I'll be honest, the only way I know to get around this is very ugly... it's to import the page as a text file instead of html, and do all of the parsing manually. Nasty, nasty business. If you want to do that, you'd just need to go to your source step, click the gear icon, and then change the "Open file as" to Text File.
After that though... then the fun begins. You'd probably want to:
- Add a custom column to say if [Column1] = "<!-- show threads -->" then "remove"
- Fill the new column up
- Remove all rows that have "remove" in that column
- Then you're going to be doing a lot of splitting data and logic tests to be working out what data you need
It won't be fun, but it should preserve all of the characters you need...
Bookmarks