Wednesday, 18 September 2013

Scrapping html between two html comments using nokogiri

Scrapping html between two html comments using nokogiri

I have some html pages where the contents to be extracted are marked with
html comments like below.
.....
<!-- begin content -->
...
Some html elements goes here
<!-- end content -->
...
I am using Nokogiri and trying to extract the html between these comments.
I can get the text only version using the characters callback but i need
full html between these comments.
I am rather new in web scrapping so please any help is appreciated.

No comments:

Post a Comment