Quick Look – Html Agility Pack

Quick Look – Html Agility Pack

Software projects often come down to transforming data from one format to another.

A common situation involves extracting some data from HTML. This can occur when transferring data from a legacy system, and the best way to do it is screen scraping.

There are a variety of options available when screen scraping, but a really good one is the Html Agility Pack.

To install Html Agility Pack, use the NuGet command.

dotnet add package HtmlAgilityPack

The Html Agility Pack is an open-source project that makes parsing and interacting with HTML easier.

To download a web page, you can use the following:

Once you get a page downloaded, the Html Agility Pack provides a lot of good ways to parse the content. You can see in the sample code below that we can find all td attributes that have a particular class.

Using Html Agility Pack is pretty straightforward. Give it a try the next time you need to parse some HTML.


Related posts