I am trying to grab elements from HTML source based on the class or id name, using C# windows forms application. I am putting the source into a string using WebClient and plugging it into the HTMLAgilityPack using HtmlDocument.
However, all the examples I find with the HTMLAgilityPack pack parse through and find items based on tags. I need to find a specific id, of say a link in the html, and retrieve the value inside of the tags. Is this possible and what would be the most efficient way to do this? Everything I am trying to parse out the ids is giving me exceptions. Thanks!
You should be able to do this with XPath:
HtmlDocument doc = new HtmlDocument();
doc.Load(@"file.htm");
HtmlNode node = doc.DocumentNode.SelectSingleNode("//*[@id=\"my_control_id\"]");
string value = (node == null) ? "Error, id not found" : node.InnerHtml;
Quick explanation of the xpath here:
// means search everywhere in the path, Use SelectNodes if it will be matching multiples* means match any type of node[] define "Predicates" which are basically checking properties relative to this node[@id=\"my_control_id\"] means find nodes that have an attribute named "id" with the value "my_control_id"Further reference
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With