Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get the content of an element of a Web page using C#

Tags:

browser

c#

Is there any way to get the content of an element or control of an open web page in a browser from a c# app?

I tried to get the window ex, but I don't know how to use it after to have any sort of communication with it. I also tried this code:

using (var client = new WebClient())
{
    var contents = client.DownloadString("http://www.google.com");
    Console.WriteLine(contents);
}

This code gives me a lot of data I can't use.

like image 877
user1839169 Avatar asked Sep 15 '25 13:09

user1839169


1 Answers

You could use an HTML parser such as HTML Agility Pack to extract the information you are interested in from the HTML you downloaded:

using (var client = new WebClient())
{
    // Download the HTML
    string html = client.DownloadString("http://www.google.com");

    // Now feed it to HTML Agility Pack:
    HtmlDocument doc = new HtmlDocument();
    doc.LoadHtml(html);

    // Now you could query the DOM. For example you could extract
    // all href attributes from all anchors:
    foreach(HtmlNode link in doc.DocumentNode.SelectNodes("//a[@href]"))
    {
        HtmlAttribute href = link.Attributes["href"];
        if (href != null)
        {
            Console.WriteLine(href.Value);
        }
    }
}
like image 106
Darin Dimitrov Avatar answered Sep 17 '25 03:09

Darin Dimitrov