Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C# using webbrowser documenttext, document stays null

I hope the title was clear enough, but I will try to explain...

I'm using C# Winforms ( dotnet 4.5 ).

The thing is that I'm creating a WebBrowser control and try to set the content with wb.DocumentText. But when I try to loop through the elements, it says that the document is empty (null)

Here's my code:

WebBrowser wb = new WebBrowser();
wb.DocumentText = leMessage;

HtmlElementCollection elems = wb.Document.GetElementsByTagName("a");
foreach (HtmlElement elem in elems)
{
    // Do Some Stuff
}

leMessage holds an HTML newsletter message and there are some a tags in it.

I've already tried this: wb.Document.Body.InnerHtml = leMessage; but that didn't work either...

What did I miss or do wrong?

like image 228
Mathlight Avatar asked Jan 26 '26 01:01

Mathlight


1 Answers

WebBrowser.DocumentText is asynchronous. You need to handle DocumentComplete before you can access the DOM, and keep pumping Windows messages. Here's a complete example of web-scraping, using async/await to keep the convinient linear code flow. Just alter the navigation part:

await NavigateAsync(ct, () => this.webBrowser.DocumentText = leMessage), timeout);
HtmlElementCollection elems = wb.Document.GetElementsByTagName("a");

This way you could do it in a loop. In a nutshell:

using System;
using System.Diagnostics;
using System.Threading.Tasks;
using System.Windows.Forms;

namespace WinformsApp2
{
    public partial class MainForm : Form
    {
        public MainForm()
        {
            InitializeComponent();
        }

        const string leMessage = "<a href='http://example.com'>Go there</a>";

        private async void MainForm_Load(object sender, EventArgs e)
        {
            var wb = new WebBrowser();

            TaskCompletionSource<bool> tcs = null;
            WebBrowserDocumentCompletedEventHandler documentCompletedHandler = (sender2, e2) => tcs.TrySetResult(true);

            for (int i = 0; i < 3; i++)
            {
                tcs = new TaskCompletionSource<bool>();
                wb.DocumentCompleted += documentCompletedHandler;
                try {
                    wb.DocumentText = leMessage;
                    await tcs.Task;
                }
                finally {
                    wb.DocumentCompleted -= documentCompletedHandler;
                }
                HtmlElementCollection elems = wb.Document.GetElementsByTagName("a");
                foreach (HtmlElement elem in elems)
                {
                    Debug.Print(elem.OuterHtml);
                }
            }
        }
    }
}
like image 67
noseratio Avatar answered Jan 27 '26 14:01

noseratio



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!