Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Adding number attribute to every HTML Tag

I need to add a custom attribute with incrementing number to every HTML tag in the document, similar to this question, but only in HTML, not XML file.

I tried to accomplish it with HTML Agility Pack, here is my code:

        HtmlDocument htmldoc = new HtmlDocument();
        htmldoc.LoadHtml(text);
        var num = 1;
        foreach (HtmlNode node in htmldoc.DocumentNode.DescendantNodes())
        {
            node.Attributes.Add("gist_num",(num++).ToString());
        }

        var numberedfilename = Path.GetDirectoryName(fname) + @"\" + Path.GetFileNameWithoutExtension(fname) + "-num.htm";

        htmldoc.Save(numberedfilename);

But I get a stack overflow exception here in HTML Agility Pack HtmlTextNode class. I tried several ways to correct this bug by changing the class, but at no avail.

What would you suggest here?

--- edit --- So, the exception is just "Stack Overflow" written to the console.

"Process is terminated due to StackOverflowException."

Since it is Stack Overflow there is no possibility to get any stack values. Here is the code where VS shows this exception happening:

    /// <summary>
    /// Gets or Sets the text of the node.
    /// </summary>
    public string Text
    {
        get
        {
            if (_text == null)
            {
                return base.OuterHtml;
            }
            return _text;
        }
         set { _text = value; }
    }

So, any ideas?

like image 610
Alexander Galkin Avatar asked Mar 21 '26 19:03

Alexander Galkin


1 Answers

You need to filter the nodes so you're only selecting the elements. For some reason, going through the descendants in HTML Agility Pack includes other nodes like document and text nodes incorrectly. Since you're blindly adding attributes to all nodes, it chokes serializing the non-element nodes.

// note: Descendants() and DescendantNodes() is equivalent (unfortunately)
var query = htmldoc.DocumentNode.Descendants()
    .Where(node => node.NodeType == HtmlNodeType.Element);
like image 67
Jeff Mercado Avatar answered Mar 24 '26 09:03

Jeff Mercado