Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

DOM Parser Chrome extension memory leak

The problem

I have developed an extension that intercepts web requests, gets the HTML the web request originated from and processes it. I have used the DOMParser to parse the HTML and I have realised that the DOMParser is causing massive memory leak issues, which eventually causes the chrome extension to crash.

This is the code that causes the issues. https://gist.github.com/uche1/20929b6ece7d647250828c63e4a2ffd4

What I've tried

Dev Tools Recorded Performance

I have recorded the chrome extension whilst it's intercepting requests and I noticed that as the DOMParser.parseFromString method was called, the more nodes and documents were created which weren't destroyed.

Dev tools screenshot https://i.sstatic.net/C8mVi.png

Task Manager Memory Footprint

I looked at the task manager on chrome and saw that it had a huge memory footprint that wouldn't decrease with time (because garbage collection should kick in after a while). When the memory footprint gets too large the extension crashes.

Task manager memory footprint screenshot https://i.sstatic.net/F3f3k.png

Heap snapshots

I took some before and after screenshots of the heap and I can see the issue seems to be originating from the HTMLDocuments being allocated that isn't being garbage collected.

Snapshot (before) https://i.sstatic.net/17LYh.png

Snapshot (after) https://i.sstatic.net/Jtg1X.png

Expected outcome

I would want to understand why the DOMParser is causing such memory issues, why it isn't being cleaned up by the garbage collector and what to do to resolve it.

Thanks

like image 772
Coder Guy Avatar asked Sep 19 '25 16:09

Coder Guy


2 Answers

I have resolved the problem. It seems like the issue was because the DOMParser class for some reason kept the references of the HTML documents it parsed in memory and didn't release it. Because my extension is a Chrome extension that runs in the background, exaggerates this problem.

The solution was to use another method of parsing the HTML document which was to use

let parseHtml = (html) => {
    let template = document.createElement('template');
    template.innerHTML = html;
    return template; 
}

This helped resolve the issue.

like image 123
Coder Guy Avatar answered Sep 21 '25 06:09

Coder Guy


You are basically replicating the entire DOM in memory and then never releasing the memory.

We get away with this in a client side app because when we navigate away, the memory used by the scripts on that page is recovered.

In a background script, that doesn't happen and is now your responsibility.

So set both parser and document to null when you are done using it.

chrome.webRequest.onCompleted.addListener(async request => {
    if (request.tabId !== -1) {
        let html = await getHtmlForTab(request.tabId);
        let parser = new DOMParser();
        let document = parser.parseFromString(html, "text/html");
        let title = document.querySelector("title").textContent;
        console.log(title);
        parser = null; // <----- DO THIS
        document = null; // <----- DO THIS
    }
}, requestFilter);
like image 23
Randy Casburn Avatar answered Sep 21 '25 08:09

Randy Casburn