Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it possible to reduce memory RAM consumption when using Selenium GeckoDriver and Firefox

I use Selenium and Firefox webdriver with python to scrape data from a website.

But in the code, I need to access this website more than 10k times and it consumes a lot of RAM to do that.

Usually, when the script access this site 2500 times, it already consumes 4gb or more of RAM and it stops to work.

Is it possible to reduce memory RAM consumption without close browser session?

I ask that because when I start the script, I need to log manually on the site(two-factor autentication, the code is not shown below) and if I close the browser session, I will need to log in the site again.

for itemLista in lista:
    driver.get("https://mytest.site.com/query/option?opt="+str(itemLista))

    isActivated = driver.find_element_by_xpath('//div/table//tr[2]//td[1]')
    activationDate = driver.find_element_by_xpath('//div/table//tr[2]//td[2]')

    print(str(isActivated.text))
    print(str(activationDate.text))

    indice+=1
    print("numero: "+str(indice))

    file2.write(itemLista+" "+str(isActivated.text)+" "+str(activationDate.text)+"\n")

#close file
file2.close()
like image 829
fabiobh Avatar asked Oct 25 '25 08:10

fabiobh


2 Answers

I discover how to avoid the memory leak.

I just use

time.sleep(2)

after

file2.write(itemLista+" "+str(isActivated.text)+" "+str(activationDate.text)+"\n")

Now firefox is working without consumes lots of RAM

It is just perfect.

I don't know exactly why it stopped consumes so much memory, but I think it was growing memory consume because it didn't have time to finish each driver.get request.

like image 151
fabiobh Avatar answered Oct 27 '25 21:10

fabiobh


As mentioned in my comment, only open and write to your file on each iteration instead of keeping it open in memory:

# remove the line file2 = open(...) from your code

for itemLista in lista:
    driver.get("https://mytest.site.com/query/option?opt="+str(itemLista))

    isActivated = driver.find_element_by_xpath('//div/table//tr[2]//td[1]')
    activationDate = driver.find_element_by_xpath('//div/table//tr[2]//td[2]')

    print(str(isActivated.text))
    print(str(activationDate.text))

    indice+=1
    print("numero: "+str(indice))

    with open("your file path here", "w") as file2:
        file2.write(itemLista+" "+str(isActivated.text)+" "+str(activationDate.text)+"\n")

While selenium is quite a memory hungry beast, it doesn't necessarily murder your RAM with each growing iteration. However your growing opened buffer of file2 does take up RAM the more you write to it. Only when it's closed it will release the virtual memory and write the physical.

like image 26
r.ook Avatar answered Oct 27 '25 23:10

r.ook



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!