I use Selenium and Firefox webdriver with python to scrape data from a website.
But in the code, I need to access this website more than 10k times and it consumes a lot of RAM to do that.
Usually, when the script access this site 2500 times, it already consumes 4gb or more of RAM and it stops to work.
Is it possible to reduce memory RAM consumption without close browser session?
I ask that because when I start the script, I need to log manually on the site(two-factor autentication, the code is not shown below) and if I close the browser session, I will need to log in the site again.
for itemLista in lista:
driver.get("https://mytest.site.com/query/option?opt="+str(itemLista))
isActivated = driver.find_element_by_xpath('//div/table//tr[2]//td[1]')
activationDate = driver.find_element_by_xpath('//div/table//tr[2]//td[2]')
print(str(isActivated.text))
print(str(activationDate.text))
indice+=1
print("numero: "+str(indice))
file2.write(itemLista+" "+str(isActivated.text)+" "+str(activationDate.text)+"\n")
#close file
file2.close()
I discover how to avoid the memory leak.
I just use
time.sleep(2)
after
file2.write(itemLista+" "+str(isActivated.text)+" "+str(activationDate.text)+"\n")
Now firefox is working without consumes lots of RAM
It is just perfect.
I don't know exactly why it stopped consumes so much memory, but I think it was growing memory consume because it didn't have time to finish each driver.get request.
As mentioned in my comment, only open and write to your file on each iteration instead of keeping it open in memory:
# remove the line file2 = open(...) from your code
for itemLista in lista:
driver.get("https://mytest.site.com/query/option?opt="+str(itemLista))
isActivated = driver.find_element_by_xpath('//div/table//tr[2]//td[1]')
activationDate = driver.find_element_by_xpath('//div/table//tr[2]//td[2]')
print(str(isActivated.text))
print(str(activationDate.text))
indice+=1
print("numero: "+str(indice))
with open("your file path here", "w") as file2:
file2.write(itemLista+" "+str(isActivated.text)+" "+str(activationDate.text)+"\n")
While selenium is quite a memory hungry beast, it doesn't necessarily murder your RAM with each growing iteration. However your growing opened buffer of file2 does take up RAM the more you write to it. Only when it's closed it will release the virtual memory and write the physical.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With