Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can you show progress bar while iterating over a pandas dataframe

I am trying to iterate over a Pandas data frame with close to a million entries. I am using a for loop to iterate over them. Consider the following code as an example

import pandas as pd 
import os 
from requests_html import HTMLSession
from tqdm import tqdm
import time


df = pd.read_csv(os.getcwd()+'/test-urls.csv')
df = df.drop('Unnamed: 0', axis=1 )

new_df = pd.DataFrame(columns = ['pid', 'orig_url', 'hosted_url'])
refused_df = pd.DataFrame(columns = ['pid', 'refused_url'])

tic = time.time()

for idx, row in df.iterrows():

    img_id = row['pid']
    url = row['image_url']

    #Let's do scrapping 
    session = HTMLSession()
    r  = session.get(url)
    r.html.render(sleep=1, keep_page=True, scrolldown=1)

    count = 0 
    link_vals =  r.html.find('.zoomable')

    if len(link_vals) != 0 : 
        attrs = link_vals[0].attrs
        # print(attrs['src'])  
        embed_link = attrs['src']

    else: 
        while count <=7:
            link_vals =  r.html.find('.zoomable')
             count += 1
        else:
             print('Link refused connection for 7 tries. Adding URL to Refused URLs Data Frame')
            ref_val = [img_id,URL]
            len_ref = len(refused_df)
            refused_df.loc[len_ref] = ref_val
            print('Refused URL added')
            continue
    print('Got 1 link')

#Append scraped data to new_df
    len_df = len(new_df)
    append_value = [img_id,url, embed_link]
    new_df.loc[len_df] = append_value

I wanted to know how could I use a progress bar to add a visual representation of how far along I am. I will appreciate any help. Please let me know if you need any clarification.

like image 439
sanster9292 Avatar asked Oct 17 '25 18:10

sanster9292


1 Answers

You could try out TQDM

from tqdm import tqdm
for idx, row in tqdm(df.iterrows()):
      do something

This is primarily for a command-line progress bar. There are other solutions if you're looking for more of a GUI. PySimpleGUI comes to mind, but is definitely a little more complicated.

like image 112
Jace Flournoy Avatar answered Oct 19 '25 07:10

Jace Flournoy