Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Creating and assigning different variables using a for loop

Tags:

python

pandas

So what I'm trying to do is the following:

I have 300+ CSVs in a certain folder. What I want to do is open each CSV and take only the first row of each.

What I wanted to do was the following:

import os

list_of_csvs = os.listdir() # puts all the names of the csv files into a list.

The above generates a list for me like ['file1.csv','file2.csv','file3.csv'].

This is great and all, but where I get stuck is the next step. I'll demonstrate this using pseudo-code:

import pandas as pd

for index,file in enumerate(list_of_csvs):
    df{index} = pd.read_csv(file)    

Basically, I want my for loop to iterate over my list_of_csvs object, and read the first item to df1, 2nd to df2, etc. But upon trying to do this I just realized - I have no idea how to change the variable being assigned when doing the assigning via an iteration!!!

That's what prompts my question. I managed to find another way to get my original job done no problemo, but this issue of doing variable assignment over an interation is something I haven't been able to find clear answers on!

like image 695
Yeahprettymuch Avatar asked Mar 13 '26 05:03

Yeahprettymuch


1 Answers

If i understand your requirement correctly, we can do this quite simply, lets use Pathlib instead of os which was added in python 3.4+

from pathlib import Path
csvs = Path.cwd().glob('*.csv') # creates a generator expression.
#change Path(your_path) with Path.cwd() if script is in dif location

dfs = {} # lets hold the csv's in this dictionary

for file in csvs:
   dfs[file.stem] = pd.read_csv(file,nrows=3) # change nrows [number of rows] to your spec.

#or with a dict comprhension
dfs = {file.stem : pd.read_csv(file) for file in Path('location\of\your\files').glob('*.csv')}

this will return a dictionary of dataframes with the key being the csv file name .stem adds this without the extension name.

much like

{
'csv_1' : dataframe,
'csv_2' : dataframe
} 

if you want to concat these then do

df = pd.concat(dfs)

the index will be the csv file name.

like image 70
Umar.H Avatar answered Mar 15 '26 19:03

Umar.H



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!