I have a pandas dataframe and a list as follows
mylist = ['nnn', 'mmm', 'yyy']
mydata =
   xxx   yyy zzz nnn ddd mmm
0  0  10      5    5   5  5
1  1   9      2    3   4  4
2  2   8      8    7   9  0
Now, I want to get only the columns mentioned in mylist and save it as a csv file.
i.e.
     yyy  nnn   mmm
0    10     5     5
1    9      3     4
2    8      7     0
My current code is as follows.
mydata = pd.read_csv( input_file, header=0)
for item in mylist:
    mydata_new = mydata[item]
print(mydata_new)
mydata_new.to_csv(file_name)
It seems to me that my new dataframe produces wrong results.Where I am making it wrong? Please help me!
We can select a column from a dataframe by using the column name we want to select as a list to filter() function. In this example, we select species column from the dataframe. By default, filter() function selects a column when we provide the column label as a list.
To select a single column, use square brackets [] with the column name of the column of interest.
You can use the loc and iloc functions to access columns in a Pandas DataFrame. Let's see how. If we wanted to access a certain column in our DataFrame, for example the Grades column, we could simply use the loc function and specify the name of the column in order to retrieve it.
Just pass a list of column names to index df:
df[['nnn', 'mmm', 'yyy']]
   nnn  mmm  yyy
0    5    5   10
1    3    4    9
2    7    0    8
If you need to handle non-existent column names in your list, try filtering with df.columns.isin - 
df.loc[:, df.columns.isin(['nnn', 'mmm', 'yyy', 'zzzzzz'])]
   yyy  nnn  mmm
0   10    5    5
1    9    3    4
2    8    7    0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With