Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Float to integer for column name in pandas

Tags:

python

pandas

I have a pandas data frame which looks like this

    Data Source   World Development Indicators  Unnamed: 2                         Unnamed: 3        Unnamed: 4        Unnamed: 5
    Country Name         Country Code         Indicator Name                     Indicator Code     1.960000e+03      1.961000e+03  
    Aruba                    ABW         GDP at market prices (constant 2010 US$)   NY.GDP.MKTP.KD           NaN             NaN    

To convert the first row to its column I am using the code

data.columns = data.iloc[0]

As a result the data data frame gets modified into

Country Name    Country Code    Indicator Name  Indicator Code     1960.0         1961.0        1962.0
Country Name    Country Code    Indicator Name  Indicator Code  1.960000e+03    1.961000e+03
Aruba   ABW GDP at market prices (constant 2010 US$)    NY.GDP.MKTP.KD  NaN           NaN

Now my main problem is for columns with years as headers iam getting 1960.0 which I want to be a sintegers ie 1960. Any help on this will be greatly appreciated

like image 472
Rajarshi Bhadra Avatar asked Oct 24 '25 19:10

Rajarshi Bhadra


2 Answers

option 1

def rn(x):
    try:
        return '{:0.0f}'.format(x)
    except:
        return x

df.T.set_index(0).rename_axis(rn).T

enter image description here

like image 64
piRSquared Avatar answered Oct 27 '25 08:10

piRSquared


Another possible solutions are add parameters skiprows or header to read_csv, if create DataFrame from csv:

import pandas as pd
import numpy as np
from pandas.compat import StringIO

temp=u"""Data Source;World Development Indicators;Unnamed: 2;Unnamed: 3;Unnamed: 4;Unnamed: 5
Country Name;Country Code;Indicator Name;Indicator Code;1960;1961
Aruba;ABW;GDP at market prices (constant 2010 US$);NY.GDP.MKTP.KD;NaN;NaN"""
#after testing replace 'StringIO(temp)' to 'filename.csv'
df = pd.read_csv(StringIO(temp), sep=";", skiprows=1)
print (df)
  Country Name Country Code                            Indicator Name  \
0        Aruba          ABW  GDP at market prices (constant 2010 US$)   

   Indicator Code  1960  1961  
0  NY.GDP.MKTP.KD   NaN   NaN 

df = pd.read_csv(StringIO(temp), sep=";", header=1)
print (df)
  Country Name Country Code                            Indicator Name  \
0        Aruba          ABW  GDP at market prices (constant 2010 US$)   

   Indicator Code  1960  1961  
0  NY.GDP.MKTP.KD   NaN   NaN  

If it is not possible, check perfect MaxU solution and add df = df[1:] for remove first row from data.

like image 43
jezrael Avatar answered Oct 27 '25 10:10

jezrael