shift particular rows of a particular column of pandas dataframe

Question

I have this dataframe

And am trying to shift rows which have NaNs in the first two columns to the left, so the values to the right now fill this column. Here is what i am currently trying to do:

(Note: the match dataframe was downloaded from this link: https://www.kaggle.com/hugomathien/soccer)

#original dataframe
<class 'pandas.core.frame.DataFrame'>
Int64Index: 21374 entries, 145 to 25978
Data columns (total 47 columns):
id                  21374 non-null int64
country_id          21374 non-null int64
league_id           21374 non-null int64
season              21374 non-null object
stage               21374 non-null int64
date                21374 non-null object
match_api_id        21374 non-null int64
home_team_api_id    21374 non-null int64
away_team_api_id    21374 non-null int64
home_team_goal      21374 non-null int64
away_team_goal      21374 non-null int64
goal                13325 non-null object
shoton              13325 non-null object
shotoff             13325 non-null object
foulcommit          13325 non-null object
card                13325 non-null object
cross               13325 non-null object
corner              13325 non-null object
possession          13325 non-null object
BSA                 11856 non-null float64
Home Team           21374 non-null object
Away Team           21374 non-null object
League              21374 non-null object
Country             21374 non-null object
home_player_1       21374 non-null object
home_player_2       21374 non-null object
home_player_3       21374 non-null object
home_player_4       21374 non-null object
home_player_5       21374 non-null object
home_player_6       21374 non-null object
home_player_7       21374 non-null object
home_player_8       21374 non-null object
home_player_9       21374 non-null object
home_player_10      21374 non-null object
home_player_11      21374 non-null object
away_player_1       21374 non-null object
away_player_2       21374 non-null object
away_player_3       21374 non-null object
away_player_4       21374 non-null object
away_player_5       21374 non-null object
away_player_6       21374 non-null object
away_player_7       21374 non-null object
away_player_8       21374 non-null object
away_player_9       21374 non-null object
away_player_10      21374 non-null object
away_player_11      21374 non-null object
winner              21374 non-null object
dtypes: float64(1), int64(9), object(37)
memory usage: 7.8+ MB

creating the dataframe

columns = match.columns[match.columns.get_loc('home_player_1'):match.columns.get_loc('away_player_1')+1].values
columns = list(columns)

player_appearences = match.groupby(columns[0]).size().reset_index()
player_appearences.rename(columns = {0:"Count_{}".format(player_appearences.columns[0][len(player_appearences.columns[0])-1])}, inplace = True, errors='raise')
player_appearences
for i in range(1,12):
    player_appearences2 = match.groupby(columns[i]).size().reset_index()
    player_appearences2
    player_appearences2.rename(columns = {0:"Count_{}".format(player_appearences2.columns[0][len(player_appearences2.columns[0])-1])}, inplace = True, errors='raise')
    player_appearences = player_appearences.merge(right = player_appearences2,how="outer",left_on ="{}".format(player_appearences.columns[0]),right_on = "{}".format(player_appearences2.columns[0]))
    player_appearences
    #overwrite nans in first column with names in current [i] player column

#select rows where first two columns give nan values
player_appearences.loc[(player_appearences.loc[:,"home_player_1"].isna()==True) & (player_appearences.loc[:,"Count_1"].isna()==True),["home_player_1","Count_1"]] = player_appearences.loc[(player_appearences.loc[:,"home_player_1"].isna()==True) & (player_appearences.loc[:,"Count_1"].isna()==True),["home_player_2","Count_2"]]

When I then print player_appearences the dataframe is unchanged. I'm unsure if its either not doing anything, or it is creating a copy of the original dataframe. Can anyone tell me why this isn't working/suggest a better way if there is one?

ansev · Accepted Answer

Use DataFrame.rename, then you only need DataFrame.stack (dropna = True by default) + DataFrame.unstack:

 df = (df.rename(columns = {'home_player_2':'home_player_1',
                           'Count_2':'Count_1'}).stack().unstack()
       .reindex(columns = df.columns[:2]))
print(df)
  home_player_1 Count_1
0         Aaron       1
1          Adam       2
2         Ziggy       3
3        Zoltan       4

Or DataFrame.shift with DataFrame.where:

df.where(df.notna(),df.shift(-1,axis = 1)).iloc[:,:2]


  home_player_1  Count_1
0         Aaron      1.0
1          Adam      2.0
2         Ziggy      3.0
3        Zoltan      4.0

Detail

print(df.where(df.notna(),df.shift(-1,axis = 1)))
  home_player_1  Count_1 home_player_2  Count_2
0         Aaron      1.0           NaN      NaN
1          Adam      2.0           NaN      NaN
2         Ziggy      3.0         Ziggy      3.0
3        Zoltan      4.0        Zoltan      4.0

wombatonfire · Answer

You can use shift(-1, axis=1) to shift the columns and df[df.home_player_1.isna() & df.Count_1.isna()] to specify which rows to affect. The rows, which you are shifting, should be rewritten in the dataframe.

df = pd.DataFrame([['Aaron', 1, None, None],
                   ['Adam', 2, None, None],
                   [None, None, 'Ziggy', 3],
                   [None, None, 'Zoltan', 4]],
                  columns=['home_player_1', 'Count_1', 'home_player_2', 'Count_2'])

home_player_1   Count_1     home_player_2   Count_2
Aaron           1.0         None            NaN
Adam            2.0         None            NaN
None            NaN         Ziggy           3.0
None            NaN         Zoltan          4.0

df[df.home_player_1.isna() & df.Count_1.isna()] = df[df.home_player_1.isna() & df.Count_1.isna()].shift(-1, axis=1)

home_player_1   Count_1     home_player_2   Count_2
Aaron           1.0         None            NaN
Adam            2.0         None            NaN
Ziggy           3.0         NaN             NaN
Zoltan          4.0         NaN             NaN

shift particular rows of a particular column of pandas dataframe

Tags:

python

pandas

dataframe

Sean

2 Answers

ansev

wombatonfire

Recent Activity

Donate For Us

shift particular rows of a particular column of pandas dataframe

Tags:

python

pandas

dataframe

Sean

2 Answers

ansev

wombatonfire

Related questions

Recent Activity

Donate For Us