Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas: Calculate the percentage between two rows and add the value as a column

I have a dataset structured like this:

"Date","Time","Open","High","Low","Close","Volume"

This time series represent the values of a generic stock market.

I want to calculate the difference in percentage between two rows of the column "Close" (in fact, I want to know how much the value of the stock increased or decreased; each row represent a day).

I've done this with a for loop(that is terrible using pandas in a big data problem) and I create the right results but in a different DataFrame:

rows_number = df_stock.shape[0]

# The first row will be 1, because is calculated in percentage. If haven't any yesterday the value must be 1
percentage_df = percentage_df.append({'Date': df_stock.iloc[0]['Date'], 'Percentage': 1}, ignore_index=True)

# Foreach days, calculate the market trend in percentage
for index in range(1, rows_number):

    # n_yesterday : 100 = (n_today - n_yesterday) : x
    n_today = df_stock.iloc[index]['Close']
    n_yesterday = self.df_stock.iloc[index-1]['Close']
    difference = n_today - n_yesterday
    percentage = (100 * difference ) / n_yesterday

    percentage_df = percentage_df .append({'Date': df_stock.iloc[index]['Date'], 'Percentage': percentage}, ignore_index=True)

How could I refactor this taking advantage of dataFrame api, thus removing the for loop and creating a new column in place?

like image 855
Aso Strife Avatar asked Jan 26 '26 11:01

Aso Strife


1 Answers

df['Change'] = df['Close'].pct_change()

or if you want to calucale change in reverse order:

df['Change'] = df['Close'].pct_change(-1)

like image 182
igo Avatar answered Jan 28 '26 00:01

igo



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!