Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Seaborn two plots on one fig: x-values are is off by one [duplicate]

I'm experience strange behavior when plotting two plots on top of each other in seaborn. The bar plot appears to work fine, but the regplot appears to be off by one. Note the lack of a reg data point for x=1, and compare the x=2 value to the value in the table for x below, it's clearly off by one.

enter image description here

My pandas Dataframe looks like this:

    Threshold per Day   # Alarms    Percent Reduction
0   1                   791         96.72
1   2                   539         93.90
2   3                   439         91.94
3   4                   361         89.82
4   5                   317         88.26
5   6                   263         85.94
6   7                   233         84.41
7   8                   205         82.78
8   9                   196         82.17
9   10                  176         80.66

The code I'm using here is:

%matplotlib inline

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

fig, ax = plt.subplots()

ax2 = ax.twinx()
sns.barplot(x='Threshold per Day', y="# Alarms", data=results_df, ax=ax, color='lightblue')
sns.regplot(x='Threshold per Day', y='Percent Reduction', data=results_df, marker='x', fit_reg=False, ax=ax2)

Any ideas what's going on or how to fix it?

like image 835
Brendan Avatar asked Nov 20 '25 20:11

Brendan


1 Answers

Caveat: This only addresses a possible fix, I don't know why that is happening in seaborn (but see Edit and comment)

If you're looking just to get a decent plot in the meantime, I would recommend just switching to pure matplotlib, at least just for this plot and any others with similarly strange behaviour. You can get a very similar plot with the following code:

fig, ax = plt.subplots(1,1, sharex=True)
ax2 = ax.twinx()

ax.bar(results_df['Threshold per Day'], results_df['# Alarms'], color='lightblue')
ax2.scatter(results_df['Threshold per Day'], results_df['Percent Reduction'], marker='x')
ax.set_ylabel('# of Alarms')
ax2.set_ylabel('Percent Reduction')
ax.set_xlabel('Threshold Per Day')
plt.xticks(range(1,11))
plt.show()

enter image description here

Edit to take into account ImportanceOfBeingErnest's comment:

You can obtain this plot in seaborn using:

fig, ax = plt.subplots()
ax2 = ax.twinx()

sns.barplot(x=results_df['Threshold per Day'], 
            y=results_df["# Alarms"], ax=ax, color='lightblue')
sns.regplot(x=np.arange(0,len(results_df)), 
            y=results_df['Percent Reduction'], marker='x', 
            fit_reg=False, ax=ax2)
plt.show()

Turns out that in matplotlib, a barplot's category seems to be interpreted as a numeric when possible, whereas in seaborn, it is interpreted as a string, and the locations start at location 0 by default; as your regplot is evenly spaced on the x axis, you can just force their locations onto a range from 0 to the length of your dataframe as above.

like image 97
sacuL Avatar answered Nov 23 '25 09:11

sacuL



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!