I have the following problem:
I have a survey that contains a large number of answers to likert questions, like so:
id | Q1 | Q2 | Q3
1 5 3 1
2 3 4 1
3 2 3 1
The problem is that not all questions are asked in 'the same direction'. So a answer of '5' in Q1 would indicate a positive answer. But a 5 in Q2 would mean a strongly negative answer.
We are currently re-encoding all questions by hand (thus replacing all Q2 5's with 1's, etc) but I was wondering if there is a quicker way to solve this.
I thought about dividing all answers by 5 and then subtracting 1, but that never gives me whole numbers. Math isn't really my strongpoint here, so I was wondering if someone here could help me out
If I understood you correctly, you can do it so:
df['Q2'] = df['Q2'].map({1:5, 2:4, 3:3, 4:2, 5:1})
Input:
Q1 Q2 Q3
0 5 3 1
1 3 4 1
2 2 5 1
Output:
Q1 Q2 Q3
0 5 3 1
1 3 2 1
2 2 1 1
You can subtract 6 from column Q2, or use rsub what subtract from right side:
print (df)
Q1 Q2 Q3
0 5 3 1
1 3 4 1
2 2 5 1
df.Q2 = 6 - df.Q2
#same as
#df.Q2 = df.Q2.rsub(6)
If performance is important subtract by numpy array:
df.Q2 = 6 - df.Q2.values
Or:
df.Q2 = df.eval(' 6 - Q2')
Or:
import numexpr
x = df.Q2.values
df.Q2 = numexpr.evaluate('(6 - x)')
print (df)
Q1 Q2 Q3
0 5 3 1
1 3 2 1
2 2 1 1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With