I have two variables for every user: review_count and fans.
'review_count' gives the number of reviews made by the user and 'fans' gives the number of fans they have.
The data looks like this:
The data is stored in SQLite. Is there any in built function in SQLite for calculating Correlation between two variables?
I am doing the same Coursera course and this is my solution. Note that in other SQL languages, the covar and cor functions makes it much easier. It wasn't possible to calculate the R-function as there is no SQRT() function in SQLite.
select avg( (review_count - avg_x) * (fans - avg_y) )*avg( (review_count - avg_x) * (fans - avg_y) )/(var_x*var_y) as R2
from user, (select
avg_x,
avg_y,
avg((review_count - avg_x)*(review_count - avg_x)) as var_x,
avg((fans - avg_y)*(fans - avg_y)) as var_y
from user, (select
avg(review_count) as avg_x,
avg(fans) as avg_y
from user)
);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With