Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Vega-lite rolling averages within groups

I have a table of data that summarizes prices vs. number of users. When I want to display the scatter plot overlaid with a moving average line I use the following Julia function:

function log_scatter(df::DataFrame; smooth=2, title="Price by Number of Users")
    sort(select(df, [:Price, :Users]), :Users) |> 
    @vlplot(width=640,height=512, title=title) +
    @vlplot(mark={:point, opacity=0.5}, x={field=:Users, scale={type="log"},title="Users"}, y={:Price,title="Price per User"}) +
    @vlplot(transform=[
        { groupby=[:Users], aggregate=[{ op=:mean, field=:Price, as="AvgPrice" }] },
        { frame=[-smooth,smooth], window=[{ field="AvgPrice", op=:mean, as="rolling" }] }
        ],
        mark={:line,size=2,color="red"}, x={:Users, title="Users"}, y={"rolling:q", title="Average"})
end

It produces a nice plot: enter image description here

Unfortunately when I want to do the same with grouping, I can't get the moving average to display

function log_scatter_and(df::DataFrame, other; smooth=2, title="Price by Number of Users")
    otherSym=Symbol(other)
    prices = price_and(df, other)
    sort(select(prices, [:Price, :Users, otherSym]), :Users) |> 
    @vlplot(width=640,height=512, title=title) +
    @vlplot(mark={:point, opacity=0.5}, color=otherSym, x={field=:Users, scale={type="log"},title="Users"}, y={:Price,title="Price per User"}) +
    @vlplot(transform=[
        { groupby=[:Users, otherSym], aggregate=[{ op=:mean, field=:Price, as="AvgPrice" }] },
        { frame=[-smooth,smooth], window=[{ field="AvgPrice", op=:mean, as="rolling" }] }
        ],
        mark={:line,size=2,color=otherSym}, x={:Users, title="Users"}, y={"rolling:q", title="Average"})
end

This is the output when I try to group by year enter image description here

I want to have the rolling average lines show up as well as the scatter

like image 822
leei Avatar asked Oct 24 '25 16:10

leei


2 Answers

I'm not good at VegaLite, but I encounter StatisticalGraphics that I guess uses vegalite, thou. you need to add the main branch

using InMemoryDatasets
using StatisticalGraphics
using RollingFunctions

ds=Dataset(y=rand(1000), x=rand(1:200,1000),year=rand(2000:2010,1000))
sort!(ds, :x)
modify!(groupby(ds,:year), :y=>(x->runmean(x,2))=>:rolling)
sgplot(ds, [Line(x=:x,y=:rolling, group=:year), Scatter(x=:x,y=:y,group=:year)], nominal=:year, xaxis=Axis(type=:log))
like image 173
giantmoa Avatar answered Oct 26 '25 12:10

giantmoa


window transform has to have a groupby statement,

function log_scatter_and(df::DataFrame, other; smooth=2, title="Price by Number of Users")
    otherSym=Symbol(other)
    prices = price_and(df, other)
    sort(select(prices, [:Price, :Users, otherSym]), :Users) |> 
    @vlplot(width=640,height=512, title=title) +
    @vlplot(mark={:point, opacity=0.5}, color=otherSym, x={field=:Users, scale={type="log"},title="Users"}, y={:Price,title="Price per User"}) +
    @vlplot(transform=[
        { groupby=[:Users, otherSym], aggregate=[{ op=:mean, field=:Price, as="AvgPrice" }] },
        { groupby=[:Users, otherSym],frame=[-smooth,smooth], window=[{ field="AvgPrice", op=:mean, as="rolling" }] }
        ],
        mark={:line,size=2,color=otherSym}, x={:Users, title="Users"}, y={"rolling:q", title="Average"})
end
like image 25
هنروقتان Avatar answered Oct 26 '25 14:10

هنروقتان