In DF, I have two columns (let’s call them A and B) with A having repeats, both are categorical variables. I am trying to show only the unique A rows with their corresponding B values, how can I do that?
I was able to do it when B is a continuous var by using this:
by(ptable, [:A], df -> mean(df[:B]))
This worked for me
df[!nonunique(df[:,[:A]]), [:A, :B]]
You can get the desired result like this:
by(df, :A, x -> [x.B])
now your DataFrame will have two columns :A and :x1, and column :x1 will hold all values of column :B corresponding to unique values of :A (so column :x1 will be a vector of vectors).
EDIT: as of DataFrames.jl 0.22 use the following syntax:
combine(groupby(df, :A), :B => Ref => :B)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With