Julia DataFrames Unique Rows

Question

In DF, I have two columns (let’s call them A and B) with A having repeats, both are categorical variables. I am trying to show only the unique A rows with their corresponding B values, how can I do that?

I was able to do it when B is a continuous var by using this:

by(ptable, [:A], df -> mean(df[:B]))

I was able to do it when B is a continuous var by using this:

by(ptable, [:A], df -> mean(df[:B]))

Kevin · Accepted Answer

This worked for me

df[!nonunique(df[:,[:A]]), [:A, :B]]

Bogumił Kamiński · Answer

You can get the desired result like this:

by(df, :A, x -> [x.B])

now your DataFrame will have two columns :A and :x1, and column :x1 will hold all values of column :B corresponding to unique values of :A (so column :x1 will be a vector of vectors).

EDIT: as of DataFrames.jl 0.22 use the following syntax:

combine(groupby(df, :A), :B => Ref => :B)

Julia DataFrames Unique Rows

Tags:

dataframe

unique

rows

julia

Kevin

2 Answers

Kevin

Bogumił Kamiński

Recent Activity

Donate For Us

Julia DataFrames Unique Rows

Tags:

dataframe

unique

rows

julia

Kevin

2 Answers

Kevin

Bogumił Kamiński

Related questions

Recent Activity

Donate For Us