Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Create a boolean feature to check if two columns are the same

I have a dataframe DF1 that has three features (columns) a,b,c, all of StringType. I want to create a new dataframe DF2 from DF1 that has two columns:

  1. The column a
  2. A new column d with 1 if b=c otherwise 0

Input example:

a b c  
A B B  
B C A  
D D D  

Wanted output

a d  
A 1  
B 0  
D 1  
like image 332
Shuang Avatar asked Oct 26 '25 03:10

Shuang


2 Answers

The part missing is drop for the other two columns.

val df2 = df1.withColumn("d", col("b") === col("c")).drop("b").drop("c")
df2.show

This gives us

+---+-----+
|  a|    d|
+---+-----+
|  A| true|
|  B|false|
|  D| true|
+---+-----+
like image 77
wlchastain Avatar answered Oct 28 '25 23:10

wlchastain


Please use This val df2=df1.withColumn("d",col("b") === col("c"))

Here WithColumn will add new columns in df2.

like image 39
maogautam Avatar answered Oct 29 '25 00:10

maogautam



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!