Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Call a function for each row of a dataframe in pyspark[non pandas]

There is a function in pyspark:

def sum(a,b):
    c=a+b
    return c

It has to be run on each record of a very very large dataframe using spark sql:

x = sum(df.select["NUM1"].first()["NUM1"], df.select["NUM2"].first()["NUM2"])

But this would run it only for the first record of the df and not for all rows. I understand it could be done using a lambda, but I am not able to code it in the desired way.

In reality; c would be a dataframe and the function would be doing a lot of spark.sql stuff and return it. I would have to call that function for each row. I guess, I will try to pick it up using this sum(a,b) as an analogy.

+----------+----------+-----------+
|     NUM1 |     NUM2 |    XYZ    |
+----------+----------+-----------+
|      10  |     20   |      HELLO|                                    
|      90  |     60   |      WORLD|
|      50  |     45   |      SPARK|
+----------+----------+-----------+


+----------+----------+-----------+------+
|     NUM1 |     NUM2 |    XYZ    | VALUE|
+----------+----------+-----------+------+
|      10  |     20   |      HELLO|30    |                                     
|      90  |     60   |      WORLD|150   |
|      50  |     45   |      SPARK|95    |
+----------+----------+-----------+------+

Python: 3.7.4
Spark: 2.2
like image 515
earl Avatar asked Oct 15 '25 13:10

earl


1 Answers

You can use .withColumn function:

from pyspark.sql.functions import col
from pyspark.sql.types import LongType
df.show()
+----+----+-----+
|NUM1|NUM2|  XYZ|
+----+----+-----+
|  10|  20|HELLO|
|  90|  60|WORLD|
|  50|  45|SPARK|
+----+----+-----+

def mysum(a,b):
  return a + b

spark.udf.register("mysumudf", mysum, LongType())

df2 = df.withColumn("VALUE", mysum(col("NUM1"),col("NUM2"))

df2.show()
+----+----+-----+-----+
|NUM1|NUM2|  XYZ|VALUE|
+----+----+-----+-----+
|  10|  20|HELLO|   30|
|  90|  60|WORLD|  150|
|  50|  45|SPARK|   95|
+----+----+-----+-----+
like image 143
Vincent Avatar answered Oct 18 '25 08:10

Vincent



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!