Hi I am using a custom UDF to take square root of each value in each column.
square_root_UDF = udf(lambda x: math.sqrt(x), DoubleType())
for x in features:
  dataTraining = dataTraining.withColumn(x, square_root_UDF(x))
Is there any faster way to get it done ? Polynomial expansion function is not suitable in this case.
Don't use UDF. Instead use built-in:
from pyspark.sql.functions import sqrt
for x in features:
    dataTraining = dataTraining.withColumn(x, sqrt(x))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With