Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Apache Spark: how to call UDF over dataset in Java?

What is exact translation of below Scala code snippet in Java?

import org.apache.spark.sql.functions.udf 

def upper(s:String) : String = {
    s.toUpperCase
}
val toUpper = udf(upper _)
peopleDS.select(peopleDS(“name”), toUpper(peopledS(“name”))).show

Please fill below missing statement in Java:

import org.apache.spark.sql.api.java.UDF1;

UDF1 toUpper = new UDF1<String, String>() {
    public String call(final String str) throws Exception {
        return str.toUpperCase();
    }
};

peopleDS.select(peopleDS.col("name"), /* how to run  toUpper("name")) ? */.show();

NOTE

Register UDF, then call using selectExpr works for me, but I need something similar to the showen above.

Working example:

sqlContext.udf().register(
    "toUpper",
    (String s) -> s.toUpperCase(),
    DataTypes.StringType
);
peopleDF.selectExpr("toUpper(name)","name").show();
like image 492
Rahul Sharma Avatar asked Nov 14 '25 15:11

Rahul Sharma


1 Answers

In Java calling UDF without registration is not possible. Please check the following discussion:

  • Using UDFs in Java without registration

Below is your UDF:

private static UDF1 toUpper = new UDF1<String, String>() {
    public String call(final String str) throws Exception {
        return str.toUpperCase();
    }
};

Register the UDF and you can use callUDF function.

import static org.apache.spark.sql.functions.callUDF;
import static org.apache.spark.sql.functions.col;

sqlContext.udf().register("toUpper", toUpper, DataTypes.StringType);
peopleDF.select(col("name"),callUDF("toUpper", col("name"))).show();
like image 157
abaghel Avatar answered Nov 17 '25 09:11

abaghel



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!