I have tried Spark UDF with parameter using lambda function and register it. but how could I create udf with not argument and registrar it I have tried this my sample code will expected to show current time
from datetime import datetime from pyspark.sql.functions import udf
def getTime():
timevalue=datetime.now()
return timevalue
udfGateTime=udf(getTime,TimestampType())
But PySpark is showing
NameError: name 'TimestampType' is not defined
which probably means my UDF is not registered I was comfortable with this format
spark.udf.register('GATE_TIME', lambda():getTime(), TimestampType())
but does lambda function take empty argument? Though I didn't try it, I am a bit confused. How could I write the code for registering this getTime() function?
lambda
expression can be nullary. You're just using incorrect syntax:
spark.udf.register('GATE_TIME', lambda: getTime(), TimestampType())
There is nothing special in lambda
expressions in context of Spark. You can use getTime
directly:
spark.udf.register('GetTime', getTime, TimestampType())
There is no need for inefficient udf
at all. Spark provides required function out-of-the-box:
spark.sql("SELECT current_timestamp()")
or
from pyspark.sql.functions import current_timestamp
spark.range(0, 2).select(current_timestamp())
I have done a bit tweak here and it is working well for now
import datetime
from pyspark.sql.types import*
def getTime():
timevalue=datetime.datetime.now()
return timevalue
def GetVal(x):
if(True):
timevalue=getTime()
return timevalue
spark.udf.register('GetTime', lambda(x):GetVal(x),TimestampType())
spark.sql("select GetTime('currenttime')as value ").show()
instead of currenttime any value can pass at it will give current date time here
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With