Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Apache Livy: query Spark SQL via REST: possible?

Tags:

apache-spark

The apache Livy documentation is sparse: is it possible to return Spark SQL query resultsets as REST calls using Apache Livy? The calling application is mobile and it cannot use odbc/jdbc to connect. So the Spark thriftserver is not an option.

like image 806
MarkTeehan Avatar asked Sep 06 '25 03:09

MarkTeehan


1 Answers

Yes, it is possible to submit Spark SQL queries through Livy. However, there is [currently] no support for the queries being submitted on their own. They would need to be wrapped in Python or Scala code.

Here are two examples of executing Spark SQL queries using Python to interact with Livy via requests lib and Scala code as a string to be executed "in spark":

1) using %json magic in livy (https://github.com/apache/incubator-livy/blob/412ccc8fcf96854fedbe76af8e5a6fec2c542d25/repl/src/test/scala/org/apache/livy/repl/PythonInterpreterSpec.scala#L91)

session_url = host + "/sessions/1"
statements_url = session_url + '/statements'
data = {
        'code': textwrap.dedent("""\
        val d = spark.sql("SELECT COUNT(DISTINCT food_item) FROM food_item_tbl")
        val e = d.collect
        %json e
        """)}
r = requests.post(statements_url, data=json.dumps(data), headers=headers)
print r.json()

2) using %table magic in livy (https://github.com/apache/incubator-livy/blob/412ccc8fcf96854fedbe76af8e5a6fec2c542d25/repl/src/test/scala/org/apache/livy/repl/PythonInterpreterSpec.scala#L105)

session_url = host + "/sessions/21"
statements_url = session_url + '/statements'
data = {
        'code': textwrap.dedent("""\
        val x = List((1, "a", 0.12), (3, "b", 0.63))
        %table x
        """)}
r = requests.post(statements_url, data=json.dumps(data), headers=headers)
print r.json()
like image 195
Garren S Avatar answered Sep 10 '25 01:09

Garren S