Given the following snippet:
val data = sc.parallelize(0 until 10000)
val local = data.collect 
println(s"local.size")
Zeppelin prints out the entire value of local to the notebook screen.  How may that behavior be changed?
You can also try adding curly brackets around your code.
{val data = sc.parallelize(0 until 10000)
val local = data.collect 
println(s"local.size")}
Since 0.6.0, Zeppelin provides a boolean flag zeppelin.spark.printREPLOutput in spark's interpreter configuration (accessible via the GUI), which is set to true by default.
If you set its value to false then you get the desired behaviour that only explicit print statements are output.
See also: https://issues.apache.org/jira/browse/ZEPPELIN-688
FWIW, this appears to be new behaviour. Until recently we have been using Livy 0.4, it only output the content of the final statement (rather than echoing the output of the whole script).
When we upgraded to Livy 0.5, the behaviour changed to output the entire script.
While splitting the paragraph and hiding the output does work, it seems like an unnecessary overhead to the usability of Zeppelin. for example, if you need to refresh your output, then you have to remember to run two paragraphs (i.e. the one that sets up your output and the one containing the actual println).
There are, IMHO, other usability issues with this approach that makes, again IMHO, Zeppelin less intuitive to use.
Someone has logged this JIRA ticket to address "the problem", please vote for it: LIVY-507
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With