When I'm trying to plot some data on jupyter notebook with pyspark environment (on python3.6) builded over EMR instance, the plot doesn't appears. Summarizing: When I run the plt.show()
command the plot don't appear.
First I tried putting %matplotlib inline
at the beginning but the plot appears. Then I tried changing the backend... No luck! Only the "Agg" backend works, when I tried others the code crush.
This is the same code I'm trying to plot:
import matplotlib.pyplot as plt
plt.switch_backend('agg')
plt.plot([1,2,3,4])
plt.show()
Output
Nothing...
I also read that for plotting I need to use "%matplotlib inline" but the problem with this is that the variables defined on others cells don't exist on %matplotlib inline
cell.
Let's see...
Cell 1
dummy_var = 10
Cell 2
import matplotlib.pyplot as plt
plt.plot([1,2,3,4])
plt.show()
print(dummy_var)
Output
NameError: name 'dummy_var' is not defined.
Disclaimer:
Being fair, the %matplotlib inline
command works fine but I also need the outside variables, for plotting them.
Cell 3
%matplotlib inline
import matplotlib.pyplot as plt
plt.plot([1,2,3,4])
plt.show()
Output
[The plot]
This is my notebook, with the error and the examples...
I was able to display a plot using data from spark by running the following in a separate cell:
%matplot plt
That is courtesy of the README sample in the sparkmagic github page
This used to happen to me too, but what i used is this instructions:
%matplotlib inline
just in the cell you did make your imports
Also a good behavior to have with notebooks is to always put your imports in the same cell at the top of the notebook.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With