I have the following situation: I want to use Anaconda3 with Zeppelin and Spark.
I have installed the following components:
Basically I configure the Python interpreter to point to my anaconda version, in my case /opt/anaconda3/bin/python and this is working. I also edited the zeppelin.sh script with:
export PYTHONPATH="${SPARK_HOME}/python:${SPARK_HOME}/python/lib/py4j-0.8.2.1-src.zip"
export SPARK_YARN_USER_ENV="PYTHONPATH=${PYTHONPATH}"
export PYSPARK_DRIVER_PYTHON="/var/opt/teradata/anaconda3/envs/py35/bin/ipython"
export PYSPARK_PYTHON="/var/opt/teradata/anaconda3/envs/py35/bin/python"
export PYLIB="/var/opt/teradata/anaconda3/envs/py35/lib"
Till here everything is Ok.
When I try the %python.conda and %python.sql interpreters , they failed because the conda command is not found and the pandas also not. I added the libraries location to the $PATH environment variable, and Zeppelin is able to find these commands but the side effect is, the default Python version for the whole environment becomes the 3.5 instead of the 2.7 and I start to get another nice error like this one:
apache.zeppelin.interpreter.InterpreterException: File "/usr/bin/hdp-select", line 205
print "ERROR: Invalid package - " + name
^
SyntaxError: Missing parentheses in call to 'print'
ls: cannot access /usr/hdp//hadoop/lib: No such file or directory
Exception in thread "main" java.lang.IllegalStateException: hdp.version is not set while running Spark under HDP, please set through HDP_VERSION in spark-env.sh or add a java-opts file in conf with -Dhdp.version=xxx
When I switch back and erase the Python3 libraries from $PATH it works again.
Is there any optimal way to configure my environment in order to make everything works and keep it manageable and easy to maintain?
I was thinking in creating symlinks in /var/lib for the files that need to be found, but I don’t know how many will be needed and I don’t want to create links for everyone except python3.
Any comment will be highly appreciated.
Kind Regards, Paul
I ran into the same error. Upon investigating, I tracked down the source of the error here. Looks like Zeppelin is defaulting to "/bin/conda" for the default path for conda.
I was able to fix it by doing the following:
ln -s /opt/anaconda3/bin/conda /bin/conda
ln -s /opt/anaconda3/bin/python /bin/python
/opt/anaconda3/bin/python3
export PYTHONPATH=/opt/anaconda3/bin
Looks like there is also a JIRA issue for this behavior here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With