I previously had PySpark installed as a Python package I installed through pip, I uninstalled it recently with a clean version of Python and downloaded the standalone version.
In my User variables I made a path with name: SPARK_HOME
with a value of: C:\spark-2.3.2-bin-hadoop2.7\bin
In System variables under Path I made an entry: C:\spark-2.3.2-bin-hadoop2.7\bin
When I run pyspark
I can not run spark-shell either. Any ideas?
SPARK_HOME should be without bin folder. Hence,
Set SPARK_HOME to C:\spark-2.3.2-bin-hadoop2.7\
Window users have to download a compatible winutils exe version and save it in your Spark's bin folder.
Find the compatible Hadoop distribution, download and save it in your Spark folder.
e.g. Download "https://github.com/steveloughran/winutils/blob/master/hadoop-2.7.1/bin/winutils.exe" and save it in your "C:\spark-2.3.2-bin-hadoop2.7\bin"
Different winutils version could be found in this link. https://github.com/steveloughran/winutils
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With