PySpark

How to install Apache Spark on Mac OS X Yosemite

Hello data scientists,

This is a quick installation guide to install the Apache Spark on your local machine. I found the documentation on the website little confusing.

1. Download the Apache Spark tar file from the http://spark.apache.org/downloads.html. [Choose any version you would like from the dropdown menu. I recommend anything 1.3.1 or above]

2. Unzip the file into your home directory.

3. Open your terminal and go to the spark directory by doing cd spark-1.3.1 [Assuming you are in your home directory]

4. Now, simply run

build/mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.0 -DskipTests clean package

5. It takes at least 10 minutes to complete the whole build.

6. After the build’s completed. It should look something like the following:

Build Success

7. Now run

./bin/run-example SparkPi 10

8. You should see something like this:

Screen Shot 2015-07-24 at 1.29.07 PM

As you can see here, it says the Job has been finished which means you have successfully made it running 🙂

Note: I am assuming you have Java installed properly on your machine. This is very important.

Advertisements