Files
Eliton D. Gadotti 258415077b Update macos.md (#332)
* Update macos.md

added a useful information regarding the path where java is installed, if someone has problems to find it

* Update macos.md

again, just made clear how to get the path to the installation (spark)
2023-02-25 12:16:15 +01:00

67 lines
1.6 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

## MacOS
Here we'll show you how to install Spark 3.2.1 for MacOS.
We tested it on MacOS Monterey 12.0.1, but it should work
for other MacOS versions as well
### Installing Java
Ensure Brew and Java installed in your system:
```bash
xcode-select install
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install.sh)"
brew install java
```
Add the following environment variables to your `.bash_profile` or `.zshrc`:
```bash
export JAVA_HOME=/usr/local/Cellar/openjdk@11/11.0.12
export PATH="$JAVA_HOME/bin/:$PATH"
```
Make sure Java was installed to `/usr/local/Cellar/openjdk@11/11.0.12`: Open Finder > Press Cmd+Shift+G > paste "/usr/local/Cellar/openjdk@11/11.0.12". If you can't find it, then change the path location to appropriate path on your machine. You can also run `brew info java` to check where java was installed on your machine.
### Installing Spark
1. Install Scala
```bash
brew install scala@2.11
```
2. Install Apache Spark
```bash
brew install apache-spark
```
3. Add environment variables:
Add the following environment variables to your `.bash_profile` or `.zshrc`. Replace the path to `SPARK_HOME` to the path on your own host. Run `brew info apache-spark` to get this.
```bash
export SPARK_HOME=/usr/local/Cellar/apache-spark/3.2.1/libexec
export PATH="$SPARK_HOME/bin/:$PATH"
```
### Testing Spark
Execute `spark-shell` and run the following in scala:
```scala
val data = 1 to 10000
val distData = sc.parallelize(data)
distData.filter(_ < 10).collect()
```
### PySpark
It's the same for all platforms. Go to [pyspark.md](pyspark.md).