Obtenez par e-mail toute l'actualité Hortonworks

Une fois par mois, recevez les dernières idées, tendances, informations d’analyse et découvertes sur le Big Data.

Sign up for the Developers Newsletter

Une fois par mois, recevez les dernières idées, tendances, informations d’analyse et découvertes sur le Big Data.

cta

Démarrer

cloud

Prêt à débuter ?

Télécharger Sandbox

Que pouvons-nous faire pour vous ?

* Je comprends que je peux me désabonner à tout moment. J'ai également compris les informations supplémentaires fournies dans la Politique de confidentialité de Hortonworks.
fermerBouton Fermer
HDP > Développez avec Hadoop > Apache Spark

Spark on YARN Example

cloud Prêt à débuter ?

TÉLÉCHARGER SANDBOX

Introduction

In this brief tutorial you will run a pre-built Spark example on YARN

Prerequisites

This tutorial assumes that you are running an HDP Sandbox.

Please ensure you complete the prerequisites before proceeding with this tutorial.

Pi Example

To test compute intensive tasks in Spark, the Pi example calculates pi by “throwing darts” at a circle. The example points in the unit square ((0,0) to (1,1)) and sees how many fall in the unit circle. The fraction should be pi/4, which is used to estimate Pi.

To calculate Pi with Spark in yarn-client mode.

Assuming you start as root user follow these steps depending on your Spark version.

Note: We have provided two examples one for Spark 2.x and another for Spark 1.6.x for reference.

Spark 2.x Version

From the Sandbox terminal (command line):

root@sandbox# export SPARK_MAJOR_VERSION=2
root@sandbox# cd /usr/hdp/current/spark2-client
root@sandbox spark2-client# su spark
spark@sandbox spark2-client$ ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn-client --num-executors 3 --driver-memory 512m --executor-memory 512m --executor-cores 1 examples/jars/spark-examples*.jar 10

Spark 1.6.x Version

From the Sandbox terminal (command line):

root@sandbox# export SPARK_MAJOR_VERSION=1
root@sandbox# cd /usr/hdp/current/spark-client
root@sandbox spark-client# su spark
spark@sandbox spark-client$ ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn-client --num-executors 3 --driver-memory 512m --executor-memory 512m --executor-cores 1 lib/spark-examples*.jar 10

Note: The Pi job should complete without any failure messages and produce output similar to below, note the value of Pi in the output message:

...
16/02/25 21:27:11 INFO YarnScheduler: Removed TaskSet 0.0, whose tasks have all completed, from pool
16/02/25 21:27:11 INFO DAGScheduler: Job 0 finished: reduce at SparkPi.scala:36, took 19.346544 s
Pi is roughly 3.143648
16/02/25 21:27:12 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/metrics/json,null}
...

Next steps

At this point, you might want to set up a complete development environment for writing and debugging your Spark applications.

Checkout one of the following tutorials on how to set up a full development environment for either Python, Scala, or Java.

User Reviews

User Rating
0 No Reviews
5 Star 0%
4 Star 0%
3 Star 0%
2 Star 0%
1 Star 0%
Tutorial Name
Spark on YARN Example

To ask a question, or find an answer, please visit the Hortonworks Community Connection.

No Reviews
Write Review

Inscription

Please register to write a review

Share Your Experience

Example: Best Tutorial Ever

You must write at least 50 characters for this field.

Success

Thank you for sharing your review!