Spark Execution Modes

Mode of Execution

  • Mode of Execution determine where your app’s resources are physically located when you run your application/job. 
    • Cluster Mode
    • Client Mode
    • Local Mode
Cluster Mode
  • In cluster mode, the Spark driver program runs within the Spark cluster itself. 
  • The driver program is launched on one of the nodes in the cluster, typically on a master node managed by the cluster manager (e.g., YARN, Mesos, or Spark's standalone cluster manager). 
  • When a Spark application is submitted in cluster mode, the driver program is launched on a cluster node, and the application code is executed within the cluster's resources.
  • Both the driver and executor processes run within the Cluster.
  • Suitable for production deployments where resources are managed centrally by a cluster manager.
Client Mode
  • In client mode, the Spark driver program runs on the machine that submits the Spark application (often referred to as the client machine). 
  • The client machine typically resides outside of the Spark cluster.
  • When a Spark application is submitted in client mode, the driver program is launched on the client machine, while the executor processes are launched on the cluster nodes by the cluster manager.
  • It offers better visibility into the application's execution, as the driver program runs on the client machine where users have direct access for monitoring and debugging.
  • The client machine needs to have network access to the cluster, which may introduce network latency and security concerns.
Local Mode
  • Generally "local mode" refers to a deployment configuration where both the driver program and the executors run within a single JVM (Java Virtual Machine) on a single machine, typically the developer's local machine. 
  • This mode is primarily used for development, testing, and debugging purposes, allowing developers to experiment with Spark applications without the need for a full-fledged Spark cluster, enabling faster development iterations and easier troubleshooting.

Comments

Popular posts from this blog

Hive File Formats

HDFS Infographic

Why We Need Hadoop?