Spark Execution Modes
Mode of Execution
- Mode of Execution determine where your app’s resources are physically located when you run your application/job.
- Cluster Mode
- Client Mode
- Local Mode
Cluster Mode
- In cluster mode, the Spark driver program runs within the Spark cluster itself.
- The driver program is launched on one of the nodes in the cluster, typically on a master node managed by the cluster manager (e.g., YARN, Mesos, or Spark's standalone cluster manager).
- When a Spark application is submitted in cluster mode, the driver program is launched on a cluster node, and the application code is executed within the cluster's resources.
- Both the driver and executor processes run within the Cluster.
- Suitable for production deployments where resources are managed centrally by a cluster manager.
Client Mode
- In client mode, the Spark driver program runs on the machine that submits the Spark application (often referred to as the client machine).
- The client machine typically resides outside of the Spark cluster.
- When a Spark application is submitted in client mode, the driver program is launched on the client machine, while the executor processes are launched on the cluster nodes by the cluster manager.
- It offers better visibility into the application's execution, as the driver program runs on the client machine where users have direct access for monitoring and debugging.
- The client machine needs to have network access to the cluster, which may introduce network latency and security concerns.
Local Mode
- Generally "local mode" refers to a deployment configuration where both the driver program and the executors run within a single JVM (Java Virtual Machine) on a single machine, typically the developer's local machine.
- This mode is primarily used for development, testing, and debugging purposes, allowing developers to experiment with Spark applications without the need for a full-fledged Spark cluster, enabling faster development iterations and easier troubleshooting.
Comments
Post a Comment