User Guide¶
In-depth guides for writing, configuring, scheduling, monitoring, and operating Spark applications with the Kubeflow Spark Operator.
Working with SparkApplications¶
Create, list, check the status of, and delete SparkApplication objects
The full anatomy of a SparkApplication spec — types, deps, pods, volumes, and more
Restart policies, failure handling, and managing running applications
Use ScheduledSparkApplication to run Spark jobs on a cron schedule
Operating the Operator¶
Tune operator behavior, flags, and Helm chart values
Run the operator in high-availability mode with leader election
Deploy several operator instances scoped to different namespaces
Enforce Kubernetes resource quotas on Spark workloads
Monitoring & Scheduling¶
Export Spark metrics to Prometheus using the JMX exporter
Batch scheduling and gang scheduling with Volcano
Resource-aware batch scheduling with Apache YuniKorn
Integrations¶
Read and write data with GCS and BigQuery on GKE
Run PySpark jobs from Kubeflow Notebooks