Skip to content

Using the interactive Spark Cluster Client

  • This example demonstrates how to connect to the Practicus AI Spark cluster we created, and execute simple Spark operations.
  • Please run this example on the Spark Coordinator (master).
import practicuscore as prt

# Let's get a Spark session
spark = prt.distributed.get_client()
# And execute some code
data = [("Alice", 29), ("Bob", 34), ("Cathy", 23)]
columns = ["Name", "Age"]

# Create a DataFrame
df = spark.createDataFrame(data, columns)

# Perform a transformation
df_filtered = df.filter(df.Age > 30)

# Show results
df_filtered.show()
# Let's end the session
spark.stop()

Terminating the cluster

  • You can go back to the other worker where you created the cluster to run:

coordinator_worker.terminate()
- Or, terminate "self" and children workers with the below:

prt.get_local_worker().terminate()

Troubleshooting

If you’re experiencing issues with an interactive cluster that doesn’t run job/train.py, please follow these steps:

  1. Agent Count Mismatch: If the number of distributed agents shown by prt.distributed.get_client() is less than what you expected, wait a moment and then run get_client() again. This is usually because the agents have not yet joined the cluster. Note: Batch jobs automatically wait for agents to join.

  2. Viewing Logs: To view logs, navigate to the ~/my/.distributed folder.


Previous: Start Cluster | Next: Batch Job > Batch Job