Using the interactive Ray Cluster for vLLM
- This example demonstrates how to connect to the Practicus AI Ray cluster we created, and execute vLLM + Ray operations.
- Please run this example on the
Ray Coordinator (master)
.
import practicuscore as prt
# Let's get a Ray session.
# this is similar to running `import ray` and then `ray.init()`
ray = prt.distributed.get_client()
from vllm import LLM, SamplingParams
prompts = [
"Mexico is famous for ",
"The largest country in the world is ",
]
sampling_params = SamplingParams(temperature=0.8, top_p=0.95)
llm = LLM(model="facebook/opt-125m")
responses = llm.generate(prompts, sampling_params)
for response in responses:
print(response.outputs[0].text)
Ray Dashboard
Practicus AI Ray offers an interactive dashboard where you can view execution details. Let's open the dashboard.
dashboard_url = prt.distributed.open_dashboard()
print("Page did not open? You can open this url manually:", dashboard_url)
Terminating the cluster
- You can go back to the other worker where you created the cluster to run:
Previous: Start Cluster | Next: Unified DevOps > Introduction