Model Deployment Settings
Overview
This document provides details for configuring model deployment settings within the Practicus AI platform. It focuses on essential fields and configurations required to create, modify, and manage model deployments efficiently.
Adding a Model Deployment
Key Fields
- Key (Required): Unique identifier for the deployment.
- Name (Required): Human-readable name for the deployment.
- Model Object Store (Required): The storage system containing the model files.
- Worker Type (Required): Defines the capacity (e.g.,
Small
,Large
) of the worker used for the deployment. - Default Replica (Required): Specifies the default number of pods to run.
- Auto Scaled: Enables dynamic scaling of pod counts based on workload.
- Min Replica: Minimum number of pods when auto-scaling is enabled.
- Max Replica: Maximum number of pods when auto-scaling is enabled.
- Enable Observability: Activates metrics collection for external systems like Prometheus.
- Log Level: Sets the granularity of logs (e.g.,
DEBUG
,INFO
).
Advanced Options
- Node Selector: Assigns deployments to specific Kubernetes nodes using labels.
- Custom Image: Allows selecting or defining a custom container image.
- Startup Script: Shell commands executed before starting the API endpoint.
- Traffic Log Object Store: Specifies where request data and prediction logs are stored.
- Deployment Group Accesses: Defines groups with access permissions.
- Deployment User Accesses: Specifies individual user access rights.
Steps to Add a Deployment
- Navigate to ML Model Hosting > Model Deployments.
- Click Add Model Deployment.
- Fill in the required fields under the "Add Model Deployment" form.
- Specify advanced configurations, if necessary.
- Save the deployment using one of the options:
- Save and add another: Save and open a new form.
- Save and continue editing: Save and remain on the current form.
- Save: Save and return to the main list.
Managing Deployment Settings
Viewing Deployment Settings
- Navigate to the list of deployments in ML Model Hosting > Model Deployments.
- Select a deployment to view or modify its settings.
Key Information Displayed
- Model Object Store
- Worker Type
- Default Replica Count
- Observability Settings
- Traffic Log Object Store
Modifying Deployments
- Select the deployment to modify from the list.
- Update necessary fields.
- Save the changes using the appropriate option.
Observability Settings
Core Metrics
- Enables tracking of essential metrics like request count and total prediction time.
Model Drift Detection
- Activates comparisons between predicted results and ground truth to detect model drift.
Logging
- Prediction Percentiles: Logs percentiles of prediction results for analysis.
- Custom Metrics: Allows defining custom metrics in Python.
- Traffic Logging: Logs request and prediction data in a compatible format.
Advanced Logging Options
- Log Batch Rows: Sets the number of rows per log batch.
- Log Batch Minutes: Defines time intervals for flushing logs.
Model Deployment Best Practices
- Use auto-scaling for deployments with variable workloads.
- Enable observability for monitoring model performance and drift.
- Leverage custom images for deployments requiring specialized environments.
- Use traffic log object store for centralized logging and analysis.