Alibabacloud Pai Eas Service Deploy
by @sdk-team
Deploy AI models as PAI-EAS inference services. Supports LLMs (Qwen, Llama), image gen (SD, SDXL), speech synthesis, and more. When to use: deploy models, cr...
clawhub install alibabacloud-pai-eas-service-deployπ About This Skill
name: alibabacloud-pai-eas-service-deploy description: | Deploy AI models as PAI-EAS inference services. Supports LLMs (Qwen, Llama), image gen (SD, SDXL), speech synthesis, and more. When to use: deploy models, create inference services, EAS deployment, model serving, deploy vLLM/SGLang/ComfyUI. license: Apache-2.0 metadata: version: "1.0.0" domain: aiops owner: pai-eas-team contact: pai-eas-agent@alibaba-inc.com tags: - pai-eas - model-deployment - inference-service - llm - vllm - sglang required_tools: - aliyun - jq prerequisites: - "Aliyun CLI >= 3.3.1" - "jq command-line JSON processor" required_permissions: - "eas:CreateService" - "eas:DescribeService" - "eas:ListServices" - "eas:DescribeMachineSpec" - "eas:ListResources" - "eas:ListGateway" - "eas:DescribeGateway" - "nlb:ListLoadBalancers" - "aiworkspace:ListImages" - "aiworkspace:ListWorkspaces" - "vpc:DescribeVpcs" - "vpc:DescribeVSwitches" - "ecs:DescribeSecurityGroups"
PAI-EAS Service Deployment
β οΈ TOP RULES (read first)
1. π΄ NO DUPLICATE SERVICE NAMES π΄
If a service with the target name already exists: STOP and inform the user. Do NOT delete and recreate. Do NOT reuse it either.
2. Mandatory API Calls β Execute ALL of these in order:
| # | API | CLI | Purpose |
|---|-----|-----|---------|
| 1 | ListImages | aliyun aiworkspace list-images | Validate image |
| 2 | describe-machine-spec | aliyun eas describe-machine-spec | Validate GPU type |
| 3 | create-service | aliyun eas create-service | Create service |
| 4 | describe-service | aliyun eas describe-service | Check status (once) |
| 5 | describe-service-endpoints | aliyun eas describe-service-endpoints | Get endpoints |
Execute #1 and #2 ALWAYS, even if user provided the info.
describe-machine-spec β list-resources. describe-service β ListServices.
3. Prohibited β β Reuse existing service
β Write bash scripts (run CLI directly)
β CPU+vLLM/SGLang β file:// in create-service
β Skip mandatory APIs β Change the service name the user specified
β Poll describe-service in a loop (call once only)
4. Autonomous Execution β Do NOT ask user for info discoverable
via APIs. Do NOT ask "should I proceed?" Execute directly.
Timeout? Retry with --read-timeout 60. Error? Inform user and CONTINUE.
Missing param? Pick reasonable default.
If any pre-check or resource discovery step fails, log the failure
and continue to the next step. Only STOP for the specific conditions
listed in Self-Verify Checkpoints (duplicate service name, missing
NLB/GW/dedicated resource group).
5. Self-Verify Checkpoints:
Before Step 2: Did Step 1.5 confirm no duplicate service name?
If duplicate β STOP, inform user, do NOT proceed.
Before Step 5: Have I run list-images AND describe-machine-spec?
If NO β STOP and run them NOW.
Before Step 6: Did Step 1.5 confirm no duplicate service name?
If duplicate exists β STOP, inform user, do NOT proceed.
Step 4 resources: If NLB/GW/Resource Group not found β inform user and STOP.
Do NOT block or attempt workarounds.
After Step 7: Did I call describe-service once and report the status?
6. Run CLI commands DIRECTLY β Use execute_shell_command to run
each aliyun CLI command one at a time. Do NOT write bash scripts,
deployment scripts, or shell files. Do NOT use retry_command()
or any wrapper functions. Each command = one direct execution.
Pre-checks
aliyun configure ai-mode enable
aliyun configure ai-mode set-user-agent \
--user-agent "AlibabaCloud-Agent-Skills/alibabacloud-pai-eas-service-deploy"
aliyun plugin update
aliyun configure set --auto-plugin-install true
β οΈ MANDATORY: Install and update required plugins BEFORE proceeding:
aliyun plugin update
aliyun plugin install aiworkspace --force
aliyun plugin install eas --force
Verify plugins are installed:
aliyun plugin list 2>/dev/null | grep -E "aiworkspace|eas"
If either plugin is missing after install, retry with:
aliyun plugin install --force
See CLI Guide.
Never read/echo AK/SK. Timeout? Retry with --read-timeout 60.
**β οΈ MANDATORY CHECKPOINT: After pre-checks, you MUST proceed to Step 1 regardless of any warnings during plugin installation. Plugin install failures are non-blocking β the plugins may already be installed. NEVER stop execution at the pre-check stage.**
Step 1: Collect Parameters
| Param | Required | Default | |-------|----------|---------| | RegionId | Yes | - | | Service Name | Yes | - | | Workspace ID | No | Auto-select | | Image URI | Yes | - | | Instance Type | Yes | - | | Replicas | No | 1 | | Port | No | 8000 | | OSS Path | No | - |
Service name: lowercase/digits/underscores only. No hyphens. 3-63 chars.
IMPORTANT: Use the EXACT service name the user specifies. Do NOT rename.
If the user specifies a prefix (e.g. "skill_qwen_εΌε€΄"), generate a random suffix of 6 digits (e.g. skill_qwen_482917).
Set profile region β Set the CLI profile region to match the
deployment region. This avoids "Region mismatch" errors when
--cluster-id differs from the profile's default region:
aliyun configure set --region
Workspace ID: Required in metadata.workspace_id. If user does not
specify a workspace, query available workspaces and pick one:
aliyun aiworkspace list-workspaces --region \
--user-agent AlibabaCloud-Agent-Skills/alibabacloud-pai-eas-service-deploy | \
jq '.Workspaces[] | select(.Status == "ENABLED") | {WorkspaceId, WorkspaceName}'
If multiple workspaces exist, list them and let the user choose.
If only one exists, use it directly.Step 1.5: Check for Duplicate Service Name
aliyun eas list-services --region --cluster-id \
--user-agent AlibabaCloud-Agent-Skills/alibabacloud-pai-eas-service-deploy | \
jq '.ServiceList[] | select(.ServiceName == "") | {ServiceName, Status}'
**If a service with the same name already exists β STOP and inform
the user: "A service named
If no duplicate β proceed to Step 2.
Step 2: ListImages (π§ BLOCKING GATE β NEVER SKIP)
Execute even if user provided image URI. Purpose = VALIDATION.
**β οΈ If you see "parse error" or "Exit Code 4", the plugin failed to install. You MUST retry with explicit install:**
aliyun plugin install aiworkspace --force
Then retry the list-images command. Do NOT skip this step.aliyun aiworkspace list-images --region --verbose true \
--labels 'system.official=true,system.supported.eas=true' \
--page-size 50 --user-agent AlibabaCloud-Agent-Skills/alibabacloud-pai-eas-service-deploy | \
jq '.Images[] | {ImageId, ImageUri, ImageCategory}' > /tmp/eas_images.json
echo "Saved $(jq length /tmp/eas_images.json) images"
If still fails after plugin reinstall, retry with timeout:
aliyun aiworkspace list-images --region --verbose true \
--labels 'system.official=true,system.supported.eas=true' \
--page-size 50 --read-timeout 60 \
--user-agent AlibabaCloud-Agent-Skills/alibabacloud-pai-eas-service-deploy | \
jq '.Images[] | {ImageId, ImageUri, ImageCategory}' > /tmp/eas_images.json
echo "Saved $(jq length /tmp/eas_images.json) images"
See Model-Image.
This output is REQUIRED by Step 5. Do NOT skip.
β NEVER skip this step saying "user already provided the image".Step 3: describe-machine-spec (π§ BLOCKING GATE β NEVER SKIP)
Execute even if user specified instance type. NOT list-resources.
**β οΈ If this command fails, retry with --read-timeout 60.
NEVER proceed to Step 5 without /tmp/eas_specs.json existing.**
aliyun eas describe-machine-spec --region \
--user-agent AlibabaCloud-Agent-Skills/alibabacloud-pai-eas-service-deploy | \
jq '.InstanceTypes[] | select(.GPU > 0) | {InstanceType, CPU, GPU, Memory}' \
> /tmp/eas_specs.json
echo "Available GPU specs:"; cat /tmp/eas_specs.json
CPUβGPU auto-switch: If vLLM/SGLang + CPU, silently pick GPU. Do NOT ask.
This output is REQUIRED by Step 5. Do NOT skip.Step 4: Network & Resource Config
| Type | VPC | Config |
|------|-----|--------|
| Shared | No | (default, no networking fields) |
| Dedicated GW | Yes | networking.gateway + cloud.networking |
| NLB | Yes | networking.nlb + cloud.networking |
**β οΈ If a required resource does not exist β STOP and inform the user. Do NOT block or attempt workarounds. This is a valid outcome.**
Dedicated Gateway β Call list-gateway. If no gateway exists β
inform user and STOP. Otherwise call describe-gateway to get
VPC/VSwitch, then query security group under that VPC.
If no security group found β inform user and STOP.
aliyun eas list-gateway --region \
--user-agent AlibabaCloud-Agent-Skills/alibabacloud-pai-eas-service-deploy
If gateway found, get details:
aliyun eas describe-gateway --region --cluster-id \
--gateway-id \
--user-agent AlibabaCloud-Agent-Skills/alibabacloud-pai-eas-service-deploy
Extract VPC and comma-separated VSwitch ID:
aliyun eas describe-gateway --region --cluster-id \
--gateway-id \
--user-agent AlibabaCloud-Agent-Skills/alibabacloud-pai-eas-service-deploy | \
jq '{vpc_id: .LoadBalancerList[0].VpcId, vswitch_id: (.LoadBalancerList[0].VSwitchIds | join(","))}'
NLB β Requires VPC/VSwitch/SecurityGroup. If user does not provide
them, query via APIs. If any required resource not found β inform
user and STOP.
β οΈ NLB requires β₯2 VSwitches across different availability zones.
Use comma-separated format: "vswitch_id": "vsw-zone-a,vsw-zone-b".
β οΈ NLB Plugin Bug (aliyun-cli-eas v0.2.0): If create-service with
NLB config returns 400 with 'vswitch can not be null' or
'vpcId, vswId and securityGroupId are required', this is a known
CLI plugin bug (not a resource issue). Fallback strategy:
1. Retry create-service with NLB config once more (max 2 attempts).
2. If both fail β Remove networking.nlb and cloud.networking from
service.json, redeploy with shared gateway.
3. Inform user: "NLB config failed due to CLI plugin limitation.
Deployed with shared gateway instead."
EAS Dedicated Resource Group β Call list-resources.
Filter for ResourceType == "Dedicated" and Status == "ResourceReady".
aliyun eas list-resources --region \
--user-agent AlibabaCloud-Agent-Skills/alibabacloud-pai-eas-service-deploy | \
jq '.Resources[] | select(.ResourceType == "Dedicated" and .Status == "ResourceReady") | {ResourceId, ResourceType, Status}'
"metadata": {"resource": ""} .cloud.computing.
Step 5: Build Service JSON
β οΈ BEFORE building JSON, you MUST read these reference files:
references/config-patterns.md β Complete JSON templates for all 8 patternsreferences/config-schema.md β Field descriptions and validation rulesreferences/storage-mount.md β OSS/NAS mount configuration detailsreferences/network-config.md β NLB/Gateway network configuration details**β οΈ HARD GATE: Before writing service.json, VERIFY these files exist and have content. If either is missing β STOP and run that Step NOW.**
test -s /tmp/eas_images.json || echo "MISSING: Run Step 2 NOW"
test -s /tmp/eas_specs.json || echo "MISSING: Run Step 3 NOW"
β οΈ JSON format rules:
metadata, containers, storage, cloud, autoscaler, networkingspec, ServiceName, Image, Cpu, Memory, Gpu, processor_path, resourceGroupId, instance, port, command, accessprocessor_path, resourceGroupId, spec, accessmetadata.name = service name, metadata.workspace_id = workspace (REQUIRED)containers[].image = image URI, containers[].command = start command, containers[].port = portcloud.computing.instance_type = instance type (MANDATORY for shared gateway)Quick Reference β JSON Skeletons
Below are minimal skeletons. **Read references/config-patterns.md for
complete templates with all fields and examples.**
Base (Shared Gateway):
{"metadata":{"name":"","instance":1,"workspace_id":""},
"containers":[{"image":"
","port":,"command":""}],
"cloud":{"computing":{"instance_type":""}}}
+ OSS β add /","readOnly":true}}]"storage":[{"mount_path":"/dir","oss":{"path":"oss:///
+ Autoscaling β add "autoscaler":{"min":1,"max":4,"scaleStrategies":[{"metricName":"qps","threshold":20}]}
+ Health Check β add startup_check to containers[] (see config-patterns.md Pattern 4)
NLB β full template (read references/network-config.md for details):
{"metadata":{"name":"","instance":1,"workspace_id":""},
"containers":[{"image":"
","port":,"command":""}],
"cloud":{"computing":{"instance_type":""},
"networking":{"vpc_id":"","vswitch_id":",","security_group_id":""}},
"networking":{"nlb":[{"id":"default","listener_port":,"netType":"intranet"}]}}
β οΈ vswitch_id must be comma-separated with β₯2 VSwitches across different zonesDedicated Resource Group β "metadata.resource" instead of cloud.computing:
{"metadata":{"name":"","instance":1,"resource":"","workspace_id":""},
"containers":[{"image":"
","port":,"command":""}]}
Dedicated Gateway β networking.gateway + cloud.networking:
{"metadata":{"name":"","instance":1,"workspace_id":""},
"containers":[{"image":"
","port":,"command":""}],
"networking":{"gateway":""},
"cloud":{"computing":{"instance_type":""},
"networking":{"vpc_id":"","vswitch_id":",","security_group_id":""}}}
β οΈ vswitch_id comma-separated if gateway returns multiple VSwitchesValidate Before Writing
jq -r '.[] | select(.ImageUri | contains("vllm")) | .ImageUri' /tmp/eas_images.json
jq -r '.[] | select(.InstanceType == "") | .InstanceType' /tmp/eas_specs.json
Step 6: Create Service (MANDATORY)
**π΄ CONFIRM: Did Step 1.5 confirm no duplicate service name?
If a service with this name already exists β STOP. Inform the user
and do NOT proceed with create-service.**
Use $(cat service.json) NOT file://service.json.
Run this DIRECTLY via execute_shell_command, do NOT write a bash script.
aliyun eas create-service --region \
--body "$(cat service.json)" \
--user-agent AlibabaCloud-Agent-Skills/alibabacloud-pai-eas-service-deploy
409 Conflict β Service already exists. Inform the user and STOP.
400 BadRequest with 'vswitch can not be null' or
'vpcId, vswId and securityGroupId are required' β NLB CLI plugin
bug (see Step 4 fallback). Remove networking.nlb and
cloud.networking from service.json and retry.
Step 7: Verify Deployment
**Call describe-service ONCE to check the current status. Do NOT poll. Do NOT loop. Do NOT wait for Running.**
aliyun eas describe-service --region --cluster-id \
--service-name \
--user-agent AlibabaCloud-Agent-Skills/alibabacloud-pai-eas-service-deploy | \
jq '{Status, ServiceName, ServiceId}'
**Report whatever status you get (Running, Waiting, Creating, etc.) and proceed to Step 8 immediately. create-service returning 200 = success.**
Step 8: Report Result (MANDATORY)
Get endpoint info via DescribeServiceEndpoint:
aliyun eas describe-service-endpoints --region --cluster-id \
--service-name --user-agent AlibabaCloud-Agent-Skills/alibabacloud-pai-eas-service-deploy | \
jq '{AccessToken, Endpoints: [.Endpoints[] | {
Type: .EndpointType, Port: .Port,
InternetEndpoints: .InternetEndpoints,
IntranetEndpoints: .IntranetEndpoints
}]}'
Use the status from Step 7 and the endpoints above to report.
Copy the ENTIRE output into your final response. Format:
Deployment Summary
==================
Service Name:
Status: Endpoints:
:
InternetEndpoint:
IntranetEndpoint:
Port: Service Invocation Examples:
curl /api/predict/ \
-H "Authorization: "
curl /api/predict/ \
-H "Authorization: "
curl :/api/predict/ \
-H "Authorization: "
**InternetEndpoint and IntranetEndpoint MUST appear in your
response, even if null.** If null: (not available for this network type)
**Always include a service invocation example using the AccessToken and endpoint URL.**
**Success criteria: create-service returning 200 with ServiceId = success. Any status (Running, Waiting, Creating) is acceptable.**
When done, disable AI-Mode: aliyun configure ai-mode disable
References (read when needed)
| Doc | When to Read | |-----|-------------| | Config Patterns | Step 5 β Complete JSON templates for all 8 patterns | | Config Schema | Step 5 β Field descriptions and validation rules | | Storage Mount | Step 5 β OSS/NAS mount details | | Network Config | Step 4/5 β NLB/Gateway config details | | Model-Image | Step 2 β Image selection guide | | Related APIs | Any step β CLI command reference | | Workflow | Overview β Full deployment flow | | CLI Guide | Pre-checks β Plugin install | | RAM Policies | Pre-checks β Required permissions | | Service Features | Step 5 β Advanced features |