🎁 Get the FREE AI Skills Starter Guide β€” Subscribe β†’
BytesAgainBytesAgain
πŸ¦€ ClawHub

Alibabacloud Pai Eas Service Deploy

by @sdk-team

Deploy AI models as PAI-EAS inference services. Supports LLMs (Qwen, Llama), image gen (SD, SDXL), speech synthesis, and more. When to use: deploy models, cr...

Versionv0.0.1-beta.1
Downloads264
TERMINAL
clawhub install alibabacloud-pai-eas-service-deploy

πŸ“– About This Skill


name: alibabacloud-pai-eas-service-deploy description: | Deploy AI models as PAI-EAS inference services. Supports LLMs (Qwen, Llama), image gen (SD, SDXL), speech synthesis, and more. When to use: deploy models, create inference services, EAS deployment, model serving, deploy vLLM/SGLang/ComfyUI. license: Apache-2.0 metadata: version: "1.0.0" domain: aiops owner: pai-eas-team contact: pai-eas-agent@alibaba-inc.com tags: - pai-eas - model-deployment - inference-service - llm - vllm - sglang required_tools: - aliyun - jq prerequisites: - "Aliyun CLI >= 3.3.1" - "jq command-line JSON processor" required_permissions: - "eas:CreateService" - "eas:DescribeService" - "eas:ListServices" - "eas:DescribeMachineSpec" - "eas:ListResources" - "eas:ListGateway" - "eas:DescribeGateway" - "nlb:ListLoadBalancers" - "aiworkspace:ListImages" - "aiworkspace:ListWorkspaces" - "vpc:DescribeVpcs" - "vpc:DescribeVSwitches" - "ecs:DescribeSecurityGroups"

PAI-EAS Service Deployment

⚠️ TOP RULES (read first)

1. πŸ”΄ NO DUPLICATE SERVICE NAMES πŸ”΄

If a service with the target name already exists: STOP and inform the user. Do NOT delete and recreate. Do NOT reuse it either.

2. Mandatory API Calls β€” Execute ALL of these in order:

| # | API | CLI | Purpose | |---|-----|-----|---------| | 1 | ListImages | aliyun aiworkspace list-images | Validate image | | 2 | describe-machine-spec | aliyun eas describe-machine-spec | Validate GPU type | | 3 | create-service | aliyun eas create-service | Create service | | 4 | describe-service | aliyun eas describe-service | Check status (once) | | 5 | describe-service-endpoints | aliyun eas describe-service-endpoints | Get endpoints |

Execute #1 and #2 ALWAYS, even if user provided the info. describe-machine-spec β‰  list-resources. describe-service β‰  ListServices.

3. Prohibited β€” ❌ Reuse existing service ❌ Write bash scripts (run CLI directly) ❌ CPU+vLLM/SGLang ❌ file:// in create-service ❌ Skip mandatory APIs ❌ Change the service name the user specified ❌ Poll describe-service in a loop (call once only)

4. Autonomous Execution β€” Do NOT ask user for info discoverable via APIs. Do NOT ask "should I proceed?" Execute directly. Timeout? Retry with --read-timeout 60. Error? Inform user and CONTINUE. Missing param? Pick reasonable default. If any pre-check or resource discovery step fails, log the failure and continue to the next step. Only STOP for the specific conditions listed in Self-Verify Checkpoints (duplicate service name, missing NLB/GW/dedicated resource group).

5. Self-Verify Checkpoints:

Before Step 2: Did Step 1.5 confirm no duplicate service name?
  If duplicate β†’ STOP, inform user, do NOT proceed.
Before Step 5: Have I run list-images AND describe-machine-spec?
  If NO β†’ STOP and run them NOW.
Before Step 6: Did Step 1.5 confirm no duplicate service name?
  If duplicate exists β†’ STOP, inform user, do NOT proceed.
Step 4 resources: If NLB/GW/Resource Group not found β†’ inform user and STOP.
  Do NOT block or attempt workarounds.
After Step 7: Did I call describe-service once and report the status?

6. Run CLI commands DIRECTLY β€” Use execute_shell_command to run each aliyun CLI command one at a time. Do NOT write bash scripts, deployment scripts, or shell files. Do NOT use retry_command() or any wrapper functions. Each command = one direct execution.


Pre-checks

aliyun configure ai-mode enable
aliyun configure ai-mode set-user-agent \
  --user-agent "AlibabaCloud-Agent-Skills/alibabacloud-pai-eas-service-deploy"
aliyun plugin update
aliyun configure set --auto-plugin-install true

⚠️ MANDATORY: Install and update required plugins BEFORE proceeding:

aliyun plugin update
aliyun plugin install aiworkspace --force
aliyun plugin install eas --force

Verify plugins are installed:

aliyun plugin list 2>/dev/null | grep -E "aiworkspace|eas"

If either plugin is missing after install, retry with:

aliyun plugin install  --force

See CLI Guide. Never read/echo AK/SK. Timeout? Retry with --read-timeout 60.

**⚠️ MANDATORY CHECKPOINT: After pre-checks, you MUST proceed to Step 1 regardless of any warnings during plugin installation. Plugin install failures are non-blocking β€” the plugins may already be installed. NEVER stop execution at the pre-check stage.**


Step 1: Collect Parameters

| Param | Required | Default | |-------|----------|---------| | RegionId | Yes | - | | Service Name | Yes | - | | Workspace ID | No | Auto-select | | Image URI | Yes | - | | Instance Type | Yes | - | | Replicas | No | 1 | | Port | No | 8000 | | OSS Path | No | - |

Service name: lowercase/digits/underscores only. No hyphens. 3-63 chars. IMPORTANT: Use the EXACT service name the user specifies. Do NOT rename. If the user specifies a prefix (e.g. "skill_qwen_开倴"), generate a random suffix of 6 digits (e.g. skill_qwen_482917).

Set profile region β€” Set the CLI profile region to match the deployment region. This avoids "Region mismatch" errors when --cluster-id differs from the profile's default region:

aliyun configure set --region 

Workspace ID: Required in metadata.workspace_id. If user does not specify a workspace, query available workspaces and pick one:

aliyun aiworkspace list-workspaces --region  \
  --user-agent AlibabaCloud-Agent-Skills/alibabacloud-pai-eas-service-deploy | \
  jq '.Workspaces[] | select(.Status == "ENABLED") | {WorkspaceId, WorkspaceName}'
If multiple workspaces exist, list them and let the user choose. If only one exists, use it directly.

Step 1.5: Check for Duplicate Service Name

aliyun eas list-services --region  --cluster-id  \
  --user-agent AlibabaCloud-Agent-Skills/alibabacloud-pai-eas-service-deploy | \
  jq '.ServiceList[] | select(.ServiceName == "") | {ServiceName, Status}'

**If a service with the same name already exists β†’ STOP and inform the user: "A service named already exists (Status: ). Please choose a different name." Do NOT delete or reuse it.**

If no duplicate β†’ proceed to Step 2.

Step 2: ListImages (🚧 BLOCKING GATE β€” NEVER SKIP)

Execute even if user provided image URI. Purpose = VALIDATION.

**⚠️ If you see "parse error" or "Exit Code 4", the plugin failed to install. You MUST retry with explicit install:**

aliyun plugin install aiworkspace --force
Then retry the list-images command. Do NOT skip this step.

aliyun aiworkspace list-images --region  --verbose true \
  --labels 'system.official=true,system.supported.eas=true' \
  --page-size 50 --user-agent AlibabaCloud-Agent-Skills/alibabacloud-pai-eas-service-deploy | \
  jq '.Images[] | {ImageId, ImageUri, ImageCategory}' > /tmp/eas_images.json
echo "Saved $(jq length /tmp/eas_images.json) images"

If still fails after plugin reinstall, retry with timeout:

aliyun aiworkspace list-images --region  --verbose true \
  --labels 'system.official=true,system.supported.eas=true' \
  --page-size 50 --read-timeout 60 \
  --user-agent AlibabaCloud-Agent-Skills/alibabacloud-pai-eas-service-deploy | \
  jq '.Images[] | {ImageId, ImageUri, ImageCategory}' > /tmp/eas_images.json
echo "Saved $(jq length /tmp/eas_images.json) images"
See Model-Image. This output is REQUIRED by Step 5. Do NOT skip. ❌ NEVER skip this step saying "user already provided the image".

Step 3: describe-machine-spec (🚧 BLOCKING GATE β€” NEVER SKIP)

Execute even if user specified instance type. NOT list-resources.

**⚠️ If this command fails, retry with --read-timeout 60. NEVER proceed to Step 5 without /tmp/eas_specs.json existing.**

aliyun eas describe-machine-spec --region  \
  --user-agent AlibabaCloud-Agent-Skills/alibabacloud-pai-eas-service-deploy | \
  jq '.InstanceTypes[] | select(.GPU > 0) | {InstanceType, CPU, GPU, Memory}' \
  > /tmp/eas_specs.json
echo "Available GPU specs:"; cat /tmp/eas_specs.json
CPU→GPU auto-switch: If vLLM/SGLang + CPU, silently pick GPU. Do NOT ask. This output is REQUIRED by Step 5. Do NOT skip.

Step 4: Network & Resource Config

| Type | VPC | Config | |------|-----|--------| | Shared | No | (default, no networking fields) | | Dedicated GW | Yes | networking.gateway + cloud.networking | | NLB | Yes | networking.nlb + cloud.networking |

**⚠️ If a required resource does not exist β†’ STOP and inform the user. Do NOT block or attempt workarounds. This is a valid outcome.**

Dedicated Gateway β€” Call list-gateway. If no gateway exists β†’ inform user and STOP. Otherwise call describe-gateway to get VPC/VSwitch, then query security group under that VPC. If no security group found β†’ inform user and STOP.

aliyun eas list-gateway --region  \
  --user-agent AlibabaCloud-Agent-Skills/alibabacloud-pai-eas-service-deploy
If gateway found, get details:
aliyun eas describe-gateway --region  --cluster-id  \
  --gateway-id  \
  --user-agent AlibabaCloud-Agent-Skills/alibabacloud-pai-eas-service-deploy
Extract VPC and comma-separated VSwitch ID:
aliyun eas describe-gateway --region  --cluster-id  \
  --gateway-id  \
  --user-agent AlibabaCloud-Agent-Skills/alibabacloud-pai-eas-service-deploy | \
  jq '{vpc_id: .LoadBalancerList[0].VpcId, vswitch_id: (.LoadBalancerList[0].VSwitchIds | join(","))}'

NLB β€” Requires VPC/VSwitch/SecurityGroup. If user does not provide them, query via APIs. If any required resource not found β†’ inform user and STOP. ⚠️ NLB requires β‰₯2 VSwitches across different availability zones. Use comma-separated format: "vswitch_id": "vsw-zone-a,vsw-zone-b". ⚠️ NLB Plugin Bug (aliyun-cli-eas v0.2.0): If create-service with NLB config returns 400 with 'vswitch can not be null' or 'vpcId, vswId and securityGroupId are required', this is a known CLI plugin bug (not a resource issue). Fallback strategy: 1. Retry create-service with NLB config once more (max 2 attempts). 2. If both fail β†’ Remove networking.nlb and cloud.networking from service.json, redeploy with shared gateway. 3. Inform user: "NLB config failed due to CLI plugin limitation. Deployed with shared gateway instead."

EAS Dedicated Resource Group β€” Call list-resources. Filter for ResourceType == "Dedicated" and Status == "ResourceReady".

aliyun eas list-resources --region  \
  --user-agent AlibabaCloud-Agent-Skills/alibabacloud-pai-eas-service-deploy | \
  jq '.Resources[] | select(.ResourceType == "Dedicated" and .Status == "ResourceReady") | {ResourceId, ResourceType, Status}'
  • If exists β†’ Set "metadata": {"resource": ""}.
  • Do NOT set cloud.computing.
  • If NOT exists β†’ Inform the user and STOP.
  • Do NOT fall back to public resource group.

    Step 5: Build Service JSON

    ⚠️ BEFORE building JSON, you MUST read these reference files:

  • references/config-patterns.md β€” Complete JSON templates for all 8 patterns
  • references/config-schema.md β€” Field descriptions and validation rules
  • references/storage-mount.md β€” OSS/NAS mount configuration details
  • references/network-config.md β€” NLB/Gateway network configuration details
  • **⚠️ HARD GATE: Before writing service.json, VERIFY these files exist and have content. If either is missing β†’ STOP and run that Step NOW.**

    test -s /tmp/eas_images.json || echo "MISSING: Run Step 2 NOW"
    test -s /tmp/eas_specs.json || echo "MISSING: Run Step 3 NOW"
    

    ⚠️ JSON format rules:

  • Allowed top-level keys: metadata, containers, storage, cloud, autoscaler, networking
  • ❌ NEVER use as top-level keys: spec, ServiceName, Image, Cpu, Memory, Gpu, processor_path, resourceGroupId, instance, port, command, access
  • ❌ FORBIDDEN fields: processor_path, resourceGroupId, spec, access
  • metadata.name = service name, metadata.workspace_id = workspace (REQUIRED)
  • containers[].image = image URI, containers[].command = start command, containers[].port = port
  • cloud.computing.instance_type = instance type (MANDATORY for shared gateway)
  • Quick Reference β€” JSON Skeletons

    Below are minimal skeletons. **Read references/config-patterns.md for complete templates with all fields and examples.**

    Base (Shared Gateway):

    {"metadata":{"name":"","instance":1,"workspace_id":""},
     "containers":[{"image":"","port":

    ,"command":""}], "cloud":{"computing":{"instance_type":""}}}

    + OSS β†’ add "storage":[{"mount_path":"/dir","oss":{"path":"oss:///

    /","readOnly":true}}] + Autoscaling β†’ add "autoscaler":{"min":1,"max":4,"scaleStrategies":[{"metricName":"qps","threshold":20}]} + Health Check β†’ add startup_check to containers[] (see config-patterns.md Pattern 4)

    NLB β€” full template (read references/network-config.md for details):

    {"metadata":{"name":"","instance":1,"workspace_id":""},
     "containers":[{"image":"","port":

    ,"command":""}], "cloud":{"computing":{"instance_type":""}, "networking":{"vpc_id":"","vswitch_id":",","security_group_id":""}}, "networking":{"nlb":[{"id":"default","listener_port":

    ,"netType":"intranet"}]}}

    ⚠️ vswitch_id must be comma-separated with β‰₯2 VSwitches across different zones

    Dedicated Resource Group β€” "metadata.resource" instead of cloud.computing:

    {"metadata":{"name":"","instance":1,"resource":"","workspace_id":""},
     "containers":[{"image":"","port":

    ,"command":""}]}

    Dedicated Gateway β€” networking.gateway + cloud.networking:

    {"metadata":{"name":"","instance":1,"workspace_id":""},
     "containers":[{"image":"","port":

    ,"command":""}], "networking":{"gateway":""}, "cloud":{"computing":{"instance_type":""}, "networking":{"vpc_id":"","vswitch_id":",","security_group_id":""}}}

    ⚠️ vswitch_id comma-separated if gateway returns multiple VSwitches

    Validate Before Writing

    jq -r '.[] | select(.ImageUri | contains("vllm")) | .ImageUri' /tmp/eas_images.json
    jq -r '.[] | select(.InstanceType == "") | .InstanceType' /tmp/eas_specs.json
    

    Step 6: Create Service (MANDATORY)

    **πŸ”΄ CONFIRM: Did Step 1.5 confirm no duplicate service name? If a service with this name already exists β†’ STOP. Inform the user and do NOT proceed with create-service.** Use $(cat service.json) NOT file://service.json. Run this DIRECTLY via execute_shell_command, do NOT write a bash script.

    aliyun eas create-service --region  \
      --body "$(cat service.json)" \
      --user-agent AlibabaCloud-Agent-Skills/alibabacloud-pai-eas-service-deploy
    

    409 Conflict β†’ Service already exists. Inform the user and STOP. 400 BadRequest with 'vswitch can not be null' or 'vpcId, vswId and securityGroupId are required' β†’ NLB CLI plugin bug (see Step 4 fallback). Remove networking.nlb and cloud.networking from service.json and retry.

    Step 7: Verify Deployment

    **Call describe-service ONCE to check the current status. Do NOT poll. Do NOT loop. Do NOT wait for Running.**

    aliyun eas describe-service --region  --cluster-id  \
      --service-name  \
      --user-agent AlibabaCloud-Agent-Skills/alibabacloud-pai-eas-service-deploy | \
      jq '{Status, ServiceName, ServiceId}'
    

    **Report whatever status you get (Running, Waiting, Creating, etc.) and proceed to Step 8 immediately. create-service returning 200 = success.**

    Step 8: Report Result (MANDATORY)

    Get endpoint info via DescribeServiceEndpoint:

    aliyun eas describe-service-endpoints --region  --cluster-id  \
      --service-name  --user-agent AlibabaCloud-Agent-Skills/alibabacloud-pai-eas-service-deploy | \
      jq '{AccessToken, Endpoints: [.Endpoints[] | {
        Type: .EndpointType, Port: .Port,
        InternetEndpoints: .InternetEndpoints,
        IntranetEndpoints: .IntranetEndpoints
      }]}'
    

    Use the status from Step 7 and the endpoints above to report.

    Copy the ENTIRE output into your final response. Format:

    Deployment Summary
    ==================
    Service Name: 
    Status: 

    Endpoints:

  • :
  • InternetEndpoint: IntranetEndpoint: Port:

    Service Invocation Examples: curl /api/predict/ \ -H "Authorization: " curl /api/predict/ \ -H "Authorization: " curl :/api/predict/ \ -H "Authorization: "

    **InternetEndpoint and IntranetEndpoint MUST appear in your response, even if null.** If null: (not available for this network type)

    **Always include a service invocation example using the AccessToken and endpoint URL.**

    **Success criteria: create-service returning 200 with ServiceId = success. Any status (Running, Waiting, Creating) is acceptable.**


    When done, disable AI-Mode: aliyun configure ai-mode disable

    References (read when needed)

    | Doc | When to Read | |-----|-------------| | Config Patterns | Step 5 β€” Complete JSON templates for all 8 patterns | | Config Schema | Step 5 β€” Field descriptions and validation rules | | Storage Mount | Step 5 β€” OSS/NAS mount details | | Network Config | Step 4/5 β€” NLB/Gateway config details | | Model-Image | Step 2 β€” Image selection guide | | Related APIs | Any step β€” CLI command reference | | Workflow | Overview β€” Full deployment flow | | CLI Guide | Pre-checks β€” Plugin install | | RAM Policies | Pre-checks β€” Required permissions | | Service Features | Step 5 β€” Advanced features |