[BETA] Create Deployment

POST /ai/deployment
application/json

Body Required

  • model-id string(uuid)

    Associated model ID

  • name string

    Deployment name

    Minimum length is 1.

  • gpu-type string Required

    GPU type family (e.g., gpua5000, gpu3080ti)

  • gpu-count integer(int64) Required

    Number of GPUs (1-8)

    Minimum value is 0.

  • replicas integer(int64)

    Number of replicas (>=1)

    Minimum value is 0.

Responses

  • 200 application/json

    200

    Hide response attributes Show response attributes object
    • id string(uuid)

      Operation ID

    • reason string

      Operation failure reason

      Values are incorrect, unknown, unavailable, forbidden, busy, fault, partial, not-found, interrupted, unsupported, or conflict.

    • reference object

      Related resource reference

      Hide reference attributes Show reference attributes object
      • id string(uuid)

        Reference ID

      • command string

        Command name

    • message string

      Operation message

    • state string

      Operation status

      Values are failure, pending, success, or timeout.

  • 400 application/json

    400

POST /ai/deployment
curl \
 --request POST 'https://api-ch-gva-2.exoscale.com/v2/ai/deployment' \
 --header "Content-Type: application/json" \
 --data '{"model-id":"string","name":"string","gpu-type":"string","gpu-count":42,"replicas":42}'
Request examples
{
  "model-id": "string",
  "name": "string",
  "gpu-type": "string",
  "gpu-count": 42,
  "replicas": 42
}
Response examples (200)
{
  "id": "string",
  "reason": "incorrect",
  "reference": {
    "id": "string",
    "link": "string",
    "command": "string"
  },
  "message": "string",
  "state": "failure"
}
Response examples (400)
{}