Body
Required
-
Number of GPUs (1-8)
Minimum value is
1. -
Inference engine version
Values are
0.12.0or0.15.1. Default value is0.12.0. -
Deployment name
Minimum length is
1. -
GPU type family (e.g., gpua5000, gpu3080ti)
-
Number of replicas (>=1)
Minimum value is
1. -
Optional extra inference engine server CLI args
POST
/ai/deployment
curl \
--request POST 'https://api-ch-gva-2.exoscale.com/v2/ai/deployment' \
--header "Content-Type: application/json" \
--data '{"gpu-count":42,"inference-engine-version":"0.12.0","name":"string","gpu-type":"string","replicas":42,"inference-engine-parameters":["string"],"model":{"name":"string","id":"string"}}'
Request examples
{
"gpu-count": 42,
"inference-engine-version": "0.12.0",
"name": "string",
"gpu-type": "string",
"replicas": 42,
"inference-engine-parameters": [
"string"
],
"model": {
"name": "string",
"id": "string"
}
}
Response examples (412)
{
"error": "string"
}
Response examples (200)
{
"id": "string",
"reason": "incorrect",
"reference": {
"id": "string",
"link": "string",
"command": "string"
},
"message": "string",
"state": "failure"
}
Response examples (400)
{
"error": "string"
}