# Create Deployment **POST /ai/deployment** Deploy a model on an inference server ## Servers - https://api-ch-gva-2.exoscale.com/v2: https://api-ch-gva-2.exoscale.com/v2 () ## Parameters ### Body: application/json (object) - **gpu-count** (integer(int64)) Number of GPUs (1-8) - **inference-engine-version** (string) Inference engine version - **name** (string) Deployment name - **gpu-type** (string) GPU type family (e.g., gpua5000, gpu3080ti) - **replicas** (integer(int64)) Number of replicas (>=1) - **inference-engine-parameters** (array[string]) Optional extra inference engine server CLI args - **model** (object) ## Responses ### 412 412 #### Body: application/json (object) - **type** (string(uri-reference)) - **title** (string) - **status** (integer) - **detail** (string) - **instance** (string(uri-reference)) - **errors** (array[object]) ### 403 403 #### Body: application/json (object) - **code** (string) Machine-readable forbidden error code - **error** (string) Forbidden error message ### 200 200 #### Body: application/json (object) - **id** (string(uuid)) Operation ID - **reason** (string) Operation failure reason - **reference** (object) Related resource reference - **message** (string) Operation message - **state** (string) Operation status ### 400 400 #### Body: application/json (object) - **type** (string(uri-reference)) - **title** (string) - **status** (integer) - **detail** (string) - **instance** (string(uri-reference)) - **errors** (array[object]) [Powered by Bump.sh](https://bump.sh)