# Chat Completions

Create a model response for the given chat conversation.

***

## Endpoint

```
POST https://api.tensorix.ai/v1/chat/completions
```

***

## Authentication

```bash
Authorization: Bearer YOUR_API_KEY
```

API key is required for all requests. Get your key from your [Tensorix dashboard](https://app.tensorix.ai/dashboard).

***

## Request Body

| Parameter           | Type         | Required | Default       | Description                                    |
| ------------------- | ------------ | -------- | ------------- | ---------------------------------------------- |
| `model`             | string       | **Yes**  | -             | Model ID (e.g., `deepseek/deepseek-chat-v3.1`) |
| `messages`          | array        | **Yes**  | -             | Array of message objects                       |
| `max_tokens`        | integer      | No       | Model default | Maximum tokens to generate                     |
| `temperature`       | number       | No       | 1.0           | Sampling temperature (0-2)                     |
| `top_p`             | number       | No       | 1.0           | Nucleus sampling (0-1)                         |
| `top_k`             | integer      | No       | -             | Top-k sampling                                 |
| `stream`            | boolean      | No       | false         | Stream partial responses                       |
| `stop`              | string/array | No       | -             | Stop sequences (up to 4)                       |
| `presence_penalty`  | number       | No       | 0             | Presence penalty (-2 to 2)                     |
| `frequency_penalty` | number       | No       | 0             | Frequency penalty (-2 to 2)                    |
| `response_format`   | object       | No       | -             | Force output format                            |
| `tools`             | array        | No       | -             | Function calling tools                         |

### Message Object

| Field     | Type   | Required | Description                      |
| --------- | ------ | -------- | -------------------------------- |
| `role`    | string | **Yes**  | `system`, `user`, or `assistant` |
| `content` | string | **Yes**  | Message content                  |

***

## Example Request

### cURL

```bash
curl https://api.tensorix.ai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $TENSORIX_API_KEY" \
  -d '{
    "model": "deepseek/deepseek-chat-v3.1",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "What is the capital of France?"}
    ],
    "max_tokens": 1000,
    "temperature": 0.7
  }'
```

### Python

```python
from openai import OpenAI

client = OpenAI(
    api_key="your-tensorix-api-key",
    base_url="https://api.tensorix.ai/v1"
)

response = client.chat.completions.create(
    model="deepseek/deepseek-chat-v3.1",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What is the capital of France?"}
    ],
    max_tokens=1000,
    temperature=0.7
)

print(response.choices[0].message.content)
```

### JavaScript

```javascript
import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: 'your-tensorix-api-key',
  baseURL: 'https://api.tensorix.ai/v1'
});

const response = await client.chat.completions.create({
  model: 'deepseek/deepseek-chat-v3.1',
  messages: [
    { role: 'system', content: 'You are a helpful assistant.' },
    { role: 'user', content: 'What is the capital of France?' }
  ],
  max_tokens: 1000,
  temperature: 0.7
});

console.log(response.choices[0].message.content);
```

***

## Response

### Success (200)

```json
{
  "id": "chatcmpl-abc123xyz",
  "object": "chat.completion",
  "created": 1706745600,
  "model": "deepseek/deepseek-chat-v3.1",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The capital of France is Paris."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 25,
    "completion_tokens": 8,
    "total_tokens": 33
  }
}
```

### Response Fields

| Field                     | Type    | Description                                             |
| ------------------------- | ------- | ------------------------------------------------------- |
| `id`                      | string  | Unique response identifier                              |
| `object`                  | string  | Always `chat.completion`                                |
| `created`                 | integer | Unix timestamp                                          |
| `model`                   | string  | Model used for generation                               |
| `choices`                 | array   | Generated completions                                   |
| `choices[].message`       | object  | Assistant's response message                            |
| `choices[].finish_reason` | string  | Why generation stopped (`stop`, `length`, `tool_calls`) |
| `usage`                   | object  | Token usage statistics                                  |

***

## Streaming

Enable streaming for real-time responses:

```bash
curl https://api.tensorix.ai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $TENSORIX_API_KEY" \
  -d '{
    "model": "deepseek/deepseek-chat-v3.1",
    "messages": [{"role": "user", "content": "Tell me a story"}],
    "stream": true
  }'
```

Streaming responses are sent as Server-Sent Events (SSE):

```
data: {"id":"chatcmpl-123","object":"chat.completion.chunk","choices":[{"delta":{"content":"Once"}}]}

data: {"id":"chatcmpl-123","object":"chat.completion.chunk","choices":[{"delta":{"content":" upon"}}]}

data: {"id":"chatcmpl-123","object":"chat.completion.chunk","choices":[{"delta":{"content":" a"}}]}

data: [DONE]
```

### Python Streaming

```python
from openai import OpenAI

client = OpenAI(
    api_key="your-tensorix-api-key",
    base_url="https://api.tensorix.ai/v1"
)

stream = client.chat.completions.create(
    model="deepseek/deepseek-chat-v3.1",
    messages=[{"role": "user", "content": "Tell me a story"}],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")
```

***

## JSON Mode

Force the model to output valid JSON:

```bash
curl https://api.tensorix.ai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $TENSORIX_API_KEY" \
  -d '{
    "model": "deepseek/deepseek-chat-v3.1",
    "messages": [
      {"role": "user", "content": "List 3 countries and their capitals as JSON"}
    ],
    "response_format": {"type": "json_object"}
  }'
```

{% hint style="info" %}
When using JSON mode, include "JSON" in your prompt for best results.
{% endhint %}

***

## Function Calling

Use tools for function calling:

```python
from openai import OpenAI

client = OpenAI(
    api_key="your-tensorix-api-key",
    base_url="https://api.tensorix.ai/v1"
)

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather in a location",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "City name"
                    }
                },
                "required": ["location"]
            }
        }
    }
]

response = client.chat.completions.create(
    model="deepseek/deepseek-chat-v3.1",
    messages=[{"role": "user", "content": "What's the weather in Paris?"}],
    tools=tools
)
```

***

## Error Responses

### 401 Unauthorized

```json
{
  "error": {
    "message": "Invalid API key",
    "type": "authentication_error",
    "code": "invalid_api_key"
  }
}
```

### 402 Insufficient Credits

```json
{
  "error": {
    "message": "Insufficient credits",
    "type": "payment_error",
    "code": "insufficient_credits"
  }
}
```

### 429 Rate Limited

```json
{
  "error": {
    "message": "Rate limit exceeded",
    "type": "rate_limit_error",
    "code": "rate_limit_exceeded"
  }
}
```

***

## Best Practices

1. **Use streaming** for long responses to improve user experience
2. **Set appropriate max\_tokens** to control costs and response length
3. **Use lower temperature** (0.1-0.3) for factual tasks, higher (0.7-1.0) for creative tasks
4. **Include system prompts** to guide model behavior
5. **Handle errors gracefully** with retry logic for transient failures

***

## See Also

* [Models](https://github.com/Tensorix-ai/tensorix-docs/blob/main/api-reference/models/README.md) - Available models
* [API Examples](https://github.com/Tensorix-ai/tensorix-docs/blob/main/api-examples/README.md) - More code examples
* [Pricing](https://tensorix.ai/models) - Model pricing


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.tensorix.ai/api-reference/chat-completions.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
