# Models

Available models on the Tensorix platform.

***

## Overview

Tensorix provides access to leading open-source and proprietary AI models through a unified API. All models are accessible via the same OpenAI-compatible endpoint.

{% hint style="info" %}
**Live Pricing & Full List**: Visit [tensorix.ai/models](https://tensorix.ai/models) for real-time pricing and the complete model catalog.
{% endhint %}

***

## Model Capabilities

Our Large Language Models (LLMs) support:

* **Text Generation** - Generate coherent, contextual content
* **Language Understanding** - Understand meaning and context
* **Code Generation** - Write, analyze, and debug code
* **Reasoning** - Complex problem solving and analysis
* **Function Calling** - Tool use and structured outputs
* **Vision** - Image understanding (select models)
* **Multilingual** - Support for multiple languages
* **Text-to-Speech** - Convert text to natural audio
* **Speech-to-Text** - Transcribe audio to text

***

## Available Models

### GLM (Z-AI) ⭐

| Model          | Context | Features             | Best For                                     |
| -------------- | ------- | -------------------- | -------------------------------------------- |
| `z-ai/glm-5.1` | 203K    | Functions, Reasoning | Coding, reasoning, Chinese/English bilingual |
| `z-ai/glm-4.6` | 203K    | Functions, Reasoning | General purpose, bilingual                   |

### MiniMax ⭐

| Model                  | Context | Features                     | Best For                   |
| ---------------------- | ------- | ---------------------------- | -------------------------- |
| `minimax/minimax-m2.5` | 197K    | Functions, Reasoning         | Reasoning, general purpose |
| `minimax/minimax-m2`   | 197K    | Coding, Functions, Reasoning | Coding, fast responses     |

### Moonshot AI

| Model                  | Context | Features          | Best For                                    |
| ---------------------- | ------- | ----------------- | ------------------------------------------- |
| `moonshotai/kimi-k2.5` | 262K    | Vision, Functions | Vision tasks, long context, general purpose |

### DeepSeek

| Model                         | Context | Features             | Best For                             |
| ----------------------------- | ------- | -------------------- | ------------------------------------ |
| `deepseek/deepseek-chat-v3.1` | 164K    | Functions, Reasoning | General chat, coding, fast reasoning |
| `deepseek/deepseek-v3.2`      | 164K    | Functions, Reasoning | General chat, fast responses         |
| `deepseek/deepseek-r1-0528`   | 164K    | Functions, Reasoning | Complex reasoning tasks              |

### Qwen

| Model                               | Context | Features          | Best For              |
| ----------------------------------- | ------- | ----------------- | --------------------- |
| `qwen/qwen3-235b-a22b-2507`         | 131K    | Functions         | Large-scale reasoning |
| `qwen/qwen3-coder-30b-a3b-instruct` | 262K    | Coding, Functions | Code generation       |

### Llama (Meta)

| Model                               | Context | Features  | Best For                               |
| ----------------------------------- | ------- | --------- | -------------------------------------- |
| `meta-llama/llama-3.3-70b-instruct` | 131K    | Functions | General purpose, instruction following |
| `meta-llama/llama-4-maverick`       | 1050K   | Functions | Long context, multimodal tasks         |

### OpenAI OSS

| Model                 | Context | Features             | Best For             |
| --------------------- | ------- | -------------------- | -------------------- |
| `openai/gpt-oss-120b` | 131K    | Functions, Reasoning | GPT-4 alternative    |
| `openai/gpt-oss-20b`  | 131K    | Functions, Reasoning | Fast, cost-effective |

### Audio Models 🎧

| Model                             | Type | Features                  | Best For                  |
| --------------------------------- | ---- | ------------------------- | ------------------------- |
| `chatterbox-turbo`                | TTS  | High-quality voices       | Text-to-speech generation |
| `Systran/faster-whisper-large-v3` | STT  | 98%+ accuracy, timestamps | Audio transcription       |

See [Audio API](/api-reference/audio.md) for detailed usage.

***

## Model Recommendations

| Use Case              | Recommended Models                                                   |
| --------------------- | -------------------------------------------------------------------- |
| **General Chat**      | `deepseek/deepseek-chat-v3.1`, `meta-llama/llama-3.3-70b-instruct`   |
| **Complex Reasoning** | `deepseek/deepseek-r1-0528`, `z-ai/glm-5.1`                          |
| **Coding**            | `z-ai/glm-5.1`, `minimax/minimax-m2`                                 |
| **Vision Tasks**      | `moonshotai/kimi-k2.5`, `meta-llama/llama-4-maverick`                |
| **Long Context**      | `moonshotai/kimi-k2.5` (262K), `meta-llama/llama-4-maverick` (1050K) |
| **Multilingual**      | `z-ai/glm-5.1` (Chinese/English)                                     |
| **Text-to-Speech**    | `chatterbox-turbo`                                                   |
| **Speech-to-Text**    | `Systran/faster-whisper-large-v3`                                    |

***

## List Models API

Retrieve the list of available models programmatically:

```bash
curl https://api.tensorix.ai/v1/models \
  -H "Authorization: Bearer $TENSORIX_API_KEY"
```

### Response

```json
{
  "object": "list",
  "data": [
    {
      "id": "deepseek/deepseek-chat-v3.1",
      "object": "model",
      "created": 1706745600,
      "owned_by": "deepseek"
    },
    {
      "id": "z-ai/glm-5.1",
      "object": "model",
      "created": 1706745600,
      "owned_by": "z-ai"
    }
  ]
}
```

***

## Using Models

Specify the model ID in your API request:

```python
from openai import OpenAI

client = OpenAI(
    api_key="your-tensorix-api-key",
    base_url="https://api.tensorix.ai/v1"
)

# Use DeepSeek for fast reasoning
response = client.chat.completions.create(
    model="deepseek/deepseek-chat-v3.1",
    messages=[{"role": "user", "content": "Explain quantum computing"}]
)

# Use GLM for coding
response = client.chat.completions.create(
    model="z-ai/glm-5.1",
    messages=[{"role": "user", "content": "Write a Python function to sort a list"}]
)

# Use Llama for long context
response = client.chat.completions.create(
    model="meta-llama/llama-4-maverick",
    messages=[{"role": "user", "content": "Summarize this document..."}]
)
```

***

## Pricing

Model pricing is based on token usage (input + output tokens).

{% hint style="success" %}
**View Current Pricing**: [tensorix.ai/models](https://tensorix.ai/models)

Pricing is displayed per 1M tokens for each model.
{% endhint %}

### How Pricing Works

```
Cost = (Input Tokens × Input Price) + (Output Tokens × Output Price)
```

* **Input tokens**: Text you send to the model
* **Output tokens**: Text the model generates

### Tips to Optimize Costs

1. **Choose the right model** - Use smaller models for simple tasks
2. **Set max\_tokens** - Limit output length when appropriate
3. **Use caching** - Cache responses for repeated queries
4. **Monitor usage** - Check your [dashboard](https://app.tensorix.ai/dashboard) regularly

***

## Feature Support

| Feature              | Supported Models                      |
| -------------------- | ------------------------------------- |
| **Function Calling** | DeepSeek, GLM, Qwen, GPT-OSS, MiniMax |
| **Reasoning**        | DeepSeek, GPT-OSS                     |
| **Vision**           | Llama 4 Maverick                      |
| **Streaming**        | All models                            |
| **JSON Mode**        | All models                            |
| **Text-to-Speech**   | chatterbox-turbo                      |
| **Speech-to-Text**   | Systran/faster-whisper-large-v3       |

***

## See Also

* [Chat Completions](https://github.com/Tensorix-ai/tensorix-docs/blob/main/api-reference/chat-completions/README.md) - API endpoint documentation
* [Quantisation](https://github.com/Tensorix-ai/tensorix-docs/blob/main/api-reference/quantisation/README.md) - How models are quantised and how to check the level for any specific model
* [Audio API](https://github.com/Tensorix-ai/tensorix-docs/blob/main/audio/README.md) - TTS and STT documentation
* [API Examples](https://github.com/Tensorix-ai/tensorix-docs/blob/main/api-examples/README.md) - Code examples
* [Pricing](https://tensorix.ai/models) - Live pricing


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.tensorix.ai/api-reference/models.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
