# LlamaIndex

[LlamaIndex](https://www.llamaindex.ai/) is a data framework for LLM applications. Connect it to Tensorix using the OpenAI-compatible configuration.

## Prerequisites

* Python 3.8+
* LlamaIndex installed
* Tensorix API key from [app.tensorix.ai](https://app.tensorix.ai)

## Installation

```bash
pip install llama-index llama-index-llms-openai
```

## Configuration

### Basic Usage

```python
import os
from llama_index.llms.openai import OpenAI

# Set environment variables
os.environ["OPENAI_API_KEY"] = "your-tensorix-api-key"
os.environ["OPENAI_API_BASE"] = "https://api.tensorix.ai/v1"

# Create LLM instance
llm = OpenAI(
    model="claude-sonnet-4-20250514",
    api_key="your-tensorix-api-key",
    api_base="https://api.tensorix.ai/v1"
)

# Simple completion
response = llm.complete("Explain quantum computing in simple terms")
print(response)
```

### Chat Interface

```python
from llama_index.core.llms import ChatMessage
from llama_index.llms.openai import OpenAI

llm = OpenAI(
    model="claude-sonnet-4-20250514",
    api_key="your-tensorix-api-key",
    api_base="https://api.tensorix.ai/v1"
)

messages = [
    ChatMessage(role="system", content="You are a helpful coding assistant"),
    ChatMessage(role="user", content="Write a Python function to calculate fibonacci numbers")
]

response = llm.chat(messages)
print(response)
```

### Streaming Responses

```python
from llama_index.llms.openai import OpenAI

llm = OpenAI(
    model="gpt-4o",
    api_key="your-tensorix-api-key",
    api_base="https://api.tensorix.ai/v1"
)

# Streaming completion
response = llm.stream_complete("Write a short story about AI")
for chunk in response:
    print(chunk.delta, end="")
```

## RAG Applications

Build retrieval-augmented generation (RAG) systems with Tensorix:

```python
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, Settings
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding

# Configure Tensorix LLM
Settings.llm = OpenAI(
    model="claude-sonnet-4-20250514",
    api_key="your-tensorix-api-key",
    api_base="https://api.tensorix.ai/v1"
)

# Configure embeddings (if using Tensorix embeddings)
Settings.embed_model = OpenAIEmbedding(
    model="text-embedding-3-small",
    api_key="your-tensorix-api-key",
    api_base="https://api.tensorix.ai/v1"
)

# Load documents
documents = SimpleDirectoryReader("./data").load_data()

# Create index
index = VectorStoreIndex.from_documents(documents)

# Query
query_engine = index.as_query_engine()
response = query_engine.query("What are the key points in these documents?")
print(response)
```

## Agent Applications

Create agents with Tensorix:

```python
from llama_index.core.agent import ReActAgent
from llama_index.core.tools import FunctionTool
from llama_index.llms.openai import OpenAI

# Define tools
def multiply(a: int, b: int) -> int:
    """Multiply two numbers."""
    return a * b

def add(a: int, b: int) -> int:
    """Add two numbers."""
    return a + b

multiply_tool = FunctionTool.from_defaults(fn=multiply)
add_tool = FunctionTool.from_defaults(fn=add)

# Create agent
llm = OpenAI(
    model="gpt-4o",
    api_key="your-tensorix-api-key",
    api_base="https://api.tensorix.ai/v1"
)

agent = ReActAgent.from_tools(
    [multiply_tool, add_tool],
    llm=llm,
    verbose=True
)

response = agent.chat("What is 20 plus 30, then multiplied by 2?")
print(response)
```

## Configuration with Settings

For global configuration across your application:

```python
from llama_index.core import Settings
from llama_index.llms.openai import OpenAI

# Set global LLM
Settings.llm = OpenAI(
    model="claude-sonnet-4-20250514",
    api_key="your-tensorix-api-key",
    api_base="https://api.tensorix.ai/v1",
    temperature=0.7,
    max_tokens=4000
)
```

## Available Models

See [Tensorix Models](/api-reference/models.md) for all available models.

| Model                        | Best For                 |
| ---------------------------- | ------------------------ |
| `claude-sonnet-4-20250514`   | Complex RAG, agents      |
| `claude-3-5-sonnet-20241022` | General LLM tasks        |
| `gpt-4o`                     | Multi-modal applications |
| `gpt-4o-mini`                | Cost-effective inference |

## Troubleshooting

### Connection errors

Ensure `api_base` ends with `/v1`:

```python
api_base="https://api.tensorix.ai/v1"  # Correct
api_base="https://api.tensorix.ai"     # Wrong
```

### Token limits

Adjust `max_tokens` based on your needs:

```python
llm = OpenAI(
    model="claude-sonnet-4-20250514",
    api_key="your-tensorix-api-key",
    api_base="https://api.tensorix.ai/v1",
    max_tokens=8000  # Increase if needed
)
```

## Resources

* [LlamaIndex Documentation](https://docs.llamaindex.ai/)
* [LlamaIndex GitHub](https://github.com/run-llama/llama_index)
* [Tensorix API Reference](/api-reference/overview.md)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.tensorix.ai/developer-sdks/llamaindex.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
