Production-ready AI inference

AI Inference API
Built for Scale

OpenAI-compatible API for chat completions, embeddings, and RAG. Deploy your own custom endpoints or use our shared infrastructure. Start free, scale infinitely.

curl
# Chat completion - OpenAI compatible
curl https://api.solidrust.ai/v1/chat/completions \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen3-4b",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'
99.9%
Uptime SLA
<50ms
P99 Latency
5
GPU Nodes

Everything you need to build with AI

Production-ready APIs with enterprise-grade reliability. OpenAI compatible, so you can switch with zero code changes.

💬

Chat Completions

OpenAI-compatible chat API powered by Qwen3-4B. Streaming support, function calling, and context management.

/v1/chat/completions
🔤

Text Embeddings

1024-dimensional vectors using BAAI/bge-m3. Perfect for semantic search, clustering, and RAG applications.

/v1/embeddings
🔍

RAG Pipeline

Built-in retrieval-augmented generation. Semantic search, keyword search, and knowledge graph queries.

/data/v1/query/*
🤖

AI Agents

Tool-enabled agents with automatic function execution. Build complex workflows with natural language.

/v1/agent/chat

Custom Endpoints

Deploy your own backend with dedicated routes. PostgreSQL database, custom APIs, and priority support.

/your-api/*
📊

Usage Analytics

Real-time dashboards, token tracking, and cost monitoring. Export data for your own analysis.

Dashboard
OpenAI Compatible — Use your existing code and SDKs

Simple, powerful API

Get started in minutes with our OpenAI-compatible endpoints. No vendor lock-in.

Py Python
from openai import OpenAI

client = OpenAI(
    base_url="https://api.solidrust.ai/v1",
    api_key="your-api-key"
)

# Chat completion
response = client.chat.completions.create(
    model="qwen3-4b",
    messages=[
        {"role": "user", "content": "Hello!"}
    ],
    stream=True
)

for chunk in response:
    print(chunk.choices[0].delta.content, end="")
JS JavaScript
import OpenAI from 'openai';

const client = new OpenAI({
  baseURL: 'https://api.solidrust.ai/v1',
  apiKey: 'your-api-key',
});

// Chat completion with streaming
const stream = await client.chat.completions.create({
  model: 'qwen3-4b',
  messages: [
    { role: 'user', content: 'Hello!' }
  ],
  stream: true,
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content || '');
}

Available Endpoints

Endpoint Description
/v1/chat/completions Chat completions (OpenAI compatible)
/v1/embeddings Text embeddings (1024-dim)
/data/v1/query/semantic Semantic vector search
/data/v1/query/hybrid Hybrid search (vector + keyword + graph)
/v1/agent/chat Tool-enabled AI agent
/v1/models List available models

Simple, transparent pricing

Start free with shared infrastructure, or deploy your own custom backend for premium features.

Standard

Perfect for getting started

Free to start
  • Chat completions API
  • Text embeddings API
  • RAG pipeline access
  • AI agent endpoint
  • 1,000 free requests/month
  • Community support
Start Free

Custom Pro

For production workloads

$199 /month
  • Everything in Starter
  • Dedicated database
  • Custom data connectors (2)
  • 100K requests included
  • 20% usage discount
  • Priority Slack support
Contact Sales

Usage-Based Pricing

Endpoint Standard Custom
Chat Completions $0.01 $0.05
Embeddings $0.005 $0.02
RAG Queries $0.02 $0.05
Agent Calls $0.03 $0.08

Custom tier pricing applies to requests routed through your dedicated backend.