AI Inference API
Built for Scale
OpenAI-compatible API for chat completions, embeddings, and RAG. Deploy your own custom endpoints or use our shared infrastructure. Start free, scale infinitely.
# Chat completion - OpenAI compatible
curl https://api.solidrust.ai/v1/chat/completions \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "vllm-primary",
"messages": [{"role": "user", "content": "Hello!"}]
}' Trusted by innovative teams
Powering real applications
See how teams are building with our inference platform
The vLLM-powered chat API handles thousands of gaming queries daily. Response times are incredible, and the RAG pipeline means our AI assistant always has up-to-date game knowledge.
MyAshes.ai
Gaming AI Assistant
Running our AI Council with per-member tool access was seamless. The agent endpoints let each council member have their own specialized capabilities while sharing the same inference backend.
Aidiant.com
AI Council Platform
Hybrid RAG search transformed our community wiki. Players can ask natural language questions and get instant, accurate answers from our knowledge base.
SolidRusT.net
Zone AI Companion
Everything you need to build with AI
Production-ready APIs with enterprise-grade reliability. OpenAI compatible, so you can switch with zero code changes.
Chat Completions
OpenAI-compatible chat API powered by Qwen3-4B. Streaming support, function calling, and context management.
/v1/chat/completions Text Embeddings
1024-dimensional vectors using BAAI/bge-m3. Perfect for semantic search, clustering, and RAG applications.
/v1/embeddings RAG Pipeline
Built-in retrieval-augmented generation. Semantic search, keyword search, and knowledge graph queries.
/data/v1/query/* AI Agents
Tool-enabled agents with automatic function execution. Build complex workflows with natural language.
/v1/agent/chat Custom Endpoints
Deploy your own backend with dedicated routes. PostgreSQL database, custom APIs, and priority support.
/your-api/* Usage Analytics
Real-time dashboards, token tracking, and cost monitoring. Export data for your own analysis.
Dashboard Simple, powerful API
Get started in minutes with our OpenAI-compatible endpoints. No vendor lock-in.
Get your API key from console.solidrust.ai
View cURL equivalent
curl -X POST https://api.solidrust.ai/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"model": "vllm-primary",
"messages": [{"role": "user", "content": "Say hello and introduce yourself briefly."}],
"max_tokens": 512,
"temperature": 0.7
}'from openai import OpenAI
client = OpenAI(
base_url="https://api.solidrust.ai/v1",
api_key="your-api-key"
)
# Chat completion
response = client.chat.completions.create(
model="vllm-primary",
messages=[
{"role": "user", "content": "Hello!"}
],
stream=True
)
for chunk in response:
print(chunk.choices[0].delta.content, end="") import OpenAI from 'openai';
const client = new OpenAI({
baseURL: 'https://api.solidrust.ai/v1',
apiKey: 'your-api-key',
});
// Chat completion with streaming
const stream = await client.chat.completions.create({
model: 'vllm-primary',
messages: [
{ role: 'user', content: 'Hello!' }
],
stream: true,
});
for await (const chunk of stream) {
process.stdout.write(chunk.choices[0]?.delta?.content || '');
} Available Endpoints
| Endpoint | Description |
|---|---|
/v1/chat/completions | Chat completions (OpenAI compatible) |
/v1/embeddings | Text embeddings (1024-dim) |
/data/v1/query/semantic | Semantic vector search |
/data/v1/query/hybrid | Hybrid search (vector + keyword + graph) |
/v1/agent/chat | Tool-enabled AI agent |
/v1/models | List available models |
Simple, transparent pricing
Start free with shared infrastructure, or deploy your own custom backend for premium features.
Standard
Perfect for getting started
- Chat completions API
- Text embeddings API
- RAG pipeline access
- AI agent endpoint
- 1,000 free requests/month
- Community support
Custom Starter
Your own dedicated backend
- Everything in Standard
- Custom API routes
- Dedicated K8s namespace
- PostgreSQL database
- Redis cache allocation
- Email support
Custom Pro
For production workloads
- Everything in Starter
- Dedicated database
- Custom data connectors (2)
- 100K requests included
- 20% usage discount
- Priority Slack support
Usage-Based Pricing
| Endpoint | Standard | Custom |
|---|---|---|
| Chat Completions | $0.01 | $0.05 |
| Embeddings | $0.005 | $0.02 |
| RAG Queries | $0.02 | $0.05 |
| Agent Calls | $0.03 | $0.08 |
Custom tier pricing applies to requests routed through your dedicated backend.