AI Inference API
Built for Scale
OpenAI-compatible API for chat completions, embeddings, and RAG. Deploy your own custom endpoints or use our shared infrastructure. Start free, scale infinitely.
# Chat completion - OpenAI compatible
curl https://api.solidrust.ai/v1/chat/completions \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "qwen3-4b",
"messages": [{"role": "user", "content": "Hello!"}]
}' Trusted by innovative teams
Everything you need to build with AI
Production-ready APIs with enterprise-grade reliability. OpenAI compatible, so you can switch with zero code changes.
Chat Completions
OpenAI-compatible chat API powered by Qwen3-4B. Streaming support, function calling, and context management.
/v1/chat/completions Text Embeddings
1024-dimensional vectors using BAAI/bge-m3. Perfect for semantic search, clustering, and RAG applications.
/v1/embeddings RAG Pipeline
Built-in retrieval-augmented generation. Semantic search, keyword search, and knowledge graph queries.
/data/v1/query/* AI Agents
Tool-enabled agents with automatic function execution. Build complex workflows with natural language.
/v1/agent/chat Custom Endpoints
Deploy your own backend with dedicated routes. PostgreSQL database, custom APIs, and priority support.
/your-api/* Usage Analytics
Real-time dashboards, token tracking, and cost monitoring. Export data for your own analysis.
Dashboard Simple, powerful API
Get started in minutes with our OpenAI-compatible endpoints. No vendor lock-in.
from openai import OpenAI
client = OpenAI(
base_url="https://api.solidrust.ai/v1",
api_key="your-api-key"
)
# Chat completion
response = client.chat.completions.create(
model="qwen3-4b",
messages=[
{"role": "user", "content": "Hello!"}
],
stream=True
)
for chunk in response:
print(chunk.choices[0].delta.content, end="") import OpenAI from 'openai';
const client = new OpenAI({
baseURL: 'https://api.solidrust.ai/v1',
apiKey: 'your-api-key',
});
// Chat completion with streaming
const stream = await client.chat.completions.create({
model: 'qwen3-4b',
messages: [
{ role: 'user', content: 'Hello!' }
],
stream: true,
});
for await (const chunk of stream) {
process.stdout.write(chunk.choices[0]?.delta?.content || '');
} Available Endpoints
| Endpoint | Description |
|---|---|
/v1/chat/completions | Chat completions (OpenAI compatible) |
/v1/embeddings | Text embeddings (1024-dim) |
/data/v1/query/semantic | Semantic vector search |
/data/v1/query/hybrid | Hybrid search (vector + keyword + graph) |
/v1/agent/chat | Tool-enabled AI agent |
/v1/models | List available models |
Simple, transparent pricing
Start free with shared infrastructure, or deploy your own custom backend for premium features.
Standard
Perfect for getting started
- Chat completions API
- Text embeddings API
- RAG pipeline access
- AI agent endpoint
- 1,000 free requests/month
- Community support
Custom Starter
Your own dedicated backend
- Everything in Standard
- Custom API routes
- Dedicated K8s namespace
- PostgreSQL database
- Redis cache allocation
- Email support
Custom Pro
For production workloads
- Everything in Starter
- Dedicated database
- Custom data connectors (2)
- 100K requests included
- 20% usage discount
- Priority Slack support
Usage-Based Pricing
| Endpoint | Standard | Custom |
|---|---|---|
| Chat Completions | $0.01 | $0.05 |
| Embeddings | $0.005 | $0.02 |
| RAG Queries | $0.02 | $0.05 |
| Agent Calls | $0.03 | $0.08 |
Custom tier pricing applies to requests routed through your dedicated backend.