Introduction to Oru-el
Oru-el is a unified cloud platform for LLM inference, GPU compute, observability, and cost management.
Introduction to Oru-el#
Oru-el is a cloud platform that gives you everything you need to build, deploy, and operate AI applications — all from one place.
Instead of stitching together separate services for inference, compute, monitoring, and cost tracking, Oru-el provides a single platform with one API key, one dashboard, and one bill.
What you get#
LLM Inference API#
Access 100+ models through a single, OpenAI-compatible API. One base URL, one API key — no need to manage multiple accounts or integrations.
- OpenAI SDK compatible — change two lines of code (base URL and API key) and your existing OpenAI integration works with Oru-el
- Chat completions, embeddings, image generation, and text-to-speech — all through the same API
- Tool calling and JSON mode — full support for function calling and structured outputs
- Streaming — server-sent events for real-time token delivery
- Standard and turbo tiers — choose between cost-optimized and latency-optimized routing
GPU Compute#
Rent GPUs for training, fine-tuning, or running custom workloads.
- A100, H100, and more — access high-end GPUs on demand
- SSH access — connect directly to your machine
- Job monitoring — track job progress, resource usage, and costs in real time
- Flexible billing — pay per hour with no long-term commitments
Observability#
Full visibility into your LLM usage without any extra instrumentation.
- Traces — every API call is automatically logged with request, response, latency, and cost
- Sessions — group related traces into conversations or workflows
- Evaluations — score outputs with manual ratings or automated LLM-as-judge evaluations
- Analytics — visualize usage patterns, model performance, and cost trends
Prompt Management#
Version-controlled prompt templates for your team.
- Versioned templates — track changes to prompts over time
- Variables — use
{{variable}}syntax for dynamic content - Team collaboration — share and iterate on prompts across your organization
FinOps#
Control your AI spending with built-in cost management.
- Cost analytics — per-model, per-user, and per-project cost breakdowns
- Budgets — set monthly spending limits with hard or soft enforcement
- Anomaly detection — get alerted when spending patterns change unexpectedly
- Usage patterns — understand peak usage times and optimize costs
Platform capabilities at a glance#
| Capability | What it covers |
|---|---|
| Inference API | Chat completions, embeddings, image generation, TTS |
| Model catalog | 100+ models across chat, reasoning, code, vision, and more |
| SDK compatibility | Works with OpenAI Python and JavaScript SDKs |
| GPU compute | On-demand GPU rentals with SSH access |
| Tracing | Automatic logging of every API call |
| Evaluations | Manual scoring and LLM-as-judge automation |
| Analytics | Usage, cost, and performance dashboards |
| Prompt management | Versioned templates with variables |
| Budgets | Monthly limits, hourly rate caps, anomaly alerts |
| Wallet | Pre-paid balance with transaction history |
How it works#
- Sign up at oru-el.com and create an account
- Add funds to your wallet — all usage is billed against your pre-paid balance
- Create an API key in Settings and start making requests
- Monitor usage in the dashboard — traces, costs, and analytics are tracked automatically
OpenAI SDK compatibility#
If you already use the OpenAI SDK, switching to Oru-el takes two lines:
from openai import OpenAI
client = OpenAI(
base_url="https://api.oru-el.com/v1/inference",
api_key="oruel_your_api_key_here",
)
response = client.chat.completions.create(
model="llama-4-maverick",
messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)
No SDK changes, no wrapper libraries — it just works.
Next steps#
- Quickstart — make your first API call in under 5 minutes
- Authentication — learn about API keys and auth
- Models — browse the full model catalog
- Pricing — understand the cost model