Introduction to Oru-el

Oru-el is a unified cloud platform for LLM inference, GPU compute, observability, and cost management.

Introduction to Oru-el#

Oru-el is a cloud platform that gives you everything you need to build, deploy, and operate AI applications — all from one place.

Instead of stitching together separate services for inference, compute, monitoring, and cost tracking, Oru-el provides a single platform with one API key, one dashboard, and one bill.

What you get#

LLM Inference API#

Access 100+ models through a single, OpenAI-compatible API. One base URL, one API key — no need to manage multiple accounts or integrations.

OpenAI SDK compatible — change two lines of code (base URL and API key) and your existing OpenAI integration works with Oru-el
Chat completions, embeddings, image generation, and text-to-speech — all through the same API
Tool calling and JSON mode — full support for function calling and structured outputs
Streaming — server-sent events for real-time token delivery
Standard and turbo tiers — choose between cost-optimized and latency-optimized routing

GPU Compute#

Rent GPUs for training, fine-tuning, or running custom workloads.

A100, H100, and more — access high-end GPUs on demand
SSH access — connect directly to your machine
Job monitoring — track job progress, resource usage, and costs in real time
Flexible billing — pay per hour with no long-term commitments

Observability#

Full visibility into your LLM usage without any extra instrumentation.

Traces — every API call is automatically logged with request, response, latency, and cost
Sessions — group related traces into conversations or workflows
Evaluations — score outputs with manual ratings or automated LLM-as-judge evaluations
Analytics — visualize usage patterns, model performance, and cost trends

Prompt Management#

Version-controlled prompt templates for your team.

Versioned templates — track changes to prompts over time
Variables — use {{variable}} syntax for dynamic content
Team collaboration — share and iterate on prompts across your organization

FinOps#

Control your AI spending with built-in cost management.

Cost analytics — per-model, per-user, and per-project cost breakdowns
Budgets — set monthly spending limits with hard or soft enforcement
Anomaly detection — get alerted when spending patterns change unexpectedly
Usage patterns — understand peak usage times and optimize costs

Platform capabilities at a glance#

Capability	What it covers
Inference API	Chat completions, embeddings, image generation, TTS
Model catalog	100+ models across chat, reasoning, code, vision, and more
SDK compatibility	Works with OpenAI Python and JavaScript SDKs
GPU compute	On-demand GPU rentals with SSH access
Tracing	Automatic logging of every API call
Evaluations	Manual scoring and LLM-as-judge automation
Analytics	Usage, cost, and performance dashboards
Prompt management	Versioned templates with variables
Budgets	Monthly limits, hourly rate caps, anomaly alerts
Wallet	Pre-paid balance with transaction history

How it works#

Sign up at oru-el.com and create an account
Add funds to your wallet — all usage is billed against your pre-paid balance
Create an API key in Settings and start making requests
Monitor usage in the dashboard — traces, costs, and analytics are tracked automatically

OpenAI SDK compatibility#

If you already use the OpenAI SDK, switching to Oru-el takes two lines:

from openai import OpenAI

client = OpenAI(
    base_url="https://api.oru-el.com/v1/inference",
    api_key="oruel_your_api_key_here",
)

response = client.chat.completions.create(
    model="llama-4-maverick",
    messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)

No SDK changes, no wrapper libraries — it just works.

Next steps#

Quickstart — make your first API call in under 5 minutes
Authentication — learn about API keys and auth
Models — browse the full model catalog
Pricing — understand the cost model