Introduction to Oru-el

Oru-el is a unified cloud platform for LLM inference, GPU compute, observability, and cost management.

Introduction to Oru-el#

Oru-el is a cloud platform that gives you everything you need to build, deploy, and operate AI applications — all from one place.

Instead of stitching together separate services for inference, compute, monitoring, and cost tracking, Oru-el provides a single platform with one API key, one dashboard, and one bill.

What you get#

LLM Inference API#

Access 100+ models through a single, OpenAI-compatible API. One base URL, one API key — no need to manage multiple accounts or integrations.

  • OpenAI SDK compatible — change two lines of code (base URL and API key) and your existing OpenAI integration works with Oru-el
  • Chat completions, embeddings, image generation, and text-to-speech — all through the same API
  • Tool calling and JSON mode — full support for function calling and structured outputs
  • Streaming — server-sent events for real-time token delivery
  • Standard and turbo tiers — choose between cost-optimized and latency-optimized routing

GPU Compute#

Rent GPUs for training, fine-tuning, or running custom workloads.

  • A100, H100, and more — access high-end GPUs on demand
  • SSH access — connect directly to your machine
  • Job monitoring — track job progress, resource usage, and costs in real time
  • Flexible billing — pay per hour with no long-term commitments

Observability#

Full visibility into your LLM usage without any extra instrumentation.

  • Traces — every API call is automatically logged with request, response, latency, and cost
  • Sessions — group related traces into conversations or workflows
  • Evaluations — score outputs with manual ratings or automated LLM-as-judge evaluations
  • Analytics — visualize usage patterns, model performance, and cost trends

Prompt Management#

Version-controlled prompt templates for your team.

  • Versioned templates — track changes to prompts over time
  • Variables — use {{variable}} syntax for dynamic content
  • Team collaboration — share and iterate on prompts across your organization

FinOps#

Control your AI spending with built-in cost management.

  • Cost analytics — per-model, per-user, and per-project cost breakdowns
  • Budgets — set monthly spending limits with hard or soft enforcement
  • Anomaly detection — get alerted when spending patterns change unexpectedly
  • Usage patterns — understand peak usage times and optimize costs

Platform capabilities at a glance#

CapabilityWhat it covers
Inference APIChat completions, embeddings, image generation, TTS
Model catalog100+ models across chat, reasoning, code, vision, and more
SDK compatibilityWorks with OpenAI Python and JavaScript SDKs
GPU computeOn-demand GPU rentals with SSH access
TracingAutomatic logging of every API call
EvaluationsManual scoring and LLM-as-judge automation
AnalyticsUsage, cost, and performance dashboards
Prompt managementVersioned templates with variables
BudgetsMonthly limits, hourly rate caps, anomaly alerts
WalletPre-paid balance with transaction history

How it works#

  1. Sign up at oru-el.com and create an account
  2. Add funds to your wallet — all usage is billed against your pre-paid balance
  3. Create an API key in Settings and start making requests
  4. Monitor usage in the dashboard — traces, costs, and analytics are tracked automatically

OpenAI SDK compatibility#

If you already use the OpenAI SDK, switching to Oru-el takes two lines:

from openai import OpenAI

client = OpenAI(
    base_url="https://api.oru-el.com/v1/inference",
    api_key="oruel_your_api_key_here",
)

response = client.chat.completions.create(
    model="llama-4-maverick",
    messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)

No SDK changes, no wrapper libraries — it just works.

Next steps#

  • Quickstart — make your first API call in under 5 minutes
  • Authentication — learn about API keys and auth
  • Models — browse the full model catalog
  • Pricing — understand the cost model