Skip to content

Developer

API & Inference

Hosted model endpoints, HuggingFace Spaces demos, and a unified cloud inference gateway. All APIs follow the OpenAI request schema for drop-in compatibility.

Quickstart

// Example: Chat completion
const res = await fetch("https://api.moddux.com/v1/chat/completions", {
  method: "POST",
  headers: {
    "Content-Type": "application/json",
    "Authorization": "Bearer <your-token>",
  },
  body: JSON.stringify({
    model: "mistralai/Mixtral-8x7B-Instruct-v0.1",
    messages: [{ role: "user", content: "Summarise this log..." }],
  }),
});
const data = await res.json();
HOSTINGPlanned

HuggingFace Spaces

Moddux-curated model demos and inference endpoints hosted on HuggingFace Spaces. Each Space provides a public Gradio or Streamlit interface for testing models before integrating them via the Inference API.

MethodPathDescription
POST/spaces/moddux/visiondocDocument OCR + Layout Analysis
POST/spaces/moddux/osint-nerOSINT Entity Extraction
POST/spaces/moddux/code-summaryCode Summarisation (DevOps)

Spaces are free-tier; cold-start latency applies. Use the Inference API for production SLA.

INFERENCEPlanned

Cloud AI Inference

A unified inference gateway that routes requests to the optimal provider — HuggingFace Inference API, OpenAI-compatible endpoints, or self-hosted vLLM — based on cost, latency, and capability requirements.

MethodPathDescription
POST/v1/chat/completionsChat Completion (OpenAI-compatible)
POST/v1/embeddingsEmbeddings
POST/v1/classifyText Classification
POST/v1/extractStructured Extraction

Authentication via Bearer token. Rate limits apply per plan tier.