Developer
API & Inference
Hosted model endpoints, HuggingFace Spaces demos, and a unified cloud inference gateway. All APIs follow the OpenAI request schema for drop-in compatibility.
Quickstart
// Example: Chat completion
const res = await fetch("https://api.moddux.com/v1/chat/completions", {
method: "POST",
headers: {
"Content-Type": "application/json",
"Authorization": "Bearer <your-token>",
},
body: JSON.stringify({
model: "mistralai/Mixtral-8x7B-Instruct-v0.1",
messages: [{ role: "user", content: "Summarise this log..." }],
}),
});
const data = await res.json();HOSTINGPlanned
HuggingFace Spaces
Moddux-curated model demos and inference endpoints hosted on HuggingFace Spaces. Each Space provides a public Gradio or Streamlit interface for testing models before integrating them via the Inference API.
| Method | Path | Description |
|---|---|---|
| POST | /spaces/moddux/visiondoc | Document OCR + Layout Analysis |
| POST | /spaces/moddux/osint-ner | OSINT Entity Extraction |
| POST | /spaces/moddux/code-summary | Code Summarisation (DevOps) |
Spaces are free-tier; cold-start latency applies. Use the Inference API for production SLA.
INFERENCEPlanned
Cloud AI Inference
A unified inference gateway that routes requests to the optimal provider — HuggingFace Inference API, OpenAI-compatible endpoints, or self-hosted vLLM — based on cost, latency, and capability requirements.
| Method | Path | Description |
|---|---|---|
| POST | /v1/chat/completions | Chat Completion (OpenAI-compatible) |
| POST | /v1/embeddings | Embeddings |
| POST | /v1/classify | Text Classification |
| POST | /v1/extract | Structured Extraction |
Authentication via Bearer token. Rate limits apply per plan tier.