Skip to content

Migrating from OpenAI

provocapi is a drop-in replacement for the OpenAI API. If your code uses the official OpenAI Python or JavaScript SDK, you can switch by changing two lines.

What changes

OpenAI provocapi
Base URL https://api.openai.com/v1 https://inference.provocative.earth/v1
API key prefix sk- pk-prov-
Models gpt-4o, text-embedding-3-large llama-3.1-70b-instruct, bge-m3

What doesn't change

  • Request/response format — identical JSON shapes for chat completions, completions, embeddings, and model listing.
  • Streaming — same SSE format, same [DONE] sentinel.
  • Tool/function calling — supported on chat models that have it (Llama 3.1, Qwen 2.5).
  • JSON mode — pass response_format: {"type": "json_object"} and it works via grammar-constrained decoding.
  • Error format — same {"error": {"message": ..., "type": ..., "code": ...}} shape.
  • SDK methodsclient.chat.completions.create(), client.embeddings.create(), client.models.list() all work unchanged.

Python migration

  from openai import OpenAI

  client = OpenAI(
-     # defaults to OPENAI_API_KEY env var
+     base_url="https://inference.provocative.earth/v1",
+     api_key="pk-prov-YOUR-KEY",  # or set OPENAI_API_KEY
  )

  response = client.chat.completions.create(
-     model="gpt-4o",
+     model="llama-3.1-70b-instruct",
      messages=[{"role": "user", "content": "hello"}],
  )

Alternatively, set environment variables and change nothing in code:

export OPENAI_BASE_URL=https://inference.provocative.earth/v1
export OPENAI_API_KEY=pk-prov-YOUR-KEY

JavaScript migration

  import OpenAI from "openai";

  const client = new OpenAI({
+   baseURL: "https://inference.provocative.earth/v1",
+   apiKey: "pk-prov-YOUR-KEY",
  });

Model mapping

Use this table to find the provocapi equivalent of the OpenAI model you're currently using:

OpenAI model provocapi equivalent Notes
gpt-4o llama-3.1-70b-instruct Comparable on most benchmarks, 30-50% cheaper
gpt-4o-mini llama-3.1-8b-instruct Fast, cheap, good for routing/classification
gpt-4-turbo qwen-2.5-72b-instruct Strong multilingual and coding
gpt-3.5-turbo mistral-small-3-24b Mid-tier, good general purpose
text-embedding-3-large bge-m3 1024-dim, multilingual
text-embedding-3-small nomic-embed-v1.5 768-dim, fast, cheap

Features not yet supported

These OpenAI features are not available on provocapi v1. Plan accordingly:

  • Assistants API (threads, runs, file search) — we're inference-only, not a RAG product
  • Image generation (DALL-E) — not on our roadmap
  • Audio (Whisper, TTS) — coming in v1.5
  • Vision (image inputs) — coming in v1.5 with Llama 3.2 Vision
  • Fine-tuning API — bring your own LoRA weights instead (see LoRA Adapters)
  • Moderation endpoint — not provided
  • Realtime API (WebRTC) — not provided

Response headers

provocapi adds observability headers to every response that OpenAI doesn't include:

Header Description
X-Request-Id Unique request ID for tracing
X-Provocapi-Worker Which backend worker served the request
X-Provocapi-Queue-Ms Time spent in the routing queue
X-Provocapi-Model Resolved model ID

Your existing code will ignore these (they're custom headers), but they're useful for debugging latency issues.