Skip to main content
For authentication and model list, see Overview.

Endpoint

POST https://api.pipellm.com/v1/chat/completions

Code Examples

curl https://api.pipellm.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $PIPELLM_API_KEY" \
  -d '{
    "model": "gpt-4o",
    "max_completion_tokens": 1024,
    "messages": [
      {
        "role": "user",
        "content": "Why is the sky blue?"
      }
    ]
  }'

Request Parameters

ParameterTypeRequiredDescription
modelstringYesModel ID (e.g., gpt-4o, grok-2)
messagesarrayYesArray of message objects
max_tokensintegerNoMaximum tokens to generate
temperaturenumberNoSampling temperature (0-2)
streambooleanNoEnable streaming response

Response Format

{
  "id": "chatcmpl-xxx",
  "object": "chat.completion",
  "created": 1234567890,
  "model": "gpt-4o",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The sky appears blue because..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 10,
    "completion_tokens": 50,
    "total_tokens": 60
  }
}

Function Calling

Function Calling allows models to generate structured JSON to call functions in your code.
tools = [
  {
    "type": "function",
    "function": {
      "name": "get_weather",
      "description": "Get the current weather in a location",
      "parameters": {
        "type": "object",
        "properties": {
          "location": {"type": "string", "description": "City name"}
        },
        "required": ["location"]
      }
    }
  }
]

response = client.chat.completions.create(
  model="gpt-4.1",
  messages=[{"role": "user", "content": "What's the weather in Tokyo?"}],
  tools=tools,
  tool_choice="auto"
)

if response.choices[0].message.tool_calls:
  tool_call = response.choices[0].message.tool_calls[0]
  # Execute your function and return result

Function Calling Documentation

Complete guide on defining functions and handling responses