Skip to main content

Chat Endpoint

The Chat endpoint allows you to send encrypted prompts to various LLM models through Cloakr.ai's secure gateway.

Endpoint

POST /v1/chat

Headers

HeaderTypeRequiredDescription
AuthorizationstringYesBearer token with your API key
Content-TypestringYesMust be application/json
X-Cloakr-VersionstringNoAPI version (default: 2024-01-01)

Request

Request Body

FieldTypeRequiredDescription
modelstringYesModel ID (e.g., "gpt-4o", "claude-3-sonnet")
promptstringYesUser prompt (encrypted or plain text)
streambooleanNoEnable streaming responses (default: false)
max_tokensintegerNoMaximum tokens in response (default: model max)
temperaturefloatNoRandomness (0.0-2.0, default: 1.0)
top_pfloatNoNucleus sampling (0.0-1.0, default: 1.0)
frequency_penaltyfloatNoFrequency penalty (-2.0-2.0, default: 0.0)
presence_penaltyfloatNoPresence penalty (-2.0-2.0, default: 0.0)
stoparrayNoStop sequences
userstringNoUser identifier for audit logging

Example Request

curl -X POST https://api.cloakr.ai/v1/chat \
-H "Authorization: Bearer $CLOAKR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"prompt": "What is Cloakr.ai?",
"stream": false,
"max_tokens": 1000,
"temperature": 0.7
}'

JavaScript Example

import { CloakrClient } from '@cloakrai/sdk';

const client = new CloakrClient({
apiKey: process.env.CLOAKR_API_KEY
});

const response = await client.chat({
model: 'gpt-4o',
prompt: 'What is Cloakr.ai?',
stream: false,
maxTokens: 1000,
temperature: 0.7
});

console.log(response.choices[0].text);

Python Example

from cloakrai import CloakrClient
import os

client = CloakrClient(api_key=os.getenv('CLOAKR_API_KEY'))

response = client.chat(
model='gpt-4o',
prompt='What is Cloakr.ai?',
stream=False,
max_tokens=1000,
temperature=0.7
)

print(response.choices[0].text)

Response

Success Response

{
"id": "chat_abc123def456",
"object": "chat.completion",
"created": 1640995200,
"model": "gpt-4o",
"choices": [
{
"index": 0,
"text": "Cloakr.ai is a privacy-first enterprise AI gateway...",
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 10,
"completion_tokens": 150,
"total_tokens": 160
},
"cloakr_metadata": {
"encrypted": true,
"pii_redacted": true,
"model_routed": "gpt-4o",
"processing_time_ms": 1250,
"audit_id": "audit_xyz789"
}
}

Streaming Response

When stream: true, responses are returned as Server-Sent Events:

const stream = await client.chat({
model: 'gpt-4o',
prompt: 'Write a story about AI security',
stream: true
});

for await (const chunk of stream) {
if (chunk.choices[0].delta?.text) {
process.stdout.write(chunk.choices[0].delta.text);
}
}

Error Responses

401 Unauthorized

{
"error": {
"type": "authentication_error",
"message": "Invalid API key",
"code": "invalid_api_key"
}
}

429 Too Many Requests

{
"error": {
"type": "rate_limit_error",
"message": "Rate limit exceeded",
"code": "rate_limit_exceeded",
"retry_after": 60
}
}

400 Bad Request

{
"error": {
"type": "invalid_request_error",
"message": "Invalid model specified",
"code": "invalid_model",
"param": "model"
}
}

Available Models

OpenAI Models

  • gpt-4o - GPT-4 Omni (latest)
  • gpt-4o-mini - GPT-4 Omni Mini
  • gpt-4-turbo - GPT-4 Turbo
  • gpt-3.5-turbo - GPT-3.5 Turbo

Anthropic Models

  • claude-3-opus - Claude 3 Opus
  • claude-3-sonnet - Claude 3 Sonnet
  • claude-3-haiku - Claude 3 Haiku

Internal Models

  • cloakr-gpt-4 - Cloakr-optimized GPT-4
  • cloakr-claude - Cloakr-optimized Claude

Rate Limits

PlanRequests per minuteTokens per minute
Free6010,000
Pro1,000100,000
EnterpriseCustomCustom

Best Practices

  1. Use streaming for long responses to improve user experience
  2. Set appropriate max_tokens to control costs
  3. Implement retry logic with exponential backoff
  4. Monitor usage through the dashboard
  5. Use user IDs for better audit trails

Next Steps