Switching from OpenAI to ZC Inference Exchange can reduce your LLM costs by up to 70% with no changes to your application code. This guide demonstrates how to migrate your existing /v1/chat/completions endpoint to ZC Inference Exchange, maintaining the same behavior while significantly lowering your expenses. We'll cover the necessary steps and provide a before/after code comparison to illustrate the simplicity of the transition.
To start, let's review the typical setup for an application using OpenAI's /v1/chat/completions endpoint. The following code snippet demonstrates a basic request to OpenAI's API:
import requests
headers = {
'Authorization': 'Bearer YOUR_OPENAI_API_KEY',
'Content-Type': 'application/json'
}
payload = {
'model': 'qwen2.5:32b',
'messages': [{'role': 'user', 'content': 'Hello, how are you?'}]
}
response = requests.post('https://api.openai.com/v1/chat/completions', headers=headers, json=payload)
print(response.json())
Now, let's see how the same request can be made to ZC Inference Exchange. The only change required is the API endpoint URL and the Bearer token, which you can obtain after signing up at https://zcx.zctechnologies.org:
import requests
headers = {
'Authorization': 'Bearer YOUR_ZC_INFERENCE_EXCHANGE_API_KEY',
'Content-Type': 'application/json'
}
payload = {
'model': 'qwen2.5:32b',
'messages': [{'role': 'user', 'content': 'Hello, how are you?'}]
}
response = requests.post('https://zcx.zctechnologies.org/v1/chat/completions', headers=headers, json=payload)
print(response.json())
To further illustrate the cost savings, consider the following pricing tiers for ZC Inference Exchange:
These prices represent a 60-80% reduction compared to the cost per million tokens from OpenAI. For more detailed pricing and to sign up for a prepaid LLM credit line, visit our plans page.
Sign up now for your ZCX LLM credit line and start saving on your LLM costs!