Install MLX LM and openai:
pip install mlx-lm openai
Run the MLX LM server with:
mlx_lm.server
Make a Python script (like test.py). And include the following:
import openai
openai_client = openai.OpenAI(
api_key="placeholder-api", base_url="http://localhost:8080"
)
response = openai_client.chat.completions.create(
model='mlx-community/Meta-Llama-3-8B-Instruct-4bit',
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": f"Say this is a test!"},
],
)
# Process response.Run the script python test.py.
4. Curl
Response:
{ "id": "chatcmpl-fab9cd3b-c56e-4ec8-9d74-b11418ed04f8", "system_fingerprint": "fp_7f85614c-8626-45bd-9e37-570fd21e4062", "object": "chat.completions", "model": "Qwen/Qwen2-0.5B-Instruct-MLX", "created": 1720402910, "choices": [{ "index": 0, "logprobs": { "token_logprobs": [-0.015625, 0.0, -0.015625, -0.015625, 0.0, -0.390625, 0.0, 0.0, 0.0, 0.0], "top_logprobs": [], "tokens": [9707, 0, 2585, 646, 358, 7789, 498, 3351, 30, 151645] }, "finish_reason": "stop", "message": { "role": "assistant", "content": "Hello! How can I assist you today?" } }], "usage": { "prompt_tokens": 21, "completion_tokens": 10, "total_tokens": 31 } }