Skip to content

Instantly share code, notes, and snippets.

@mostlygeek
Last active December 2, 2024 19:42
Show Gist options
  • Save mostlygeek/da429769796ac8a111142e75660820f1 to your computer and use it in GitHub Desktop.
Save mostlygeek/da429769796ac8a111142e75660820f1 to your computer and use it in GitHub Desktop.
testing llama-swap settings for performance
#
# Run "write sname game in $lang" with different llama-swap profiles and llama.cpp
# greps logs for latest `eval time`
#
for model in "qwen-coder-32b-q4-nodraft" "qwen-coder-32b-q4" "qwen-coder-32b-q4-w-ctk"; do
for lang in "python" "typescript" "swift"; do
echo "Generating Snake Game in $lang using $model"
curl -s --url http://localhost:8080/v1/chat/completions -d "{\"messages\": [{\"role\": \"system\", \"content\": \"you only write code.\"}, {\"role\": \"user\", \"content\": \"write snake game in $lang\"}], \"temperature\": 0.1, \"top_k\": 1, \"model\":\"$model\"}" > /dev/null
curl -s --url http://localhost:8080/logs | grep 'eval time' | tail -n 2
echo ""
done
done
done
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment