Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save vitali2y/4d33a2455b84e8e8551a1bdb711f4b4d to your computer and use it in GitHub Desktop.
Save vitali2y/4d33a2455b84e8e8551a1bdb711f4b4d to your computer and use it in GitHub Desktop.
Example of aichat + llama-cpp integration with using local gemma-3n-E4B-it-GGUF model

log2tsv.rs: aichat + llama-cpp + gemma-3n-E4B-it-GGUF

First terminal:

✗ grep -B 10 -A 3 gemma-3n ~/.config/aichat/config.yaml
  - type: openai-compatible
    name: local
    api_base: http://localhost:8000/v1
    api_key: ""
    # default_provider: local
    models:
      - name: cogito-v1-preview-qwen
        max_input_tokens: 131072
      - name: Mistral-Small-3.1-24B-Instruct-2503-GGUF
        max_input_tokens: 131072
      - name: gemma-3n-E4B-it-UD-Q5_K_XL.gguf
        max_input_tokens: 131072
      - name: mellum-4b-sft-rust.Q4_K_M.gguf
        max_input_tokens: 131072
✗ aichat --list-models | grep local
local:cogito-v1-preview-qwen
local:Mistral-Small-3.1-24B-Instruct-2503-GGUF
local:gemma-3n-E4B-it-UD-Q5_K_XL.gguf
local:mellum-4b-sft-rust.Q4_K_M.gguf
✗ llama-server -m ~/.cache/huggingface/hub/models--unsloth--gemma-3n-E4B-it-GGUF/snapshots/90fa8b0e431faeae50c305828bc260d6f71720e1/gemma-3n-E4B-it-UD-Q5_K_XL.gguf -c 16384 --temp 1.0 --top-k 64 --top-p 0.95 --min-p 0.0 --host 0.0.0.0 --port 8000 --threads $(nproc)  # https://huggingface.co/unsloth/gemma-3n-E4B-it-GGUF
~...~

Second terminal:

✗ MY_PROMPT=$(echo "Help to enhance Rust code below - what we can optimize or how to improve error handling. Also, take care about next issue:\nthread 'main' panicked at core-rs/examples/log2tsv.rs:123:50:\ncaptures failed: Error { kind: InvalidData, message: "stream did not contain valid UTF-8" }\n\n"; cat ./core-rs/examples/log2tsv.rs); time echo $MY_PROMPT | aichat -m local:gemma-3n-E4B-it-UD-Q5_K_XL.gguf
~...~
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment