Skip to content

Instantly share code, notes, and snippets.

@disler
Last active April 5, 2025 14:30
Show Gist options
  • Save disler/7798d826102091649824adfd05c55080 to your computer and use it in GitHub Desktop.
Save disler/7798d826102091649824adfd05c55080 to your computer and use it in GitHub Desktop.
Prompt Chaining with QwQ, Qwen, o1-mini, Ollama, and LLM

Prompt Chaining with QwQ, Qwen, o1-mini, Ollama, and LLM

Here we explore prompt chaining with local reasoning models in combination with base models. With shockingly powerful local models like QwQ and Qwen, we can build some powerful prompt chains that let us tap into their capabilities in a immediately useful, local, private, AND free way.

Explore the idea of building prompt chains where the first is a powerful reasoning model that generates a response, and then use a base model to extract the response.

Play with the prompts and models to see what works best for your use cases. Use the o1 series to see how qwq compares.

Setup

  • Bun (to run bun run chain.ts ...)
  • uv (to run uv run python chain.py ...)
  • Ollama (for qwen and qwq)
    • ollama pull qwen2.5-coder:14b
    • ollama pull qwen2.5-coder:32b
    • ollama pull qwq:32b
  • LLM (for o1-mini)

Standalone prompting with ollama and llm

ollama run qwen2.5-coder:14b "ping" ollama run qwq:32b "ping" llm -m o1-mini "ping"

Problem: Chain of thought is included in qwq output. Solution: Use Prompt Chain 'Reasoner' + 'Extraction' pattern bun run chain.ts "QwQ CANT BE: Local Prompt Chaining with Alibaba's QwQ reasoning model (bun + ollama)" "prompt_title_generation_reasoner.txt,prompt_title_generation_extraction.txt" "ollama:qwq:32b,ollama:qwen2.5-coder:32b"

Example Usage of Prompt Chaining with Chain.ts or Chain.py

Example Command

bun run chain.ts "<starting value>" "<prompt-file-1.txt,prompt-file-2.txt,...>" "<ollama:model-1,llm:model-2,...>" or uv run python chain.py "<starting value>" "<prompt-file-1.txt,prompt-file-2.txt,...>" "<ollama:model-1,llm:model-2,...>"

Single Prompt

bun run chain.ts "start-value-with-bun" "prompt_simple.txt" "ollama:qwen2.5-coder:14b" or uv run python chain.py "start-value-with-uv-python" "prompt_simple.txt" "ollama:qwen2.5-coder:14b"

Prompt Chain with 2 Prompts

bun run chain.ts "start-value" "prompt_simple.txt,prompt_simple.txt" "ollama:qwen2.5-coder:14b,ollama:qwen2.5-coder:14b" or uv run python chain.py "start-value" "prompt_simple.txt,prompt_simple.txt" "ollama:qwen2.5-coder:14b,ollama:qwen2.5-coder:14b"

Prompt Chain with 3 Prompts

bun run chain.ts "start-value" "prompt_simple.txt,prompt_simple.txt,prompt_simple.txt" "ollama:qwen2.5-coder:14b,ollama:qwen2.5-coder:14b,ollama:qwen2.5-coder:14b"

Prompt Chain with 10 Prompts (Infinite Chain)

bun run chain.ts "start-value" "prompt_simple.txt,prompt_simple.txt,prompt_simple.txt,prompt_simple.txt,prompt_simple.txt,prompt_simple.txt,prompt_simple.txt,prompt_simple.txt,prompt_simple.txt,prompt_simple.txt" "ollama:qwen2.5-coder:14b,ollama:qwen2.5-coder:14b,ollama:qwen2.5-coder:14b,ollama:qwen2.5-coder:14b,ollama:qwen2.5-coder:14b,ollama:qwen2.5-coder:14b,ollama:qwen2.5-coder:14b,ollama:qwen2.5-coder:14b,ollama:qwen2.5-coder:14b,ollama:qwen2.5-coder:14b"


Reasoning Model + Base Model Combinations

Single Prompt (Reasoning Model Only)

Mermaid Chart Generation

QwQ Reasoning Model

bun run chain.ts "Create a mermaid chart of the software development lifecycle with ai tooling in 5 steps." "prompt_mermaid_reasoner.txt" "ollama:qwq:32b" > output/pc1_mermaid_qwq_32b.txt

o1-mini Reasoning Model

bun run chain.ts "Create a mermaid chart of the software development lifecycle with ai tooling in 5 steps." "prompt_mermaid_reasoner.txt" "llm:o1-mini" > output/pc1_mermaid_o1_mini.txt

Prompt Chain with Two Chain Commands (Reasoning Model + Base Model)

Content Title Generation

QwQ Reasoning Model + qwen2.5-coder:32b Base Model

bun run chain.ts "QwQ CANT BE: Local Prompt Chaining with Alibaba's QwQ reasoning model (bun + ollama)" "prompt_title_generation_reasoner.txt,prompt_title_generation_extraction.txt" "ollama:qwq:32b,ollama:qwen2.5-coder:32b" > output/pc2_title_generation_qwq_32b.txt

o1-mini Reasoning Model + gpt-4o-mini Base Model

bun run chain.ts "QwQ CANT BE: Local Prompt Chaining with Alibaba's QwQ reasoning model (bun + ollama)" "prompt_title_generation_reasoner.txt,prompt_title_generation_extraction.txt" "llm:o1-mini,llm:gpt-4o-mini" > output/pc2_title_generation_o1_mini.txt

Prompt Chain with Three Chain Commands (Reasoning Model + Base Model + Base Model)

Architecture Design

QwQ Reasoning Model + qwen2.5-coder:32b Base Model + qwen2.5-coder:32b Base Model

bun run chain.ts "Design a low-latency event betting/gambling platform" "prompt_architecture_design_reasoner.txt,prompt_architecture_design_extraction.txt,prompt_architecture_design_simplifier.txt" "ollama:qwq:32b,ollama:qwen2.5-coder:32b,ollama:qwen2.5-coder:32b" > output/pc3_architecture_design_qwq_32b.txt

o1-mini Reasoning Model + gpt-4o-mini Base Model + gpt-4o-mini Base Model

bun run chain.ts "Design a low-latency event betting/gambling platform" "prompt_architecture_design_reasoner.txt,prompt_architecture_design_extraction.txt,prompt_architecture_design_simplifier.txt" "llm:o1-mini,llm:gpt-4o-mini,llm:gpt-4o-mini" > output/pc3_architecture_design_o1_mini.txt

o1-preview Reasoning Model + gpt-4o Base Model + gpt-4o Base Model

bun run chain.ts "Design a low-latency event betting/gambling platform" "prompt_architecture_design_reasoner.txt,prompt_architecture_design_extraction.txt,prompt_architecture_design_simplifier.txt" "llm:o1-preview,llm:gpt-4o,llm:gpt-4o" > output/pc3_architecture_design_o1_preview.txt

import subprocess
import sys
def read_and_replace(
file_path: str, input_str: str, template_replace: str = "{{input}}"
) -> str:
"""Replace input in the prompt file"""
with open(file_path, "r") as f:
file_content = f.read()
return file_content.replace(template_replace, input_str)
def prompt_chain(initial_input: str, prompts: list[str], models: list[str]) -> str:
"""Chain processing of prompts and models"""
if len(prompts) != len(models):
raise ValueError("Number of prompts and models must match.")
result = initial_input
for i, (prompt_file, model) in enumerate(zip(prompts, models)):
# Replace {{input}} in the prompt file
replaced_prompt = read_and_replace(prompt_file, result)
# Run the model based on prefix
print(f"{'⭐️' * (i + 1)} Prompt #{i + 1}")
print("-" * 32)
print(
f"""
🤖 Model: '{model}'
📄 Input file: '{prompt_file}'
📝 Prompt content:
{replaced_prompt}
🎯 Result:
"""
)
if model.startswith("ollama:"):
ollama_model = model.replace("ollama:", "")
output = subprocess.run(
["ollama", "run", ollama_model, replaced_prompt],
capture_output=True,
text=True,
)
result = output.stdout.strip()
elif model.startswith("llm:"):
llm_model = model.replace("llm:", "")
output = subprocess.run(
["llm", "-m", llm_model, replaced_prompt],
capture_output=True,
text=True,
)
result = output.stdout.strip()
else:
raise ValueError(f"Unsupported model prefix in: {model}")
print(result)
return result
def main():
if len(sys.argv) < 4:
print(
'Usage: python chain.py "<first input value>" <prompt1,prompt2,...> <model1,model2,...>"'
)
sys.exit(1)
initial_input = sys.argv[1]
prompt_files = sys.argv[2].split(",")
models = sys.argv[3].split(",")
try:
final_result = prompt_chain(initial_input, prompt_files, models)
except Exception as e:
print("Error:", str(e))
sys.exit(1)
if __name__ == "__main__":
main()
// @ts-nocheck
import { $, ProcessOutput } from "bun";
import * as fs from "fs";
// Replace input in the prompt file
async function readAndReplace(filePath: string, input: string, templateReplace: string = "{{input}}"): Promise<string> {
const fileContent = fs.readFileSync(filePath, "utf8");
return fileContent.replace(templateReplace, input);
}
// Chain processing of prompts and models
async function promptChain(initialInput: string, prompts: string[], models: string[]): Promise<string> {
if (prompts.length !== models.length) {
throw new Error("Number of prompts and models must match.");
}
let result = initialInput;
for (let i = 0; i < prompts.length; i++) {
const promptFile = prompts[i];
const model = models[i];
// Replace {{input}} in the prompt file
const replacedPrompt = await readAndReplace(promptFile, result);
// Run the model based on prefix
console.log(`${'⭐️'.repeat(i + 1)} Prompt #${i + 1}
--------------------------------`);
console.log(`
🤖 Model: '${model}'
📄 Input file: '${promptFile}'
📝 Prompt content:
${replacedPrompt}
🎯 Result:
`);
let output: ProcessOutput;
if (model.startsWith("ollama:")) {
const ollamaModel = model.replace("ollama:", "");
output = await $`ollama run ${ollamaModel} "${replacedPrompt}"`;
} else if (model.startsWith("llm:")) {
const llmModel = model.replace("llm:", "");
output = await $`llm -m ${llmModel} ${replacedPrompt}`;
} else {
throw new Error(`Unsupported model prefix in: ${model}`);
}
result = await output.text().trim(); // Update result with the output of the model
}
return result;
}
// Main logic
(async () => {
const [initialInput, promptFilesArg, modelsArg] = process.argv.slice(2); // First argument is initial input, second is prompts, third is models
const promptFiles = promptFilesArg.split(","); // Split prompt files on comma
const models = modelsArg.split(","); // Split models on comma
if (!initialInput || promptFiles.length === 0 || models.length === 0) {
console.error("Usage: bun run chain.ts \"<first input value>\" <prompt1,prompt2,...> <model1,model2,...>");
process.exit(1);
}
try {
const finalResult = await promptChain(initialInput, promptFiles, models);
} catch (error) {
console.error("Error:", error.message);
process.exit(1);
}
})();
<purpose>
Extract the user-input into a detailed markdown document that describes the architecture of the system.
</purpose>
<instructions>
<instruction>Based on user-input, extract the architecture of the system into a detailed markdown document</instruction>
<instruction>Be sure to explain the primary data models and how they are used in the system</instruction>
<instruction>Keep explanations thorough but concise</instruction>
<instruction>Break down the system into major components and explain each one in their own section</instruction>
<instruction>At the beginning of the document, include a explanation of the business value of this solution and then the high level overview of the system</instruction>
</instructions>
<examples>
<example>
<user-request>
Design a basic e-commerce platform
</user-request>
<design-response><![CDATA[
# Architecture Overview
Let's design a scalable e-commerce platform with core functionality.
## Business Value
The system will allow users to browse products, add them to a cart, and complete a purchase which will increase customer satisfaction and drive more sales.
## High Level Overview
The system will consist of a frontend, API gateway, and microservices for products and orders.
## System Components
```mermaid
graph TD
Client["Web Frontend"]
API["API Gateway"]
Auth["Auth Service"]
Products["Product Service"]
Orders["Order Service"]
DB[(Database)]
Cache[(Cache)]
Client --> API
API --> Auth
API --> Products
API --> Orders
Products --> DB
Orders --> DB
Products --> Cache
```
## Component Details
1. Frontend
- React.js SPA for dynamic user experience
- Server-side rendering for SEO
- Trade-off: Added complexity vs better UX
2. API Gateway
- Express.js/Node.js for high throughput
- Handles routing and basic validation
- Trade-off: Additional network hop vs cleaner architecture
3. Services
- Microservices for independent scaling
- Each service owns its data model
- Trade-off: Deployment complexity vs team autonomy
## Key Decisions
- Relational DB for orders (data consistency)
- Redis cache for products (fast reads)
- JWT auth for stateless scaling
]]></design-response>
</example>
</examples>
<user-prompt>
{{input}}
</user-prompt>
<purpose>
You are a world-class software architect and technical leader.
You excel at designing robust, scalable software systems while clearly explaining your reasoning and trade-offs.
</purpose>
<instructions>
<instruction>Based on the user's requirements, design an appropriate software architecture</instruction>
<instruction>Walk through your design process step-by-step, explaining key decisions</instruction>
<instruction>For each major component, explain its core responsibilities, key technologies/frameworks recommended, trade-offs considered, integration points with other components</instruction>
<instruction>Consider and address scalability requirements, security considerations, maintainability, development complexity, cost implications</instruction>
<instruction>Provide clear diagrams using mermaid syntax to visualize the architecture</instruction>
<instruction>Keep explanations thorough but concise</instruction>
<instruction>Focus on practical, proven solutions over bleeding-edge technologies</instruction>
<instruction>Be sure to explain the primary data models and how they are used in the system</instruction>
<instruction>At the beginning of the document, include a explanation of the business value of this solution and then the high level overview of the system</instruction>
</instructions>
<user-prompt>
{{input}}
</user-prompt>
<purpose>
Simplify the architecture of the system based on the user-prompt to the most important data models, components and relationships.
</purpose>
<instructions>
<instruction>Based on user-input, extract the architecture of the system into a detailed markdown document</instruction>
<instruction>Simplify the architecture to the most important data models, components and relationships</instruction>
<instruction>Keep explanations thorough but concise</instruction>
<instruction>Maintain the business value of the system and the high level overview of the system</instruction>
</instructions>
<user-prompt>
{{input}}
</user-prompt>
<purpose>
Extract the final answer/solution from the user-input
</purpose>
<instructions>
<instruction>Only extract the final answer/solution from the user-input</instruction>
<instruction>Do not include any other text or instructions in your response</instruction>
<instruction>Output the code only in a markdown code block</instruction>
</instructions>
<user-input>
{{input}}
</user-input>
<purpose>
Generate code based on the users-input
</purpose>
<instructions>
<instruction>Use the user-input to generate the code</instruction>
<instruction>Generate valid code in the specified language based on the user-input</instruction>
<instruction>Use the function signature as a guide to generate the code</instruction>
<instruction>Don't overthink your implementation details, just focus on the CORE logic</instruction>
</instructions>
<user-input>
{{input}}
</user-input>
<purpose>
Extract the final result of the mermaid chart from the user-prompt.
</purpose>
<instructions>
<instruction>Extract ONLY the final result of the mermaid chart from the user-prompt.</instruction>
<instruction>Be sure to include all nodes and edges.</instruction>
<instruction>We must extract a valid mermaid chart.</instruction>
<instruction>Respond with the mermaid chart only.</instruction>
<instruction>Do not wrap the mermaid chart in markdown code blocks. Respond with the mermaid chart only.</instruction>
</instructions>
<user-prompt>
{{input}}
</user-prompt>
<purpose>
You are a world-class expert at creating mermaid charts.
You follow the instructions perfectly to generate mermaid charts.
</purpose>
<instructions>
<instruction>Based on the user-prompt, create the corresponding mermaid chart.</instruction>
<instruction>Be very precise with the chart, every node and edge must be included.</instruction>
<instruction>Use double quotes for text in the chart</instruction>
<instruction>Respond with the mermaid chart only.</instruction>
<instruction>Do not wrap the mermaid chart in markdown code blocks. Respond with the mermaid chart only.</instruction>
<instruction>If you see a file-content section, use the content to help create the chart.</instruction>
<instruction>Keep node labels short and concise.</instruction>
<instruction>Avoid embedding links in the chart.</instruction>
</instructions>
<examples>
<example>
<user-chart-request>
Create a flowchart that shows A flowing to E. At C, branch out to H and I.
</user-chart-request>
<chart-response><![CDATA[
graph LR;
A
B
C
D
E
H
I
A --> B
A --> C
A --> D
C --> H
C --> I
D --> E
]]></chart-response>
</example>
<example>
<user-chart-request>
Build a pie chart that shows the distribution of Apples: 40, Bananas: 35, Oranges: 25.
</user-chart-request>
<chart-response><![CDATA[
pie title Distribution of Fruits
"Apples" : 40
"Bananas" : 35
"Oranges" : 25
]]></chart-response>
</example>
<example>
<user-chart-request>
State diagram for a traffic light. Still, Moving, Crash.
</user-chart-request>
<chart-response><![CDATA[
stateDiagram-v2
[*] --> Still
Still --> [*]
Still --> Moving
Moving --> Still
Moving --> Crash
Crash --> [*]
]]></chart-response>
</example>
<example>
<user-chart-request>
Create a timeline of major social media platforms from 2002 to 2006.
</user-chart-request>
<chart-response><![CDATA[
timeline
title History of Social Media Platforms
2002 : LinkedIn
2004 : Facebook
: Google
2005 : Youtube
2006 : Twitter
]]></chart-response>
</example>
</examples>
<user-prompt>
{{input}}
</user-prompt>
<purpose>
Suffix the user-input with ' pong'
</purpose>
<user-input>
{{input}}
</user-input>
<purpose>
Extract the titles from the final output of the user-input
</purpose>
<instructions>
<instruction>Create a list of titles from the final output of the user-input</instruction>
<instruction>Extract ONLY the titles, no other text</instruction>
<instruction>Place the titles in a json array that is parsable by JSON.parse()</instruction>
<instruction>Extract titles from the final output of the user-input to create the json array</instruction>
<instruction>Use the example-output as a guide on how to format the json array</instruction>
<instruction>Respond exclusively in json in the output format shown in the example-output</instruction>
</instructions>
<user-input>
{{input}}
</user-input>
<example-output>
[
"title-1",
"title-2",
"title-3"
]
</example-output>
<purpose>
Create eye catching, high SEO, click worthy titles
</purpose>
<instructions>
<instruction>Use the user-input to create relevant titles</instruction>
<instruction>Base your titles on the user-input but be creative with your generations</instruction>
<instruction>The goal is to maximize CTR</instruction>
<instruction>The titles should be unique</instruction>
<instruction>The titles should be between 40-85 characters</instruction>
<instruction>If not specified create 5-10 titles</instruction>
<instruction>Remember we're looking for interesting, engaging, arousing titles.</instruction>
</instructions>
<user-input>
{{input}}
</user-input>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment