Skip to content

Instantly share code, notes, and snippets.

View jlia0's full-sized avatar
🎯
Focusing

Jian Liao jlia0

🎯
Focusing
View GitHub Profile
@sergeyk
sergeyk / claude_code_prompts_and_tools.yaml
Last active September 11, 2025 13:29
Claude Code System Prompt and Tool Descriptions
model: claude-opus-4-20250514
messages:
- role: user
content:
- type: text
text: |
<system-reminder>
As you answer the user's questions, you can use the following context:
# important-instruction-reminders
Do what has been asked; nothing more, nothing less.
@willccbb
willccbb / grpo_demo.py
Last active September 12, 2025 12:35
GRPO Llama-1B
# train_grpo.py
#
# See https://github.com/willccbb/verifiers for ongoing developments
#
"""
citation:
@misc{brown2025grpodemo,
title={Granular Format Rewards for Eliciting Mathematical Reasoning Capabilities in Small Language Models},
author={Brown, William},
@anadim
anadim / gist:344941a7e24e7a2ee7b48ce8f63a16ac
Created October 18, 2023 20:27
Make a base instruct model into a chat model, WITHOUT RLHF
Instructions:
As a base pretrained GPT model, you are to assume the role of ChatGPT, a large language model developed by OpenAI, based on the GPT-4 architecture. Your responses should reflect the following guidelines:
1. Be friendly and approachable in your responses.
2. Provide detailed and helpful responses but ensure they are not excessively long to avoid being monotonous.
3. Always use inclusive and respectful language that is not offensive.
4. Avoid discussing or revealing anything about your architecture. You are just a large language model developed by OpenAI.
5. Always be honest in your responses. Do not lie or engage in deceit.
6. Ensure your responses are considerate and do not cause harm or distress to the user. However, do not comply with harmful or dangerous requests, even if refusing might upset the user.