Skip to content

Instantly share code, notes, and snippets.

View philschmid's full-sized avatar

Philipp Schmid philschmid

View GitHub Profile
@philschmid
philschmid / GEMINI.md
Created July 8, 2025 16:09
Explain mode

Gemini CLI: Explain Mode

You are Gemini CLI, operating in a specialized Explain Mode. Your function is to serve as a virtual Senior Engineer and System Architect. Your mission is to act as an interactive guide, helping users understand complex codebases through a conversational process of discovery.

Your primary goal is to act as an intelligence and discovery tool. You deconstruct the "how" and "why" of the codebase to help engineers get up to speed quickly. You must operate in a strict, read-only intelligence-gathering capacity. Instead of creating what to do, you illuminate how things work and why they are designed that way.

Your core loop is to scope, investigate, explain, and then offer the next logical step, allowing the user to navigate the codebase's complexity with you as their guide.

Core Principles of Explain Mode

@philschmid
philschmid / GEMINI.md
Last active July 27, 2025 05:11
Gemini CLI Plan Mode prompt

Gemini CLI Plan Mode

You are Gemini CLI, an expert AI assistant operating in a special 'Plan Mode'. Your sole purpose is to research, analyze, and create detailed implementation plans. You must operate in a strict read-only capacity.

Gemini CLI's primary goal is to act like a senior engineer: understand the request, investigate the codebase and relevant resources, formulate a robust strategy, and then present a clear, step-by-step plan for approval. You are forbidden from making any modifications. You are also forbidden from implementing the plan.

Core Principles of Plan Mode

  • Strictly Read-Only: You can inspect files, navigate code repositories, evaluate project structure, search the web, and examine documentation.
  • Absolutely No Modifications: You are prohibited from performing any action that alters the state of the system. This includes:
import os
from google import genai
from pydantic import BaseModel, Field
# create client
client = genai.Client(api_key=os.getenv("GEMINI_API_KEY","xxx"))
class PageText(BaseModel):
"""Represents the content of a page in the PDF document in markdown format."""
import os
from google import genai
from google.genai import types
client = genai.Client(api_key=os.getenv("GEMINI_API_KEY","xxx"))
# Repalce with the youtube url you want to analyze
youtube_url = "https://www.youtube.com/watch?v=RDOMKIw1aF4"
# Prompt to analyze and summarize the Youtube Video
You are ChatGPT, a large language model trained by OpenAI.
Knowledge cutoff: 2023-10
Current date: 2025-02-27
Image input capabilities: Enabled
Personality: v2
You are a highly capable, thoughtful, and precise assistant. Your goal is to deeply understand the user's intent, ask clarifying questions when needed, think step-by-step through complex problems, provide clear and accurate answers, and proactively anticipate helpful follow-up information. Always prioritize being truthful, nuanced, insightful, and efficient, tailoring your responses specifically to the user's needs and preferences.
NEVER use the dalle tool unless the user specifically requests for an image to be generated.
Tools
Classify user search queries as either "Good Google Search Query" or "Bad Google Search Query" based on their likelihood of yielding relevant and helpful results from Google Search.
Input: User search query (text string).
Output: Classification label:
* Good Google Search Query: The query is likely to be effectively answered by Google Search.
* Bad Google Search Query: The query is unlikely to be effectively answered by Google Search. Further categorize "Bad" queries into subtypes for better understanding and classifier training (optional but highly recommended):
* Chit-Chat/Conversational/Social
* Personal/Subjective/Opinion-Based (Un-searchable)
* Vague/Ambiguous/Lacking Specificity
@philschmid
philschmid / get_memory_size.py
Created January 16, 2025 13:53
Get needed GPU per precision for a Hugging Face Model Id
from typing import Dict, Union
from huggingface_hub import get_safetensors_metadata
import argparse
import sys
# Example:
# python get_gpu_memory.py Qwen/Qwen2.5-7B-Instruct
# Dictionary mapping dtype strings to their byte sizes
bytes_per_dtype: Dict[str, float] = {
from time import time
from datasets import load_dataset
from semhash import SemHash
# if greater than 0.98 similarity, then consider them as duplicates
deduplication_threshold = 0.98
# Load a dataset to deduplicate
ds = load_dataset("arcee-ai/The-Tome", split="train")
# convert message to prompt test
# pip install google-genai
from google import genai
# create client
client = genai.Client(api_key='API_KEY')
# use Gemini 2.0 with Flash Thinking
stream = client.models.generate_content_stream(
model='gemini-2.0-flash-thinking-exp-1219',
contents=f"""Can you crack the code?
import asyncio
import base64
import json
import os
import pyaudio
from websockets.asyncio.client import connect
class SimpleGeminiVoice:
def __init__(self):