Mike Bijon mbijon

Opus 4.7 and Qwen3.6-35B-A3B Benchmark Comparison

Original Qwen3.6-35B-A3B benchmark table extended to include scores for Claude Opus 4.7, Claude Opus 4.6, Claude Sonnet 4.6, gpt-5.4-high, and gemini-3.1-pro-preview. Empty cells (—) indicate scores not published by the vendor or not applicable. Cross-vendor numbers are not directly comparable because each vendor uses different evaluation harnesses — see Methodology Flags at the bottom.

Coding Agent

Benchmark	Opus 4.7	Qwen3.5-27B	Qwen3.5-35BA3B	Gemma4-31B	Gemma4-26BA4B	Qwen3.6-35BA3B	Opus 4.6	Sonnet 4.6	gpt-5.4-high	gemini-3.1-pro-preview
SWE-bench Verified	87.6	75.0	70.0	52.0	17.4	73.4	80.8	79.6	77.2*	80.6
SWE-bench Multilingual	—	69.3	60.3	51.7	17.3	67.2	77.8	—	—	—

https://www.latent.space/p/ainews-top-local-models-list-april

Small / fast

ollama pull hf.co/KatyTestHistorical/SultrySilicon-7B-V2-GGUF:latest ollama pull hf.co/huihui-ai/Huihui-GLM-4.6V-Flash-abliterated-GGUF:latest

9B

ollama pull hf.co/bartowski/Gemma-2-Ataraxy-9B-GGUF:latest

You are an expert in prompt engineering, specializing in optimizing AI code assistant instructions. Your task is to analyze and improve the instructions for Claude Code. Follow these steps carefully:

Analysis Phase: Review the chat history in your context window.

Then, examine the current Claude instructions, commands and config <claude_instructions> /CLAUDE.md /.claude/commands/*

	instructions=$(mktemp /tmp/gpt-5.5-instructions.XXXXXX) && \
	jq -r '.models[] \| select(.slug=="gpt-5.5") \| .base_instructions' \
	~/.codex/models_cache.json \| \
	grep -vi 'goblins' > "$instructions" && \
	codex -m gpt-5.5 -c "model_instructions_file=\"$instructions\""

	###
	# CI check to prevent version regression
	# Fail builds if NLTK is below the fixed version.
	###
	import nltk
	from packaging.version import Version

	MIN_SAFE = Version("X.Y.Z") # set to your validated fixed version
	if Version(nltk.__version__) < MIN_SAFE:
	raise SystemExit(f"NLTK too old: {nltk.__version__} < {MIN_SAFE}")

	import pathlib

	# Avoid insecure segments in link names.
	# 'tar' is a tarfile open for reading.
	for member in tar.getmembers():
	if member.linkname and '..' in pathlib.Path(member.linkname).parts:
	raise OSError("Tarfile with insecure segment ('..') in linkname")

	# Now safe to extract members with the data filter.
	tar.extractall(filter="data")

	# Secrets, Evals, and Unsafe practices
	grep -r "password\\|secret\\|api_key\\|token" /repo -e .env -e .env.local --include=".ts" --include=".tsx" --include=".js" --include=".jsx" 2>/dev/null \| head -20
	grep -r "http://" /repo/src --include=".ts" --include=".tsx" 2>/dev/null \| grep -v "https://" \| head -20
	grep -r "(eval\|Function)\(\|dangerouslySetInnerHTML\|__html\|v-html" /repo 2>/dev/null \| head -20
	grep -r "public/*/.html" /repo 2>/dev/null \| head -20
	grep -r "localStorage\|sessionStorage\|document\.cookie" /repo 2>/dev/null \| head -20
	grep -r "userAgent\|navigator\." /repo 2>/dev/null \| head -20
	grep -r "maxLength\|minLength\|pattern=\|validation\|sanitize" /repo/src/components 2>/dev/null \| head -20

	# JS and NPM

	You are Manus, an AI agent created by the Manus team.

	You excel at the following tasks:
	1. Information gathering, fact-checking, and documentation
	2. Data processing, analysis, and visualization
	3. Writing multi-chapter articles and in-depth research reports
	4. Creating websites, applications, and tools
	5. Using programming to solve various problems beyond development
	6. Various tasks that can be accomplished using computers and the internet

	# Shim the Kimi model by Moonshot into Claude Code
	export ANTHROPIC_AUTH_TOKEN={Your Kimi / Moonshot API key}
	export ANTHROPIC_BASE_URL=https://api.moonshot.ai/anthropic

	Encoded Traversal Strings:

	../
	..\
	..\/
	%2e%2e%2f
	%252e%252e%252f
	%c0%ae%c0%ae%c0%af
	%uff0e%uff0e%u2215
	%uff0e%uff0e%u2216