Skip to content

Instantly share code, notes, and snippets.

@wassname
Last active April 26, 2026 02:28
Show Gist options
  • Select an option

  • Save wassname/d043a5c968097c18779b21b9078f7c9f to your computer and use it in GitHub Desktop.

Select an option

Save wassname/d043a5c968097c18779b21b9078f7c9f to your computer and use it in GitHub Desktop.
Coding agent compatible logs. concise, inline guidance, token effecient

Principles

  • don't use many tokens
  • make it so a dumb summary LLM can easily 1) see problems 2) have clues to diagnose
  • timing information

example of good log

  • with single line should statement inline in log that make it clear how it should look, distinguish from subtle failures, and give principled clue for diagnosis

  • table short have longest and least important lines last, so that humans can read it even with wrap around e.g short numeric columns first. long text columns last, notes or desc last

  • use tabulate plain for token effecient, not logging each step or epoch but just table

      ```py
      f"\n=== Sweep: {FAMILY_TITLES[family]} ===")
      print(tabulate(
          table_rows,
          headers=["model", "dataset", "condition", "seeds", "|SR|", "hgap(low)", "hgap(0)", "hgap(high)"],
          tablefmt="tsv",
          floatfmt="+.2f",
      ))
      ```
    
  • tqdm with mininterval 60, to record times, but not pollute logs

  • have headers for major stages with timestamp in for task_idx, task in enumerate(tqdm(tasks, desc=f"{model_name} {cot_label}", mininterval=60)):

  • avoid escape issues, for example don't have |dS| mean instead have abs(dS) or similar

  • loguru plain message, no colors, write to tqdm.write

    import os as _os
    logger.remove()
    # TODO change to config option and env vars are not very trackable
    _LOG_LEVEL = _os.environ.get("SSTEER_LOG_LEVEL", "INFO")
    logger.add(lambda x: tqdm.write(x, end=""), level=_LOG_LEVEL, colorize=False, format="{message}")
  • due false positives, if you have things that might trigger llm nanies like ending a process, or traces from red teaming, you might need to give context

  • due to tail, make the last 30 lines have most important context: main metric, argv/delta(config), main diagnostics, time, commit / branch, output dir, wandb etc

Examples of good logs (but should use tabulate tsv)


  coeff    logratio     pmass  passes    note
-------  ----------  --------  --------  -------------------------
 0           13.547  1         ✓
 0.0001      13.547  1         ✓
 0.001       12.641  1         ✓
 0.01        11.109  1         ✓
 0.02        13.625  1         ✓
 0.05        13.297  1         ✓
 0.1         10.844  1         ✓
 0.2          8.188  1         ✓
 0.2375       5.891  1         ✓         <-- selected
 0.275        5.635  0.949219  .         <-- breakdown pmass<floor

SHOULD: logratio should be monotonic untill breakdown. should fine a place where pmass breaks down and select just before itr, coeff=0 should have ~perfect pmass
---

example of good final 40 lines (note has output files, input args, main metric, and result table with important and short things first)

out: ./outputs/20260426T015439_ssteer_v2_exp_mean_38a4_eval.jsonl argv: eval_logratio.py --quick --model-name Qwen/Qwen3-0.6B --extraction ssteer_v2 --seed 42 --n-train-steps 5 main metric: abs_sr=6.867 [flags=quick,tasks=1/75]

cue abs_sr h_low h_0 h_high C_min/max pmass_min seed n commit model method flags run out 🟢 6.867 0.758 5.75 7.625 -0.50/+0.28 0.93 42 1 773f4d5 Qwen3-0.6Bssteer_v2/exp/mean quick,tasks=1/75 20260426T015439_ssteer_v2_exp_mean_38a4 ./outputs/20260426T015439_ssteer_v2_exp_mean_38a4_eval.jsonl



Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment