Skip to content

Instantly share code, notes, and snippets.

@joshrotenberg
Last active March 20, 2026 19:24
Show Gist options
  • Select an option

  • Save joshrotenberg/92199d3143e8c3ab7578d3497d140260 to your computer and use it in GitHub Desktop.

Select an option

Save joshrotenberg/92199d3143e8c3ab7578d3497d140260 to your computer and use it in GitHub Desktop.
Autonomous GitHub Issue Runner design doc

Autonomous GitHub Issue Runner

Overview

This document describes an agent-agnostic system for turning GitHub issues into ready-to-merge pull requests with minimal human intervention. The system is designed to operate asynchronously against one or more repositories using standard developer infrastructure: GitHub APIs, local git worktrees or clones, language/toolchain commands, and one or more coding agents.

The system is intentionally not tied to a specific agent vendor, prompt format, or execution engine. Any capable coding agent may be used as long as it can:

  • read repository context
  • execute bounded tasks non-interactively
  • emit machine-consumable results
  • operate inside an isolated working directory

The core product goal is to treat GitHub issues and pull requests as the primary human-agent collaboration surface. Interactive chat becomes secondary. The system should be able to:

  • watch for new or updated issues
  • decide whether an issue is ready for work
  • request clarification when it is not
  • decompose ready issues into execution stages
  • run implementation work in isolation
  • open and update pull requests
  • respond to review feedback
  • optionally merge approved work

Goals

  • Convert eligible GitHub issues into implementation runs automatically.
  • Keep humans in GitHub as the main control plane.
  • Support multiple execution stages rather than one monolithic agent task.
  • Allow repositories to define policy for priority, scope, approval, and merge.
  • Preserve auditability through persisted plans, run records, comments, and PRs.
  • Be portable across different coding agents and execution providers.

Non-Goals

  • Replacing repository maintainers' judgment about product direction.
  • Fully autonomous work on ambiguous or under-specified issues by default.
  • Solving all project management needs beyond issue-to-PR automation.
  • Embedding agent-specific prompt logic into the core domain model.
  • Requiring CI-first deployment; the system must also run from a laptop or VM.

Operating Model

The system runs as a long-lived process or scheduled job against one or more repositories. It polls for repository events or receives webhooks, evaluates issues against repository policy, and advances them through a staged workflow.

GitHub becomes the source of truth for:

  • issue intent
  • clarification requests and responses
  • assignment/ownership visibility
  • PR review and merge status

The automation process becomes responsible for:

  • selecting work
  • creating an execution plan
  • coordinating agent tasks
  • tracking progress and retries
  • reflecting state back into GitHub

Core Concepts

Repository Policy

Per-repository configuration that controls:

  • which issue types are eligible
  • priority rules
  • when clarification is required
  • allowed execution stages
  • branch naming strategy
  • PR and merge policy
  • concurrency limits
  • retry behavior

Issue Candidate

A normalized representation of a GitHub issue enriched with repository context, labels, comments, relationships, and inferred readiness.

Readiness Decision

A triage decision for an issue. Typical outcomes:

  • ready to plan
  • needs clarification
  • blocked
  • duplicate
  • out of scope
  • already in progress
  • waiting on external dependency

Work Plan

An internal execution plan derived from an issue. This is not necessarily a user-authored manifest. It is a machine-generated plan that records:

  • the issue being addressed
  • the execution stages
  • per-stage context and constraints
  • isolation requirements
  • success criteria

Workflow Template

A repository-configurable template that defines which stages are available for a given kind of issue and how those stages are ordered.

Workflow templates allow the system to vary its behavior by work type instead of forcing every issue through the same pipeline. For example:

  • bug issues may go directly from planning to implementation and validation
  • feature issues may require clarification and design planning first
  • research issues may produce comments or reports instead of pull requests
  • chores may use a shorter implementation and validation path

Templates should be selected by issue classification, not only by literal title prefix. Conventional commit style prefixes such as fix:, feat:, chore:, and research: are useful signals, but labels, repository rules, and issue metadata may also contribute.

Stage

A bounded unit of work within a run. Stages are explicit so the system can observe, retry, and gate progress. Typical stages:

  • triage
  • clarify
  • plan
  • implement
  • test
  • review
  • open_pr
  • revise_pr
  • merge

Run

A persisted attempt to move one issue through some or all stages. A run has:

  • a stable run id
  • a target repository and issue
  • a selected plan
  • per-stage status
  • artifacts and logs
  • branch and PR references

Lease

A time-bounded claim on an issue or PR to prevent duplicate work across multiple automation processes. Leases should be renewable and recoverable.

End-to-End Workflow

1. Observe

The system watches for:

  • new issues
  • updated issues
  • new issue comments
  • label changes
  • PR review activity for automation-owned PRs

2. Triage

For each candidate issue, the system evaluates:

  • title and body quality
  • labels and type prefixes
  • duplicate or blocked status
  • dependency references
  • whether the issue is already assigned or actively worked
  • whether enough acceptance criteria exist

Priority may be influenced by conventions such as:

  • bug:* is high priority
  • feature:* is medium priority
  • chore:* is low priority
  • research:* is opportunistic

3. Clarify or Plan

If the issue is not ready, the system comments with targeted questions and records the issue as waiting for response.

If the issue is ready, the system creates a work plan and acquires a lease.

4. Execute

The system runs staged agent tasks in isolated environments. Each stage should receive only the context it needs. The common pattern is:

  • plan: interpret issue, scan repository, propose implementation approach
  • implement: make code changes
  • test: run repository validation commands
  • review: inspect changes and identify fixes
  • open_pr: create branch, push, and open PR

5. Monitor and Revise

After PR creation, the system watches for:

  • CI failures
  • requested changes
  • merge conflicts
  • maintainer comments

It can then run follow-up stages such as:

  • fix_ci
  • address_review
  • rebase_or_merge_base

6. Merge or Escalate

If policy allows and all requirements are satisfied, the system may merge. Otherwise, it leaves the PR ready for human approval and records the final automation state.

Recommended State Machine

At the issue level:

  • new
  • triaging
  • needs_clarification
  • ready
  • planning
  • in_progress
  • waiting_on_review
  • waiting_on_human
  • blocked
  • completed
  • closed_unresolved

At the run level:

  • queued
  • leased
  • running
  • succeeded
  • failed
  • canceled
  • abandoned

At the stage level:

  • pending
  • running
  • succeeded
  • failed
  • skipped
  • waiting

The distinction matters. One issue may have multiple runs over time, and one run contains multiple stages.

System Components

Repository Adapter

Responsible for:

  • fetching issues, comments, labels, and PRs
  • creating comments and PRs
  • applying labels and status markers
  • reading review decisions and mergeability state

This should isolate all platform-specific behavior. GitHub is the first target, but the higher-level engine should not depend on GitHub-specific data shapes.

Triage Engine

Responsible for:

  • evaluating issue readiness
  • deciding whether to clarify, plan, or ignore
  • prioritizing work
  • enforcing repo policy and filters

This should be deterministic where possible and only invoke an agent when judgment is needed.

Planner

Responsible for:

  • turning a ready issue into a staged execution plan
  • deciding whether work should be split into sub-tasks
  • defining stage ordering and dependencies
  • attaching context, constraints, and success criteria to each stage

Execution Engine

Responsible for:

  • creating isolated work environments
  • invoking agent tasks
  • capturing outputs and artifacts
  • enforcing timeouts, retries, and resource limits

The execution engine must be agent-agnostic. Agent-specific adapters should translate stage requests into concrete CLI/API invocations.

Source Control Manager

Responsible for:

  • creating and cleaning up branches and worktrees
  • committing with repository conventions
  • rebasing or merging base changes
  • pushing updates for PR creation and maintenance

Validation Engine

Responsible for:

  • running tests, linters, and formatters
  • collecting structured success/failure data
  • surfacing actionable failure context for follow-up stages

State Store

Responsible for:

  • run persistence
  • lease tracking
  • stage status
  • branch/PR associations
  • retry counts
  • audit trail

This may be file-based for local operation and later moved to a database if multi-process coordination becomes important.

Agent Interaction Model

The system should not launch one long-running agent session for the entire issue. Instead, it should create bounded tasks with narrowly scoped context.

Guidelines:

  • each stage should have a clear objective
  • each stage should have a clear success/failure contract
  • each stage should receive only necessary repository and issue context
  • outputs should be machine-readable when possible
  • stages should be restartable without relying on hidden conversation state

Examples:

  • planning task: inspect issue and codebase, return proposed approach and files
  • implementation task: apply a specific plan in an isolated branch
  • review task: check for bugs, regressions, and missing tests
  • PR task: summarize changes, risks, and validation results

This bounded-stage model improves:

  • repeatability
  • observability
  • vendor portability
  • failure recovery

Priority and Selection Policy

An initial selection policy may consider:

  • issue type prefix or label
  • explicit opt-in labels such as automation:ready
  • exclusion labels such as needs-design or blocked
  • severity or business priority
  • freshness of recent discussion
  • absence of an active lease
  • repository concurrency limits

The system should prefer explicit repository policy over implicit heuristics.

Clarification Strategy

When an issue is not ready, the system should comment with focused questions rather than broad requests for "more detail."

Good clarification prompts target specific gaps:

  • missing acceptance criteria
  • unclear expected behavior
  • ambiguity about compatibility or migration
  • uncertainty about desired UX or API shape
  • missing reproduction steps for bugs

The clarification stage should have rate limits and escalation thresholds to avoid noisy loops.

Isolation Model

Every implementation-oriented run should execute in an isolated environment. Suitable strategies include:

  • git worktree
  • lightweight clone
  • ephemeral container

Requirements:

  • reproducible starting point
  • independent branch or workspace
  • no interference with other active tasks
  • clear cleanup policy

Isolation should be owned by the execution layer, not by the agent itself.

Branch and Commit Strategy

Recommended defaults:

  • one automation branch per issue or run
  • conventional commits for local progress and final PR history
  • deterministic branch naming derived from issue id and slug

Examples:

  • automation/123-fix-login-timeout
  • automation/456-feature-export-report

Conventional commit prefixes can support prioritization and reporting:

  • fix:
  • feat:
  • chore:
  • docs:
  • refactor:

Repository policy should decide whether automation squashes commits, preserves a small commit series, or rebases before merge.

Pull Request Lifecycle

After implementation and validation, the system should:

  • push the branch
  • open a PR with structured summary
  • link the source issue
  • include validation results
  • indicate whether merge is safe or requires human review

While the PR is open, the system should handle:

  • CI failures
  • requested changes
  • merge conflicts
  • stale base branches

Automatic merge should be policy-controlled and usually gated on:

  • all required checks passing
  • no outstanding change requests
  • allowed issue type
  • optional label or approval signal

Observability

The system should produce structured logs and persisted run records for:

  • issue decisions
  • plan creation
  • stage starts and finishes
  • agent outputs
  • validation results
  • PR transitions

Useful operator views include:

  • active leases
  • in-progress runs
  • recently failed issues
  • PRs awaiting review
  • issues waiting on clarification

Failure Handling

Failures should be explicit and recoverable.

Common failure categories:

  • repository access/auth problems
  • ambiguous issue requirements
  • agent execution errors
  • validation failures
  • git conflicts
  • GitHub API failures

Recovery mechanisms:

  • bounded retries with backoff
  • stage-level reruns
  • explicit escalation to human comments
  • abandonment after repeated non-actionable failures

The system should avoid silent loops. Repeated failures should be surfaced in GitHub and in operator-visible state.

Security and Safety

The system may have significant repository and network access. Safeguards should include:

  • least-privilege GitHub credentials
  • repository allowlists
  • branch protection awareness
  • command allowlists for validation steps
  • bounded execution timeouts
  • explicit merge policy
  • audit logs for automated actions

If agents can execute arbitrary shell commands, execution should occur in a controlled environment with clear repository boundaries.

Deployment Model

The first deployment mode can be a single remote process or laptop job that:

  • polls GitHub at a fixed interval
  • stores local state on disk
  • runs one or more tasks concurrently
  • uses local git and language toolchains

Later deployment options may include:

  • a daemon or service
  • scheduled CI jobs
  • webhook-driven workers
  • distributed workers with a shared lease store

The architecture should support these without changing the core domain model.

Minimal Viable Product

An MVP should support:

  • one repository
  • polling-based issue observation
  • triage based on labels/title prefixes/body quality
  • clarification comments
  • a staged flow of plan -> implement -> test -> open_pr
  • isolated git worktrees
  • persisted run state
  • one automation-owned PR per issue

Nice-to-have later:

  • review-comment response loops
  • automatic merge
  • conflict repair
  • distributed workers
  • multi-repo fleet management

Workflow Templates By Issue Type

The system should support configurable workflow templates keyed by issue type or classification. This allows repositories to define the available steps for each kind of work.

Example intent:

  • bug or fix triage -> plan -> implement -> test -> review -> open_pr -> merge
  • feature or feat triage -> clarify -> plan -> implement -> test -> review -> open_pr
  • research triage -> research -> summarize -> comment_issue
  • chore triage -> implement -> validate -> open_pr

Repositories should be able to override both:

  • the mapping from issues to workflow type
  • the stage chain or stage graph for that type

This makes the workflow policy explicit and inspectable. It also reduces the amount of hidden decision-making delegated to the agent at runtime.

Recommended design constraints:

  • start with a bounded set of known stage kinds
  • support linear chains first, then dependency graphs if needed
  • keep repository policy responsible for selecting templates
  • keep agent-specific prompt logic out of the template definition

Over time, workflow templates may support:

  • optional stages
  • conditional stages
  • retry limits per stage
  • stage-specific timeout and validation policies
  • stages that end in an issue comment rather than a PR

Open Design Questions

  • What exact issue signals mark work as automation-ready?
  • When should the system ask for clarification versus declining to act?
  • How much planning should be deterministic versus agent-generated?
  • How should repositories classify issues into workflow templates: prefixes, labels, rules, or a combination?
  • Should one issue map to one PR by default, or can the planner split it?
  • What validation commands are repository-controlled versus globally defined?
  • What approvals are required before merge?
  • How are automation actions distinguished from human actions in GitHub?
  • What persistence layer is needed beyond a single-node local filesystem?

Recommended Architectural Principle

Keep three layers separate:

  • domain orchestration
  • repository/platform integration
  • agent execution

If these are kept separate, the same product can later run on different agents, different hosting models, and potentially different code-hosting platforms without rewriting the core workflow model.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment