Skip to content

Instantly share code, notes, and snippets.

@jwmatthews
Created May 14, 2026 15:19
Show Gist options
  • Select an option

  • Save jwmatthews/476a28e36e63f3180475ffea48fc9350 to your computer and use it in GitHub Desktop.

Select an option

Save jwmatthews/476a28e36e63f3180475ffea48fc9350 to your computer and use it in GitHub Desktop.

Deep Dive: Quarkus Agent MCP Server — Information Architecture

Overview

This MCP server provides 22 tools across 7 classes that help AI agents build Quarkus applications. The Quarkus-specific information it delivers comes from 6 distinct sources, each gathered differently.


Source 1: Extension Skills (SKILL.md files embedded in JARs)

Where captured: Inside Quarkus extension deployment JARs at META-INF/quarkus-skill.md

How gathered: SkillReader.java (1,206 lines) implements a three-layer composition chain:

  1. Layer 1 — Extension JARs (from ~/.m2/repository):

    • Core extensions (io.quarkus group): Scans all quarkus-*-deployment directories in the local Maven repo for the project's detected Quarkus version. For each deployment JAR containing META-INF/quarkus-skill.md, it reads the raw skill content and composes it with:
      • Extension metadata from the runtime JAR's META-INF/quarkus-extension.yaml (name, description, guide URL, categories)
      • Discovered MCP tools via Jandex bytecode scanning of deployment, runtime, and dev JARs (looking for @JsonRpcDescription, @DevMCPEnableByDefault, @DevMcpBuildTimeTool annotations)
    • Non-core extensions (Quarkiverse, custom): Resolved from the project's pom.xml/build.gradle dependencies via DependencyResolver, then the same JAR composition process
    • Fallback: For older Quarkus versions without per-extension skills, downloads an aggregated quarkus-extension-skills JAR from Maven Central
  2. Layer 2 — User-level skills (~/.quarkus/skills/): SKILL.md files with frontmatter supporting enhance (append) or override (replace) composition modes

  3. Layer 3 — Project-level skills (.agent/skills/): Standalone files, no composition

Content type: Markdown files with YAML frontmatter containing coding patterns, testing guidelines, configuration reference, common pitfalls, and Dev MCP tool tables. This is the primary source of "how to code with Quarkus" knowledge.

Key insight for extraction: The actual content lives inside Quarkus extension JARs that ship with each Quarkus release. The SKILL.md files are authored by extension developers and shipped as resources. You'd need to extract these from all extension deployment JARs for a given Quarkus version.


Source 2: Semantic Documentation Search (Pre-indexed pgvector)

Where captured: Pre-built Docker images at ghcr.io/quarkusio/chappie-ingestion-quarkus:<version>

How gathered: DocSearchTools.java + ContainerManager.java + EmbeddingModelLoader.java:

  • On startup, loads the BGE Small EN v1.5 embedding model (quantized, 384 dimensions) via LangChain4j
  • Starts a pgvector PostgreSQL container via Testcontainers with pre-indexed documentation
  • The container images are version-specific (e.g., tag 3.34.2 has docs for that Quarkus version)
  • Queries use vector similarity search with a minimum score threshold of 0.82

Post-processing logic embedded in Java:

  • Synonym expansion: Maps common terms to Quarkus concepts (e.g., "api" -> "rest", "orm" -> "hibernate", "di" -> "cdi" — 22 synonyms total)
  • Score boosting: +0.15 for title/topic matches, +0.10 for repo path/category matches, +0.08 for section matches
  • Legacy penalties: -0.50 for resteasy-classic guides, -0.50 for internal docs
  • Modern boosts: +0.15 for modern rest/rest-json/rest-client guides
  • Junk filtering: Skips chunks under 50 chars, pure whitespace, or config property tables

Key insight for extraction: The actual documentation content is inside those Docker images as pgvector data. The ingestion pipeline is the separate chappie-docling-rag project (external). You'd need either the raw indexed documents or the original Quarkus guide sources.


Source 3: Embedded Workflow Instructions (application.properties + Tool descriptions)

Where captured: Hardcoded in Java source code as annotation string literals and in application.properties

How gathered: Static content compiled into the server:

  • application.properties lines 7-53: A massive multi-line instructions string defining the complete AI agent workflow — when to use each tool, testing patterns, error handling, logging, key rules. This is sent as the MCP server's instructions field.

  • Tool description strings: Each @Tool(description = ...) annotation contains detailed behavioral instructions. For example, quarkus_create has ~20 lines of rules about extension-first development, skill loading order, testing patterns, and README maintenance.

  • quarkus_create also generates: AGENTS.md (180+ lines of project instructions) and CLAUDE.md into every new project, embedding the entire Quarkus development workflow.

Key insight for extraction: This is the "how to use the tools" / "Quarkus development methodology" content. It's easily extractable from the source code directly — it's all string literals.


Source 4: Live Dev MCP Tool Discovery (Runtime introspection)

Where captured: Dynamically discovered from running Quarkus applications

How gathered: DevMcpProxyTools.java:

  • Makes HTTP JSON-RPC calls to http://localhost:<port>/q/dev-mcp
  • tools/list discovers available tools on the running app
  • tools/call invokes discovered tools
  • Tool list is dynamic — changes when extensions are added/removed

Content type: Tool schemas (name, description, parameters) from extensions that implement Dev MCP tools (testing, config, OpenAPI, scheduler management, etc.)

Key insight for extraction: This is inherently runtime/dynamic. The available tools depend on which extensions are in the project. However, the Jandex scanning in SkillReader already discovers these tool definitions statically from extension JARs and includes them in skill content. So the tool documentation is partially captured in Source 1.


Source 5: Version & Update Intelligence (GitHub + reference projects)

Where captured: External GitHub repository quarkusio/code-with-quarkus-compare

How gathered: UpdateTools.java:

  • git ls-remote --tags to fetch available version tags
  • HTTP fetch of reference build files (pom.xml/build.gradle) from raw GitHub content
  • Runs quarkus update --dry-run (or Maven/Gradle plugin equivalent) locally
  • Generates comparison diffs between current and latest versions

Key insight for extraction: This is operational tooling, not knowledge content. The version comparison logic and upgrade advice are algorithmic, not content-based.


Source 6: Extension Metadata (YAML from runtime JARs)

Where captured: Inside extension runtime JARs at META-INF/quarkus-extension.yaml

How gathered: SkillReader.parseExtensionYaml():

  • Extracts: name, description, guide URL, categories
  • Used to compose the header of each skill (name, description link, guide reference)
  • Also used for categorizing skills in the index display

Key insight for extraction: Standard Quarkus metadata that ships with every extension. Easily extractable from Maven Central JARs.


Summary: What to Extract for agentskill.io

Source Content Type Extraction Difficulty Value
Extension Skills (SKILL.md) Coding patterns, testing, pitfalls Medium — need to iterate all extension deployment JARs for each version Highest — this is the core "how to write Quarkus code" knowledge
Documentation (pgvector) Full Quarkus guides, chunked Medium — need the raw guide content from quarkus.io or the ingestion pipeline High — comprehensive API/config documentation
Workflow Instructions Development methodology Easy — copy from source code strings High — guides AI agents on Quarkus development process
Extension Metadata Names, descriptions, guide URLs, categories Easy — parse YAML from runtime JARs Medium — provides the extension catalog
Dev MCP Tool Schemas Tool names, descriptions, parameters Medium — Jandex scanning or runtime query Medium — already partially in skills
Update Intelligence Version comparison, migration recipes N/A — algorithmic, not content Low — operational, not knowledge

The Critical Path for agentskill.io Extraction

  1. Extract all SKILL.md files from every Quarkus extension deployment JAR for a target version — this is where the actionable coding patterns live
  2. Extract the workflow instructions from application.properties and tool descriptions — this is the development methodology
  3. Get the raw documentation either from the pgvector container dump or the quarkus.io source guides
  4. Build the extension metadata catalog from runtime JAR YAML files

The MCP server itself is primarily a delivery mechanism and runtime orchestrator — it doesn't generate Quarkus knowledge, it aggregates it from extension JARs, pre-indexed docs, and hardcoded instructions, then delivers it through MCP tools. The knowledge itself is extractable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment