Skip to content

Instantly share code, notes, and snippets.

@bartoszmajsak
Last active February 14, 2026 22:42
Show Gist options
  • Select an option

  • Save bartoszmajsak/95afb1d4454a5df17dd07b2c15b6fb1a to your computer and use it in GitHub Desktop.

Select an option

Save bartoszmajsak/95afb1d4454a5df17dd07b2c15b6fb1a to your computer and use it in GitHub Desktop.
KServe make precommit optimization plan — 5 focused PRs (analysis session: kserve/optimizations/precommit)

Plan: Optimize make precommit -- 5 focused PRs

Context

make precommit takes ~101s on a clean tree. 79% of time is in generate (57.7s) and manifests (33.8s), which run unconditionally even when no relevant files changed.

Deep tool-level analysis uncovered additional waste:

  • 42 sequential yq invocations in manifests (16.4s) can be batched into 5 calls (1.3s)
  • go vet on main module is redundant (5.1s) -- golangci-lint already includes govet
  • yq+jq+xargs protocol pipeline spawns yq 24+ times via xargs (4.3s → 0.3s single expression)
  • $(shell perl ...) mutates files at parse time on every make invocation
  • all: test manager agent router runs tests on bare make
  • test target has redundant fmt vet manifests deps that overlap with precommit

Timing baseline

Target Time %
ensure-go-version-upgrade 0.02s 0%
sync-deps 2.29s 2%
sync-img-env 0.04s 0%
vet 5.85s 5%
tidy 0.43s 0.4%
go-lint 3.23s 3%
py-fmt 1.51s 1%
py-lint 0.05s 0%
generate 57.73s 50%
manifests 33.77s 29%
uv-lock 0.75s 1%
generate-quick-install-scripts 10.60s 9%
TOTAL ~101s

Tool-level optimizations (all measured and verified)

Optimization Before After Savings Verified
yq: batch 42 calls → 5 piped 16.39s 1.33s -15.1s Identical output
yq: single expr replaces yq+jq+xargs 4.33s 0.31s -4.0s Identical output
go vet: drop redundant main module scan 5.05s 0s -5.1s govet in golangci-lint
generate: internal parallelism 57.73s 39.51s -18.2s Tested
manifests: all optimizations combined 26.22s 15.43s -10.8s Tested

Expected results after all PRs

Scenario Before After Speedup
make precommit (full, auto-parallel) 101s ~55s 45%
make precommit-quick (no API changes) 101s ~8s 92%
make (bare) runs tests precommit-quick + test correct UX

PR 1: Fix parse-time side-effects, default target, missing .PHONY

Files: Makefile Risk: Near zero

Changes

  1. Remove parse-time $(shell perl ...) (lines 30-31). Move to patch-manager-resources target, add as prerequisite of deploy/deploy-dev.

  2. Change all target to all: precommit-quick test.

  3. Add missing .PHONY for: precommit, check, vet, tidy, go-lint, py-fmt, py-lint, fmt, generate, manifests, test, generate-quick-install-scripts, validate-infra-scripts, uv-lock.

  4. Remove redundant deps from test target: change test: fmt vet manifests envtest test-qpext to test: envtest setup-envtest test-qpext (when all calls precommit-quick first, fmt/vet/manifests are already done).


PR 2: Drop redundant go vet, batch yq, optimize manifests internals

Files: Makefile, hack/minimal-crdgen.sh Risk: Low. All changes verified to produce identical output.

Changes

A: Drop redundant go vet on main module (-5.1s)

golangci-lint already includes the govet linter and scans the same packages. qpext is a separate Go module not reachable by golangci-lint, so it still needs go vet.

vet:
	cd qpext && go vet ./...

B: Batch yq invocations in manifests (42 calls → 5, saves ~15s)

Replace all sequential $(YQ) ... -i file.yaml calls per file with single piped expressions using |. Example for llminferenceserviceconfigs.yaml (19 calls → 1):

@$(YQ) '
  del(.spec.versions[1]...x-kubernetes-validations) |
  del(.spec.versions[1]...pattern) |
  ... all 19 operations piped ...
' -i config/crd/full/llmisvc/serving.kserve.io_llminferenceserviceconfigs.yaml

Same for: llminferenceservices.yaml (10→1), inferenceservices.yaml (8+protocol→1), clusterservingruntimes.yaml (1), servingruntimes.yaml (1).

C: Replace yq+jq+xargs protocol pipeline with single yq expression (-4s per file)

Current (spawns yq 24 times via xargs for inferenceservices.yaml alone):

@$(YQ) '... | path' file -o j | jq -r '...' | awk '...' | xargs -n1 -I{} $(YQ) '{} = "TCP"' -i file

Replace with single expression (0.31s instead of 4.33s):

@$(YQ) '(.spec.versions[0].schema.openAPIV3Schema.properties.spec.properties | .. | select(has("protocol")).protocol.default) = "TCP"' -i file

Apply to all 3 files: inferenceservices.yaml, clusterservingruntimes.yaml, servingruntimes.yaml. Removes dependency on jq too.

D: Pre-build crd-gen in hack/minimal-crdgen.sh (-0.24s)

Change go run ./cmd/crd-gen to go build -o + run binary. Avoids 11 go run compilation cycles.

E: Deduplicate kubectl kustomize for llmisvc

Lines 194-195 call kubectl kustomize config/crd/full/llmisvc twice. Run once, filter twice:

@LLMISVC_CRD=$$(kubectl kustomize config/crd/full/llmisvc) && \
  echo "$$LLMISVC_CRD" | $(YQ) 'select(.metadata.name == "llminferenceservices...")' > ... && \
  echo "$$LLMISVC_CRD" | $(YQ) 'select(.metadata.name == "llminferenceserviceconfigs...")' > ...

PR 3: Automatic parallelism in precommit and generate

Files: Makefile Risk: Low. Parallelism is built into target recipes, no -j flag needed.

Changes

A: Restructure precommit into phased parallel execution

Important: tidy mutates go.mod/go.sum and go-lint --fix mutates Go source. These must complete before read-only checks (vet) run. Split into ordered phases:

# Portable CPU count (nproc on Linux, sysctl on macOS, fallback to 4)
NPROC ?= $(shell nproc 2>/dev/null || getconf _NPROCESSORS_ONLN 2>/dev/null || sysctl -n hw.ncpu 2>/dev/null || echo 4)

.PHONY: precommit-sync precommit-mutate precommit-checks

# Phase 0: sync + mutations (safe to parallelize with each other)
precommit-sync: ensure-go-version-upgrade sync-deps sync-img-env
precommit-mutate: tidy go-lint  # go-lint uses --fix, tidy modifies go.sum

# Phase 1: read-only checks (safe to parallelize after mutations)
precommit-checks: vet py-fmt py-lint

precommit:
	@$(MAKE) --no-print-directory -j$(NPROC) precommit-sync precommit-mutate
	@$(MAKE) --no-print-directory -j$(NPROC) precommit-checks generate manifests uv-lock
	@$(MAKE) --no-print-directory generate-quick-install-scripts

Phase 1 (parallel): sync + tidy + go-lint --fix (all mutating, but on different files). Phase 2 (parallel): vet + py-fmt + py-lint + generate + manifests + uv-lock (read-only checks safe after mutations complete; generate/manifests write to separate dirs). Phase 3 (sequential): generate-quick-install-scripts (needs manifests + sync-deps output).

B: Parallelize within generate (-18s)

generate: controller-gen helm-docs
	@hack/update-codegen.sh & \
	{ hack/update-openapigen.sh && hack/python-sdk/client-gen.sh; } & \
	wait
	@$(HELM_DOCS) --chart-search-root=charts --output-file=README.md

Codegen (29.5s) runs in parallel with openapigen→client-gen (14.2s). Verified: they write to separate dirs (pkg/client/ vs pkg/openapi/ + python/kserve/).

C: Parallelize object generation within manifests (-0.7s)

@$(CONTROLLER_GEN) object:headerFile="hack/boilerplate.go.txt" paths=./pkg/apis/serving/v1alpha1 & \
$(CONTROLLER_GEN) object:headerFile="hack/boilerplate.go.txt" paths=./pkg/apis/serving/v1alpha2 & \
$(CONTROLLER_GEN) object:headerFile="hack/boilerplate.go.txt" paths=./pkg/apis/serving/v1beta1 & \
wait

Note: RBAC parallelism was tested and showed no improvement (2.38s→2.61s).


PR 4: Add precommit-quick with git-based change detection

Files: Makefile, hack/changed-paths.sh (new) Risk: Medium-low. CI uses full precommit. Escape hatch: FORCE_FULL=1.

hack/changed-paths.sh (new, ~20 lines)

#!/bin/bash
# Exit 0 if files matching given patterns changed (uncommitted, untracked, or on branch).
set -euo pipefail
MERGE_BASE=$(git merge-base HEAD origin/master 2>/dev/null \
          || git merge-base HEAD main 2>/dev/null \
          || echo "HEAD~1")
# Check uncommitted changes (staged + unstaged)
if git diff --name-only HEAD -- "$@" 2>/dev/null | grep -q .; then exit 0; fi
# Check untracked files matching the patterns
if git ls-files --others --exclude-standard -- "$@" 2>/dev/null | grep -q .; then exit 0; fi
# Check branch changes (committed but not in base)
if git diff --name-only "${MERGE_BASE}"..HEAD -- "$@" 2>/dev/null | grep -q .; then exit 0; fi
exit 1

Change detection rules

Trigger paths Target
pkg/apis/serving/ hack/update-codegen.sh hack/update-openapigen.sh hack/python-sdk/ cmd/spec-gen/ go.mod charts/ hack/boilerplate.go.txt generate
pkg/apis/serving/ pkg/controller/ config/crd/ config/rbac/ hack/minimal-crdgen.sh config/configmap/ charts/ kserve-deps.env Makefile manifests
python/*/pyproject.toml uv-lock
hack/setup/ kserve-deps.env kserve-images.env generate-quick-install-scripts

Makefile target

FORCE_FULL ?=

.PHONY: precommit-quick
precommit-quick:
	@$(MAKE) --no-print-directory -j$(NPROC) precommit-sync precommit-mutate
	@$(MAKE) --no-print-directory -j$(NPROC) precommit-checks
	@if [ "$(FORCE_FULL)" = "1" ] || hack/changed-paths.sh <generate triggers>; then \
	  echo "==> Running generate..."; $(MAKE) generate; \
	else echo "==> Skipping generate"; fi
	@if [ "$(FORCE_FULL)" = "1" ] || hack/changed-paths.sh <manifests triggers>; then \
	  echo "==> Running manifests..."; $(MAKE) manifests; \
	else echo "==> Skipping manifests"; fi
	@if [ "$(FORCE_FULL)" = "1" ] || hack/changed-paths.sh <uv-lock triggers>; then \
	  echo "==> Running uv-lock..."; $(MAKE) uv-lock; \
	else echo "==> Skipping uv-lock"; fi
	@if [ "$(FORCE_FULL)" = "1" ] || hack/changed-paths.sh <install-scripts triggers>; then \
	  echo "==> Running generate-quick-install-scripts..."; \
	  $(MAKE) generate-quick-install-scripts; \
	else echo "==> Skipping generate-quick-install-scripts"; fi
  • make precommit-quick -- fast dev workflow (~8s for non-API changes)
  • FORCE_FULL=1 make precommit-quick -- runs everything
  • make precommit -- unchanged, always runs everything (used by CI)

PR 5: Cache network fetch in manifests

Files: Makefile, .gitignore Risk: Low.

Cache gateway-api-inference-extension CRD by version

Guard on both version match and CRD file existence (prevents false cache hit when CRD file is missing/deleted but version file remains):

@if [ ! -f test/crds/.gie-version ] || \
    [ "$$(cat test/crds/.gie-version)" != "$(GIE_VERSION)" ] || \
    [ ! -f test/crds/gateway-inference-extension.yaml ]; then \
  echo "Fetching gateway-api-inference-extension CRD $(GIE_VERSION)..."; \
  kubectl kustomize https://...?ref=$(GIE_VERSION) > test/crds/gateway-inference-extension.yaml; \
  echo "$(GIE_VERSION)" > test/crds/.gie-version; \
else echo "Using cached gateway-inference-extension CRD ($(GIE_VERSION))"; fi

Add test/crds/.gie-version to .gitignore.


PR merge order

  1. PR 1 (side-effects + default target + .PHONY) -- zero risk, foundational
  2. PR 2 (drop go vet + batch yq + optimize manifests) -- biggest tool-level wins
  3. PR 5 (network cache) -- independent, can merge any time
  4. PR 3 (auto-parallelism) -- depends on PR 2 for reduced manifests time
  5. PR 4 (change detection) -- depends on PR 3 for sub-target names and phase structure

PRs 1, 2, and 5 can be reviewed in parallel.


Verification

After all PRs:

  1. make -- runs precommit-quick + test, no parse-time side-effects
  2. make precommit -- full auto-parallel run, identical output to baseline
  3. make precommit-quick on clean tree -- skips heavy targets (~8s)
  4. make precommit-quick after editing pkg/apis/serving/v1beta1/*.go -- runs generate+manifests
  5. make precommit-quick with new untracked file in config/crd/ -- triggers manifests
  6. make check -- CI path, unchanged
  7. time make precommit vs baseline 101s -- expect ~55s (45% faster)
  8. time make precommit-quick (no API changes) -- expect ~8s (92% faster)
  9. git diff after each target -- verify identical output
  10. Test on macOS (if available) -- verify NPROC fallback works

Review findings addressed

Finding Resolution
High: Race condition in parallel phase (tidy/--fix mutate) Split into 3 phases: mutate → read-only checks → codegen. tidy and go-lint --fix complete before vet runs.
High: changed-paths.sh misses untracked files Added git ls-files --others --exclude-standard check between uncommitted and branch checks.
Medium: charts/ missing from generate triggers Added charts/ and hack/boilerplate.go.txt to generate trigger paths.
Medium: kserve-deps.env missing from manifests triggers Added kserve-deps.env and Makefile to manifests trigger paths.
Medium: nproc Linux-specific Added portable NPROC variable with fallback chain: nprocgetconf _NPROCESSORS_ONLNsysctl -n hw.ncpu4.
Low: PR5 cache guard doesn't check CRD file existence Added [ ! -f test/crds/gateway-inference-extension.yaml ] to guard condition.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment