Skip to content

Instantly share code, notes, and snippets.

@dims
dims / 2026-05-11-dra-driver-nvidia-gpu-external-contributors.md
Last active May 11, 2026 18:20
dra-driver-nvidia-gpu — External Contributor Report (2026-05-11)

dra-driver-nvidia-gpu — External Contributor Report

Generated: 2026-05-11 (rev. 2 — Helios cross-check added) Repo: kubernetes-sigs/dra-driver-nvidia-gpu Repo history: 2022-07-14 → 2026-05-11 (~3.8 years) Total commits analyzed: 1,853 (47 unique author emails) Methodology: Extracted all unique commit authors via git log → classified by email domain (@nvidia.com = NVIDIA, all others = candidates) → mapped commits to GitHub logins via GET /repos/.../commits/{sha} → verified every candidate against GET /orgs/NVIDIA/members/{username} (HTTP 204 = confirmed member, 404 = not a member) → for ambiguous cases, additionally cross-referenced against NVIDIA Helios LDAP (helios-cli user search) to detect NVIDIA employees who contribute via personal GitHub accounts not registered in the NVIDIA org → cross-referenced GitHub profiles, DCO Signed-off-by trailers, LinkedIn, and corporate-email patterns → folded NVIDIA-personal-e

@dims
dims / 2026-05-10-k8s-ci-failures-triage-v3.md
Created May 11, 2026 00:44
K8s CI triage runbook + v3 flakes report + v3 failures report (2026-05-10)

Kubernetes CI Failures — Triage Report (v3, independent)

Date: 2026-05-10 (PM) Source: failures-latest.json (HTML view: failures-latest.html). Snapshot: 231 jobs. Method: 10 parallel cluster-investigation agents → 1 independent cross-check verifier (8 claims: 6 CONFIRMED / 2 PARTIAL / 0 REFUTED) → live PR/issue state sweep on 56 references → drift detection against 2026-05-09 snapshot. Truly independent: no prior triage markdown was read; every claim re-derived from raw artifacts.

⚠️ Status banner:

  • 6 fix PRs merged today: k/k#138934 (coverage), k/k#138851 (ContainerMetrics), k/k#138584 (compat-versions, INCOMPLETE — needs release-1.36 cherry-pick), k/k#137936 (storage-kind), kops#18296 (upgrade-gossip), provider-aws-test-infra#550 (AMI build), cloud-provider-kind#407 (Pattern A digest pin).
  • Drift recovery: `ci-kubernetes-e2e
@dims
dims / 2026-05-05-kubernetes-security-findings.md
Last active May 5, 2026 18:08
Kubernetes Security Findings — May 2026

Kubernetes Security Findings — May 2026

Repository: kubernetes/kubernetes
Commit: 47f990437458a2b171f51b5e97a0c28c81d949d1 (master, 2026-05-05)
Methods: Static multi-agent source review (87 files across 4 researchers) + dynamic execution harness (kubectl, 3 agents)
Subsystems: authentication, authorization/RBAC, admission control/webhooks, node authorization (NodeAuthorizer + DRA graph)


Table of Contents

@dims
dims / kube-openapi-pr590-risk-analysis.md
Last active April 27, 2026 13:13
kube-openapi PR #590 risk analysis: go-openapi/swag v0.23.0→v0.25.4 behavioral deep-dive

kube-openapi PR #590 — Deep-Dive Risk Analysis

Upgrading go-openapi/swag v0.23.0 → v0.25.4

Prepared: 2026-04-27
PR: kubernetes/kube-openapi#590
Reviewer question: "go-openapi has some reputation of changing semantics without notification by accident. As we use it in our CRD validation there is risk that we break our API (we have forked the go-openapi validator nowadays, so risk is lower than in the past, but worth a check anyway)."


Executive Summary

@dims
dims / k8s-unwanted-deps-2026-05.md
Last active May 5, 2026 12:17
Kubernetes unwanted vendor dependencies status — April 2026

Kubernetes Unwanted Dependencies: Status Report

Date: May 2026
Branch: master (commit 47f990437458a2b171f51b5e97a0c28c81d949d1)
Scope: hack/unwanted-dependencies.json — modules listed in spec.unwantedModules that are still present in vendor/


Background

@dims
dims / k8s-thermal-masking-full-analysis.md
Last active April 25, 2026 12:13
Kubernetes thermal masking regression analysis and runc shared-tmpfs fix

Kubernetes Thermal Masking Regression: Full Technical Analysis

Issues: k/k#138512, k/k#138388
Root PR: k/k#131018 (merged 2025-07-15, backported 2025-09-03)
Affects: Kubernetes 1.31–1.34, Intel CPUs, high core counts
Date written: 2026-04-24
Updated: 2026-04-25 with runc implementation branch and validation results

Public disclosure note: this analysis is based on public Kubernetes, runc, containerd, and runtime ecosystem issue/PR discussion. The referenced GHSA was still inaccessible when this note was written, so no non-public advisory text is quoted here.

@dims
dims / 2026-04-23-dep-security-analysis-v2.md
Last active April 24, 2026 00:51
Kubernetes dependency security analysis 2026-04-23 (43 packages)

Kubernetes Dependency Security Analysis

Date: 2026-04-23
Packages analyzed: 43
Method: GitHub diff inspection, Go Vulnerability Database, CVE/GHSA search, K8s source grep for reachability


Executive Summary

Of 43 packages with version gaps, 2 require prompt action (live CVE or directly reachable hardening fix), 3 are medium priority (correctness/transitive security value), and the remainder are routine hygiene with no meaningful security delta. Two packages had known CVEs that are already patched in the currently pinned version.

@dims
dims / 2026-04-23-constants-module-impact.md
Created April 23, 2026 21:31
What k8s.io/constants enables — prioritized impact analysis (PR #135896)

What k8s.io/constants enables — prioritized impact analysis

PR: kubernetes/kubernetes#135896 Branch: add-constants-module at /Users/dsrinivas/go/src/k8s.io/kubernetes-pr135896 Cross-checked: all factual claims below verified against the actual branch.


The structural shift in one sentence

@dims
dims / 2026-04-23-k8s-staging-deps-radial.svg
Created April 23, 2026 17:30
k8s.io staging module dependency graph (radial, api at center)
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@dims
dims / dra-driver-nvidia-gpu-ci-coverage.md
Created April 21, 2026 17:04
CI Coverage Map — sigs.k8s.io/dra-driver-nvidia-gpu (Lambda/GCP-nvkind/mock-nvml providers, BATS suites, TestGrid tabs, GPU_TYPE= resolution, gap analysis)

CI Coverage Map — sigs.k8s.io/dra-driver-nvidia-gpu

As of 2026-04-21. Sources: .github/workflows/, kubernetes/test-infra (config/jobs/kubernetes-sigs/dra-driver-nvidia-gpu/, config/testgrids/nvidia/nvidia.yaml), testgrid.k8s.io/nvidia-gpu, hack/ci/{gcp-nvkind,lambda,mock-nvml}, tests/bats/, test/e2e/.

TL;DR

  • 3 execution surfaces: GitHub Actions (lint/unit/mock-e2e only), Prow on Lambda Cloud (real GPUs, BATS), Prow on GCP-nvkind (T4 GCE, Ginkgo).
  • 7 Prow jobs on this repo: 3 e2e presubmits + 3 e2e periodics + 1 image-push postsubmit.
  • Only Lambda/arm64 (GH200) gives real arm64 GPU coverage. GCP-nvkind is amd64/T4 only.
  • Nothing is truly a required check. GitHub branch protection on main and release-25.8 lists EasyCLA as the only required status. No rulesets configured. Every CI signal above — GH Actions lint/unit/mock-e2e and all 4 Prow e2e presubmits (optional: true) — posts status but cannot block merge. Merge gating is effectively: EasyCLA + tide/OW