dra-driver-nvidia-gpu — External Contributor Report

Generated: 2026-05-11 (rev. 2 — Helios cross-check added) Repo: kubernetes-sigs/dra-driver-nvidia-gpu Repo history: 2022-07-14 → 2026-05-11 (~3.8 years) Total commits analyzed: 1,853 (47 unique author emails) Methodology: Extracted all unique commit authors via git log → classified by email domain (@nvidia.com = NVIDIA, all others = candidates) → mapped commits to GitHub logins via GET /repos/.../commits/{sha} → verified every candidate against GET /orgs/NVIDIA/members/{username} (HTTP 204 = confirmed member, 404 = not a member) → for ambiguous cases, additionally cross-referenced against NVIDIA Helios LDAP (helios-cli user search) to detect NVIDIA employees who contribute via personal GitHub accounts not registered in the NVIDIA org → cross-referenced GitHub profiles, DCO Signed-off-by trailers, LinkedIn, and corporate-email patterns → folded NVIDIA-personal-email aliases (e.g. klueska@gmail.com → kklues@nvidia.com, davanum@gmail.com → dsrinivas@nvidia.com, 7723350-elezar@…gitlab.com → elezar@nvidia.com, etc.) back into the NVIDIA cohort.

DCO Status ⚠️ — Five external commits unsigned

Of 37 commits from confirmed non-NVIDIA authors, 32 carry a valid Signed-off-by trailer and 5 do not. The repo does not run the standard CNCF DCO bot (only EasyCLA is configured as a branch-protection required check on main / release-25.8), so unsigned commits were not blocked at merge time.

Unsigned external commits (5):

Commit	Author	PR	Title
`1eb01b4`	coderth `<coderth@outlook.com>`	#38	fix: nvidia-dra-plugin Config
`b96e2c4`	Kevin Hannon `<kehannon@redhat.com>`	#1016	Pin GitHub Actions to commit SHAs for supply-chain protection
`a379b9f`	takonomura `<takonomura@users.noreply.github.com>`	#1053	helm: fix `maskNvidiaDriverParams` path
`ffa7858`	Xingyu Guo `<xingyug.guo.ericsson@gmail.com>`	#1039	gpu plugin: fix wrong error variable in getGpuInfo for NVML system calls
`81c4422`	Xingyu Guo `<xingyug.guo.ericsson@gmail.com>`	#1040	gpu plugin: add missing return in unpreparePartiallyPrepairedClaim when DynamicMIG is disabled

Identity corrections from DCO trailers:

yyzxw signed off with xiaowu.zhu <xiaowu.zhu@daocloud.io> → real name is Xiaowu Zhu, employer is DaoCloud (GitHub profile has no name, no company, no bio; commit email 1020938856@qq.com is opaque).
yuyue9284 signed off with both Yue Yu <yuyu3@microsoft.com> and yuyue9284 <15863499+yuyue9284@users.noreply.github.com> → real name Yue Yu, employer Microsoft (GitHub profile has no name, no company, no bio).
lengrongfu signed off with rongfu.leng <lenronfu@gmail.com> (commit email is the local hostname lengrongfu@lengrongfudeMacBook-Pro.local); GitHub profile separately confirms @DaoCloud.
cyclinder signed off with Cyclinder Kuo <kuocyclinder@gmail.com> (full real name: Cyclinder Kuo).
learner0810 signed off with zhongjun.li <zhongjun.li@daocloud.io> → real name Zhongjun Li, employer DaoCloud (GitHub profile has no name, no company, no bio).

Corrections from NVIDIA org check (false positives removed):

visheshtanksale (Vishesh Tanksale, PR #965, commit email vishesh.tanksale09@gmail.com) is an NVIDIA org member (HTTP 204). Removed from the external list. He sits in the OWNERS file as a reviewer.

Corrections from NVIDIA Helios LDAP (additional false positives, not detectable from GitHub org alone):

thesuperzapper (Mathew Wicks, PRs #510 / #511, commit email 5735406+thesuperzapper@users.noreply.github.com) is a confirmed NVIDIA employee per Helios LDAP — login mwicks, email mwicks@nvidia.com, NVIDIA hire date 2025-02-18, department Enterprise Products, location Santa Clara HQ, manager chain Joohoon Lee → Ian Buck → Jensen Huang. Both his DRA-driver PRs merged on 2025-08-30 — i.e. 6 months after his NVIDIA hire date. He contributes via his personal GitHub handle (still listing @aranui-solutions as company; "Kubeflow Lead" in bio) and has not joined the NVIDIA GitHub org (HTTP 404), which is why he initially read as external; the Helios cross-check resolved it. Removed from the external list. This is the analogue of the nvsentinel report's jamie-yu0 discovery — an NVIDIA employee whose external GitHub identity initially appears as third-party.

The two corrections above (visheshtanksale, thesuperzapper) reduce the external-contributor count from 24 candidates down to 22. The Helios pass would have caught additional cases if any NVIDIA-employed contributor used a personal email AND a personal GitHub handle AND was not in the NVIDIA GitHub org; thesuperzapper is the only such case found. The single most actionable insight from this report is that the GitHub-org check alone misses NVIDIA employees who do OSS work from personal handles — anyone doing this kind of analysis on NVIDIA-adjacent repos in the future should include a Helios pass.

Confirmed Non-NVIDIA Contributors

22 confirmed external contributors with 33 merged PRs (plus 11 closed-unmerged + 1 open) and 35 external commits total. Profiles below are ordered by employer cluster size, then merged-PR count, then alphabetical.

1. John Belamaric — @johnbelamaric

Field	Value
Commit email	jbelamaric@google.com
DCO Signed-off-by	`John Belamaric <jbelamaric@google.com>` ✅
NVIDIA org check	HTTP 404
GitHub company	Google
Public email	jbelamaric@google.com
LinkedIn	linkedin.com/in/johnbelamaric (public, not linked from GitHub)
Employer	Google — Principal / Distinguished Software Engineer; SIG-Architecture co-chair; co-driver of the DRA KEP and ResourceClaim API; long-time CoreDNS maintainer; co-author of Learning CoreDNS (O'Reilly); frequent KubeCon speaker
Location	US (East Coast / Maryland area, per public bios)

Focus area: Manageability for control-plane scheduling. Added a toleration that lets the kubelet-plugin DaemonSet land on compute-class-tainted nodes (the "compute class" pattern Google's GKE Autopilot uses).

PR	Contribution
#221	Add toleration for compute class taint

2. Antonio Ojea — @aojea

Field	Value
Commit email	aojea@google.com
DCO Signed-off-by	`Antonio Ojea <aojea@google.com>` ✅
NVIDIA org check	HTTP 404
GitHub company	Google
Public email	antonio.ojea.garcia@gmail.com
Twitter	@itsuugo
LinkedIn	linkedin.com/in/ajojea
Personal site	https://kindnet.es (creator/maintainer of KindNet CNI)
Location	Spain (remote, Google)
Followers	545
Employer	Google (GKE Networking) — Senior Software Engineer; Kubernetes SIG-Network technical lead / approver; KIND maintainer; creator of KindNet; KEP author for many networking features (kube-proxy, IPv6/dual-stack, Gateway API); frequent KubeCon speaker

Focus area: Cross-driver correctness. PR #435 prevents the NVIDIA kubelet-plugin from trying to allocate devices that belong to a different DRA driver — a real-world hazard in any cluster running multiple DRA drivers (e.g. NVIDIA GPU + a network/RDMA DRA driver), which is the configuration GKE is moving toward.

PR	Contribution
#435	filter device requests from other drivers

3. Leiyi Zhang — @leiyiz

Field	Value
Commit email	leiyiz@google.com
DCO Signed-off-by	`Léiyì Zhang <leiyiz@google.com>` ✅
NVIDIA org check	HTTP 404
GitHub company	Google
Location	Seattle, WA
LinkedIn	not publicly linked
Employer	Google — CSI / GKE storage / AI-on-GKE engineer (contributes heavily to `gcp-filestore-csi-driver`, `gcp-compute-persistent-disk-csi-driver`, `container-engine-accelerators`, `ai-on-gke`); DRA GPU driver work targets GKE accelerators

Focus area: Image-hardening (driver image). The PR fixed a shell-quoting/interpolation bug where "{$var}" was being emitted as a literal {} wrapped around the filename. A subsequent unmerged PR (#972) attempted to remove the shell dependency from the driver image altogether — consistent with running NVIDIA DRA under a hardened, shell-less base image (a Google/Distroless idiom).

PR	Contribution
#968	fix syntax error where "{$var}" results in literal {} wrapped around file name
#972 (closed, unmerged)	remove shell dependency from the driver image

Google cluster (3 contributors, 3 merged PRs): Belamaric, Ojea, and Leiyi Zhang. Together they cover the three things Google needs out of this driver: compute-class scheduling (GKE Autopilot), multi-DRA-driver coexistence (GKE multi-resource), and a shell-less hardened base image. None of these contributions are cosmetic; they read like patches that arose from Google running this driver in some kind of integration / pre-prod path.

4. Suraj Deshmukh — @surajssd

Field	Value
Commit email	surajd.service@gmail.com
DCO Signed-off-by	`Suraj Deshmukh <surajd.service@gmail.com>` ✅
NVIDIA org check	HTTP 404
GitHub company	@microsoft
Public email	surajd.service@gmail.com
Blog	https://suraj.io/
Twitter	@surajd_
Location	Redmond, WA
Public repos	304
LinkedIn	linkedin.com/in/surajssd
Bluesky	https://bsky.app/profile/suraj.io
Employer	Microsoft — Senior Software Engineer (AKS / Azure Linux); ex-Kinvolk (Flatcar Container Linux, acquired by Microsoft 2021); ex-Red Hat OpenShift/Origin; frequent KubeCon speaker and blogger

Focus area: Documentation; the PR clarifies the expected post-install state of the DRA driver in the README.

PR	Contribution
#164	README: Update the expectations after installation

5. Jon Huhn — @nojnhuh

Field	Value
Commit email	nojnhuh@users.noreply.github.com
DCO Signed-off-by	`Jon Huhn <nojnhuh@users.noreply.github.com>` ✅
NVIDIA org check	HTTP 404
GitHub company	@microsoft
LinkedIn	not publicly linked
Employer	Microsoft — Software Engineer on the Azure Kubernetes team; cluster-api-provider-azure (CAPZ) maintainer; maintainer of `kubernetes-sigs/dra-example-driver` (the reference DRA implementation he ported many learnings from into this driver); active across `kubernetes/kubernetes`, `test-infra`, `klog`, `enhancements`

Focus area: Operational softening — only warn (don't fail) when pcieRoot cannot be determined for a GPU. Relevant on AKS GPU nodes where the PCI topology query may not return a stable bus path under all node images.

PR	Contribution
#577	Only warn when pcieRoot can't be determined

6. Yue Yu — @yuyue9284

Field	Value
Commit email	yuyu3@microsoft.com (in commit author)
DCO Signed-off-by	`Yue Yu <yuyu3@microsoft.com>` + `yuyue9284 <15863499+yuyue9284@users.noreply.github.com>` ✅ (dual signoff)
NVIDIA org check	HTTP 404
GitHub profile	No name, no bio, no company set (but a member of the `microsoft` GitHub org)
LinkedIn	not publicly linked
Employer	Microsoft — Azure Arc for Kubernetes / Azure ML on Kubernetes engineer; PR distribution: `AzureArcForKubernetes/azure-cli-extensions` (12), `Azure/azureml-examples` (7), `Azure/AML-Kubernetes` (4), `microsoft/frameworkcontroller`, `MicrosoftDocs/azure-ai-docs`; also contributes to `volcano-sh/volcano` (batch scheduler) and `NVIDIA/gpu-operator`

Note: GitHub login yuyue9284 does not match the commit author email's local part (yuyu3@…); without the DCO trailer this contributor would have read as an unknown handle. The dual signoff (one corporate, one noreply) is unusual — likely a tooling artifact from a corporate review pipeline.

Focus area: Stale-cache fix in the ComputeDomain controller — guards against operating on a ComputeDomain that has been deleted but is still present in the informer cache. Production-grade hardening, not a drive-by.

PR	Contribution
#805	fix: check existence of ComputeDomain in cache before processing updates

Microsoft cluster (3 contributors, 3 merged PRs): Deshmukh, Huhn, and Yue Yu. Yue Yu's ComputeDomain cache fix is a non-trivial controller-runtime correctness improvement. Huhn's pcieRoot patch is AKS-flavored. Together they imply AKS is exercising this driver end-to-end, not just shipping it.

7. Vitaliy Emporopulo — @empovit

Field	Value
Commit email	vemporop@redhat.com
DCO Signed-off-by	`Vitaliy Emporopulo <vemporop@redhat.com>` ✅
NVIDIA org check	HTTP 404
GitHub company	Red Hat
Public email	vemporop@redhat.com
Profile name	Vitaly E.
Location	Israel
Public repos	64
LinkedIn	not publicly linked
Employer	Red Hat — Software Engineer focused on the NVIDIA / GPU ecosystem on OpenShift. Primary contributor to `rh-ecosystem-edge/nvidia-ci`, `rh-ecosystem-edge/console-plugin-nvidia-gpu`, `openshift/instaslice-operator` (MIG slice orchestration), plus `openshift/release` CI plumbing

Focus area: OpenShift integration. Five merged PRs over ~14 months (Feb 2024 → Mar 2026), covering: privileged kubelet-plugin on OpenShift; OpenShift SCC bindings for service accounts (so the IMEX daemon can write its nodes_config.cfg under a random UID); the nvidia-container-toolkit path override for non-default RHEL installs; the master-role toleration for control-plane workloads; and the original OpenShift install docs in the README. This is the most sustained external contribution in the repo's history; the OpenShift install instructions in docs/ exist because Empovit wrote them.

PR	Contribution
#72	Allow custom NVIDIA CTK path
#76	Let kubelet plugin run privileged on OpenShift
#82	Document DRA driver installation on OpenShift
#569	Add SCC to service accounts on OpenShift (fixes IMEX `writeNodesConfig` perm-denied under random UID)
#899	Add `node-role.kubernetes.io/master` toleration to controller

8. Kevin Hannon — @kannon92

Field	Value
Commit email	kehannon@redhat.com
DCO Signed-off-by	(missing — commit unsigned) ⚠️
NVIDIA org check	HTTP 404
GitHub company	Red Hat
Bio	"wg-batch lead. Kueue reviewer. JobSet Maintainer. Excited to make Kubernetes the platform of choice for AI/ML/HPC."
Location	Cleveland, Ohio
LinkedIn	not publicly linked from GitHub
Followers	98
Employer	Red Hat — Principal Software Engineer on OpenShift / Kubernetes batch & scheduling; WG-Batch Lead (Kubernetes); Kueue reviewer; JobSet maintainer; frequent KubeCon presenter on batch/AI workloads (Kueue, JobSet)

Focus area: Supply-chain hardening of the repo's own CI. The change pins every GitHub Action used in workflows to an immutable commit SHA (instead of a movable tag), explicitly motivated by the March 2025 tj-actions/changed-files and reviewdog/action-setup compromises. Fixes upstream issue #1015.

PR	Contribution
#1016	Pin GitHub Actions to commit SHAs for supply-chain protection (fixes #1015)

Red Hat cluster (2 contributors, 6 merged PRs): Emporopulo (5 PRs, all OpenShift enablement) and Hannon (1 PR, CI supply-chain hardening). Empovit is the longest-running external contributor in the repo; Hannon's contribution is the single most consequential security PR an external has landed here.

9. Xiaowu Zhu — @yyzxw

Field	Value
Real name	Xiaowu Zhu (朱晓武) — discovered via DCO trailer (`xiaowu.zhu@daocloud.io`); GitHub profile has no name, company, bio, or blog
Commit email	1020938856@qq.com
DCO Signed-off-by	`xiaowu.zhu <xiaowu.zhu@daocloud.io>` ✅
NVIDIA org check	HTTP 404
GitHub	77 public repos, created 2017-12; goes by handle "zxw"
LinkedIn	not publicly linked
Employer	DaoCloud — software engineer focused on AI/LLM serving infrastructure. PR concentration: `DaoCloud/dce-charts-repackage`, `BaizeAI/modelhub`, `BaizeAI/dataset` (DaoCloud's Baize AI platform), `llm-d/llm-d-kv-cache`, and `vllm-project/vllm`. Also active on DaoCloud's Higress plugin server

Focus area: Repo bootstrap chores during the first months after the project moved to its current form — dependabot setup, label dedupe, PR templates, makefile helpers, ci scaffolding. Only 2 of 6 PRs merged; the other 4 were closed unmerged (likely overlapped with NVIDIA's own scaffolding direction).

PR	Contribution
#45	feat: add dependabot
#59	fix: remove repeat label key
#42, #43, #44, #64 (all closed-unmerged)	makefile help target, PR templates, PR-title check, switch-to-map refactor

10. Noah Tang — @CoderTH

Field	Value
Commit email	coderth@outlook.com
DCO Signed-off-by	`coderth <coderth@outlook.com>` ✅ on #39; missing ⚠️ on #38 (commit `1eb01b4`)
NVIDIA org check	HTTP 404
GitHub company	@DaoCloud
Real name	Noah Tang
Bio	"🚀 Cloud Native Developer
Location	Chengdu, China
Blog	https://coderth.onrender.com/ (Hexo blog, handle "CoderTh")
LinkedIn	not publicly linked
Employer	DaoCloud — Cloud Native engineer. Works on DaoCloud's `dce-charts-repackage`, the `matrixhub-ai/matrixhub` AI model hub, NVIDIA vGPU / `k8s-vgpu-scheduler` integrations, and contributes to NVIDIA Dynamo and the Harbor operator

Focus area: Early CI/config scaffolding — the GitHub-actions CI step that runs golangci-lint/test/build was first set up by this PR, and the nvidia-dra-plugin static config fix.

PR	Contribution
#38	fix: nvidia-dra-plugin Config (unsigned commit)
#39	feat: add github ci step
#40 (closed-unmerged)	feat: add issue template

11. Rongfu Leng — @lengrongfu

Field	Value
Commit author email	`lengrongfu@lengrongfudeMacBook-Pro.local` (local laptop hostname — not deliverable)
DCO Signed-off-by	`rongfu.leng <lenronfu@gmail.com>` ✅
Profile email	1275177125@qq.com
NVIDIA org check	HTTP 404
GitHub company	@DaoCloud
Real name	rongfu.leng
Location	Chengdu, Sichuan, China
Blog	https://lengrongfu.github.io/
Public repos	309
LinkedIn	not publicly linked
Employer	DaoCloud — Senior engineer; heavy contributor in the LLM-serving ecosystem (`vllm-project/vllm`, `vllm-omni`, `vllm-project/router`, `sglang-project/sglang`) plus DaoCloud's enterprise charts. Chinese name 冷荣富

Focus area: Environment-variable rename to align with the broader nvidia-container-toolkit naming convention (CONTAINER_DRIVER_ROOT → DRIVER_ROOT_CTR_PATH). Closed-unmerged follow-up #212 attempted NVIDIA_CTK_PATH → NVIDIA_CDI_HOOK_PATH for the same alignment reason.

PR	Contribution
#211	replace `CONTAINER_DRIVER_ROOT` with `DRIVER_ROOT_CTR_PATH`
#212 (closed-unmerged)	Use `NVIDIA_CDI_HOOK_PATH` instead of `NVIDIA_CTK_PATH`

12. Cyclinder Kuo — @cyclinder

Field	Value
Commit email	qifeng.guo@daocloud.io
DCO Signed-off-by	`Cyclinder Kuo <kuocyclinder@gmail.com>` ✅
NVIDIA org check	HTTP 404
GitHub company	@DaoCloud
Real name	Cyclinder Kuo (Qifeng Guo — Chinese name `郭起峰`, per commit email handle)
Bio	"everything we went through was just a waste of a time"
Location	Chengdu, China
Twitter	@Cyclinder_Kuo
LinkedIn	not publicly linked
Employer	DaoCloud — networking engineer; maintainer / major contributor to `spidernet-io/spiderpool` (CNCF sandbox Kubernetes IPAM/CNI) and the `vlan-cni` / `iaas-network-provider` projects

Focus area: Doc maintenance — adjust README to follow script/namespace renames done elsewhere in the repo.

PR	Contribution
#280	README: adjust to script/namespace renames

13. Zhongjun Li — @learner0810

Field	Value
Commit email	zhongjun.li@daocloud.io
DCO Signed-off-by	`zhongjun.li <zhongjun.li@daocloud.io>` ✅
NVIDIA org check	HTTP 404
GitHub profile	No name, no bio, no company set; 58 public repos
LinkedIn	not publicly linked
Employer	DaoCloud — Software engineer on inference / storage. PR distribution: `DaoCloud/dce-charts-repackage` (9), `kubernetes-sigs/gateway-api-inference-extension` (7), `llm-d/llm-d-inference-scheduler` (4), `hwameistor/hwameistor` (3, CNCF sandbox local-volume storage). Keeps a low public profile — identity recoverable only from DCO trailer

Focus area: Build fix. Closed-unmerged follow-up #66 proposed adding gofumpt to the code checks.

PR	Contribution
#67	fix build binaries
#66 (closed-unmerged)	Add gofumpt code checks

DaoCloud cluster (5 contributors, 5 merged PRs): Zhu, Tang, Leng, Kuo, Li — all four use Chengdu / DaoCloud-issued corporate emails in DCO. This is the single largest external-company cluster on the project. The work is heavy on chores/scaffolding/docs and lighter on architectural change; the contribution shape suggests early-days community engagement rather than a deep production-deployment-driven workstream. DCO trailers are the only identity signal for three of the five (Zhu, Yu, Li have no GitHub-profile metadata at all).

14. Kasia Kujawa — @kasia-kujawa

Field	Value
Commit email	katarzyna@cast.ai
DCO Signed-off-by	`Katarzyna Kujawa <katarzyna@cast.ai>` ✅
NVIDIA org check	HTTP 404
GitHub company	CAST AI
Real name	Katarzyna ("Kasia") Kujawa
Location	Gdańsk, Poland
LinkedIn	not publicly linked from GitHub (multiple Katarzyna Kujawa entries in Gdańsk on LinkedIn; the CAST AI–affiliated one is the match)
Employer	CAST AI — Software engineer on the cost-optimization / autoscaling product surface (`castai/k8s-agent`, `castai/terraform-provider-castai`, `castai/helm-charts`). CAST AI is a Kubernetes autoscaling / cost-management platform spanning GKE / EKS / AKS

Focus area: Production bug-fix stream from CAST AI's GPU offering. Five merged PRs in early 2026:

PR #889: fixes mps-control-daemon chroot 'sh': No such file or directory error when nvidiaDriverRoot is /home/kubernetes/bin/nvidia/ — GKE-specific driver-install path. CAST AI runs MPS GPU-sharing on GKE.
PR #978: corrects the MPS shm-dir mount path so MPS clients can actually talk to the control daemon.
PR #979: fixes the cross-compiler used for the static-bash build (related to bash-static work the NVIDIA team did for distroless images).
PR #996: actionable validation errors when GPU sharing config is malformed (improves error UX for end-users).
PR #997: nil-pointer fix when featureGates: null in the Helm chart values.

PR	Contribution
#889	Fix mps-control-daemon chroot shell execution error when nvidiaDriverRoot is set (GKE)
#978	Set proper path for MPS shm dir mount
#979	Use correct cross-compiler for bash static build
#996	Improve validation errors for GPU sharing with actionable messages and supported values
#997	Fix nil pointer when `featureGates` is set to null in values
#1009 (open)	Retry device enumeration on startup to prevent empty ResourceSlices

Note: This is the most active and the most production-driven external contributor in the repo's recent history. Every PR is a fix for something CAST AI hit in production.

15. Robert Northard — @RobertNorthard

Field	Value
Commit email	robertnorthard@googlemail.com
DCO Signed-off-by	`Rob <robertnorthard@googlemail.com>` ✅
NVIDIA org check	HTTP 404
GitHub company	AWS
Public email	robertnorthard@googlemail.com
Profile name	Rob
Public repos	142
LinkedIn	not publicly linked from GitHub (public Robert Northard SA-at-AWS profile exists on LinkedIn)
Employer	Amazon Web Services (AWS) — Specialist Solutions Architect / engineer. Heavy contributor to `aws/karpenter-provider-aws`, `aws/eks-anywhere`, `aws-samples/karpenter-blueprints`, `aws-ia` EKS Blueprints, and `aws-eks-best-practices`. Likely EMEA-based given his community presence in EU Karpenter / EKS workshops

Focus area: Helm-chart defaults for managed-Kubernetes (EKS-flavored) deployments. The default node affinity assumed a node-role.kubernetes.io/control-plane node existed — which is never true on managed offerings (EKS / GKE / AKS hide the control plane). The PR also adds default GPU tolerations so the controller pods can land on tainted GPU nodes. Filed as issue #1047 and fixed by his own PR #1054.

PR	Contribution
#1054	Changed default node affinity and GPU tolerations for DRA controller and kubelet plugin helm chart values (fixes #1047)

16. Marco Ebert — @Gacko

Field	Value
Commit email	marco_ebert@icloud.com
DCO Signed-off-by	`Marco Ebert <marco_ebert@icloud.com>` ✅
NVIDIA org check	HTTP 404
GitHub	97 followers, but only 1 public personal repo (`marctl`) — the 97-follower count tracks his commit volume inside the `giantswarm` GitHub org rather than personal projects
Real name	Marco Ebert
Location	Germany (Giant Swarm is headquartered in Cologne; not set on profile)
LinkedIn	not publicly linked from GitHub (public "Marco Ebert at Giant Swarm" profile exists on LinkedIn)
Employer	Giant Swarm (Cologne-based managed-Kubernetes vendor; member of the `giantswarm` org with ~2,463 internal PRs there). Platform / cluster engineer working on Cluster API tooling — `cluster-api-app`, `cluster-test-suites`, `cluster-standup-teardown`, `clustertest`, `releases`, `cluster-vsphere`. Account from 2011

Focus area: Two helm/runtime PRs in late 2025. PR #708 adds NetworkPolicy resources to the chart (egress/ingress for the controller and daemons) — meaningful in restricted clusters that block all pod-to-pod traffic by default. PR #706 adds /opt/bin to the kubelet-plugin's binary search path (Talos / Flatcar / GKE-COS-style read-only-root distros put their nvidia binaries under /opt/bin).

PR	Contribution
#706	kubelet plugins: add `/opt/bin` to binary search paths
#708	chart: add network policies

17. Herb Duan — @herb-duan

Field	Value
Commit email	herbertduan@qq.com
DCO Signed-off-by	`Herb Duan <herbertduan@qq.com>` ✅
NVIDIA org check	HTTP 404
GitHub	No bio, no company, no blog; 38 public repos; 28 followers
Real name	Herb Duan
Location	Beijing, China
LinkedIn	not publicly linked
Employer	Not disclosed — no company on GitHub bio, no blog, no social accounts. Total public footprint is only 5 PRs across 3 repos: `kubernetes/kubernetes` (3, kubelet resource-claim status handling), this repo (1, the leader-election feature below), and `BenchCouncil/BigDataBench` (1 — a Chinese academic/research benchmark, suggesting a possible prior ICT/CAS-style academic background; unverified)

Focus area: Largest external feature contribution in the repo. PR #851 adds leader election to the ComputeDomain controller so it can run multi-replica without conflicts — eliminating the previous SPOF on the controller. +1,599 / −4 lines across 14 files. Fixes issue #815 (filed by another non-NVIDIA user, Lily922). The follow-up e2e test for this feature was contributed by Anish Bista (PR #1094, below).

PR	Contribution
#851	feat(controller): Add leader election for high availability (+1599/-4, fixes #815)

18. Anish Bista — @anishbista60

Field	Value
Commit email	anishbista053@gmail.com
DCO Signed-off-by	`anish bista <anishbista053@gmail.com>` and `anishbista60 <anishbista053@gmail.com>` ✅
NVIDIA org check	HTTP 404
GitHub company	KubeRox Technologies
Bio	"Always be humble to everyone"
Location	Nepal
Twitter	@anishbista053
LinkedIn	linkedin.com/in/anishbista
Personal site	https://anishbista60.github.io/personal-website/ ; blog at https://medium.com/@anishbista
Public repos	60
Employer	KubeRox Technologies (Nepal) — Kubernetes engineer. Self-describes as the "youngest CNCF Kubestronaut from Nepal" (holds CKA, CKAD, CKS, KCNA, KCSA). Active KubeVirt contributor and maintainer of the `kubevirtbmc` project; also contributes to Kanister

Focus area: End-to-end test coverage for the leader-election feature herb-duan landed in #851. Filed against follow-up issue #970 (jgehrcke). This is the only example in the repo of two external contributors collaborating across a feature: one shipped the feature, the other shipped the test.

PR	Contribution
#1094	tests/bats: add leader election e2e test for compute-domain-controller

19. Kante Yin — @kerthcet

Field	Value
Commit email	kerthcet@gmail.com
DCO Signed-off-by	`kerthcet <kerthcet@gmail.com>` ✅
NVIDIA org check	HTTP 404
GitHub	No company set; bio "Building AI Infrastructure @InftyAI @hiverge"
Real name	Kante Yin
Location	Cambridge, UK
Blog	https://ky.dev/
Twitter	@kerthcet
Followers	212
LinkedIn	not publicly linked
Employer	InftyAI (non-profit AI-infra org he co-founded) and Hiverge. Kubernetes SIG-Scheduling reviewer/approver; historical Kueue maintainer; now focused on InftyAI's open-source LLM-infra projects (PUMA, alphatrion, llmaz). Chinese name 银坎特; frequent KubeCon China speaker

Focus area: Tiny correctness PR — drop empty values from a map (i.e. the make rev helper script wasn't removing now-empty entries after key deletions). Kerthcet is best known elsewhere in the K8s ecosystem (Kueue, llmaz) — this is a drive-by from an early Kueue + DRA crossover phase.

PR	Contribution
#48	Remove item when values are empty in map

20. takonomura — @takonomura

Field	Value
Commit email	takonomura@users.noreply.github.com
DCO Signed-off-by	(missing — commit unsigned) ⚠️
NVIDIA org check	HTTP 404
GitHub	No name, no bio, no company; 52 public repos; 46 followers
Location	Japan
LinkedIn	not publicly linked (handle is consistently pseudonymous across all repos, all commits use `users.noreply.github.com`)
Employer	Not publicly disclosed — deliberately pseudonymous. 228 PRs across many orgs. Strong circumstantial signal of association with the Japanese university IT-contest scene: heavy contributor to `ictsc/ictsc-regalia`, `ictsc/ictsc-k8s-infra`, and `ictsc/ictsc-regalia-release` (ICTSC = ICT Service Contest, a Japanese inter-college infra competition). Also contributes to `whywaita/myshoes` and `whywaita/shoes-lxd-multi` (whywaita is "Tachibana Waita" at CyberAgent), suggesting peer-group ties to the CyberAgent / Japanese-cloud community. Won ISUCON14 in 2024 (his isucon14 repo description claims "優勝" = winner). Expertise: DRA (this repo + kubernetes/kubernetes), CUE, netavark, LXD, GitHub Actions self-hosted runners

Focus area: Helm chart bug — maskNvidiaDriverParams was being rendered at the wrong YAML path so the feature toggle didn't actually take effect. The first PR to the repo from this contributor (welcomed by k8s-ci-robot).

PR	Contribution
#1053	helm: fix `maskNvidiaDriverParams` path (unsigned commit)

21. Jia-Wei Jiang — @JiangJiaWei1103

Field	Value
Commit email	waynechuang97@gmail.com
DCO Signed-off-by	`JiangJiaWei1103 <waynechuang97@gmail.com>` ✅
NVIDIA org check	HTTP 404
GitHub	No company set; bio "Never sell out · De-noobing · @ray-project Contributor · @flyteorg Committer"
Real name	Jia-Wei Jiang (江家瑋)
Location	Taiwan
Kaggle	https://www.kaggle.com/abaojiang
LinkedIn	linkedin.com/in/jiawei-jiang-mr-denoober
Employer	Not publicly stated. Open-source committer relationships: Flyte committer; Ray (KubeRay) contributor — recent activity is mostly large multi-part PRs to `ray-project/kuberay`'s History Server beta. ML/data-science background (active on Kaggle)

Focus area: Documentation polish — clarify an error message in an input-validation path.

PR	Contribution
#444	docs: Clarify err msg for input validation

22. Xingyu "Richard" Guo — @xingyug

Field	Value
Commit email	xingyug.guo.ericsson@gmail.com (the `.ericsson` infix in a personal gmail handle is unusual — see Notable Observations)
DCO Signed-off-by	(missing — both commits unsigned) ⚠️
NVIDIA org check	HTTP 404
GitHub profile name	Richard Guo
GitHub	14 public repos; only 1 follower; account created 2022-10-24 (recent)
LinkedIn	not publicly linked
Employer	Now likely Red Hat (with a prior Ericsson stint). The `.ericsson` token in the gmail handle is consistent with his earlier corporate identity, but the current PR concentration is on Red Hat–maintained projects: `rhel-lightspeed/linux-mcp-server` and `containers/kubernetes-mcp-server` (the RHEL Lightspeed / containers MCP-server work). Additional PRs to `vllm`, `BerriAI/litellm`, `jumpserver/jumpserver` — all security/correctness focused (SSRF protections, path-traversal hardening, unbounded-read fixes, auth-bug fixes). Profile pattern reads "security-focused engineer who recently joined a Red Hat MCP team"

Focus area: Two unrelated DRA-driver bug fixes landed on the same day (2026-04-15), both indicating real production NVML use:

PR #1039: getGpuInfo was logging <nil> instead of the actual NVML return code because the error format string referenced %w, err instead of %v, ret. Misleading error UX, easy to miss without hitting it.
PR #1040: when the DynamicMIG feature gate is disabled (the default), unpreparePartiallyPrepairedClaim() logs "nothing to do" but falls through into DynamicMIG-specific cleanup code, causing a spurious NVML error for static-MIG claims in PrepareStarted state. Adds the missing return nil.

Both bugs look like things you only hit if you're running this driver against real GPUs at non-trivial scale. The defect-detection pattern (NVML semantics + SSRF / path-traversal / auth elsewhere) reads as a security-focused engineer.

PR	Contribution
#1039	gpu plugin: fix wrong error variable in getGpuInfo for NVML system calls (unsigned)
#1040	gpu plugin: add missing return in unpreparePartiallyPrepairedClaim when DynamicMIG is disabled (unsigned)

Summary Table

#	Contributor	GitHub	Employer	DCO email	NVIDIA org	Merged PRs	Notes
1	John Belamaric	@johnbelamaric	Google	jbelamaric@google.com	❌	1	SIG-Architecture co-chair, DRA KEP lead
2	Antonio Ojea	@aojea	Google	aojea@google.com	❌	1	SIG-Network chair; KIND/kube-proxy maintainer
3	Leiyi Zhang	@leiyiz	Google	leiyiz@google.com	❌	1	shell-quoting fix + distroless follow-up
4	Suraj Deshmukh	@surajssd	Microsoft (AKS)	surajd.service@gmail.com	❌	1	README expectations
5	Jon Huhn	@nojnhuh	Microsoft (AKS)	nojnhuh@users.noreply.github.com	❌	1	pcieRoot soft-fail
6	Yue Yu	@yuyue9284	Microsoft	yuyu3@microsoft.com	❌	1	ComputeDomain cache existence check
7	Vitaliy Emporopulo	@empovit	Red Hat (OpenShift AI)	vemporop@redhat.com	❌	5	OpenShift enablement (longest-running external)
8	Kevin Hannon	@kannon92	Red Hat	kehannon@redhat.com	❌ ⚠️	1	CI supply-chain (action-pin) — DCO unsigned
9	Xiaowu Zhu	@yyzxw	DaoCloud	xiaowu.zhu@daocloud.io	❌	2	repo-bootstrap chores
10	Noah Tang	@CoderTH	DaoCloud	coderth@outlook.com	❌ ⚠️	2	early CI scaffolding — #38 DCO unsigned
11	Rongfu Leng	@lengrongfu	DaoCloud	lenronfu@gmail.com	❌	1	env-var rename
12	Cyclinder Kuo	@cyclinder	DaoCloud	kuocyclinder@gmail.com	❌	1	README maintenance
13	Zhongjun Li	@learner0810	DaoCloud	zhongjun.li@daocloud.io	❌	1	build fix
14	Kasia Kujawa	@kasia-kujawa	CAST AI	katarzyna@cast.ai	❌	5	GKE MPS production bug-fix stream
15	Robert Northard	@RobertNorthard	AWS (EKS)	robertnorthard@googlemail.com	❌	1	Helm defaults for managed K8s
16	Marco Ebert	@Gacko	Giant Swarm (Cluster API / Cologne)	marco_ebert@icloud.com	❌	2	NetworkPolicy + `/opt/bin`
17	Herb Duan	@herb-duan	undisclosed (Beijing)	herbertduan@qq.com	❌	1	Leader election (largest feature)
18	Anish Bista	@anishbista60	KubeRox Technologies (Nepal) — KubeVirt contrib; CNCF Kubestronaut	anishbista053@gmail.com	❌	1	e2e test for herb-duan's #851
19	Kante Yin	@kerthcet	InftyAI / Hiverge (co-founder)	kerthcet@gmail.com	❌	1	tiny map-cleanup
20	takonomura	@takonomura	undisclosed; ISUCON14 winner; ICTSC contributor	(noreply)	❌ ⚠️	1	helm path fix — DCO unsigned
21	Jia-Wei Jiang	@JiangJiaWei1103	undisclosed (Taiwan; Flyte committer / KubeRay contrib)	waynechuang97@gmail.com	❌	1	docs polish
22	Xingyu "Richard" Guo	@xingyug	likely Red Hat now (ex-Ericsson per email handle)	xingyug.guo.ericsson@gmail.com	❌ ⚠️	2	NVML correctness fixes — both DCO unsigned

Totals: 22 confirmed non-NVIDIA contributors · 33 merged PRs (plus 11 closed-unmerged + 1 open) · 35 commits · 30 signed / 5 unsigned.

Reference point: NVIDIA commits in the same window total ~1,445 (about 78% of the 1,853-commit repo, after reclassifying Mathew Wicks's 2 commits from external to NVIDIA per Helios); external work is ~2% of total commits but disproportionately concentrated in production-flavored bug fixes and platform-enablement (OpenShift, GKE, EKS, AKS, Talos/Flatcar, IPv6) rather than feature work.

Notable Observations

OpenShift / Red Hat is the most sustained external engagement. Vitaliy Emporopulo (@empovit) has been landing PRs from vemporop@redhat.com continuously since Feb 2024 — the original "DRA driver on OpenShift" install docs in the repo exist because he wrote them, and he is the only external contributor who has shipped PRs across more than two release cycles. Kevin Hannon (@kannon92) adds a second Red Hat surface area (CI supply-chain hardening). If you wanted to point at one external company that runs this driver in production today, it's Red Hat / OpenShift.
CAST AI is the most active production user this year. Kasia Kujawa (@kasia-kujawa) has merged 5 PRs in 2026 alone, every one a fix to a real bug they hit on GKE — MPS chroot, MPS shm-dir path, validation messages, nil pointers, cross-compiler. CAST AI has a managed-GPU-sharing offering that sits on top of GKE and they are clearly running this driver against real customer workloads. PR #1009 (retry device enumeration on startup to prevent empty ResourceSlices) is still open and is a classic race-on-startup symptom.
DaoCloud is the largest headcount external cluster (5), but the thinnest per-person engagement. Xiaowu Zhu, Noah Tang, Rongfu Leng, Cyclinder Kuo, and Zhongjun Li all signed off with @daocloud.io (or are listed @DaoCloud on GitHub). All five together account for only 7 merged PRs, mostly chores/docs/build-fixes from the project's early days. Three of the five — Zhu, Yu (Microsoft, see #6), Li — have no GitHub-profile metadata at all; their real names and employers are only recoverable from DCO trailers. This is consistent with a corporate community-engagement quota rather than production usage.
Hyperscaler-adjacent helm-defaults work is happening. RobertNorthard (AWS) filed and fixed issue #1047 because the default helm chart node-affinity targets node-role.kubernetes.io/control-plane, which never exists on EKS / GKE / AKS. johnbelamaric (Google) added the compute-class toleration. Together these are the closest the project has to "make defaults work on managed Kubernetes" patches — and they have come, one at a time, from each of the three hyperscalers.
The DRA KEP author has personally contributed. John Belamaric (@johnbelamaric) — co-author of the DRA KEP-3063 that this entire repo exists to implement, and co-chair of SIG-Architecture — has one merged PR (#221, compute-class toleration). That's a useful piece of social proof for the repo: the API designer landed code in the reference driver.
One feature PR dominates external impact: herb-duan's #851 (+1,599 / −4 across 14 files), adding leader election to the ComputeDomain controller. This eliminates a controller SPOF in HA deployments. The feature was requested by another non-NVIDIA contributor (@Lily922, issue #815) and the follow-up e2e test was contributed by another non-NVIDIA contributor (@anishbista60, PR #1094). Three external contributors collaborating across feature-request → implementation → test is the most coordinated external work the repo has seen.
Identity discovery via DCO trailers — 4 contributors. DCO Signed-off-by lines were the only signal that revealed:
- yyzxw is Xiaowu Zhu at DaoCloud (GitHub profile is empty),
- yuyue9284 is Yue Yu at Microsoft (GitHub profile is empty),
- learner0810 is Zhongjun Li at DaoCloud (GitHub profile is empty),
- lengrongfu's real DCO email is lenronfu@gmail.com (commit author email is a local hostname). Without DCO, three of these would be "unknown handle at unknown company."
Identity discovery via email pattern + repo-graph — 1 contributor. xingyug.guo.ericsson@gmail.com has the substring .ericsson deliberately embedded in a personal gmail handle. GitHub login xingyug has no name, no company, no LinkedIn surface — but cross-referencing his other PRs shows the current concentration is on Red Hat–maintained projects (rhel-lightspeed/linux-mcp-server, containers/kubernetes-mcp-server), not Ericsson properties. Best read: an Ericsson alum who recently joined a Red Hat MCP-server team. The defect-spotting pattern (NVML semantics here + SSRF / path-traversal / auth-bug fixes elsewhere) is consistent with a security-focused engineer. Two unsigned commits from this contributor are the only unsigned external work in the recent (post-2026) window.
DCO compliance is weaker on this repo than on nvsentinel. Five external commits are unsigned (kannon92, takonomura, two from xingyug, one CoderTH). The repo's branch protection only requires EasyCLA; there is no DCO bot blocking unsigned merges. By contrast, the nvsentinel repo's external commits were 100% signed (and the gap that surfaced there was on NVIDIA-internal committers, not externals).
No competing-GPU-vendor contributions detected. Unlike nvsentinel (which received a MooreThreads-affiliated style PR), this repo has not received any contributions from MooreThreads, AMD, Intel, Habana, Tenstorrent, or other competing accelerator vendors. The Ericsson signal (from #8) is the closest to "third-party hardware vendor" and even there the work is on NVML / DRA semantics, not on cross-vendor abstraction.
One pseudonymous contributor who is demonstrably world-class — takonomura. Won ISUCON14 (2024) — Japan's premier infrastructure-tuning competition. Heavy contributor to ICTSC (Japan's inter-college infra contest) and to whywaita/CyberAgent-community tools. DRA contributions across both this repo and kubernetes/kubernetes. Uses users.noreply.github.com everywhere; the only external contributor we genuinely cannot de-anonymize, but his work and contribution graph make clear he is an experienced platform engineer.
Giant Swarm joins the managed-K8s cluster. Marco Ebert (@Gacko) is a Giant Swarm Cluster API engineer with ~2,463 PRs inside the giantswarm org. That brings the managed-Kubernetes-vendor count contributing to this repo to five: AWS (RobertNorthard), Google (Belamaric / Ojea / Zhang), Microsoft (Deshmukh / Huhn / Yu), Red Hat-OpenShift (Empovit / Hannon), and Giant Swarm (Ebert) — plus the upstream-distribution vendors DaoCloud and CAST AI. Every major Kubernetes commercial distribution model is represented in the external contributor pool except SUSE / Rancher and VMware-Tanzu-broadcom. The contributions are consistently shaped by what each vendor needs to ship the driver to their customers.
The IPv6 / driver-580 startup-probe fix was actually NVIDIA-internal. PRs #510 / #511 (compute-domain startup probe on IPv6 with NVIDIA driver 580+) initially appeared to be the only external IPv6 contribution. After the Helios cross-check those PRs are NVIDIA-internal (Mathew Wicks, NVIDIA Enterprise Products) — meaning the external IPv6 surface area in this repo is currently zero. Worth flagging because IPv6 / dual-stack support is a real production-readiness gap that no third party has yet pushed on.
The contribution shape is "platform-enablement," not "feature work." Of the 33 merged external PRs, the breakdown is roughly:
- Platform / installer / chart enablement (OpenShift, GKE, EKS, AKS, NetworkPolicy, /opt/bin, masters-toleration, control-plane affinity): ~11 PRs
- Production bug fixes (MPS chroot, NVML error vars, cache existence checks, race on startup): ~9 PRs
- CI / supply-chain / build / dependabot scaffolding: ~7 PRs
- Docs: ~3 PRs
- Features (leader-election, e2e test for it): 2 PRs (one of them 1.6k lines)
- Style / chores (map cleanup, repeat-label fix, env-var rename): ~3 PRs
External contributors are exercising this driver against real cloud-vendor distributions and fixing what breaks. NVIDIA continues to own all of the architectural and feature direction.
Helios cross-check caught a false-positive that GitHub-org check alone missed. Mathew Wicks (@thesuperzapper) initially read as a Kubeflow-lead external contributor running his own consultancy. Helios LDAP revealed he has been an NVIDIA employee since 2025-02-18 (Enterprise Products, Santa Clara HQ, manager-chain ending at Jensen Huang) — i.e. he was already on NVIDIA payroll 6 months before his two DRA-driver PRs merged on 2025-08-30. GitHub-org membership is not provisioned (HTTP 404); he contributes via his personal handle with no @nvidia.com signoff. Without a Helios pass this would have been miscategorized. This is the analogue of nvsentinel's jamie-yu0 finding (NVIDIA employee using a university-style external identity) and is a pattern any future external-contributor audit on NVIDIA-adjacent repos should look for.

dims/2026-05-11-dra-driver-nvidia-gpu-external-contributors.md

Select an option

No results found