Generated: 2026-05-11 (rev. 2 — Helios cross-check added)
Repo: kubernetes-sigs/dra-driver-nvidia-gpu
Repo history: 2022-07-14 → 2026-05-11 (~3.8 years)
Total commits analyzed: 1,853 (47 unique author emails)
Methodology: Extracted all unique commit authors via git log → classified by email domain (@nvidia.com = NVIDIA, all others = candidates) → mapped commits to GitHub logins via GET /repos/.../commits/{sha} → verified every candidate against GET /orgs/NVIDIA/members/{username} (HTTP 204 = confirmed member, 404 = not a member) → for ambiguous cases, additionally cross-referenced against NVIDIA Helios LDAP (helios-cli user search) to detect NVIDIA employees who contribute via personal GitHub accounts not registered in the NVIDIA org → cross-referenced GitHub profiles, DCO Signed-off-by trailers, LinkedIn, and corporate-email patterns → folded NVIDIA-personal-email aliases (e.g. klueska@gmail.com → kklues@nvidia.com, davanum@gmail.com → dsrinivas@nvidia.com, 7723350-elezar@…gitlab.com → elezar@nvidia.com, etc.) back into the NVIDIA cohort.
Of 37 commits from confirmed non-NVIDIA authors, 32 carry a valid Signed-off-by trailer and 5 do not. The repo does not run the standard CNCF DCO bot (only EasyCLA is configured as a branch-protection required check on main / release-25.8), so unsigned commits were not blocked at merge time.
Unsigned external commits (5):
| Commit | Author | PR | Title |
|---|---|---|---|
1eb01b4 |
coderth <coderth@outlook.com> |
#38 | fix: nvidia-dra-plugin Config |
b96e2c4 |
Kevin Hannon <kehannon@redhat.com> |
#1016 | Pin GitHub Actions to commit SHAs for supply-chain protection |
a379b9f |
takonomura <takonomura@users.noreply.github.com> |
#1053 | helm: fix `maskNvidiaDriverParams` path |
ffa7858 |
Xingyu Guo <xingyug.guo.ericsson@gmail.com> |
#1039 | gpu plugin: fix wrong error variable in getGpuInfo for NVML system calls |
81c4422 |
Xingyu Guo <xingyug.guo.ericsson@gmail.com> |
#1040 | gpu plugin: add missing return in unpreparePartiallyPrepairedClaim when DynamicMIG is disabled |
Identity corrections from DCO trailers:
- yyzxw signed off with
xiaowu.zhu <xiaowu.zhu@daocloud.io>→ real name is Xiaowu Zhu, employer is DaoCloud (GitHub profile has no name, no company, no bio; commit email1020938856@qq.comis opaque). - yuyue9284 signed off with both
Yue Yu <yuyu3@microsoft.com>andyuyue9284 <15863499+yuyue9284@users.noreply.github.com>→ real name Yue Yu, employer Microsoft (GitHub profile has no name, no company, no bio). - lengrongfu signed off with
rongfu.leng <lenronfu@gmail.com>(commit email is the local hostnamelengrongfu@lengrongfudeMacBook-Pro.local); GitHub profile separately confirms@DaoCloud. - cyclinder signed off with
Cyclinder Kuo <kuocyclinder@gmail.com>(full real name: Cyclinder Kuo). - learner0810 signed off with
zhongjun.li <zhongjun.li@daocloud.io>→ real name Zhongjun Li, employer DaoCloud (GitHub profile has no name, no company, no bio).
Corrections from NVIDIA org check (false positives removed):
- visheshtanksale (Vishesh Tanksale, PR #965, commit email
vishesh.tanksale09@gmail.com) is an NVIDIA org member (HTTP 204). Removed from the external list. He sits in the OWNERS file as a reviewer.
Corrections from NVIDIA Helios LDAP (additional false positives, not detectable from GitHub org alone):
- thesuperzapper (Mathew Wicks, PRs #510 / #511, commit email
5735406+thesuperzapper@users.noreply.github.com) is a confirmed NVIDIA employee per Helios LDAP — loginmwicks, emailmwicks@nvidia.com, NVIDIA hire date 2025-02-18, department Enterprise Products, location Santa Clara HQ, manager chain Joohoon Lee → Ian Buck → Jensen Huang. Both his DRA-driver PRs merged on 2025-08-30 — i.e. 6 months after his NVIDIA hire date. He contributes via his personal GitHub handle (still listing@aranui-solutionsas company; "Kubeflow Lead" in bio) and has not joined the NVIDIA GitHub org (HTTP 404), which is why he initially read as external; the Helios cross-check resolved it. Removed from the external list. This is the analogue of the nvsentinel report'sjamie-yu0discovery — an NVIDIA employee whose external GitHub identity initially appears as third-party.
The two corrections above (visheshtanksale, thesuperzapper) reduce the external-contributor count from 24 candidates down to 22. The Helios pass would have caught additional cases if any NVIDIA-employed contributor used a personal email AND a personal GitHub handle AND was not in the NVIDIA GitHub org; thesuperzapper is the only such case found. The single most actionable insight from this report is that the GitHub-org check alone misses NVIDIA employees who do OSS work from personal handles — anyone doing this kind of analysis on NVIDIA-adjacent repos in the future should include a Helios pass.
22 confirmed external contributors with 33 merged PRs (plus 11 closed-unmerged + 1 open) and 35 external commits total. Profiles below are ordered by employer cluster size, then merged-PR count, then alphabetical.
1. John Belamaric — @johnbelamaric
| Field | Value |
|---|---|
| Commit email | jbelamaric@google.com |
| DCO Signed-off-by | John Belamaric <jbelamaric@google.com> ✅ |
| NVIDIA org check | HTTP 404 |
| GitHub company | |
| Public email | jbelamaric@google.com |
| linkedin.com/in/johnbelamaric (public, not linked from GitHub) | |
| Employer | Google — Principal / Distinguished Software Engineer; SIG-Architecture co-chair; co-driver of the DRA KEP and ResourceClaim API; long-time CoreDNS maintainer; co-author of Learning CoreDNS (O'Reilly); frequent KubeCon speaker |
| Location | US (East Coast / Maryland area, per public bios) |
Focus area: Manageability for control-plane scheduling. Added a toleration that lets the kubelet-plugin DaemonSet land on compute-class-tainted nodes (the "compute class" pattern Google's GKE Autopilot uses).
| PR | Contribution |
|---|---|
| #221 | Add toleration for compute class taint |
2. Antonio Ojea — @aojea
| Field | Value |
|---|---|
| Commit email | aojea@google.com |
| DCO Signed-off-by | Antonio Ojea <aojea@google.com> ✅ |
| NVIDIA org check | HTTP 404 |
| GitHub company | |
| Public email | antonio.ojea.garcia@gmail.com |
| @itsuugo | |
| linkedin.com/in/ajojea | |
| Personal site | https://kindnet.es (creator/maintainer of KindNet CNI) |
| Location | Spain (remote, Google) |
| Followers | 545 |
| Employer | Google (GKE Networking) — Senior Software Engineer; Kubernetes SIG-Network technical lead / approver; KIND maintainer; creator of KindNet; KEP author for many networking features (kube-proxy, IPv6/dual-stack, Gateway API); frequent KubeCon speaker |
Focus area: Cross-driver correctness. PR #435 prevents the NVIDIA kubelet-plugin from trying to allocate devices that belong to a different DRA driver — a real-world hazard in any cluster running multiple DRA drivers (e.g. NVIDIA GPU + a network/RDMA DRA driver), which is the configuration GKE is moving toward.
| PR | Contribution |
|---|---|
| #435 | filter device requests from other drivers |
3. Leiyi Zhang — @leiyiz
| Field | Value |
|---|---|
| Commit email | leiyiz@google.com |
| DCO Signed-off-by | Léiyì Zhang <leiyiz@google.com> ✅ |
| NVIDIA org check | HTTP 404 |
| GitHub company | |
| Location | Seattle, WA |
| not publicly linked | |
| Employer | Google — CSI / GKE storage / AI-on-GKE engineer (contributes heavily to gcp-filestore-csi-driver, gcp-compute-persistent-disk-csi-driver, container-engine-accelerators, ai-on-gke); DRA GPU driver work targets GKE accelerators |
Focus area: Image-hardening (driver image). The PR fixed a shell-quoting/interpolation bug where "{$var}" was being emitted as a literal {} wrapped around the filename. A subsequent unmerged PR (#972) attempted to remove the shell dependency from the driver image altogether — consistent with running NVIDIA DRA under a hardened, shell-less base image (a Google/Distroless idiom).
| PR | Contribution |
|---|---|
| #968 | fix syntax error where "{$var}" results in literal {} wrapped around file name |
| #972 (closed, unmerged) | remove shell dependency from the driver image |
Google cluster (3 contributors, 3 merged PRs): Belamaric, Ojea, and Leiyi Zhang. Together they cover the three things Google needs out of this driver: compute-class scheduling (GKE Autopilot), multi-DRA-driver coexistence (GKE multi-resource), and a shell-less hardened base image. None of these contributions are cosmetic; they read like patches that arose from Google running this driver in some kind of integration / pre-prod path.
4. Suraj Deshmukh — @surajssd
| Field | Value |
|---|---|
| Commit email | surajd.service@gmail.com |
| DCO Signed-off-by | Suraj Deshmukh <surajd.service@gmail.com> ✅ |
| NVIDIA org check | HTTP 404 |
| GitHub company | @microsoft |
| Public email | surajd.service@gmail.com |
| Blog | https://suraj.io/ |
| @surajd_ | |
| Location | Redmond, WA |
| Public repos | 304 |
| linkedin.com/in/surajssd | |
| Bluesky | https://bsky.app/profile/suraj.io |
| Employer | Microsoft — Senior Software Engineer (AKS / Azure Linux); ex-Kinvolk (Flatcar Container Linux, acquired by Microsoft 2021); ex-Red Hat OpenShift/Origin; frequent KubeCon speaker and blogger |
Focus area: Documentation; the PR clarifies the expected post-install state of the DRA driver in the README.
| PR | Contribution |
|---|---|
| #164 | README: Update the expectations after installation |
5. Jon Huhn — @nojnhuh
| Field | Value |
|---|---|
| Commit email | nojnhuh@users.noreply.github.com |
| DCO Signed-off-by | Jon Huhn <nojnhuh@users.noreply.github.com> ✅ |
| NVIDIA org check | HTTP 404 |
| GitHub company | @microsoft |
| not publicly linked | |
| Employer | Microsoft — Software Engineer on the Azure Kubernetes team; cluster-api-provider-azure (CAPZ) maintainer; maintainer of kubernetes-sigs/dra-example-driver (the reference DRA implementation he ported many learnings from into this driver); active across kubernetes/kubernetes, test-infra, klog, enhancements |
Focus area: Operational softening — only warn (don't fail) when pcieRoot cannot be determined for a GPU. Relevant on AKS GPU nodes where the PCI topology query may not return a stable bus path under all node images.
| PR | Contribution |
|---|---|
| #577 | Only warn when pcieRoot can't be determined |
6. Yue Yu — @yuyue9284
| Field | Value |
|---|---|
| Commit email | yuyu3@microsoft.com (in commit author) |
| DCO Signed-off-by | Yue Yu <yuyu3@microsoft.com> + yuyue9284 <15863499+yuyue9284@users.noreply.github.com> ✅ (dual signoff) |
| NVIDIA org check | HTTP 404 |
| GitHub profile | No name, no bio, no company set (but a member of the microsoft GitHub org) |
| not publicly linked | |
| Employer | Microsoft — Azure Arc for Kubernetes / Azure ML on Kubernetes engineer; PR distribution: AzureArcForKubernetes/azure-cli-extensions (12), Azure/azureml-examples (7), Azure/AML-Kubernetes (4), microsoft/frameworkcontroller, MicrosoftDocs/azure-ai-docs; also contributes to volcano-sh/volcano (batch scheduler) and NVIDIA/gpu-operator |
Note: GitHub login yuyue9284 does not match the commit author email's local part (yuyu3@…); without the DCO trailer this contributor would have read as an unknown handle. The dual signoff (one corporate, one noreply) is unusual — likely a tooling artifact from a corporate review pipeline.
Focus area: Stale-cache fix in the ComputeDomain controller — guards against operating on a ComputeDomain that has been deleted but is still present in the informer cache. Production-grade hardening, not a drive-by.
| PR | Contribution |
|---|---|
| #805 | fix: check existence of ComputeDomain in cache before processing updates |
Microsoft cluster (3 contributors, 3 merged PRs): Deshmukh, Huhn, and Yue Yu. Yue Yu's ComputeDomain cache fix is a non-trivial controller-runtime correctness improvement. Huhn's pcieRoot patch is AKS-flavored. Together they imply AKS is exercising this driver end-to-end, not just shipping it.
7. Vitaliy Emporopulo — @empovit
| Field | Value |
|---|---|
| Commit email | vemporop@redhat.com |
| DCO Signed-off-by | Vitaliy Emporopulo <vemporop@redhat.com> ✅ |
| NVIDIA org check | HTTP 404 |
| GitHub company | Red Hat |
| Public email | vemporop@redhat.com |
| Profile name | Vitaly E. |
| Location | Israel |
| Public repos | 64 |
| not publicly linked | |
| Employer | Red Hat — Software Engineer focused on the NVIDIA / GPU ecosystem on OpenShift. Primary contributor to rh-ecosystem-edge/nvidia-ci, rh-ecosystem-edge/console-plugin-nvidia-gpu, openshift/instaslice-operator (MIG slice orchestration), plus openshift/release CI plumbing |
Focus area: OpenShift integration. Five merged PRs over ~14 months (Feb 2024 → Mar 2026), covering: privileged kubelet-plugin on OpenShift; OpenShift SCC bindings for service accounts (so the IMEX daemon can write its nodes_config.cfg under a random UID); the nvidia-container-toolkit path override for non-default RHEL installs; the master-role toleration for control-plane workloads; and the original OpenShift install docs in the README. This is the most sustained external contribution in the repo's history; the OpenShift install instructions in docs/ exist because Empovit wrote them.
| PR | Contribution |
|---|---|
| #72 | Allow custom NVIDIA CTK path |
| #76 | Let kubelet plugin run privileged on OpenShift |
| #82 | Document DRA driver installation on OpenShift |
| #569 | Add SCC to service accounts on OpenShift (fixes IMEX writeNodesConfig perm-denied under random UID) |
| #899 | Add node-role.kubernetes.io/master toleration to controller |
8. Kevin Hannon — @kannon92
| Field | Value |
|---|---|
| Commit email | kehannon@redhat.com |
| DCO Signed-off-by | (missing — commit unsigned) |
| NVIDIA org check | HTTP 404 |
| GitHub company | Red Hat |
| Bio | "wg-batch lead. Kueue reviewer. JobSet Maintainer. Excited to make Kubernetes the platform of choice for AI/ML/HPC." |
| Location | Cleveland, Ohio |
| not publicly linked from GitHub | |
| Followers | 98 |
| Employer | Red Hat — Principal Software Engineer on OpenShift / Kubernetes batch & scheduling; WG-Batch Lead (Kubernetes); Kueue reviewer; JobSet maintainer; frequent KubeCon presenter on batch/AI workloads (Kueue, JobSet) |
Focus area: Supply-chain hardening of the repo's own CI. The change pins every GitHub Action used in workflows to an immutable commit SHA (instead of a movable tag), explicitly motivated by the March 2025 tj-actions/changed-files and reviewdog/action-setup compromises. Fixes upstream issue #1015.
| PR | Contribution |
|---|---|
| #1016 | Pin GitHub Actions to commit SHAs for supply-chain protection (fixes #1015) |
Red Hat cluster (2 contributors, 6 merged PRs): Emporopulo (5 PRs, all OpenShift enablement) and Hannon (1 PR, CI supply-chain hardening). Empovit is the longest-running external contributor in the repo; Hannon's contribution is the single most consequential security PR an external has landed here.
9. Xiaowu Zhu — @yyzxw
| Field | Value |
|---|---|
| Real name | Xiaowu Zhu (朱晓武) — discovered via DCO trailer (xiaowu.zhu@daocloud.io); GitHub profile has no name, company, bio, or blog |
| Commit email | 1020938856@qq.com |
| DCO Signed-off-by | xiaowu.zhu <xiaowu.zhu@daocloud.io> ✅ |
| NVIDIA org check | HTTP 404 |
| GitHub | 77 public repos, created 2017-12; goes by handle "zxw" |
| not publicly linked | |
| Employer | DaoCloud — software engineer focused on AI/LLM serving infrastructure. PR concentration: DaoCloud/dce-charts-repackage, BaizeAI/modelhub, BaizeAI/dataset (DaoCloud's Baize AI platform), llm-d/llm-d-kv-cache, and vllm-project/vllm. Also active on DaoCloud's Higress plugin server |
Focus area: Repo bootstrap chores during the first months after the project moved to its current form — dependabot setup, label dedupe, PR templates, makefile helpers, ci scaffolding. Only 2 of 6 PRs merged; the other 4 were closed unmerged (likely overlapped with NVIDIA's own scaffolding direction).
| PR | Contribution |
|---|---|
| #45 | feat: add dependabot |
| #59 | fix: remove repeat label key |
| #42, #43, #44, #64 (all closed-unmerged) | makefile help target, PR templates, PR-title check, switch-to-map refactor |
10. Noah Tang — @CoderTH
| Field | Value |
|---|---|
| Commit email | coderth@outlook.com |
| DCO Signed-off-by | coderth <coderth@outlook.com> ✅ on #39; missing 1eb01b4) |
| NVIDIA org check | HTTP 404 |
| GitHub company | @DaoCloud |
| Real name | Noah Tang |
| Bio | "🚀 Cloud Native Developer |
| Location | Chengdu, China |
| Blog | https://coderth.onrender.com/ (Hexo blog, handle "CoderTh") |
| not publicly linked | |
| Employer | DaoCloud — Cloud Native engineer. Works on DaoCloud's dce-charts-repackage, the matrixhub-ai/matrixhub AI model hub, NVIDIA vGPU / k8s-vgpu-scheduler integrations, and contributes to NVIDIA Dynamo and the Harbor operator |
Focus area: Early CI/config scaffolding — the GitHub-actions CI step that runs golangci-lint/test/build was first set up by this PR, and the nvidia-dra-plugin static config fix.
| PR | Contribution |
|---|---|
| #38 | fix: nvidia-dra-plugin Config (unsigned commit) |
| #39 | feat: add github ci step |
| #40 (closed-unmerged) | feat: add issue template |
11. Rongfu Leng — @lengrongfu
| Field | Value |
|---|---|
| Commit author email | lengrongfu@lengrongfudeMacBook-Pro.local (local laptop hostname — not deliverable) |
| DCO Signed-off-by | rongfu.leng <lenronfu@gmail.com> ✅ |
| Profile email | 1275177125@qq.com |
| NVIDIA org check | HTTP 404 |
| GitHub company | @DaoCloud |
| Real name | rongfu.leng |
| Location | Chengdu, Sichuan, China |
| Blog | https://lengrongfu.github.io/ |
| Public repos | 309 |
| not publicly linked | |
| Employer | DaoCloud — Senior engineer; heavy contributor in the LLM-serving ecosystem (vllm-project/vllm, vllm-omni, vllm-project/router, sglang-project/sglang) plus DaoCloud's enterprise charts. Chinese name 冷荣富 |
Focus area: Environment-variable rename to align with the broader nvidia-container-toolkit naming convention (CONTAINER_DRIVER_ROOT → DRIVER_ROOT_CTR_PATH). Closed-unmerged follow-up #212 attempted NVIDIA_CTK_PATH → NVIDIA_CDI_HOOK_PATH for the same alignment reason.
| PR | Contribution |
|---|---|
| #211 | replace CONTAINER_DRIVER_ROOT with DRIVER_ROOT_CTR_PATH |
| #212 (closed-unmerged) | Use NVIDIA_CDI_HOOK_PATH instead of NVIDIA_CTK_PATH |
12. Cyclinder Kuo — @cyclinder
| Field | Value |
|---|---|
| Commit email | qifeng.guo@daocloud.io |
| DCO Signed-off-by | Cyclinder Kuo <kuocyclinder@gmail.com> ✅ |
| NVIDIA org check | HTTP 404 |
| GitHub company | @DaoCloud |
| Real name | Cyclinder Kuo (Qifeng Guo — Chinese name 郭起峰, per commit email handle) |
| Bio | "everything we went through was just a waste of a time" |
| Location | Chengdu, China |
| @Cyclinder_Kuo | |
| not publicly linked | |
| Employer | DaoCloud — networking engineer; maintainer / major contributor to spidernet-io/spiderpool (CNCF sandbox Kubernetes IPAM/CNI) and the vlan-cni / iaas-network-provider projects |
Focus area: Doc maintenance — adjust README to follow script/namespace renames done elsewhere in the repo.
| PR | Contribution |
|---|---|
| #280 | README: adjust to script/namespace renames |
13. Zhongjun Li — @learner0810
| Field | Value |
|---|---|
| Commit email | zhongjun.li@daocloud.io |
| DCO Signed-off-by | zhongjun.li <zhongjun.li@daocloud.io> ✅ |
| NVIDIA org check | HTTP 404 |
| GitHub profile | No name, no bio, no company set; 58 public repos |
| not publicly linked | |
| Employer | DaoCloud — Software engineer on inference / storage. PR distribution: DaoCloud/dce-charts-repackage (9), kubernetes-sigs/gateway-api-inference-extension (7), llm-d/llm-d-inference-scheduler (4), hwameistor/hwameistor (3, CNCF sandbox local-volume storage). Keeps a low public profile — identity recoverable only from DCO trailer |
Focus area: Build fix. Closed-unmerged follow-up #66 proposed adding gofumpt to the code checks.
| PR | Contribution |
|---|---|
| #67 | fix build binaries |
| #66 (closed-unmerged) | Add gofumpt code checks |
DaoCloud cluster (5 contributors, 5 merged PRs): Zhu, Tang, Leng, Kuo, Li — all four use Chengdu / DaoCloud-issued corporate emails in DCO. This is the single largest external-company cluster on the project. The work is heavy on chores/scaffolding/docs and lighter on architectural change; the contribution shape suggests early-days community engagement rather than a deep production-deployment-driven workstream. DCO trailers are the only identity signal for three of the five (Zhu, Yu, Li have no GitHub-profile metadata at all).
14. Kasia Kujawa — @kasia-kujawa
| Field | Value |
|---|---|
| Commit email | katarzyna@cast.ai |
| DCO Signed-off-by | Katarzyna Kujawa <katarzyna@cast.ai> ✅ |
| NVIDIA org check | HTTP 404 |
| GitHub company | CAST AI |
| Real name | Katarzyna ("Kasia") Kujawa |
| Location | Gdańsk, Poland |
| not publicly linked from GitHub (multiple Katarzyna Kujawa entries in Gdańsk on LinkedIn; the CAST AI–affiliated one is the match) | |
| Employer | CAST AI — Software engineer on the cost-optimization / autoscaling product surface (castai/k8s-agent, castai/terraform-provider-castai, castai/helm-charts). CAST AI is a Kubernetes autoscaling / cost-management platform spanning GKE / EKS / AKS |
Focus area: Production bug-fix stream from CAST AI's GPU offering. Five merged PRs in early 2026:
- PR #889: fixes
mps-control-daemonchroot'sh': No such file or directoryerror whennvidiaDriverRootis/home/kubernetes/bin/nvidia/— GKE-specific driver-install path. CAST AI runs MPS GPU-sharing on GKE. - PR #978: corrects the MPS shm-dir mount path so MPS clients can actually talk to the control daemon.
- PR #979: fixes the cross-compiler used for the static-bash build (related to bash-static work the NVIDIA team did for distroless images).
- PR #996: actionable validation errors when GPU sharing config is malformed (improves error UX for end-users).
- PR #997: nil-pointer fix when
featureGates: nullin the Helm chart values.
| PR | Contribution |
|---|---|
| #889 | Fix mps-control-daemon chroot shell execution error when nvidiaDriverRoot is set (GKE) |
| #978 | Set proper path for MPS shm dir mount |
| #979 | Use correct cross-compiler for bash static build |
| #996 | Improve validation errors for GPU sharing with actionable messages and supported values |
| #997 | Fix nil pointer when featureGates is set to null in values |
| #1009 (open) | Retry device enumeration on startup to prevent empty ResourceSlices |
Note: This is the most active and the most production-driven external contributor in the repo's recent history. Every PR is a fix for something CAST AI hit in production.
15. Robert Northard — @RobertNorthard
| Field | Value |
|---|---|
| Commit email | robertnorthard@googlemail.com |
| DCO Signed-off-by | Rob <robertnorthard@googlemail.com> ✅ |
| NVIDIA org check | HTTP 404 |
| GitHub company | AWS |
| Public email | robertnorthard@googlemail.com |
| Profile name | Rob |
| Public repos | 142 |
| not publicly linked from GitHub (public Robert Northard SA-at-AWS profile exists on LinkedIn) | |
| Employer | Amazon Web Services (AWS) — Specialist Solutions Architect / engineer. Heavy contributor to aws/karpenter-provider-aws, aws/eks-anywhere, aws-samples/karpenter-blueprints, aws-ia EKS Blueprints, and aws-eks-best-practices. Likely EMEA-based given his community presence in EU Karpenter / EKS workshops |
Focus area: Helm-chart defaults for managed-Kubernetes (EKS-flavored) deployments. The default node affinity assumed a node-role.kubernetes.io/control-plane node existed — which is never true on managed offerings (EKS / GKE / AKS hide the control plane). The PR also adds default GPU tolerations so the controller pods can land on tainted GPU nodes. Filed as issue #1047 and fixed by his own PR #1054.
| PR | Contribution |
|---|---|
| #1054 | Changed default node affinity and GPU tolerations for DRA controller and kubelet plugin helm chart values (fixes #1047) |
16. Marco Ebert — @Gacko
| Field | Value |
|---|---|
| Commit email | marco_ebert@icloud.com |
| DCO Signed-off-by | Marco Ebert <marco_ebert@icloud.com> ✅ |
| NVIDIA org check | HTTP 404 |
| GitHub | 97 followers, but only 1 public personal repo (marctl) — the 97-follower count tracks his commit volume inside the giantswarm GitHub org rather than personal projects |
| Real name | Marco Ebert |
| Location | Germany (Giant Swarm is headquartered in Cologne; not set on profile) |
| not publicly linked from GitHub (public "Marco Ebert at Giant Swarm" profile exists on LinkedIn) | |
| Employer | Giant Swarm (Cologne-based managed-Kubernetes vendor; member of the giantswarm org with ~2,463 internal PRs there). Platform / cluster engineer working on Cluster API tooling — cluster-api-app, cluster-test-suites, cluster-standup-teardown, clustertest, releases, cluster-vsphere. Account from 2011 |
Focus area: Two helm/runtime PRs in late 2025. PR #708 adds NetworkPolicy resources to the chart (egress/ingress for the controller and daemons) — meaningful in restricted clusters that block all pod-to-pod traffic by default. PR #706 adds /opt/bin to the kubelet-plugin's binary search path (Talos / Flatcar / GKE-COS-style read-only-root distros put their nvidia binaries under /opt/bin).
| PR | Contribution |
|---|---|
| #706 | kubelet plugins: add /opt/bin to binary search paths |
| #708 | chart: add network policies |
17. Herb Duan — @herb-duan
| Field | Value |
|---|---|
| Commit email | herbertduan@qq.com |
| DCO Signed-off-by | Herb Duan <herbertduan@qq.com> ✅ |
| NVIDIA org check | HTTP 404 |
| GitHub | No bio, no company, no blog; 38 public repos; 28 followers |
| Real name | Herb Duan |
| Location | Beijing, China |
| not publicly linked | |
| Employer | Not disclosed — no company on GitHub bio, no blog, no social accounts. Total public footprint is only 5 PRs across 3 repos: kubernetes/kubernetes (3, kubelet resource-claim status handling), this repo (1, the leader-election feature below), and BenchCouncil/BigDataBench (1 — a Chinese academic/research benchmark, suggesting a possible prior ICT/CAS-style academic background; unverified) |
Focus area: Largest external feature contribution in the repo. PR #851 adds leader election to the ComputeDomain controller so it can run multi-replica without conflicts — eliminating the previous SPOF on the controller. +1,599 / −4 lines across 14 files. Fixes issue #815 (filed by another non-NVIDIA user, Lily922). The follow-up e2e test for this feature was contributed by Anish Bista (PR #1094, below).
| PR | Contribution |
|---|---|
| #851 | feat(controller): Add leader election for high availability (+1599/-4, fixes #815) |
18. Anish Bista — @anishbista60
| Field | Value |
|---|---|
| Commit email | anishbista053@gmail.com |
| DCO Signed-off-by | anish bista <anishbista053@gmail.com> and anishbista60 <anishbista053@gmail.com> ✅ |
| NVIDIA org check | HTTP 404 |
| GitHub company | KubeRox Technologies |
| Bio | "Always be humble to everyone" |
| Location | Nepal |
| @anishbista053 | |
| linkedin.com/in/anishbista | |
| Personal site | https://anishbista60.github.io/personal-website/ ; blog at https://medium.com/@anishbista |
| Public repos | 60 |
| Employer | KubeRox Technologies (Nepal) — Kubernetes engineer. Self-describes as the "youngest CNCF Kubestronaut from Nepal" (holds CKA, CKAD, CKS, KCNA, KCSA). Active KubeVirt contributor and maintainer of the kubevirtbmc project; also contributes to Kanister |
Focus area: End-to-end test coverage for the leader-election feature herb-duan landed in #851. Filed against follow-up issue #970 (jgehrcke). This is the only example in the repo of two external contributors collaborating across a feature: one shipped the feature, the other shipped the test.
| PR | Contribution |
|---|---|
| #1094 | tests/bats: add leader election e2e test for compute-domain-controller |
19. Kante Yin — @kerthcet
| Field | Value |
|---|---|
| Commit email | kerthcet@gmail.com |
| DCO Signed-off-by | kerthcet <kerthcet@gmail.com> ✅ |
| NVIDIA org check | HTTP 404 |
| GitHub | No company set; bio "Building AI Infrastructure @InftyAI @hiverge" |
| Real name | Kante Yin |
| Location | Cambridge, UK |
| Blog | https://ky.dev/ |
| @kerthcet | |
| Followers | 212 |
| not publicly linked | |
| Employer | InftyAI (non-profit AI-infra org he co-founded) and Hiverge. Kubernetes SIG-Scheduling reviewer/approver; historical Kueue maintainer; now focused on InftyAI's open-source LLM-infra projects (PUMA, alphatrion, llmaz). Chinese name 银坎特; frequent KubeCon China speaker |
Focus area: Tiny correctness PR — drop empty values from a map (i.e. the make rev helper script wasn't removing now-empty entries after key deletions). Kerthcet is best known elsewhere in the K8s ecosystem (Kueue, llmaz) — this is a drive-by from an early Kueue + DRA crossover phase.
| PR | Contribution |
|---|---|
| #48 | Remove item when values are empty in map |
20. takonomura — @takonomura
| Field | Value |
|---|---|
| Commit email | takonomura@users.noreply.github.com |
| DCO Signed-off-by | (missing — commit unsigned) |
| NVIDIA org check | HTTP 404 |
| GitHub | No name, no bio, no company; 52 public repos; 46 followers |
| Location | Japan |
not publicly linked (handle is consistently pseudonymous across all repos, all commits use users.noreply.github.com) |
|
| Employer | Not publicly disclosed — deliberately pseudonymous. 228 PRs across many orgs. Strong circumstantial signal of association with the Japanese university IT-contest scene: heavy contributor to ictsc/ictsc-regalia, ictsc/ictsc-k8s-infra, and ictsc/ictsc-regalia-release (ICTSC = ICT Service Contest, a Japanese inter-college infra competition). Also contributes to whywaita/myshoes and whywaita/shoes-lxd-multi (whywaita is "Tachibana Waita" at CyberAgent), suggesting peer-group ties to the CyberAgent / Japanese-cloud community. Won ISUCON14 in 2024 (his isucon14 repo description claims "優勝" = winner). Expertise: DRA (this repo + kubernetes/kubernetes), CUE, netavark, LXD, GitHub Actions self-hosted runners |
Focus area: Helm chart bug — maskNvidiaDriverParams was being rendered at the wrong YAML path so the feature toggle didn't actually take effect. The first PR to the repo from this contributor (welcomed by k8s-ci-robot).
| PR | Contribution |
|---|---|
| #1053 | helm: fix maskNvidiaDriverParams path (unsigned commit) |
21. Jia-Wei Jiang — @JiangJiaWei1103
| Field | Value |
|---|---|
| Commit email | waynechuang97@gmail.com |
| DCO Signed-off-by | JiangJiaWei1103 <waynechuang97@gmail.com> ✅ |
| NVIDIA org check | HTTP 404 |
| GitHub | No company set; bio "Never sell out · De-noobing · @ray-project Contributor · @flyteorg Committer" |
| Real name | Jia-Wei Jiang (江家瑋) |
| Location | Taiwan |
| Kaggle | https://www.kaggle.com/abaojiang |
| linkedin.com/in/jiawei-jiang-mr-denoober | |
| Employer | Not publicly stated. Open-source committer relationships: Flyte committer; Ray (KubeRay) contributor — recent activity is mostly large multi-part PRs to ray-project/kuberay's History Server beta. ML/data-science background (active on Kaggle) |
Focus area: Documentation polish — clarify an error message in an input-validation path.
| PR | Contribution |
|---|---|
| #444 | docs: Clarify err msg for input validation |
22. Xingyu "Richard" Guo — @xingyug
| Field | Value |
|---|---|
| Commit email | xingyug.guo.ericsson@gmail.com (the .ericsson infix in a personal gmail handle is unusual — see Notable Observations) |
| DCO Signed-off-by | (missing — both commits unsigned) |
| NVIDIA org check | HTTP 404 |
| GitHub profile name | Richard Guo |
| GitHub | 14 public repos; only 1 follower; account created 2022-10-24 (recent) |
| not publicly linked | |
| Employer | Now likely Red Hat (with a prior Ericsson stint). The .ericsson token in the gmail handle is consistent with his earlier corporate identity, but the current PR concentration is on Red Hat–maintained projects: rhel-lightspeed/linux-mcp-server and containers/kubernetes-mcp-server (the RHEL Lightspeed / containers MCP-server work). Additional PRs to vllm, BerriAI/litellm, jumpserver/jumpserver — all security/correctness focused (SSRF protections, path-traversal hardening, unbounded-read fixes, auth-bug fixes). Profile pattern reads "security-focused engineer who recently joined a Red Hat MCP team" |
Focus area: Two unrelated DRA-driver bug fixes landed on the same day (2026-04-15), both indicating real production NVML use:
- PR #1039:
getGpuInfowas logging<nil>instead of the actual NVML return code because the error format string referenced%w, errinstead of%v, ret. Misleading error UX, easy to miss without hitting it. - PR #1040: when the
DynamicMIGfeature gate is disabled (the default),unpreparePartiallyPrepairedClaim()logs "nothing to do" but falls through into DynamicMIG-specific cleanup code, causing a spurious NVML error for static-MIG claims inPrepareStartedstate. Adds the missingreturn nil.
Both bugs look like things you only hit if you're running this driver against real GPUs at non-trivial scale. The defect-detection pattern (NVML semantics + SSRF / path-traversal / auth elsewhere) reads as a security-focused engineer.
| PR | Contribution |
|---|---|
| #1039 | gpu plugin: fix wrong error variable in getGpuInfo for NVML system calls (unsigned) |
| #1040 | gpu plugin: add missing return in unpreparePartiallyPrepairedClaim when DynamicMIG is disabled (unsigned) |
| # | Contributor | GitHub | Employer | DCO email | NVIDIA org | Merged PRs | Notes |
|---|---|---|---|---|---|---|---|
| 1 | John Belamaric | @johnbelamaric | jbelamaric@google.com | ❌ | 1 | SIG-Architecture co-chair, DRA KEP lead | |
| 2 | Antonio Ojea | @aojea | aojea@google.com | ❌ | 1 | SIG-Network chair; KIND/kube-proxy maintainer | |
| 3 | Leiyi Zhang | @leiyiz | leiyiz@google.com | ❌ | 1 | shell-quoting fix + distroless follow-up | |
| 4 | Suraj Deshmukh | @surajssd | Microsoft (AKS) | surajd.service@gmail.com | ❌ | 1 | README expectations |
| 5 | Jon Huhn | @nojnhuh | Microsoft (AKS) | nojnhuh@users.noreply.github.com | ❌ | 1 | pcieRoot soft-fail |
| 6 | Yue Yu | @yuyue9284 | Microsoft | yuyu3@microsoft.com | ❌ | 1 | ComputeDomain cache existence check |
| 7 | Vitaliy Emporopulo | @empovit | Red Hat (OpenShift AI) | vemporop@redhat.com | ❌ | 5 | OpenShift enablement (longest-running external) |
| 8 | Kevin Hannon | @kannon92 | Red Hat | kehannon@redhat.com | ❌ |
1 | CI supply-chain (action-pin) — DCO unsigned |
| 9 | Xiaowu Zhu | @yyzxw | DaoCloud | xiaowu.zhu@daocloud.io | ❌ | 2 | repo-bootstrap chores |
| 10 | Noah Tang | @CoderTH | DaoCloud | coderth@outlook.com | ❌ |
2 | early CI scaffolding — #38 DCO unsigned |
| 11 | Rongfu Leng | @lengrongfu | DaoCloud | lenronfu@gmail.com | ❌ | 1 | env-var rename |
| 12 | Cyclinder Kuo | @cyclinder | DaoCloud | kuocyclinder@gmail.com | ❌ | 1 | README maintenance |
| 13 | Zhongjun Li | @learner0810 | DaoCloud | zhongjun.li@daocloud.io | ❌ | 1 | build fix |
| 14 | Kasia Kujawa | @kasia-kujawa | CAST AI | katarzyna@cast.ai | ❌ | 5 | GKE MPS production bug-fix stream |
| 15 | Robert Northard | @RobertNorthard | AWS (EKS) | robertnorthard@googlemail.com | ❌ | 1 | Helm defaults for managed K8s |
| 16 | Marco Ebert | @Gacko | Giant Swarm (Cluster API / Cologne) | marco_ebert@icloud.com | ❌ | 2 | NetworkPolicy + /opt/bin |
| 17 | Herb Duan | @herb-duan | undisclosed (Beijing) | herbertduan@qq.com | ❌ | 1 | Leader election (largest feature) |
| 18 | Anish Bista | @anishbista60 | KubeRox Technologies (Nepal) — KubeVirt contrib; CNCF Kubestronaut | anishbista053@gmail.com | ❌ | 1 | e2e test for herb-duan's #851 |
| 19 | Kante Yin | @kerthcet | InftyAI / Hiverge (co-founder) | kerthcet@gmail.com | ❌ | 1 | tiny map-cleanup |
| 20 | takonomura | @takonomura | undisclosed; ISUCON14 winner; ICTSC contributor | (noreply) | ❌ |
1 | helm path fix — DCO unsigned |
| 21 | Jia-Wei Jiang | @JiangJiaWei1103 | undisclosed (Taiwan; Flyte committer / KubeRay contrib) | waynechuang97@gmail.com | ❌ | 1 | docs polish |
| 22 | Xingyu "Richard" Guo | @xingyug | likely Red Hat now (ex-Ericsson per email handle) | xingyug.guo.ericsson@gmail.com | ❌ |
2 | NVML correctness fixes — both DCO unsigned |
Totals: 22 confirmed non-NVIDIA contributors · 33 merged PRs (plus 11 closed-unmerged + 1 open) · 35 commits · 30 signed / 5 unsigned.
Reference point: NVIDIA commits in the same window total ~1,445 (about 78% of the 1,853-commit repo, after reclassifying Mathew Wicks's 2 commits from external to NVIDIA per Helios); external work is ~2% of total commits but disproportionately concentrated in production-flavored bug fixes and platform-enablement (OpenShift, GKE, EKS, AKS, Talos/Flatcar, IPv6) rather than feature work.
-
OpenShift / Red Hat is the most sustained external engagement. Vitaliy Emporopulo (@empovit) has been landing PRs from
vemporop@redhat.comcontinuously since Feb 2024 — the original "DRA driver on OpenShift" install docs in the repo exist because he wrote them, and he is the only external contributor who has shipped PRs across more than two release cycles. Kevin Hannon (@kannon92) adds a second Red Hat surface area (CI supply-chain hardening). If you wanted to point at one external company that runs this driver in production today, it's Red Hat / OpenShift. -
CAST AI is the most active production user this year. Kasia Kujawa (@kasia-kujawa) has merged 5 PRs in 2026 alone, every one a fix to a real bug they hit on GKE — MPS chroot, MPS shm-dir path, validation messages, nil pointers, cross-compiler. CAST AI has a managed-GPU-sharing offering that sits on top of GKE and they are clearly running this driver against real customer workloads. PR #1009 (retry device enumeration on startup to prevent empty ResourceSlices) is still open and is a classic race-on-startup symptom.
-
DaoCloud is the largest headcount external cluster (5), but the thinnest per-person engagement. Xiaowu Zhu, Noah Tang, Rongfu Leng, Cyclinder Kuo, and Zhongjun Li all signed off with
@daocloud.io(or are listed@DaoCloudon GitHub). All five together account for only 7 merged PRs, mostly chores/docs/build-fixes from the project's early days. Three of the five — Zhu, Yu (Microsoft, see #6), Li — have no GitHub-profile metadata at all; their real names and employers are only recoverable from DCO trailers. This is consistent with a corporate community-engagement quota rather than production usage. -
Hyperscaler-adjacent helm-defaults work is happening. RobertNorthard (AWS) filed and fixed issue #1047 because the default helm chart node-affinity targets
node-role.kubernetes.io/control-plane, which never exists on EKS / GKE / AKS. johnbelamaric (Google) added the compute-class toleration. Together these are the closest the project has to "make defaults work on managed Kubernetes" patches — and they have come, one at a time, from each of the three hyperscalers. -
The DRA KEP author has personally contributed. John Belamaric (@johnbelamaric) — co-author of the DRA KEP-3063 that this entire repo exists to implement, and co-chair of SIG-Architecture — has one merged PR (#221, compute-class toleration). That's a useful piece of social proof for the repo: the API designer landed code in the reference driver.
-
One feature PR dominates external impact: herb-duan's #851 (+1,599 / −4 across 14 files), adding leader election to the ComputeDomain controller. This eliminates a controller SPOF in HA deployments. The feature was requested by another non-NVIDIA contributor (
@Lily922, issue #815) and the follow-up e2e test was contributed by another non-NVIDIA contributor (@anishbista60, PR #1094). Three external contributors collaborating across feature-request → implementation → test is the most coordinated external work the repo has seen. -
Identity discovery via DCO trailers — 4 contributors. DCO
Signed-off-bylines were the only signal that revealed:yyzxwis Xiaowu Zhu at DaoCloud (GitHub profile is empty),yuyue9284is Yue Yu at Microsoft (GitHub profile is empty),learner0810is Zhongjun Li at DaoCloud (GitHub profile is empty),lengrongfu's real DCO email islenronfu@gmail.com(commit author email is a local hostname). Without DCO, three of these would be "unknown handle at unknown company."
-
Identity discovery via email pattern + repo-graph — 1 contributor.
xingyug.guo.ericsson@gmail.comhas the substring.ericssondeliberately embedded in a personal gmail handle. GitHub loginxingyughas no name, no company, no LinkedIn surface — but cross-referencing his other PRs shows the current concentration is on Red Hat–maintained projects (rhel-lightspeed/linux-mcp-server,containers/kubernetes-mcp-server), not Ericsson properties. Best read: an Ericsson alum who recently joined a Red Hat MCP-server team. The defect-spotting pattern (NVML semantics here + SSRF / path-traversal / auth-bug fixes elsewhere) is consistent with a security-focused engineer. Two unsigned commits from this contributor are the only unsigned external work in the recent (post-2026) window. -
DCO compliance is weaker on this repo than on nvsentinel. Five external commits are unsigned (kannon92, takonomura, two from xingyug, one CoderTH). The repo's branch protection only requires
EasyCLA; there is no DCO bot blocking unsigned merges. By contrast, the nvsentinel repo's external commits were 100% signed (and the gap that surfaced there was on NVIDIA-internal committers, not externals). -
No competing-GPU-vendor contributions detected. Unlike nvsentinel (which received a MooreThreads-affiliated style PR), this repo has not received any contributions from MooreThreads, AMD, Intel, Habana, Tenstorrent, or other competing accelerator vendors. The Ericsson signal (from #8) is the closest to "third-party hardware vendor" and even there the work is on NVML / DRA semantics, not on cross-vendor abstraction.
-
One pseudonymous contributor who is demonstrably world-class — takonomura. Won ISUCON14 (2024) — Japan's premier infrastructure-tuning competition. Heavy contributor to ICTSC (Japan's inter-college infra contest) and to whywaita/CyberAgent-community tools. DRA contributions across both this repo and
kubernetes/kubernetes. Usesusers.noreply.github.comeverywhere; the only external contributor we genuinely cannot de-anonymize, but his work and contribution graph make clear he is an experienced platform engineer. -
Giant Swarm joins the managed-K8s cluster. Marco Ebert (@Gacko) is a Giant Swarm Cluster API engineer with ~2,463 PRs inside the
giantswarmorg. That brings the managed-Kubernetes-vendor count contributing to this repo to five: AWS (RobertNorthard), Google (Belamaric / Ojea / Zhang), Microsoft (Deshmukh / Huhn / Yu), Red Hat-OpenShift (Empovit / Hannon), and Giant Swarm (Ebert) — plus the upstream-distribution vendors DaoCloud and CAST AI. Every major Kubernetes commercial distribution model is represented in the external contributor pool except SUSE / Rancher and VMware-Tanzu-broadcom. The contributions are consistently shaped by what each vendor needs to ship the driver to their customers. -
The IPv6 / driver-580 startup-probe fix was actually NVIDIA-internal. PRs #510 / #511 (compute-domain startup probe on IPv6 with NVIDIA driver 580+) initially appeared to be the only external IPv6 contribution. After the Helios cross-check those PRs are NVIDIA-internal (Mathew Wicks, NVIDIA Enterprise Products) — meaning the external IPv6 surface area in this repo is currently zero. Worth flagging because IPv6 / dual-stack support is a real production-readiness gap that no third party has yet pushed on.
-
The contribution shape is "platform-enablement," not "feature work." Of the 33 merged external PRs, the breakdown is roughly:
- Platform / installer / chart enablement (OpenShift, GKE, EKS, AKS, NetworkPolicy,
/opt/bin, masters-toleration, control-plane affinity): ~11 PRs - Production bug fixes (MPS chroot, NVML error vars, cache existence checks, race on startup): ~9 PRs
- CI / supply-chain / build / dependabot scaffolding: ~7 PRs
- Docs: ~3 PRs
- Features (leader-election, e2e test for it): 2 PRs (one of them 1.6k lines)
- Style / chores (map cleanup, repeat-label fix, env-var rename): ~3 PRs
External contributors are exercising this driver against real cloud-vendor distributions and fixing what breaks. NVIDIA continues to own all of the architectural and feature direction.
- Platform / installer / chart enablement (OpenShift, GKE, EKS, AKS, NetworkPolicy,
-
Helios cross-check caught a false-positive that GitHub-org check alone missed. Mathew Wicks (
@thesuperzapper) initially read as a Kubeflow-lead external contributor running his own consultancy. Helios LDAP revealed he has been an NVIDIA employee since 2025-02-18 (Enterprise Products, Santa Clara HQ, manager-chain ending at Jensen Huang) — i.e. he was already on NVIDIA payroll 6 months before his two DRA-driver PRs merged on 2025-08-30. GitHub-org membership is not provisioned (HTTP 404); he contributes via his personal handle with no@nvidia.comsignoff. Without a Helios pass this would have been miscategorized. This is the analogue of nvsentinel'sjamie-yu0finding (NVIDIA employee using a university-style external identity) and is a pattern any future external-contributor audit on NVIDIA-adjacent repos should look for.