Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Select an option

  • Save dims/e109efa0c92e6465d90e02155fcb39e3 to your computer and use it in GitHub Desktop.

Select an option

Save dims/e109efa0c92e6465d90e02155fcb39e3 to your computer and use it in GitHub Desktop.
dra-driver-nvidia-gpu — External Contributor Report (2026-05-11)

dra-driver-nvidia-gpu — External Contributor Report

Generated: 2026-05-11 (rev. 2 — Helios cross-check added) Repo: kubernetes-sigs/dra-driver-nvidia-gpu Repo history: 2022-07-14 → 2026-05-11 (~3.8 years) Total commits analyzed: 1,853 (47 unique author emails) Methodology: Extracted all unique commit authors via git log → classified by email domain (@nvidia.com = NVIDIA, all others = candidates) → mapped commits to GitHub logins via GET /repos/.../commits/{sha} → verified every candidate against GET /orgs/NVIDIA/members/{username} (HTTP 204 = confirmed member, 404 = not a member) → for ambiguous cases, additionally cross-referenced against NVIDIA Helios LDAP (helios-cli user search) to detect NVIDIA employees who contribute via personal GitHub accounts not registered in the NVIDIA org → cross-referenced GitHub profiles, DCO Signed-off-by trailers, LinkedIn, and corporate-email patterns → folded NVIDIA-personal-email aliases (e.g. klueska@gmail.comkklues@nvidia.com, davanum@gmail.comdsrinivas@nvidia.com, 7723350-elezar@…gitlab.comelezar@nvidia.com, etc.) back into the NVIDIA cohort.


DCO Status ⚠️ — Five external commits unsigned

Of 37 commits from confirmed non-NVIDIA authors, 32 carry a valid Signed-off-by trailer and 5 do not. The repo does not run the standard CNCF DCO bot (only EasyCLA is configured as a branch-protection required check on main / release-25.8), so unsigned commits were not blocked at merge time.

Unsigned external commits (5):

Commit Author PR Title
1eb01b4 coderth <coderth@outlook.com> #38 fix: nvidia-dra-plugin Config
b96e2c4 Kevin Hannon <kehannon@redhat.com> #1016 Pin GitHub Actions to commit SHAs for supply-chain protection
a379b9f takonomura <takonomura@users.noreply.github.com> #1053 helm: fix `maskNvidiaDriverParams` path
ffa7858 Xingyu Guo <xingyug.guo.ericsson@gmail.com> #1039 gpu plugin: fix wrong error variable in getGpuInfo for NVML system calls
81c4422 Xingyu Guo <xingyug.guo.ericsson@gmail.com> #1040 gpu plugin: add missing return in unpreparePartiallyPrepairedClaim when DynamicMIG is disabled

Identity corrections from DCO trailers:

  • yyzxw signed off with xiaowu.zhu <xiaowu.zhu@daocloud.io> → real name is Xiaowu Zhu, employer is DaoCloud (GitHub profile has no name, no company, no bio; commit email 1020938856@qq.com is opaque).
  • yuyue9284 signed off with both Yue Yu <yuyu3@microsoft.com> and yuyue9284 <15863499+yuyue9284@users.noreply.github.com> → real name Yue Yu, employer Microsoft (GitHub profile has no name, no company, no bio).
  • lengrongfu signed off with rongfu.leng <lenronfu@gmail.com> (commit email is the local hostname lengrongfu@lengrongfudeMacBook-Pro.local); GitHub profile separately confirms @DaoCloud.
  • cyclinder signed off with Cyclinder Kuo <kuocyclinder@gmail.com> (full real name: Cyclinder Kuo).
  • learner0810 signed off with zhongjun.li <zhongjun.li@daocloud.io> → real name Zhongjun Li, employer DaoCloud (GitHub profile has no name, no company, no bio).

Corrections from NVIDIA org check (false positives removed):

  • visheshtanksale (Vishesh Tanksale, PR #965, commit email vishesh.tanksale09@gmail.com) is an NVIDIA org member (HTTP 204). Removed from the external list. He sits in the OWNERS file as a reviewer.

Corrections from NVIDIA Helios LDAP (additional false positives, not detectable from GitHub org alone):

  • thesuperzapper (Mathew Wicks, PRs #510 / #511, commit email 5735406+thesuperzapper@users.noreply.github.com) is a confirmed NVIDIA employee per Helios LDAP — login mwicks, email mwicks@nvidia.com, NVIDIA hire date 2025-02-18, department Enterprise Products, location Santa Clara HQ, manager chain Joohoon Lee → Ian Buck → Jensen Huang. Both his DRA-driver PRs merged on 2025-08-30 — i.e. 6 months after his NVIDIA hire date. He contributes via his personal GitHub handle (still listing @aranui-solutions as company; "Kubeflow Lead" in bio) and has not joined the NVIDIA GitHub org (HTTP 404), which is why he initially read as external; the Helios cross-check resolved it. Removed from the external list. This is the analogue of the nvsentinel report's jamie-yu0 discovery — an NVIDIA employee whose external GitHub identity initially appears as third-party.

The two corrections above (visheshtanksale, thesuperzapper) reduce the external-contributor count from 24 candidates down to 22. The Helios pass would have caught additional cases if any NVIDIA-employed contributor used a personal email AND a personal GitHub handle AND was not in the NVIDIA GitHub org; thesuperzapper is the only such case found. The single most actionable insight from this report is that the GitHub-org check alone misses NVIDIA employees who do OSS work from personal handles — anyone doing this kind of analysis on NVIDIA-adjacent repos in the future should include a Helios pass.


Confirmed Non-NVIDIA Contributors

22 confirmed external contributors with 33 merged PRs (plus 11 closed-unmerged + 1 open) and 35 external commits total. Profiles below are ordered by employer cluster size, then merged-PR count, then alphabetical.

1. John Belamaric — @johnbelamaric

Field Value
Commit email jbelamaric@google.com
DCO Signed-off-by John Belamaric <jbelamaric@google.com>
NVIDIA org check HTTP 404
GitHub company Google
Public email jbelamaric@google.com
LinkedIn linkedin.com/in/johnbelamaric (public, not linked from GitHub)
Employer Google — Principal / Distinguished Software Engineer; SIG-Architecture co-chair; co-driver of the DRA KEP and ResourceClaim API; long-time CoreDNS maintainer; co-author of Learning CoreDNS (O'Reilly); frequent KubeCon speaker
Location US (East Coast / Maryland area, per public bios)

Focus area: Manageability for control-plane scheduling. Added a toleration that lets the kubelet-plugin DaemonSet land on compute-class-tainted nodes (the "compute class" pattern Google's GKE Autopilot uses).

PR Contribution
#221 Add toleration for compute class taint

2. Antonio Ojea — @aojea

Field Value
Commit email aojea@google.com
DCO Signed-off-by Antonio Ojea <aojea@google.com>
NVIDIA org check HTTP 404
GitHub company Google
Public email antonio.ojea.garcia@gmail.com
Twitter @itsuugo
LinkedIn linkedin.com/in/ajojea
Personal site https://kindnet.es (creator/maintainer of KindNet CNI)
Location Spain (remote, Google)
Followers 545
Employer Google (GKE Networking) — Senior Software Engineer; Kubernetes SIG-Network technical lead / approver; KIND maintainer; creator of KindNet; KEP author for many networking features (kube-proxy, IPv6/dual-stack, Gateway API); frequent KubeCon speaker

Focus area: Cross-driver correctness. PR #435 prevents the NVIDIA kubelet-plugin from trying to allocate devices that belong to a different DRA driver — a real-world hazard in any cluster running multiple DRA drivers (e.g. NVIDIA GPU + a network/RDMA DRA driver), which is the configuration GKE is moving toward.

PR Contribution
#435 filter device requests from other drivers

3. Leiyi Zhang — @leiyiz

Field Value
Commit email leiyiz@google.com
DCO Signed-off-by Léiyì Zhang <leiyiz@google.com>
NVIDIA org check HTTP 404
GitHub company Google
Location Seattle, WA
LinkedIn not publicly linked
Employer Google — CSI / GKE storage / AI-on-GKE engineer (contributes heavily to gcp-filestore-csi-driver, gcp-compute-persistent-disk-csi-driver, container-engine-accelerators, ai-on-gke); DRA GPU driver work targets GKE accelerators

Focus area: Image-hardening (driver image). The PR fixed a shell-quoting/interpolation bug where "{$var}" was being emitted as a literal {} wrapped around the filename. A subsequent unmerged PR (#972) attempted to remove the shell dependency from the driver image altogether — consistent with running NVIDIA DRA under a hardened, shell-less base image (a Google/Distroless idiom).

PR Contribution
#968 fix syntax error where "{$var}" results in literal {} wrapped around file name
#972 (closed, unmerged) remove shell dependency from the driver image

Google cluster (3 contributors, 3 merged PRs): Belamaric, Ojea, and Leiyi Zhang. Together they cover the three things Google needs out of this driver: compute-class scheduling (GKE Autopilot), multi-DRA-driver coexistence (GKE multi-resource), and a shell-less hardened base image. None of these contributions are cosmetic; they read like patches that arose from Google running this driver in some kind of integration / pre-prod path.


4. Suraj Deshmukh — @surajssd

Field Value
Commit email surajd.service@gmail.com
DCO Signed-off-by Suraj Deshmukh <surajd.service@gmail.com>
NVIDIA org check HTTP 404
GitHub company @microsoft
Public email surajd.service@gmail.com
Blog https://suraj.io/
Twitter @surajd_
Location Redmond, WA
Public repos 304
LinkedIn linkedin.com/in/surajssd
Bluesky https://bsky.app/profile/suraj.io
Employer Microsoft — Senior Software Engineer (AKS / Azure Linux); ex-Kinvolk (Flatcar Container Linux, acquired by Microsoft 2021); ex-Red Hat OpenShift/Origin; frequent KubeCon speaker and blogger

Focus area: Documentation; the PR clarifies the expected post-install state of the DRA driver in the README.

PR Contribution
#164 README: Update the expectations after installation

5. Jon Huhn — @nojnhuh

Field Value
Commit email nojnhuh@users.noreply.github.com
DCO Signed-off-by Jon Huhn <nojnhuh@users.noreply.github.com>
NVIDIA org check HTTP 404
GitHub company @microsoft
LinkedIn not publicly linked
Employer Microsoft — Software Engineer on the Azure Kubernetes team; cluster-api-provider-azure (CAPZ) maintainer; maintainer of kubernetes-sigs/dra-example-driver (the reference DRA implementation he ported many learnings from into this driver); active across kubernetes/kubernetes, test-infra, klog, enhancements

Focus area: Operational softening — only warn (don't fail) when pcieRoot cannot be determined for a GPU. Relevant on AKS GPU nodes where the PCI topology query may not return a stable bus path under all node images.

PR Contribution
#577 Only warn when pcieRoot can't be determined

6. Yue Yu — @yuyue9284

Field Value
Commit email yuyu3@microsoft.com (in commit author)
DCO Signed-off-by Yue Yu <yuyu3@microsoft.com> + yuyue9284 <15863499+yuyue9284@users.noreply.github.com> ✅ (dual signoff)
NVIDIA org check HTTP 404
GitHub profile No name, no bio, no company set (but a member of the microsoft GitHub org)
LinkedIn not publicly linked
Employer Microsoft — Azure Arc for Kubernetes / Azure ML on Kubernetes engineer; PR distribution: AzureArcForKubernetes/azure-cli-extensions (12), Azure/azureml-examples (7), Azure/AML-Kubernetes (4), microsoft/frameworkcontroller, MicrosoftDocs/azure-ai-docs; also contributes to volcano-sh/volcano (batch scheduler) and NVIDIA/gpu-operator

Note: GitHub login yuyue9284 does not match the commit author email's local part (yuyu3@…); without the DCO trailer this contributor would have read as an unknown handle. The dual signoff (one corporate, one noreply) is unusual — likely a tooling artifact from a corporate review pipeline.

Focus area: Stale-cache fix in the ComputeDomain controller — guards against operating on a ComputeDomain that has been deleted but is still present in the informer cache. Production-grade hardening, not a drive-by.

PR Contribution
#805 fix: check existence of ComputeDomain in cache before processing updates

Microsoft cluster (3 contributors, 3 merged PRs): Deshmukh, Huhn, and Yue Yu. Yue Yu's ComputeDomain cache fix is a non-trivial controller-runtime correctness improvement. Huhn's pcieRoot patch is AKS-flavored. Together they imply AKS is exercising this driver end-to-end, not just shipping it.


7. Vitaliy Emporopulo — @empovit

Field Value
Commit email vemporop@redhat.com
DCO Signed-off-by Vitaliy Emporopulo <vemporop@redhat.com>
NVIDIA org check HTTP 404
GitHub company Red Hat
Public email vemporop@redhat.com
Profile name Vitaly E.
Location Israel
Public repos 64
LinkedIn not publicly linked
Employer Red Hat — Software Engineer focused on the NVIDIA / GPU ecosystem on OpenShift. Primary contributor to rh-ecosystem-edge/nvidia-ci, rh-ecosystem-edge/console-plugin-nvidia-gpu, openshift/instaslice-operator (MIG slice orchestration), plus openshift/release CI plumbing

Focus area: OpenShift integration. Five merged PRs over ~14 months (Feb 2024 → Mar 2026), covering: privileged kubelet-plugin on OpenShift; OpenShift SCC bindings for service accounts (so the IMEX daemon can write its nodes_config.cfg under a random UID); the nvidia-container-toolkit path override for non-default RHEL installs; the master-role toleration for control-plane workloads; and the original OpenShift install docs in the README. This is the most sustained external contribution in the repo's history; the OpenShift install instructions in docs/ exist because Empovit wrote them.

PR Contribution
#72 Allow custom NVIDIA CTK path
#76 Let kubelet plugin run privileged on OpenShift
#82 Document DRA driver installation on OpenShift
#569 Add SCC to service accounts on OpenShift (fixes IMEX writeNodesConfig perm-denied under random UID)
#899 Add node-role.kubernetes.io/master toleration to controller

8. Kevin Hannon — @kannon92

Field Value
Commit email kehannon@redhat.com
DCO Signed-off-by (missing — commit unsigned) ⚠️
NVIDIA org check HTTP 404
GitHub company Red Hat
Bio "wg-batch lead. Kueue reviewer. JobSet Maintainer. Excited to make Kubernetes the platform of choice for AI/ML/HPC."
Location Cleveland, Ohio
LinkedIn not publicly linked from GitHub
Followers 98
Employer Red Hat — Principal Software Engineer on OpenShift / Kubernetes batch & scheduling; WG-Batch Lead (Kubernetes); Kueue reviewer; JobSet maintainer; frequent KubeCon presenter on batch/AI workloads (Kueue, JobSet)

Focus area: Supply-chain hardening of the repo's own CI. The change pins every GitHub Action used in workflows to an immutable commit SHA (instead of a movable tag), explicitly motivated by the March 2025 tj-actions/changed-files and reviewdog/action-setup compromises. Fixes upstream issue #1015.

PR Contribution
#1016 Pin GitHub Actions to commit SHAs for supply-chain protection (fixes #1015)

Red Hat cluster (2 contributors, 6 merged PRs): Emporopulo (5 PRs, all OpenShift enablement) and Hannon (1 PR, CI supply-chain hardening). Empovit is the longest-running external contributor in the repo; Hannon's contribution is the single most consequential security PR an external has landed here.


9. Xiaowu Zhu — @yyzxw

Field Value
Real name Xiaowu Zhu (朱晓武) — discovered via DCO trailer (xiaowu.zhu@daocloud.io); GitHub profile has no name, company, bio, or blog
Commit email 1020938856@qq.com
DCO Signed-off-by xiaowu.zhu <xiaowu.zhu@daocloud.io>
NVIDIA org check HTTP 404
GitHub 77 public repos, created 2017-12; goes by handle "zxw"
LinkedIn not publicly linked
Employer DaoCloud — software engineer focused on AI/LLM serving infrastructure. PR concentration: DaoCloud/dce-charts-repackage, BaizeAI/modelhub, BaizeAI/dataset (DaoCloud's Baize AI platform), llm-d/llm-d-kv-cache, and vllm-project/vllm. Also active on DaoCloud's Higress plugin server

Focus area: Repo bootstrap chores during the first months after the project moved to its current form — dependabot setup, label dedupe, PR templates, makefile helpers, ci scaffolding. Only 2 of 6 PRs merged; the other 4 were closed unmerged (likely overlapped with NVIDIA's own scaffolding direction).

PR Contribution
#45 feat: add dependabot
#59 fix: remove repeat label key
#42, #43, #44, #64 (all closed-unmerged) makefile help target, PR templates, PR-title check, switch-to-map refactor

10. Noah Tang — @CoderTH

Field Value
Commit email coderth@outlook.com
DCO Signed-off-by coderth <coderth@outlook.com> ✅ on #39; missing ⚠️ on #38 (commit 1eb01b4)
NVIDIA org check HTTP 404
GitHub company @DaoCloud
Real name Noah Tang
Bio "🚀 Cloud Native Developer
Location Chengdu, China
Blog https://coderth.onrender.com/ (Hexo blog, handle "CoderTh")
LinkedIn not publicly linked
Employer DaoCloud — Cloud Native engineer. Works on DaoCloud's dce-charts-repackage, the matrixhub-ai/matrixhub AI model hub, NVIDIA vGPU / k8s-vgpu-scheduler integrations, and contributes to NVIDIA Dynamo and the Harbor operator

Focus area: Early CI/config scaffolding — the GitHub-actions CI step that runs golangci-lint/test/build was first set up by this PR, and the nvidia-dra-plugin static config fix.

PR Contribution
#38 fix: nvidia-dra-plugin Config (unsigned commit)
#39 feat: add github ci step
#40 (closed-unmerged) feat: add issue template

11. Rongfu Leng — @lengrongfu

Field Value
Commit author email lengrongfu@lengrongfudeMacBook-Pro.local (local laptop hostname — not deliverable)
DCO Signed-off-by rongfu.leng <lenronfu@gmail.com>
Profile email 1275177125@qq.com
NVIDIA org check HTTP 404
GitHub company @DaoCloud
Real name rongfu.leng
Location Chengdu, Sichuan, China
Blog https://lengrongfu.github.io/
Public repos 309
LinkedIn not publicly linked
Employer DaoCloud — Senior engineer; heavy contributor in the LLM-serving ecosystem (vllm-project/vllm, vllm-omni, vllm-project/router, sglang-project/sglang) plus DaoCloud's enterprise charts. Chinese name 冷荣富

Focus area: Environment-variable rename to align with the broader nvidia-container-toolkit naming convention (CONTAINER_DRIVER_ROOTDRIVER_ROOT_CTR_PATH). Closed-unmerged follow-up #212 attempted NVIDIA_CTK_PATHNVIDIA_CDI_HOOK_PATH for the same alignment reason.

PR Contribution
#211 replace CONTAINER_DRIVER_ROOT with DRIVER_ROOT_CTR_PATH
#212 (closed-unmerged) Use NVIDIA_CDI_HOOK_PATH instead of NVIDIA_CTK_PATH

12. Cyclinder Kuo — @cyclinder

Field Value
Commit email qifeng.guo@daocloud.io
DCO Signed-off-by Cyclinder Kuo <kuocyclinder@gmail.com>
NVIDIA org check HTTP 404
GitHub company @DaoCloud
Real name Cyclinder Kuo (Qifeng Guo — Chinese name 郭起峰, per commit email handle)
Bio "everything we went through was just a waste of a time"
Location Chengdu, China
Twitter @Cyclinder_Kuo
LinkedIn not publicly linked
Employer DaoCloud — networking engineer; maintainer / major contributor to spidernet-io/spiderpool (CNCF sandbox Kubernetes IPAM/CNI) and the vlan-cni / iaas-network-provider projects

Focus area: Doc maintenance — adjust README to follow script/namespace renames done elsewhere in the repo.

PR Contribution
#280 README: adjust to script/namespace renames

13. Zhongjun Li — @learner0810

Field Value
Commit email zhongjun.li@daocloud.io
DCO Signed-off-by zhongjun.li <zhongjun.li@daocloud.io>
NVIDIA org check HTTP 404
GitHub profile No name, no bio, no company set; 58 public repos
LinkedIn not publicly linked
Employer DaoCloud — Software engineer on inference / storage. PR distribution: DaoCloud/dce-charts-repackage (9), kubernetes-sigs/gateway-api-inference-extension (7), llm-d/llm-d-inference-scheduler (4), hwameistor/hwameistor (3, CNCF sandbox local-volume storage). Keeps a low public profile — identity recoverable only from DCO trailer

Focus area: Build fix. Closed-unmerged follow-up #66 proposed adding gofumpt to the code checks.

PR Contribution
#67 fix build binaries
#66 (closed-unmerged) Add gofumpt code checks

DaoCloud cluster (5 contributors, 5 merged PRs): Zhu, Tang, Leng, Kuo, Li — all four use Chengdu / DaoCloud-issued corporate emails in DCO. This is the single largest external-company cluster on the project. The work is heavy on chores/scaffolding/docs and lighter on architectural change; the contribution shape suggests early-days community engagement rather than a deep production-deployment-driven workstream. DCO trailers are the only identity signal for three of the five (Zhu, Yu, Li have no GitHub-profile metadata at all).


14. Kasia Kujawa — @kasia-kujawa

Field Value
Commit email katarzyna@cast.ai
DCO Signed-off-by Katarzyna Kujawa <katarzyna@cast.ai>
NVIDIA org check HTTP 404
GitHub company CAST AI
Real name Katarzyna ("Kasia") Kujawa
Location Gdańsk, Poland
LinkedIn not publicly linked from GitHub (multiple Katarzyna Kujawa entries in Gdańsk on LinkedIn; the CAST AI–affiliated one is the match)
Employer CAST AI — Software engineer on the cost-optimization / autoscaling product surface (castai/k8s-agent, castai/terraform-provider-castai, castai/helm-charts). CAST AI is a Kubernetes autoscaling / cost-management platform spanning GKE / EKS / AKS

Focus area: Production bug-fix stream from CAST AI's GPU offering. Five merged PRs in early 2026:

  • PR #889: fixes mps-control-daemon chroot 'sh': No such file or directory error when nvidiaDriverRoot is /home/kubernetes/bin/nvidia/GKE-specific driver-install path. CAST AI runs MPS GPU-sharing on GKE.
  • PR #978: corrects the MPS shm-dir mount path so MPS clients can actually talk to the control daemon.
  • PR #979: fixes the cross-compiler used for the static-bash build (related to bash-static work the NVIDIA team did for distroless images).
  • PR #996: actionable validation errors when GPU sharing config is malformed (improves error UX for end-users).
  • PR #997: nil-pointer fix when featureGates: null in the Helm chart values.
PR Contribution
#889 Fix mps-control-daemon chroot shell execution error when nvidiaDriverRoot is set (GKE)
#978 Set proper path for MPS shm dir mount
#979 Use correct cross-compiler for bash static build
#996 Improve validation errors for GPU sharing with actionable messages and supported values
#997 Fix nil pointer when featureGates is set to null in values
#1009 (open) Retry device enumeration on startup to prevent empty ResourceSlices

Note: This is the most active and the most production-driven external contributor in the repo's recent history. Every PR is a fix for something CAST AI hit in production.


15. Robert Northard — @RobertNorthard

Field Value
Commit email robertnorthard@googlemail.com
DCO Signed-off-by Rob <robertnorthard@googlemail.com>
NVIDIA org check HTTP 404
GitHub company AWS
Public email robertnorthard@googlemail.com
Profile name Rob
Public repos 142
LinkedIn not publicly linked from GitHub (public Robert Northard SA-at-AWS profile exists on LinkedIn)
Employer Amazon Web Services (AWS) — Specialist Solutions Architect / engineer. Heavy contributor to aws/karpenter-provider-aws, aws/eks-anywhere, aws-samples/karpenter-blueprints, aws-ia EKS Blueprints, and aws-eks-best-practices. Likely EMEA-based given his community presence in EU Karpenter / EKS workshops

Focus area: Helm-chart defaults for managed-Kubernetes (EKS-flavored) deployments. The default node affinity assumed a node-role.kubernetes.io/control-plane node existed — which is never true on managed offerings (EKS / GKE / AKS hide the control plane). The PR also adds default GPU tolerations so the controller pods can land on tainted GPU nodes. Filed as issue #1047 and fixed by his own PR #1054.

PR Contribution
#1054 Changed default node affinity and GPU tolerations for DRA controller and kubelet plugin helm chart values (fixes #1047)

16. Marco Ebert — @Gacko

Field Value
Commit email marco_ebert@icloud.com
DCO Signed-off-by Marco Ebert <marco_ebert@icloud.com>
NVIDIA org check HTTP 404
GitHub 97 followers, but only 1 public personal repo (marctl) — the 97-follower count tracks his commit volume inside the giantswarm GitHub org rather than personal projects
Real name Marco Ebert
Location Germany (Giant Swarm is headquartered in Cologne; not set on profile)
LinkedIn not publicly linked from GitHub (public "Marco Ebert at Giant Swarm" profile exists on LinkedIn)
Employer Giant Swarm (Cologne-based managed-Kubernetes vendor; member of the giantswarm org with ~2,463 internal PRs there). Platform / cluster engineer working on Cluster API tooling — cluster-api-app, cluster-test-suites, cluster-standup-teardown, clustertest, releases, cluster-vsphere. Account from 2011

Focus area: Two helm/runtime PRs in late 2025. PR #708 adds NetworkPolicy resources to the chart (egress/ingress for the controller and daemons) — meaningful in restricted clusters that block all pod-to-pod traffic by default. PR #706 adds /opt/bin to the kubelet-plugin's binary search path (Talos / Flatcar / GKE-COS-style read-only-root distros put their nvidia binaries under /opt/bin).

PR Contribution
#706 kubelet plugins: add /opt/bin to binary search paths
#708 chart: add network policies

17. Herb Duan — @herb-duan

Field Value
Commit email herbertduan@qq.com
DCO Signed-off-by Herb Duan <herbertduan@qq.com>
NVIDIA org check HTTP 404
GitHub No bio, no company, no blog; 38 public repos; 28 followers
Real name Herb Duan
Location Beijing, China
LinkedIn not publicly linked
Employer Not disclosed — no company on GitHub bio, no blog, no social accounts. Total public footprint is only 5 PRs across 3 repos: kubernetes/kubernetes (3, kubelet resource-claim status handling), this repo (1, the leader-election feature below), and BenchCouncil/BigDataBench (1 — a Chinese academic/research benchmark, suggesting a possible prior ICT/CAS-style academic background; unverified)

Focus area: Largest external feature contribution in the repo. PR #851 adds leader election to the ComputeDomain controller so it can run multi-replica without conflicts — eliminating the previous SPOF on the controller. +1,599 / −4 lines across 14 files. Fixes issue #815 (filed by another non-NVIDIA user, Lily922). The follow-up e2e test for this feature was contributed by Anish Bista (PR #1094, below).

PR Contribution
#851 feat(controller): Add leader election for high availability (+1599/-4, fixes #815)

18. Anish Bista — @anishbista60

Field Value
Commit email anishbista053@gmail.com
DCO Signed-off-by anish bista <anishbista053@gmail.com> and anishbista60 <anishbista053@gmail.com>
NVIDIA org check HTTP 404
GitHub company KubeRox Technologies
Bio "Always be humble to everyone"
Location Nepal
Twitter @anishbista053
LinkedIn linkedin.com/in/anishbista
Personal site https://anishbista60.github.io/personal-website/ ; blog at https://medium.com/@anishbista
Public repos 60
Employer KubeRox Technologies (Nepal) — Kubernetes engineer. Self-describes as the "youngest CNCF Kubestronaut from Nepal" (holds CKA, CKAD, CKS, KCNA, KCSA). Active KubeVirt contributor and maintainer of the kubevirtbmc project; also contributes to Kanister

Focus area: End-to-end test coverage for the leader-election feature herb-duan landed in #851. Filed against follow-up issue #970 (jgehrcke). This is the only example in the repo of two external contributors collaborating across a feature: one shipped the feature, the other shipped the test.

PR Contribution
#1094 tests/bats: add leader election e2e test for compute-domain-controller

19. Kante Yin — @kerthcet

Field Value
Commit email kerthcet@gmail.com
DCO Signed-off-by kerthcet <kerthcet@gmail.com>
NVIDIA org check HTTP 404
GitHub No company set; bio "Building AI Infrastructure @InftyAI @hiverge"
Real name Kante Yin
Location Cambridge, UK
Blog https://ky.dev/
Twitter @kerthcet
Followers 212
LinkedIn not publicly linked
Employer InftyAI (non-profit AI-infra org he co-founded) and Hiverge. Kubernetes SIG-Scheduling reviewer/approver; historical Kueue maintainer; now focused on InftyAI's open-source LLM-infra projects (PUMA, alphatrion, llmaz). Chinese name 银坎特; frequent KubeCon China speaker

Focus area: Tiny correctness PR — drop empty values from a map (i.e. the make rev helper script wasn't removing now-empty entries after key deletions). Kerthcet is best known elsewhere in the K8s ecosystem (Kueue, llmaz) — this is a drive-by from an early Kueue + DRA crossover phase.

PR Contribution
#48 Remove item when values are empty in map

20. takonomura — @takonomura

Field Value
Commit email takonomura@users.noreply.github.com
DCO Signed-off-by (missing — commit unsigned) ⚠️
NVIDIA org check HTTP 404
GitHub No name, no bio, no company; 52 public repos; 46 followers
Location Japan
LinkedIn not publicly linked (handle is consistently pseudonymous across all repos, all commits use users.noreply.github.com)
Employer Not publicly disclosed — deliberately pseudonymous. 228 PRs across many orgs. Strong circumstantial signal of association with the Japanese university IT-contest scene: heavy contributor to ictsc/ictsc-regalia, ictsc/ictsc-k8s-infra, and ictsc/ictsc-regalia-release (ICTSC = ICT Service Contest, a Japanese inter-college infra competition). Also contributes to whywaita/myshoes and whywaita/shoes-lxd-multi (whywaita is "Tachibana Waita" at CyberAgent), suggesting peer-group ties to the CyberAgent / Japanese-cloud community. Won ISUCON14 in 2024 (his isucon14 repo description claims "優勝" = winner). Expertise: DRA (this repo + kubernetes/kubernetes), CUE, netavark, LXD, GitHub Actions self-hosted runners

Focus area: Helm chart bug — maskNvidiaDriverParams was being rendered at the wrong YAML path so the feature toggle didn't actually take effect. The first PR to the repo from this contributor (welcomed by k8s-ci-robot).

PR Contribution
#1053 helm: fix maskNvidiaDriverParams path (unsigned commit)

21. Jia-Wei Jiang — @JiangJiaWei1103

Field Value
Commit email waynechuang97@gmail.com
DCO Signed-off-by JiangJiaWei1103 <waynechuang97@gmail.com>
NVIDIA org check HTTP 404
GitHub No company set; bio "Never sell out · De-noobing · @ray-project Contributor · @flyteorg Committer"
Real name Jia-Wei Jiang (江家瑋)
Location Taiwan
Kaggle https://www.kaggle.com/abaojiang
LinkedIn linkedin.com/in/jiawei-jiang-mr-denoober
Employer Not publicly stated. Open-source committer relationships: Flyte committer; Ray (KubeRay) contributor — recent activity is mostly large multi-part PRs to ray-project/kuberay's History Server beta. ML/data-science background (active on Kaggle)

Focus area: Documentation polish — clarify an error message in an input-validation path.

PR Contribution
#444 docs: Clarify err msg for input validation

22. Xingyu "Richard" Guo — @xingyug

Field Value
Commit email xingyug.guo.ericsson@gmail.com (the .ericsson infix in a personal gmail handle is unusual — see Notable Observations)
DCO Signed-off-by (missing — both commits unsigned) ⚠️
NVIDIA org check HTTP 404
GitHub profile name Richard Guo
GitHub 14 public repos; only 1 follower; account created 2022-10-24 (recent)
LinkedIn not publicly linked
Employer Now likely Red Hat (with a prior Ericsson stint). The .ericsson token in the gmail handle is consistent with his earlier corporate identity, but the current PR concentration is on Red Hat–maintained projects: rhel-lightspeed/linux-mcp-server and containers/kubernetes-mcp-server (the RHEL Lightspeed / containers MCP-server work). Additional PRs to vllm, BerriAI/litellm, jumpserver/jumpserver — all security/correctness focused (SSRF protections, path-traversal hardening, unbounded-read fixes, auth-bug fixes). Profile pattern reads "security-focused engineer who recently joined a Red Hat MCP team"

Focus area: Two unrelated DRA-driver bug fixes landed on the same day (2026-04-15), both indicating real production NVML use:

  • PR #1039: getGpuInfo was logging <nil> instead of the actual NVML return code because the error format string referenced %w, err instead of %v, ret. Misleading error UX, easy to miss without hitting it.
  • PR #1040: when the DynamicMIG feature gate is disabled (the default), unpreparePartiallyPrepairedClaim() logs "nothing to do" but falls through into DynamicMIG-specific cleanup code, causing a spurious NVML error for static-MIG claims in PrepareStarted state. Adds the missing return nil.

Both bugs look like things you only hit if you're running this driver against real GPUs at non-trivial scale. The defect-detection pattern (NVML semantics + SSRF / path-traversal / auth elsewhere) reads as a security-focused engineer.

PR Contribution
#1039 gpu plugin: fix wrong error variable in getGpuInfo for NVML system calls (unsigned)
#1040 gpu plugin: add missing return in unpreparePartiallyPrepairedClaim when DynamicMIG is disabled (unsigned)

Summary Table

# Contributor GitHub Employer DCO email NVIDIA org Merged PRs Notes
1 John Belamaric @johnbelamaric Google jbelamaric@google.com 1 SIG-Architecture co-chair, DRA KEP lead
2 Antonio Ojea @aojea Google aojea@google.com 1 SIG-Network chair; KIND/kube-proxy maintainer
3 Leiyi Zhang @leiyiz Google leiyiz@google.com 1 shell-quoting fix + distroless follow-up
4 Suraj Deshmukh @surajssd Microsoft (AKS) surajd.service@gmail.com 1 README expectations
5 Jon Huhn @nojnhuh Microsoft (AKS) nojnhuh@users.noreply.github.com 1 pcieRoot soft-fail
6 Yue Yu @yuyue9284 Microsoft yuyu3@microsoft.com 1 ComputeDomain cache existence check
7 Vitaliy Emporopulo @empovit Red Hat (OpenShift AI) vemporop@redhat.com 5 OpenShift enablement (longest-running external)
8 Kevin Hannon @kannon92 Red Hat kehannon@redhat.com ⚠️ 1 CI supply-chain (action-pin) — DCO unsigned
9 Xiaowu Zhu @yyzxw DaoCloud xiaowu.zhu@daocloud.io 2 repo-bootstrap chores
10 Noah Tang @CoderTH DaoCloud coderth@outlook.com ⚠️ 2 early CI scaffolding — #38 DCO unsigned
11 Rongfu Leng @lengrongfu DaoCloud lenronfu@gmail.com 1 env-var rename
12 Cyclinder Kuo @cyclinder DaoCloud kuocyclinder@gmail.com 1 README maintenance
13 Zhongjun Li @learner0810 DaoCloud zhongjun.li@daocloud.io 1 build fix
14 Kasia Kujawa @kasia-kujawa CAST AI katarzyna@cast.ai 5 GKE MPS production bug-fix stream
15 Robert Northard @RobertNorthard AWS (EKS) robertnorthard@googlemail.com 1 Helm defaults for managed K8s
16 Marco Ebert @Gacko Giant Swarm (Cluster API / Cologne) marco_ebert@icloud.com 2 NetworkPolicy + /opt/bin
17 Herb Duan @herb-duan undisclosed (Beijing) herbertduan@qq.com 1 Leader election (largest feature)
18 Anish Bista @anishbista60 KubeRox Technologies (Nepal) — KubeVirt contrib; CNCF Kubestronaut anishbista053@gmail.com 1 e2e test for herb-duan's #851
19 Kante Yin @kerthcet InftyAI / Hiverge (co-founder) kerthcet@gmail.com 1 tiny map-cleanup
20 takonomura @takonomura undisclosed; ISUCON14 winner; ICTSC contributor (noreply) ⚠️ 1 helm path fix — DCO unsigned
21 Jia-Wei Jiang @JiangJiaWei1103 undisclosed (Taiwan; Flyte committer / KubeRay contrib) waynechuang97@gmail.com 1 docs polish
22 Xingyu "Richard" Guo @xingyug likely Red Hat now (ex-Ericsson per email handle) xingyug.guo.ericsson@gmail.com ⚠️ 2 NVML correctness fixes — both DCO unsigned

Totals: 22 confirmed non-NVIDIA contributors · 33 merged PRs (plus 11 closed-unmerged + 1 open) · 35 commits · 30 signed / 5 unsigned.

Reference point: NVIDIA commits in the same window total ~1,445 (about 78% of the 1,853-commit repo, after reclassifying Mathew Wicks's 2 commits from external to NVIDIA per Helios); external work is ~2% of total commits but disproportionately concentrated in production-flavored bug fixes and platform-enablement (OpenShift, GKE, EKS, AKS, Talos/Flatcar, IPv6) rather than feature work.


Notable Observations

  1. OpenShift / Red Hat is the most sustained external engagement. Vitaliy Emporopulo (@empovit) has been landing PRs from vemporop@redhat.com continuously since Feb 2024 — the original "DRA driver on OpenShift" install docs in the repo exist because he wrote them, and he is the only external contributor who has shipped PRs across more than two release cycles. Kevin Hannon (@kannon92) adds a second Red Hat surface area (CI supply-chain hardening). If you wanted to point at one external company that runs this driver in production today, it's Red Hat / OpenShift.

  2. CAST AI is the most active production user this year. Kasia Kujawa (@kasia-kujawa) has merged 5 PRs in 2026 alone, every one a fix to a real bug they hit on GKE — MPS chroot, MPS shm-dir path, validation messages, nil pointers, cross-compiler. CAST AI has a managed-GPU-sharing offering that sits on top of GKE and they are clearly running this driver against real customer workloads. PR #1009 (retry device enumeration on startup to prevent empty ResourceSlices) is still open and is a classic race-on-startup symptom.

  3. DaoCloud is the largest headcount external cluster (5), but the thinnest per-person engagement. Xiaowu Zhu, Noah Tang, Rongfu Leng, Cyclinder Kuo, and Zhongjun Li all signed off with @daocloud.io (or are listed @DaoCloud on GitHub). All five together account for only 7 merged PRs, mostly chores/docs/build-fixes from the project's early days. Three of the five — Zhu, Yu (Microsoft, see #6), Li — have no GitHub-profile metadata at all; their real names and employers are only recoverable from DCO trailers. This is consistent with a corporate community-engagement quota rather than production usage.

  4. Hyperscaler-adjacent helm-defaults work is happening. RobertNorthard (AWS) filed and fixed issue #1047 because the default helm chart node-affinity targets node-role.kubernetes.io/control-plane, which never exists on EKS / GKE / AKS. johnbelamaric (Google) added the compute-class toleration. Together these are the closest the project has to "make defaults work on managed Kubernetes" patches — and they have come, one at a time, from each of the three hyperscalers.

  5. The DRA KEP author has personally contributed. John Belamaric (@johnbelamaric) — co-author of the DRA KEP-3063 that this entire repo exists to implement, and co-chair of SIG-Architecture — has one merged PR (#221, compute-class toleration). That's a useful piece of social proof for the repo: the API designer landed code in the reference driver.

  6. One feature PR dominates external impact: herb-duan's #851 (+1,599 / −4 across 14 files), adding leader election to the ComputeDomain controller. This eliminates a controller SPOF in HA deployments. The feature was requested by another non-NVIDIA contributor (@Lily922, issue #815) and the follow-up e2e test was contributed by another non-NVIDIA contributor (@anishbista60, PR #1094). Three external contributors collaborating across feature-request → implementation → test is the most coordinated external work the repo has seen.

  7. Identity discovery via DCO trailers — 4 contributors. DCO Signed-off-by lines were the only signal that revealed:

    • yyzxw is Xiaowu Zhu at DaoCloud (GitHub profile is empty),
    • yuyue9284 is Yue Yu at Microsoft (GitHub profile is empty),
    • learner0810 is Zhongjun Li at DaoCloud (GitHub profile is empty),
    • lengrongfu's real DCO email is lenronfu@gmail.com (commit author email is a local hostname). Without DCO, three of these would be "unknown handle at unknown company."
  8. Identity discovery via email pattern + repo-graph — 1 contributor. xingyug.guo.ericsson@gmail.com has the substring .ericsson deliberately embedded in a personal gmail handle. GitHub login xingyug has no name, no company, no LinkedIn surface — but cross-referencing his other PRs shows the current concentration is on Red Hat–maintained projects (rhel-lightspeed/linux-mcp-server, containers/kubernetes-mcp-server), not Ericsson properties. Best read: an Ericsson alum who recently joined a Red Hat MCP-server team. The defect-spotting pattern (NVML semantics here + SSRF / path-traversal / auth-bug fixes elsewhere) is consistent with a security-focused engineer. Two unsigned commits from this contributor are the only unsigned external work in the recent (post-2026) window.

  9. DCO compliance is weaker on this repo than on nvsentinel. Five external commits are unsigned (kannon92, takonomura, two from xingyug, one CoderTH). The repo's branch protection only requires EasyCLA; there is no DCO bot blocking unsigned merges. By contrast, the nvsentinel repo's external commits were 100% signed (and the gap that surfaced there was on NVIDIA-internal committers, not externals).

  10. No competing-GPU-vendor contributions detected. Unlike nvsentinel (which received a MooreThreads-affiliated style PR), this repo has not received any contributions from MooreThreads, AMD, Intel, Habana, Tenstorrent, or other competing accelerator vendors. The Ericsson signal (from #8) is the closest to "third-party hardware vendor" and even there the work is on NVML / DRA semantics, not on cross-vendor abstraction.

  11. One pseudonymous contributor who is demonstrably world-class — takonomura. Won ISUCON14 (2024) — Japan's premier infrastructure-tuning competition. Heavy contributor to ICTSC (Japan's inter-college infra contest) and to whywaita/CyberAgent-community tools. DRA contributions across both this repo and kubernetes/kubernetes. Uses users.noreply.github.com everywhere; the only external contributor we genuinely cannot de-anonymize, but his work and contribution graph make clear he is an experienced platform engineer.

  12. Giant Swarm joins the managed-K8s cluster. Marco Ebert (@Gacko) is a Giant Swarm Cluster API engineer with ~2,463 PRs inside the giantswarm org. That brings the managed-Kubernetes-vendor count contributing to this repo to five: AWS (RobertNorthard), Google (Belamaric / Ojea / Zhang), Microsoft (Deshmukh / Huhn / Yu), Red Hat-OpenShift (Empovit / Hannon), and Giant Swarm (Ebert) — plus the upstream-distribution vendors DaoCloud and CAST AI. Every major Kubernetes commercial distribution model is represented in the external contributor pool except SUSE / Rancher and VMware-Tanzu-broadcom. The contributions are consistently shaped by what each vendor needs to ship the driver to their customers.

  13. The IPv6 / driver-580 startup-probe fix was actually NVIDIA-internal. PRs #510 / #511 (compute-domain startup probe on IPv6 with NVIDIA driver 580+) initially appeared to be the only external IPv6 contribution. After the Helios cross-check those PRs are NVIDIA-internal (Mathew Wicks, NVIDIA Enterprise Products) — meaning the external IPv6 surface area in this repo is currently zero. Worth flagging because IPv6 / dual-stack support is a real production-readiness gap that no third party has yet pushed on.

  14. The contribution shape is "platform-enablement," not "feature work." Of the 33 merged external PRs, the breakdown is roughly:

    • Platform / installer / chart enablement (OpenShift, GKE, EKS, AKS, NetworkPolicy, /opt/bin, masters-toleration, control-plane affinity): ~11 PRs
    • Production bug fixes (MPS chroot, NVML error vars, cache existence checks, race on startup): ~9 PRs
    • CI / supply-chain / build / dependabot scaffolding: ~7 PRs
    • Docs: ~3 PRs
    • Features (leader-election, e2e test for it): 2 PRs (one of them 1.6k lines)
    • Style / chores (map cleanup, repeat-label fix, env-var rename): ~3 PRs

    External contributors are exercising this driver against real cloud-vendor distributions and fixing what breaks. NVIDIA continues to own all of the architectural and feature direction.

  15. Helios cross-check caught a false-positive that GitHub-org check alone missed. Mathew Wicks (@thesuperzapper) initially read as a Kubeflow-lead external contributor running his own consultancy. Helios LDAP revealed he has been an NVIDIA employee since 2025-02-18 (Enterprise Products, Santa Clara HQ, manager-chain ending at Jensen Huang) — i.e. he was already on NVIDIA payroll 6 months before his two DRA-driver PRs merged on 2025-08-30. GitHub-org membership is not provisioned (HTTP 404); he contributes via his personal handle with no @nvidia.com signoff. Without a Helios pass this would have been miscategorized. This is the analogue of nvsentinel's jamie-yu0 finding (NVIDIA employee using a university-style external identity) and is a pattern any future external-contributor audit on NVIDIA-adjacent repos should look for.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment