Skip to content

Updated Red Hat mounts for air-gapped clusters#2391

Open
JunAr7112 wants to merge 1 commit intoNVIDIA:mainfrom
JunAr7112:air_gapped_mount
Open

Updated Red Hat mounts for air-gapped clusters#2391
JunAr7112 wants to merge 1 commit intoNVIDIA:mainfrom
JunAr7112:air_gapped_mount

Conversation

@JunAr7112
Copy link
Copy Markdown
Contributor

@JunAr7112 JunAr7112 commented Apr 23, 2026

Description

This change is in response to this issue. In summary, when we install the gpu-operator on an air-gapped cluster the driver pod spec can error because the node does not have the HostPathFile.

MountVolume.SetUp failed for volume "subscription-config-1" : hostPath type check failed: /etc/yum.repos.d/redhat.repo is not a file Warning FailedMount 3m55s (x23 over 34m) kubelet MountVolume.SetUp failed for volume "subscription-config-0" : hostPath type check failed: /etc/pki/entitlement is not a directory

When we set up the gpu-operator to run in air-gapped environments (see here), we will setup a local package repository containing the the Red Hat packages needed to run, and create a configMap referencing a repo list containing the packages. In a regular environment, the driver container uses Ret Hat Subscription Management (RHSM) to determine what repos you need and how to install them from Red Hat's CDN. With a repoConfig ConfigMap (air-gapped / custom mirror): you do not rely on that subscription path for package location. The ConfigMap will provide the server/files where the Driver can pull the repos. Therefore, the operator can skip RHSM mounts because the repo definitions and mirror content are already provided.

The following changes were made:

// Custom repo ConfigMap supplies yum repos in offline/air-gapped installs. Skip mounting host RHSM paths when using a custom repo ConfigMap if cr.Spec.IsRepoConfigEnabled() && pool.osRelease == "rhel" { pathToVolumeSource = map[string]corev1.VolumeSource{} }
...ensuring that we no longer mount the RHSM paths.

Checklist

  • No secrets, sensitive information, or unrelated changes
  • Lint checks passing (make lint)
  • Generated assets in-sync (make validate-generated-assets)
  • Go mod artifacts in-sync (make validate-modules)
  • Test cases are added for new code paths

Testing

An Air-gapped cluster was setup following instructions here: https://kubernetes.io/blog/2023/10/12/bootstrap-an-air-gapped-cluster-with-kubeadm/.

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented Apr 23, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Signed-off-by: Arjun <agadiyar@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant