Skip to content

[Feature]: Support arm64 for gpu-operator containers #331

@cyanidium

Description

@cyanidium

Suggestion Description

gpu-operator should be able to run on arm64 control plane nodes.

I understand there might be dependencies for the workloads/kmm on amd64, but the gpu-operator container is very straightforward and runs fine on arm64. It is also the only image that seems to need to run on the control plane. Supporting arm64 is useful for multi-arch clusters with no amd64 control planes.

It seems like this already works with some simple changes:

sed -i 's/amd64/arm64/g' Makefile
sed -i 's/amd64/arm64/g' Dockerfile
sed -i 's/amd64/arm64/g' Dockerfile.build
make docker-build

This generates a working image that I've got running in my test cluster.

There seems to be two ways to implement this:

  1. Add build steps for dedicated arm64 containers/tags (could copy the existing files and find/replace, but would result in a lot of duplication), or
  2. Implement multi-platform builds (a bit more work needed to get the Makefile changes in place)

Operating System

No response

GPU

No response

ROCm Component

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions