-
Notifications
You must be signed in to change notification settings - Fork 14
Description
Test environment:
VM:
rocm-7.1.1.70101-38.el8.x86_64
amdgpu-dkms-firmware-30.20.1.0.30200100-2255209.el9.noarch
amdgpu-dkms-6.16.6-2255209.el9.noarch
amdgpu-core-7.1.70101-2255337.el9.noarch
5.14.0-570.75.1.el9_6.x86_64
Test steps to reproduce the issue:
[1] make sure the inbox amdgpu is blacklisted
[2] start a RHEL96 VM with a passthrough AMD MI300X GPU
[3] install the amdgpu related packages
[VM] # os_version=${1:-"9.6"}
[VM] # cat > /etc/yum.repos.d/amdgpu.repo <<EOF
[amdgpu]
name=amdgpu
baseurl=https://repo.radeon.com/amdgpu/latest/rhel/$os_version/main/x86_64/
enabled=1
priority=50
gpgcheck=1
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
[ROCm]
name=ROCm
baseurl=https://repo.radeon.com/rocm/el9/latest/main/
enabled=1
priority=50
gpgcheck=1
gpgkey=https://repo.radeon.com/rocm/rocm.gpg.key
EOF
[4] reboot the VM
[5] load amdgpu driver in the VM
[6] check the VM dmesg
[VM] # dmesg
...
[ 70.449258] amdkcl: loading out-of-tree module taints kernel.
[ 70.449304] amdkcl: module verification failed: signature and/or required key missing - tainting kernel
[ 73.210070] [drm] amdgpu kernel modesetting enabled.
[ 73.210096] [drm] amdgpu version: 6.16.6
[ 73.210107] [drm] OS DRM version: 6.12.0
[ 73.211415] amdgpu: Virtual CRAT table created for CPU
[ 73.211500] amdgpu: Topology: Add CPU node
[ 73.222175] amdgpu 0000:04:00.0: amdgpu: initializing kernel modesetting (IP DISCOVERY 0x1002:0x74A1 0x1002:0x74A1 0x00).
[ 73.222518] amdgpu 0000:04:00.0: amdgpu: register mmio base: 0x82400000
[ 73.222545] amdgpu 0000:04:00.0: amdgpu: register mmio size: 2097152
[ 79.226575] amdgpu 0000:04:00.0: amdgpu: failed to read discovery info from memory, vram size read: 0
[ 79.226669] amdgpu 0000:04:00.0: amdgpu: [drm] ERROR discovery failed: -2
[ 79.226707] amdgpu 0000:04:00.0: amdgpu: Fatal error during GPU init
[ 79.226820] amdgpu 0000:04:00.0: amdgpu: amdgpu: finishing device.
[ 79.229911] amdgpu: probe of 0000:04:00.0 failed with error -2
[ 79.230021] amdgpu: legacy kernel without apple_gmux_detect()