Add chipStar (SPIR-V) support for HIP backends#1942
Draft
Add chipStar (SPIR-V) support for HIP backends#1942
Conversation
Detect HIP platform at build time via hipconfig output: - __HIP_PLATFORM_SPIRV__ → HIP_LIB_NAME=CHIP (chipStar) - __HIP_PLATFORM_HCC__/__HIP_PLATFORM_AMD__ → HIP_LIB_NAME=amdhip64 Move ROCM_DIR and HIP_ARCH defaults to the top of the file where other tool-path defaults live. Use HIP_LIB_NAME in the library detection glob and in PKG_LIBS. Remove the subst=,, stripping from HIPCONFIG_CPPFLAGS so flags are passed through unmodified.
When SYCL backends are built, libceed.so must be linked with icpx (SYCLCXX) rather than g++, and -fsycl must appear in CEED_LDFLAGS (before object files) so icpx can merge the SYCL fat binary device sections. Without this, libceed.so lacks NEEDED: libsycl.so.7 and SYCL kernels fail to load at runtime.
Replace the outer element for-loop with a single element assignment and guard all memory accesses with if (elem < num_elem). Shared memory operations (Interp, Grad, etc.) must execute unconditionally across all threads so __syncthreads() is reached uniformly; only the load/store steps are guarded. Also guard qfunction calls with thread-id bounds checks and comment out the pragma unroll that triggered miscompilation on chipStar's LLVM.
Prevents UnicodeDecodeError when GPU runtimes emit non-UTF-8 bytes (e.g. chipStar CHIP warnings contain raw binary data in some paths).
chipStar prints 'CHIP info/warning/debug ...' lines to stderr on every run. These are not test failures but caused JUnit to mark all HIP tests as failed. Filter them out before checking whether stderr is non-empty.
Member
|
Note: Please use the libCEED PR template, including the LLM usage declaration. Thanks (requirement in CONTRIBUTING.md) |
…r gcc chipStar's hipconfig -C outputs --offload=spirv64, -nohipwrapperinc, --hip-path=, and --target= which are clang-only flags. When CC=gcc is used for .c files (or CXX != HIPCC for .cpp files), these flags cause build failures. Add HIPCONFIG_CPPFLAGS_C that filters the clang-only flags and adds an explicit -I$(ROCM_DIR)/include (since -nohipwrapperinc was suppressing the wrapper that would have pulled in hip_runtime.h).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Purpose:
Add support for chipStar, enabling libCEED HIP backends (
/gpu/hip/ref,/gpu/hip/shared,/gpu/hip/gen) to run on any OpenCL or Level Zero GPU (Intel, etc.) via SPIR-V.Changes
__HIP_PLATFORM_SPIRV__), handle dynamic lib name (libCHIP.sovslibamdhip64.so), filter clang-only flags fromHIPCONFIG_CPPFLAGSwhen using gcc asCC, useSYCLCXXas linker when SYCL backend is also enabled#ifdef __HIP_PLATFORM_SPIRV__): chipStar SPIR-V target requires all threads in a workgroup to reach__syncthreads()barriers. Kernel element loops are refactored to a single-assignment + guard pattern on chipStar, while preserving the original for-loop structure on AMD ROCm/NVIDIAhip-ref-basis-nontensor.h: 5 kernel functionship-shared-basis-tensor.h: 13 kernel functionsceed-hip-gen-operator-build.cpp: code-generator emits conditional kernel sourceerrors='replace'for binary data in subprocess outputBenchmarks: HIP (chipStar) vs SYCL (oneAPI) on Intel Arc A770
chipStar main + hiprtc output caching. oneAPI 2024.2.2 + Level Zero. 5M DOFs, polynomial degree p=4, warm JIT cache.
Testing was done with SYCL Caching PR #1943 otherwise chipStar outperforms SYCL much more due to JIT costs.
ex1-volume (mass operator)
ex2-surface (surface integral)
chipStar HIP is 15-52% faster than native SYCL across all backends.
LLM/GenAI Disclosure:
Claude Code was used to rebase on latest main (bulk of this work was done in 2024/2025), write the
#ifdef __HIP_PLATFORM_SPIRV__preprocessor guards across HIP kernel files and the code-generator, and to draft this PR description.By submitting this PR, the author certifies to its contents as described by the Developer's Certificate of Origin.
Please follow the Contributing Guidelines for all PRs.