Skip to content

Conversation

@liqiangxl
Copy link
Collaborator

No description provided.

@liqiangxl liqiangxl added the enable-auto-merge Auto-merge a PR when: 1) PR mergeable 2) Internal CI complete 3) No failures label Jan 13, 2026
@github-actions
Copy link

github-actions bot commented Jan 13, 2026

Auto-merge Status

✅ Internal CI is finished
✅ No failed checks
✅ PR is mergeable
ℹ️ PR mergeable_state: unstable

Description

  • Added debug output for shared memory limited blocks per SM

  • Prints total requested vs available shared memory bytes

  • Calculates and displays blocks per SM ratio for memory planning

  • Handles unlimited case when no shared memory requested

Changes walkthrough

Relevant files
Enhancement
allocations.cpp
Add shared memory blocks per SM debug logging                       

csrc/runtime/allocations.cpp

  • Added debug output section after shared memory computation
  • Prints total requested and available shared memory bytes
  • Calculates and logs shared memory limited blocks per SM
  • Handles division by zero with unlimited message when no memory
    requested
  • +14/-0   

    PR Reviewer Guide

    Here are some key observations to aid the review process:

    🧪 PR contains tests
    ⚡ Recommended focus areas for review
    Missing function validation

    The code calls deviceAvailableSharedMemoryBytes() but there's no indication this function exists or is properly implemented. Need to verify this function is available and returns the correct value for the target device.

    int64_t available_shared_memory_bytes = deviceAvailableSharedMemoryBytes();
    Potential division by zero

    While the code handles total == 0 case, it doesn't validate that total is positive before division. Consider adding a bounds check to ensure total > 0 before performing the division operation.

    debug() << "shared memory limited blocks per SM: "
            << available_shared_memory_bytes / total << std::endl;

    @liqiangxl
    Copy link
    Collaborator Author

    !build

    @liqiangxl liqiangxl requested a review from jacobhinkle January 13, 2026 17:20
    @greptile-apps
    Copy link
    Contributor

    greptile-apps bot commented Jan 13, 2026

    Greptile Overview

    Greptile Summary

    Added debug output for shared memory usage metrics when DebugDumpOption::DynamicSharedMemory is enabled.

    • Prints total requested shared memory bytes
    • Prints available shared memory bytes
    • Calculates and prints shared memory limited blocks per SM (available/total)
    • Handles zero division case with "unlimited" message when no shared memory is requested

    Issue found:

    • Type mismatch: deviceAvailableSharedMemoryBytes() returns size_t but is assigned to int64_t

    Confidence Score: 4/5

    • Safe to merge with one type mismatch fix needed
    • The change adds useful debug output with proper division-by-zero handling. One minor type mismatch exists where size_t is assigned to int64_t, which should be corrected for type consistency with the function's return type
    • csrc/runtime/allocations.cpp needs the type mismatch fixed on line 119

    Important Files Changed

    File Analysis

    Filename Score Overview
    csrc/runtime/allocations.cpp 4/5 Added debug output for shared memory usage metrics including blocks per SM calculation; minor type mismatch detected

    Sequence Diagram

    sequenceDiagram
        participant Caller
        participant computeSharedMemory
        participant deviceAvailableSharedMemoryBytes
        participant debug
    
        Caller->>computeSharedMemory: Request shared memory calculation
        computeSharedMemory->>computeSharedMemory: Calculate total from buffers
        
        alt DebugDumpOption::DynamicSharedMemory enabled
            computeSharedMemory->>deviceAvailableSharedMemoryBytes: Get available shared memory
            deviceAvailableSharedMemoryBytes-->>computeSharedMemory: Return size_t bytes
            computeSharedMemory->>debug: Print total requested bytes
            computeSharedMemory->>debug: Print available bytes
            
            alt total > 0
                computeSharedMemory->>computeSharedMemory: Calculate blocks per SM
                computeSharedMemory->>debug: Print blocks per SM (available/total)
            else total == 0
                computeSharedMemory->>debug: Print unlimited blocks per SM
            end
        end
        
        computeSharedMemory-->>Caller: Return total bytes
    
    Loading

    Copy link
    Contributor

    @greptile-apps greptile-apps bot left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    1 file reviewed, 1 comment

    Edit Code Review Agent Settings | Greptile

    }
    }
    if (isDebugDumpEnabled(DebugDumpOption::DynamicSharedMemory)) {
    int64_t available_shared_memory_bytes = deviceAvailableSharedMemoryBytes();
    Copy link
    Contributor

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    deviceAvailableSharedMemoryBytes() returns size_t (unsigned), not int64_t (signed)

    Suggested change
    int64_t available_shared_memory_bytes = deviceAvailableSharedMemoryBytes();
    size_t available_shared_memory_bytes = deviceAvailableSharedMemoryBytes();

    Copy link
    Collaborator

    @jacobhinkle jacobhinkle left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    LGTM

    }
    }
    if (isDebugDumpEnabled(DebugDumpOption::DynamicSharedMemory)) {
    int64_t available_shared_memory_bytes = deviceAvailableSharedMemoryBytes();
    Copy link
    Collaborator

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Suggested change
    int64_t available_shared_memory_bytes = deviceAvailableSharedMemoryBytes();
    const size_t available_shared_memory_bytes = deviceAvailableSharedMemoryBytes();

    @github-actions github-actions bot merged commit d7a414b into main Jan 14, 2026
    20 checks passed
    @github-actions github-actions bot removed the enable-auto-merge Auto-merge a PR when: 1) PR mergeable 2) Internal CI complete 3) No failures label Jan 14, 2026
    @github-actions github-actions bot deleted the llu/print_smem_info branch January 14, 2026 18:24
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

    Labels

    None yet

    Projects

    None yet

    Development

    Successfully merging this pull request may close these issues.

    2 participants