fix CUDA_ERROR_ILLEGAL_ADDRESS bug #63

1374839016 · 2023-06-24T12:12:00Z

I fix the memory access bug, which describe here #55 . I force cupy allocate memory on pytorch device.

fix CUDA_ERROR_ILLEGAL_ADDRESS

sniklaus · 2023-06-29T00:31:56Z

Huge thanks for bringing this up!

Could you provide some more technical details on how this makes a difference? Currently, all the involved tensors will be on the same device as the first input as per:

pytorch-pwc/correlation/correlation.py

Lines 279 to 285 in 07c3f5a

    
           rbot0 = one.new_zeros([ one.shape[0], one.shape[2] + 8, one.shape[3] + 8, one.shape[1] ]) 
        
           rbot1 = one.new_zeros([ one.shape[0], one.shape[2] + 8, one.shape[3] + 8, one.shape[1] ]) 
        
           one = one.contiguous(); assert(one.is_cuda == True) 
        
           two = two.contiguous(); assert(two.is_cuda == True) 
        
           output = one.new_zeros([ one.shape[0], 81, one.shape[2], one.shape[3] ])

I am hence a little bit confused on what the proposed fix would change. 🤔

1374839016 · 2023-06-29T01:02:39Z

Sorry, I don't know, but I guess the code allocate shared memory on default device(GPU 0).

cupy_launch('kernel_Correlation_updateOutput', cupy_kernel('kernel_Correlation_updateOutput', {
    'rbot0': rbot0,
    'rbot1': rbot1,
    'top': output
}))(
    grid=tuple([ output.shape[3], output.shape[2], output.shape[0] ]),
    block=tuple([ 32, 1, 1 ]),
    shared_mem=one.shape[1] * 4,
    args=[ cupy.int32(n), rbot0.data_ptr(), rbot1.data_ptr(), output.data_ptr() ]
)

Update correlation.py

07c3f5a

fix CUDA_ERROR_ILLEGAL_ADDRESS

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix CUDA_ERROR_ILLEGAL_ADDRESS bug #63

fix CUDA_ERROR_ILLEGAL_ADDRESS bug #63

Uh oh!

1374839016 commented Jun 24, 2023

Uh oh!

sniklaus commented Jun 29, 2023

Uh oh!

1374839016 commented Jun 29, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fix CUDA_ERROR_ILLEGAL_ADDRESS bug #63

Are you sure you want to change the base?

fix CUDA_ERROR_ILLEGAL_ADDRESS bug #63

Uh oh!

Conversation

1374839016 commented Jun 24, 2023

Uh oh!

sniklaus commented Jun 29, 2023

Uh oh!

1374839016 commented Jun 29, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants