PaddleOCR-VL: Remove ROCm BF16 _keep_in_fp32_modules workaround

# Issue Description

## Background

PaddleOCR-VL currently uses `_keep_in_fp32_modules = ["visual", "mlp_AR"]` in the `PaddleOCRVLForConditionalGeneration` model class to work around MIOpen BF16 convolution bugs on ROCm 7.0. This forces the entire SigLIP vision encoder to run in FP32 precision, even when the model is loaded with BF16 dtype.

## Problem

This workaround has significant downsides:

1. **VRAM doubled**: FP32 weights + activations consume 2x memory compared to BF16
2. **Throughput reduced**: Cannot leverage AMD GPU's native BF16 tensor core performance
3. **Inconsistent behavior**: Model claims BF16 dtype but vision encoder silently runs in FP32

## Root Cause

The upstream Paddle framework did not register BF16 convolution kernels (`conv2d`, `conv3d`, `depthwise_conv2d`) for the HIP (ROCm) backend. When a BF16 model attempted convolution, it failed with:

```
RuntimeError: The kernel with key (GPU, Undefined(AnyLayout), bfloat16) of kernel `conv2d` is not registered
```

## Resolution Path

A fix has been submitted to the Paddle framework: https://github.com/PaddlePaddle/Paddle/pull/78587

This PR adds `phi::bfloat16` to the HIP kernel registration macros in `conv_kernel.cu` and `conv_grad_kernel.cu`, enabling native BF16 convolution on AMD GPUs.

## Proposed Change

Once the Paddle framework fix is merged, this workaround in PaddleX should be removed:

```python
# Before (current)
_keep_in_fp32_modules = ["visual", "mlp_AR"]

# After
_keep_in_fp32_modules = None
```

A corresponding PR for PaddleX will be submitted alongside the Paddle framework fix.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PaddleOCR-VL: Remove ROCm BF16 _keep_in_fp32_modules workaround #5076

Issue Description

Background

Problem

Root Cause

Resolution Path

Proposed Change

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

PaddleOCR-VL: Remove ROCm BF16 _keep_in_fp32_modules workaround #5076

Description

Issue Description

Background

Problem

Root Cause

Resolution Path

Proposed Change

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions