[NPU] ERNIE 4.5 support #3399

starmountain1997 · 2025-08-14T07:34:51Z

环境配置

基础环境配置

镜像启动

建议使用镜像安装，当然你也可以在裸机上安装。

首先根据自己的系统架构拉取镜像：

docker pull ccr-2vdh3abv-pub.cnc.bj.baidubce.com/device/paddle-npu:cann80RC2-ubuntu20-npu-base-x86_64-gcc84 # X86 架构

docker pull ccr-2vdh3abv-pub.cnc.bj.baidubce.com/device/paddle-npu:cann80RC2-ubuntu20-npu-base-aarch64-gcc84 # ARM 架构

启动镜像：

docker run -it --name ${NAME} -v /home/guozr:/home/guozr \
    --privileged --shm-size=128G -w=/home/guozr \
    -v /usr/local/Ascend/driver:/usr/local/Ascend/driver \
    -v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \
    -v /usr/local/dcmi:/usr/local/dcmi \
    --net host \
    -e ASCEND_RT_VISIBLE_DEVICES="0,1,2,3,4,5,6,7" \
    e6acd904bbcf /bin/bash

安装高版本 CANN

镜像内的 CANN 套件较老，需要重新安装 CANN Toolkit、CANN Kernels 和 NNAL，版本>=8.1.RC1，请注意，三个软件的版本需配套，推荐使用 8.2.RC1 版本。请正确选择 CPU 架构，CANN kernels 是分硬件的，请注意选择。下载好后按下面顺序安装：

yes | toolkit.run --install
yes | kernels.run --install
yes | nnal.run --install

配置环境变量

运行前请配置下列环境变量：

source /usr/local/Ascend/ascend-toolkit/set_env.sh
source /usr/local/Ascend/atb/set_env.sh
source /usr/local/Ascend/nnal/atb/set_env.sh --cxx_abi=1

另外默认显存分配机制为 naive_best_fit 可选择配置 Paddle 显存分配机制为 auto_growth 以随着真实数据需要再占用内存/显存，但内存/显存可能会产生碎片，详见。

目前由于未知原因，不将显存分配机制设为 auto_growth会爆显存，因此也请设置下面的环境变量：

export FLAGS_allocator_strategy=auto_growth

`Python` 环境配置

安装 `Paddle`

可使用如下命令安装（更高版本的 paddlepaddle 和 paddleformers 有冲突，因此这里建议安装 3.1 版本）：

# 先安装飞桨 CPU 安装包
pip install paddlepaddle==3.1
# 再安装飞桨 NPU 插件包
pip install paddle-custom-npu -i https://www.paddlepaddle.org.cn/packages/stable/npu

详见昇腾 NPU 安装说明。

安装三方库

编译 PaddleCustomDevice 之前，需要安装三方库 spdlog 和 json：

# 安装 spglog
git clone https://github.com/gabime/spdlog.git
cd spdlog
mkdir build && cd build
cmake ..
make -j$(nproc)
make install

# 安装 json
git clone https://github.com/nlohmann/json.git
cd json
mkdir build && cd build
cmake ..
make -j$(nproc)
make install

安装 PaddleCustomDevice

git clone https://github.com/PaddlePaddle/PaddleCustomDevice.git
cd PaddleCustomDevice/backends/npu
bash tools/compile.sh

完成编译后执行下面的命令安装：

pip install build/dist/paddle_custom_npu-*.whl --force-reinstall

手动安装这个 PR：

git clone https://github.com/llliiilil/PaddleCustomDevicetmp.git miPaddleCustomDevice
cd miPaddleCustomDevice/backends/npu
bash tools/compile.sh
pip install build/dist/paddle_custom_npu-*.whl --force-reinstall


source tools/set_env.sh
cd opp/ascendc_custom_ops/build/
bash build_ops.sh
cd custom_project/build_out/
./custom_opp*.run
export LD_LIBRARY_PATH=/usr/local/Ascend/ascend-toolkit/latest/opp/vendors/aie_ascendc/op_api/lib/:${LD_LIBRARY_PATH}

如后续报错 please make sure you registered your op first and try again，请在手动安装后回去再覆盖安装一下主线版本 PaddlePaddle/PaddleCustomDevice 中生成的 whl。

安装 PaddleNLP

从源码克隆：

git clone https://github.com/PaddlePaddle/PaddleNLP.git

到 csrc/npu 目录下按照 README.md 安装：

python setup.py build bdist_wheel
pip install dist/paddlenlp_ops*.whl

编译 FastDeploy

bash build.sh

运行时可能会报错：

ModuleNotFoundError: No module named 'distutils.dir_util'

可以修改 /usr/local/lib/python3.10/dist-packages/paddleformers/utils/pdc_sdk.py 22 行的 from distutils.dir_util import copy_tree 为：

from shutil import copytree as copy_tree

运行前需把对应的 FastDeploy 目录添加到 PYTHONPATH：

export PYTHONPATH="/work/FastDeploy":${PYTHONPATH}
export LD_LIBRARY_PATH=/usr/local/Ascend/npt/lib:$LD_LIBRARY_PATH

如果遇到 libgomp cannot allocate memory in static TLS block 错误，可以按如下方法解决：

export LD_PRELOAD=$LD_PRELOAD:/usr/local/lib/python3.10/dist-packages/scikit_learn.libs/libgomp-{一串数字，根据你实际情况决定}.so.1.0.0

如果遇到循环导入问题，且不运行多模态模型，可以临时卸载 opencv。另外请注意，目前对 numpy 2.0 支持不佳，因此在最后请强制安装 numpy 1.26.4 版本：

pip uninstall opencv-python
pip install numpy==1.26.4

如果遇到：

  File "/home/guozr/CODE/FastDeploy/fastdeploy/utils.py", line 443, in get_host_ip
    ip = socket.gethostbyname(socket.gethostname())
socket.gaierror: [Errno -2] Name or service not known

先查询：

hostname

然后在 /etc/hosts 加上上面查询到的 hostname：

127.0.0.1   hostname-mbqbc.foreman.pxe localhost

paddle-bot · 2025-08-14T07:34:57Z

Thanks for your contribution!

CLAassistant · 2025-08-14T08:00:22Z

All committers have signed the CLA.

Cndbk · 2025-11-12T09:18:09Z

您好，麻烦请教下，启动ernie4.5-21b-a3b模型报错：
FastDeploy/fastdeploy/model_executor/ops/npu/sparse_moe.py", line 6, in
from paddlenlp_ops import sparse_moe
ImportError: cannot import name 'sparse_moe' from 'paddlenlp_ops' (/usr/local/lib/python3.10/dist-packages/paddlenlp_ops/init.py)

nd-toolkit/latest/opp/vendors/aie_ascendc/op_api/lib/libcust_opapi.so: undefined symbol: aclnnSub.dlsym aclnnSubGetWorkspaceSize from libcust_opapi.so failed, error:/usr/local/Ascend/ascend-toolkit/latest/opp/vendors/aie_ascendc/op_api/lib/libcust_opapi.so: undefined symbol: aclnnSubGetWorkspaceSize.dlsym aclnnSub from libcust_opapi.so failed, error:/usr/local/Ascend/ascend-toolkit/latest/opp/vendors/aie_ascendc/op_api/lib/libcust_opapi.so: undefined symbol: aclnnSub.

starmountain1997 · 2025-11-28T08:19:21Z

您好，麻烦请教下，启动ernie4.5-21b-a3b模型报错： FastDeploy/fastdeploy/model_executor/ops/npu/sparse_moe.py", line 6, in from paddlenlp_ops import sparse_moe ImportError: cannot import name 'sparse_moe' from 'paddlenlp_ops' (/usr/local/lib/python3.10/dist-packages/paddlenlp_ops/init.py)

nd-toolkit/latest/opp/vendors/aie_ascendc/op_api/lib/libcust_opapi.so: undefined symbol: aclnnSub.dlsym aclnnSubGetWorkspaceSize from libcust_opapi.so failed, error:/usr/local/Ascend/ascend-toolkit/latest/opp/vendors/aie_ascendc/op_api/lib/libcust_opapi.so: undefined symbol: aclnnSubGetWorkspaceSize.dlsym aclnnSub from libcust_opapi.so failed, error:/usr/local/Ascend/ascend-toolkit/latest/opp/vendors/aie_ascendc/op_api/lib/libcust_opapi.so: undefined symbol: aclnnSub.

paddlecustomdevice都装了嘛

Cndbk · 2025-11-28T08:25:01Z

嗯嗯，按照步骤构建了，但是还是提示缺失sparse_moe，麻烦问下paddlenlp版本是有指定的分支吗

TBD1 · 2025-12-01T03:08:37Z

NPU630版本适配可用，暂时未合入。待跟进

TBD1 · 2025-12-01T03:11:29Z

fastdeploy/model_executor/ops/npu/weight_quantize.py

+def clip_and_round(x):
+    return np.clip(np.around(x), -127, 127).astype("int8")
+
+def npu_quant_weight(weight_np):


动态量化加载权重性能差，待优化

paddle-bot bot added the contributor External developers label Aug 14, 2025

starmountain1997 changed the title ~~【wip】npu support~~ [NPU] ERNIE 4.5 support Aug 14, 2025

starmountain1997 force-pushed the npu branch from 7207f7e to 7eee023 Compare August 21, 2025 11:21

starmountain1997 marked this pull request as ready for review August 21, 2025 11:22

starmountain1997 force-pushed the npu branch 3 times, most recently from a48c978 to 59ad44e Compare August 22, 2025 03:43

npu adapter

9d5d11a

starmountain1997 force-pushed the npu branch 2 times, most recently from 9c1eb8e to 9d5d11a Compare August 26, 2025 02:30

asinkLuno added 2 commits August 26, 2025 17:15

Complete the missing NPU parts

9573de8

moe support

f3c2b68

starmountain1997 force-pushed the npu branch from 3972293 to f3c2b68 Compare September 1, 2025 04:59

quant moe

7c02a32

TBD1 self-requested a review December 1, 2025 03:09

TBD1 reviewed Dec 1, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[NPU] ERNIE 4.5 support #3399

[NPU] ERNIE 4.5 support #3399

Uh oh!

starmountain1997 commented Aug 14, 2025 •

edited

Loading

Uh oh!

paddle-bot bot commented Aug 14, 2025

Uh oh!

CLAassistant commented Aug 14, 2025 •

edited

Loading

Uh oh!

Cndbk commented Nov 12, 2025

Uh oh!

starmountain1997 commented Nov 28, 2025

Uh oh!

Cndbk commented Nov 28, 2025

Uh oh!

TBD1 commented Dec 1, 2025

Uh oh!

TBD1 Dec 1, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

[NPU] ERNIE 4.5 support #3399

Are you sure you want to change the base?

[NPU] ERNIE 4.5 support #3399

Uh oh!

Conversation

starmountain1997 commented Aug 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

环境配置

基础环境配置

镜像启动

安装高版本 CANN

配置环境变量

Python 环境配置

安装 Paddle

安装三方库

安装 PaddleCustomDevice

安装 PaddleNLP

编译 FastDeploy

Uh oh!

paddle-bot bot commented Aug 14, 2025

Uh oh!

CLAassistant commented Aug 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Cndbk commented Nov 12, 2025

Uh oh!

starmountain1997 commented Nov 28, 2025

Uh oh!

Cndbk commented Nov 28, 2025

Uh oh!

TBD1 commented Dec 1, 2025

Uh oh!

TBD1 Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

starmountain1997 commented Aug 14, 2025 •

edited

Loading

`Python` 环境配置

安装 `Paddle`

CLAassistant commented Aug 14, 2025 •

edited

Loading