Bench

FastTransformer

踩坑

依赖 openmpi,cudnn,

env set,cmake没法用环境变量override,-D加路径

export CUDNN_LIBRARY_PATH=/root/data/tensorrt_test/cudnn-linux-x86_64-8.9.1.23_cuda11-archive/lib
export CUDNN_HOME=/root/data/tensorrt_test/cudnn-linux-x86_64-8.9.1.23_cuda11-archive
export LD_LIBRARY_PATH=${CUDNN_HOME}/lib${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
export CUDNN_ROOT=/root/data/tensorrt_test/cudnn-linux-x86_64-8.9.1.23_cuda11-archive
export CUDNN_INCLUDE_PATH=/root/data/tensorrt_test/cudnn-linux-x86_64-8.9.1.23_cuda11-archive/include

CMakeList 改mpi路径

Untitled

编译的flag

-DBUILD_MULTI_GPU=ON

-DBUILD_PYT=ON (for torch benchmark)

-DCUDNN_LIBRARY_PATH

cmake -DSM=80 -DCMAKE_BUILD_TYPE=Release -DBUILD_MULTI_GPU=ON -DBUILD_PYT=ON -DCUDNN_LIBRARY_PATH=/root/data/tensorrt_test/cudnn-linux-x86_64-8.9.1.23_cuda11-archive/lib ..

vit、swin等可能会报错,进对应example把cmakelist把对应add_submodule删掉

EnergonAI

编译

torch cpp cuda version 需要等于cuda toolkit runtime的version

TensorRT