Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cherry dataloader exit error #34502

Closed
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
156 commits
Select commit Hold shift + click to select a range
54ab656
[OPs] Bug fix, fix the segment mean for illegal syncthreads usage. (#…
ZHUI Apr 27, 2021
938a5a5
cherry-pick from develop: update 2.0 public api in nn #31912 (#32621)
zhiboniu Apr 28, 2021
32203c3
update 2.0 public api in paddle.init (#32034) (#32620)
zhiboniu Apr 28, 2021
33703da
[Cherry-pick] Optimize update_loss_scaling_op(#32554) (#32606)
thisjiang Apr 28, 2021
056a2fc
conservative judgment (#32619)
b3602sss Apr 28, 2021
e60c08f
add __all__=[] to python files not in API public list; import * only …
zhiboniu Apr 29, 2021
0e904d4
update 2.0 public api in hapi (#32651)
zhiboniu Apr 29, 2021
7ae0a80
- Added clearing oneDNN per executor (#32664)
jczaja Apr 29, 2021
a5627df
fix mem release error. (#32655)
jiweibo Apr 29, 2021
263710c
edit paddle.save/load API (#32532) (#32612)
hbwx24 Apr 29, 2021
93d34f8
'jit.save/load' support save/load function without parameters. (#3243…
hbwx24 Apr 29, 2021
ef7b6d5
Add fake interface for register_hook in static mode (#32642) (#32660)
chenwhql Apr 29, 2021
30dfa74
specify multihead_matmul_fuse_pass_v3 QK path (#32659) (#32668)
cryoco Apr 29, 2021
3c324f0
[cherry-pick to 2.1] [Modify spectralnorm #32633] (#32667)
wangna11BD Apr 29, 2021
ca2ef41
[Cherry-pick] Polish custom operator overrided method impl (#32666) …
chenwhql Apr 29, 2021
93535c5
Added pure_bf16 mode (#32281) (#32681)
arlesniak Apr 29, 2021
e7c8160
Add BF16 uniform random initializer (#32468) (#32677)
wozna Apr 29, 2021
cb50657
Nne integration (#32604) (#32658)
shangzhizhou Apr 30, 2021
79ce2a6
skip fuse repeated fc when the fc with weight padding (#32648) (#32680)
juncaipeng Apr 30, 2021
2817239
Add op read_file and decode_jpeg (#32564) (#32686)
LielinJiang Apr 30, 2021
1a417a4
remove is_test=True in grad (#32683)
ceci3 Apr 30, 2021
097d5f5
Add 12 inplace APIs including auto generated (#32573) (#32699)
pangyoki Apr 30, 2021
09adf20
add flag to check_kernel launch (#32692) (#32709)
jeff41404 Apr 30, 2021
2c1ed9b
[Kunlun]fix multi xpu dygraph hang, test=kunlun (#32662) (#32696)
vslyu May 1, 2021
6a1957e
slove develop bugs (#32560) (#32684)
Baibaifan May 1, 2021
4593597
add_c_sync_npu_kernel (#32687) (#32723)
Baibaifan May 4, 2021
6b86e96
Fix the bug in pipeline for dygraph mode (#32716) (#32728)
May 5, 2021
d19b5da
bug fix, test=develop (#32730)
May 5, 2021
4626afa
fix traverse graph in reducer (#32721)
ForFishes May 5, 2021
cdfc34d
[Dy2stat] Fix to_tensor Bug Reported from QA (#32701) (#32713)
zhhsplendid May 6, 2021
035c742
add API Tensor.item() to convert Tensor element to a Python scalar (#…
zhwesky2010 May 6, 2021
df00636
update, test=develop (#32731)
May 6, 2021
c0f2668
fix l1 decay for inplace (#32718)
May 6, 2021
43b3e99
fix error imformation when trigger import error (#32702)
lyuwenyu May 6, 2021
a9d330a
[cherry-pick pr31970] Support transforms for paddle tensor image (#32…
lyuwenyu May 6, 2021
0bb079c
avoid polluting logging's root logger (#32673) (#32706)
May 6, 2021
9a589de
cherry-pick:change softmax_with_cross_entropy_op's parameter name fro…
chajchaj May 6, 2021
2144852
[CHERRY-PICK] Reduce grad fix cherrypick (#32742)
jakpiase May 6, 2021
f3436af
[cherry-pick] Sum kernel for CPU supporting BF16 and SelectedRows (#…
arogowie-intel May 6, 2021
4f06cd1
Pick revert data generator (#32700)
tianshuo78520a May 6, 2021
7e35ef3
[Cherry-Pick] Clear 'BasicEngine' when an exception occurs in the bac…
hbwx24 May 7, 2021
c67a5d9
pylayer_op:release context after compute. (#32707) (#32744)
hbwx24 May 7, 2021
ce27821
[2.1 API] Enable printing deprecated warning info. (#32712) (#32756)
xiemoyuan May 7, 2021
5fdd85b
bug fix, test=develop (#32753)
May 7, 2021
70e0e3d
[cherry-pick] Mechanism that converts startup_program initializers t…
lidanqing-intel May 7, 2021
3ba8c48
[CHERRY-PICK2.1]Remove paddle_custom_op dynamic libraries, and link …
zhwesky2010 May 7, 2021
ded39f8
[Cherrypick 2.1] fix compile error on jetson platform (#32760)
LielinJiang May 7, 2021
f54fb1e
fix stack grad gpu (#32781)
bjjwwang May 7, 2021
957cbe6
fix ce error message, test=release/2.1 (#32758)
huangjun12 May 7, 2021
2ec6b6f
remove packages in __all__ (#32757)
zhiboniu May 7, 2021
09b18a4
[Paddle-TRT] Implement MHA fp16 order same as training (#32629) (#32785)
shangzhizhou May 7, 2021
0251320
fix find_unused_parameters default value (#32829)
ForFishes May 11, 2021
4ccd9a0
fix dataloader exit hang when join re-enter (#32835)
heavengate May 12, 2021
4831e37
fix the error of fake_quant_dequant op name (#32866) (#32879)
juncaipeng May 17, 2021
b619648
bugfix: parallel_executor for xpu should use BindThreadedSSAGraphExec…
houj04 May 18, 2021
7b0b064
[cherry-pick] Fix CI Python3 on release/2.1 (#32930)
tianshuo78520a May 18, 2021
4639f5d
[Cherry-pick]Add code examples for paddle.save/load (#32900) (#32929)
hbwx24 May 18, 2021
ab1a4df
【cherrypick】support cuda11 for heterps; add profiler in oneps (#32957)
danleifeng May 19, 2021
b4b9438
[Cherry-pick] add enforce check for set_value (#32972) (#32981)
chenwhql May 19, 2021
bdce8a1
[Cherry-pick] Change Paddle CI-Cverage Python3.8 [32515] (#32960)
tianshuo78520a May 19, 2021
c7848ac
[Cherry-Pick]fix test_paddle_save_load and test_paddle_save_load_bin…
hbwx24 May 20, 2021
ef2ee5e
[cherry-pick] BugFix StaticAanlysis with gast.Subscript (#32969) (#32…
Aurelius84 May 20, 2021
8ecaa8a
BugFix with ParseInputDataType from LodTensorArray (#32918) (#32984)
Aurelius84 May 20, 2021
26c2911
[Cherry-pick]Refactor param_guard logic of @to_static (#32867) (#3285…
Aurelius84 May 20, 2021
50356eb
[Cherry-pick] Change Paddle CI-Cverage Python3.8 [32515] #33013
tianshuo78520a May 21, 2021
7c0b96e
update 2.0 public api in distributed (#32990)
zhiboniu May 24, 2021
4026e22
[HybridParallel]Fix precision problem of model parallel (#32897) (#33…
ForFishes May 25, 2021
8fe6d55
[Cherry-pick][Dy2Stat]Support convert sublayers in Sequential Contai…
Aurelius84 May 26, 2021
d7d3090
[Cherry-Pick][HybridParallel]Fix pipeline in dygraph (#33097)
ForFishes May 26, 2021
7766721
disable conv plugin in TRT old versions (#33198)
b3602sss May 31, 2021
92a7d11
[cherry-pick][CustomOP]Set GLIBCXX_USE_CXX11_ABI=1 to fix potential …
Aurelius84 May 31, 2021
ca0cc8a
[Cherry-pick][CustomOp]Specify -std=c++14 cflags by default (#33213…
Aurelius84 Jun 1, 2021
6fb6460
[Cherry-Pick]Set the default value of protocol to 4. (#32904) #33009
hbwx24 Jun 1, 2021
3fe99ad
[ROCM] add is_compiled_with_rocm api, test=develop (#33043) (#33228)
qili93 Jun 1, 2021
8a5a45f
Fix cuda kernel launch of grid sampler (#33100) (#33232)
wanghaoshuang Jun 1, 2021
5d8e439
[Cherry-pick] Fix spawn default nprocs get error (#33215) (#33249)
chenwhql Jun 1, 2021
ef6120f
[ROCM] fix fused_fc_elementwise_layernorm, test=develop (#33281) (#33…
qili93 Jun 3, 2021
b032b57
[ROCM] update paddle inference cmake, test=develop (#33260) (#33290)
qili93 Jun 3, 2021
c42ccf1
[CherryPick] fix compare ops when broadcast (#33086)
wawltor Jun 4, 2021
f17d643
Fix syncbn (#32989) (#33321)
ceci3 Jun 7, 2021
d522514
Fix inference prepare data (#33370)
b3602sss Jun 7, 2021
3c22b17
[cherry-pick] Fix code examples #32861 #33395 (#33396)
TCChenlong Jun 8, 2021
ccabafa
OP:strided_slice_op supports bool type inputs (#33373) (#33393)
TeslaZhao Jun 8, 2021
0549d4a
Cherry pick deconv & jetson single arch (#33387)
cryoco Jun 8, 2021
5e09d67
fix API: normalize_program. test=develop (#33408)
T8T9 Jun 8, 2021
bad3beb
Add trt convert reshape_op in release/2.1.1 (#33372)
Wangzheee Jun 8, 2021
28a18af
fix output_padding in conv (#33429)
jerrywgz Jun 9, 2021
6385f5e
[Paddle-TRT] Add gather_nd and reduce_sum trt op. (#33324) (#33365)
jiweibo Jun 9, 2021
d496722
fix the bug of yolo_box which can't run on nano and tx2 (#33422) (#33…
fengxiaoshuai Jun 9, 2021
c4a417f
fix the bug in repeated_fc_relu_fuse_pass.test=develop (#33386) (#33431)
winter-wang Jun 10, 2021
03f4668
fix aligned in roi_align (#33446)
jerrywgz Jun 10, 2021
fe84179
fix the bug in the creation of pp groups to avoid hang (#32890) (#33473)
Jun 10, 2021
9035fd2
[cherry-pick] Fix retry error in download when exception occurs #3281…
lyuwenyu Jun 10, 2021
1cdf69b
[cherry pick] add random state generate in DataLoader worker (#33434)
heavengate Jun 10, 2021
dfa05da
[cherry-pick] fuse L2Decay and momentum when param.regularizer is se…
zhangting2020 Jun 10, 2021
8461ab1
add sample code for summary (#33337) (#33427)
LielinJiang Jun 10, 2021
61cae0d
[cherry-pick]Fixed a bug of log_softmax: op input was modified to 'n…
AshburnLee Jun 11, 2021
f57ae4d
[cherry-pick] use the required instruction to determine if the envir…
wadefelix Jun 11, 2021
1444090
[Cherry-pick] Support diff dataset tensor place in single process da…
chenwhql Jun 11, 2021
9567cbd
[cherry-pick 2.1.1]2.1/fix concat (#33383)
vslyu Jun 11, 2021
45f8b9d
update 2.0 public api in vision (#33307)
zhiboniu Jun 11, 2021
e48f7a5
update 2.0 public api in all left files (#33314)
zhiboniu Jun 11, 2021
de612f7
Add comments to ColorJitter parameters;test=document_fix (#33432)
WenmuZhou Jun 11, 2021
a43e1fa
Fix LayerNorm Problem Release2.1 (#33534)
zhiboniu Jun 12, 2021
f703461
refix if-else logic for inference: missing if (#33531)
b3602sss Jun 15, 2021
0079e0b
[Cherry-Pick] Fix the segfault when using to_tensor in PyLayer. (#3…
hbwx24 Jun 15, 2021
bbedca4
[cherry pick] add warning for dataloader incompatable upgrade (#33514)
heavengate Jun 15, 2021
2b44ae5
[cherry-pick] Polish code for setitem/getitem and support index for l…
liym27 Jun 15, 2021
036f81f
bugfix: param init with fill constant str_value (#33381) (#33472)
JZ-LIANG Jun 15, 2021
a4e841e
[cherry-pick] fix gather bug && fix hang of new_group (#33553)
ForFishes Jun 15, 2021
06c2d0c
[cherry-pick] tar CAPI lib in paddle build scripts (#33563)
OliverLPH Jun 15, 2021
c334d2b
Cherry-pick support the bool tensor for the compare ops (#33551)
wawltor Jun 15, 2021
e5bd7eb
Add trt layer norm dynamic (#33448)
shangzhizhou Jun 16, 2021
5c68e79
[cherry pick] Fix issue #33021 setCacheCapacity could not limit memor…
lidanqing-intel Jun 16, 2021
172f271
bug fix, test=develop (#33595)
Jun 16, 2021
7be50f9
update, test=develop (#33588)
Jun 16, 2021
bb5963d
[CP] add a strategy to run program with fleet (#33511)
Jun 16, 2021
63aeb02
fix gather op and add logsumexp op on kunlun (#32931) (#33592)
tangzhiyi11 Jun 16, 2021
7bbeeb5
cherry-pick fix output padding conv (#33587)
jerrywgz Jun 17, 2021
c3807f9
fix Windows CI unstable (#33606)
zhwesky2010 Jun 17, 2021
8e163f9
[Inference Tensorrt] Add attr for trt engine and handle the input seq…
jiweibo Jun 17, 2021
40b2a03
[cherry-pick 32784] Fix distro (#33638)
tianshuo78520a Jun 18, 2021
370fb10
remove check for optim_cache_dir in trt slim int8 (#32676) (#33629)
cryoco Jun 18, 2021
6ec2ea0
[cherry-pick] fix cmake expressions error #33621
Avin0323 Jun 18, 2021
bd3aa03
[XPU] Update cmake options for xpu. (#33450) (#33581)
jiweibo Jun 18, 2021
9a3d859
cherry-pick .Align the code of trt under the develop and release/2.1 …
jiweibo Jun 18, 2021
18043ab
fix the but that concat op can't support uint8 (#33667)
youth123 Jun 21, 2021
cdeffff
fix gpt2 train loss Nan problem by add a line __syncthreads in BlockR…
zhiboniu Jun 21, 2021
bf3161b
fix emb_eltwise_ln gpu_id bug (#33701) (#33706)
cryoco Jun 22, 2021
3b3bd93
add layernorm (#33610) (#33707)
ceci3 Jun 22, 2021
a029d36
[Cherry-pick] solve ANSI escape sequences print error in cmd and pow…
thisjiang Jun 22, 2021
1e62c23
Dynamic amp support sync_batch_norm op (#32770) (#33709)
sljlp Jun 22, 2021
89fdd6c
Fix wrong scale length for QkvToContext (#33763) (#33784)
b3602sss Jun 28, 2021
3749af5
[Dy2stat]Specify gast version in requirements.txt (#33850) (#33865)
Aurelius84 Jul 1, 2021
702610e
fix the opt path create error in windows, test=develop (#33853) (#33885)
winter-wang Jul 1, 2021
bedcf0d
[cherry-pick] fix bug when the cuda kernel config exceeds dims max (…
zhiqiu Jul 1, 2021
aa12737
[Dy2Stat]Support Python3 type hint (#33745) (#33914)
Aurelius84 Jul 2, 2021
adca05f
[cherry-pick2.1]polish avx/no_avx install error message (#33818) (#3…
zhwesky2010 Jul 2, 2021
50cb945
update readme test=document_fix
TCChenlong Jul 1, 2021
16ed3cc
调整2.1分支中的审核人员 (#33890)
iducn Jul 2, 2021
fe82754
cherry-pick prs. (#33932)
jiweibo Jul 5, 2021
0d6c753
[Cherry-pick][Dy2Stat] Fix unique_name in create_static_variable_gas…
Aurelius84 Jul 6, 2021
12f103a
[Cherry-Pick 33556]del python2 code (#33987)
tianshuo78520a Jul 7, 2021
f2f2fd8
[oneDNN] Fix to #33282 , added support of X input broadcasting to one…
jczaja Jul 9, 2021
8417ad6
[Cherry-pick] Up cxx11 check to cxx14 (#34015) (#34034)
chenwhql Jul 9, 2021
ed7903c
make DataLoader warning less noisy. test=develop (#34001)
heavengate Jul 9, 2021
0f266ac
cherry pick xpu to 2.1 (#34000)
taixiurong Jul 12, 2021
999c291
[Cherry-pick]Delete the function of saving layer object. (#34039)
hbwx24 Jul 12, 2021
1d1ca0f
[Cherry-Pick]Support finetuning the model saved on the MAC on the Li…
hbwx24 Jul 15, 2021
a456a1b
add the size of libpaddle_inference.so to Inference CI, test=develop …
winter-wang Jul 19, 2021
519df32
cherry-pick 34040 (#34228)
jiweibo Jul 19, 2021
8db945a
Update while loop (#34229)
TCChenlong Jul 19, 2021
4ffd339
[Cherry-pick][Dy2Stat]Support Nest sequtial container (#34246) #34262
0x45f Jul 21, 2021
0f5e0ba
【cherry-pick】add more info to tensor.grad warning message (#34264) #…
MingMingShangTian Jul 21, 2021
2041a0d
fix dataloader exit terminate error. test=develop
heavengate Jul 30, 2021
7674f51
fix format. test=develop
heavengate Jul 30, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
43 changes: 22 additions & 21 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -209,7 +209,7 @@ option(WITH_STRIP "Strip so files of Whl packages" OFF)

# PY_VERSION
if(NOT PY_VERSION)
set(PY_VERSION 2.7)
set(PY_VERSION 3.6)
endif()
set(PYBIND11_PYTHON_VERSION ${PY_VERSION})

Expand Down Expand Up @@ -283,6 +283,27 @@ if(WITH_GPU)
endif()
endif()

if(WITH_ROCM)
include(hip)
include(miopen) # set miopen libraries, must before configure
endif(WITH_ROCM)

if (NOT WITH_ROCM AND WITH_RCCL)
MESSAGE(WARNING
"Disable RCCL when compiling without ROCM. Force WITH_RCCL=OFF.")
set(WITH_RCCL OFF CACHE STRING
"Disable RCCL when compiling without ROCM" FORCE)
endif()

if(WITH_RCCL)
add_definitions("-DPADDLE_WITH_RCCL")
include(rccl)
else()
if(WITH_ROCM)
MESSAGE(WARNING "If the environment is multi-card, the WITH_RCCL option needs to be turned on, otherwise only a single card can be used.")
endif()
endif()

include(third_party) # download, build, install third_party, Contains about 20+ dependencies

include(flags) # set paddle compile flags
Expand All @@ -307,26 +328,6 @@ include(configure) # add paddle env configuration

include_directories("${PADDLE_SOURCE_DIR}")

if(WITH_ROCM)
include(hip)
endif(WITH_ROCM)

if (NOT WITH_ROCM AND WITH_RCCL)
MESSAGE(WARNING
"Disable RCCL when compiling without ROCM. Force WITH_RCCL=OFF.")
set(WITH_RCCL OFF CACHE STRING
"Disable RCCL when compiling without ROCM" FORCE)
endif()

if(WITH_RCCL)
add_definitions("-DPADDLE_WITH_RCCL")
include(rccl)
else()
if(WITH_ROCM)
MESSAGE(WARNING "If the environment is multi-card, the WITH_RCCL option needs to be turned on, otherwise only a single card can be used.")
endif()
endif()

if(WITH_NV_JETSON)
set(WITH_ARM ON CACHE STRING "Set WITH_ARM=ON when compiling WITH_NV_JETSON=ON." FORCE)
endif()
Expand Down
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ PaddlePaddle is originated from industrial practices with dedication and commitm

## Installation

### Latest PaddlePaddle Release: [v2.0](https://github.com/PaddlePaddle/Paddle/tree/release/2.0)
### Latest PaddlePaddle Release: [v2.1](https://github.com/PaddlePaddle/Paddle/tree/release/2.1)

Our vision is to enable deep learning for everyone via PaddlePaddle.
Please refer to our [release announcement](https://github.com/PaddlePaddle/Paddle/releases) to track the latest features of PaddlePaddle.
Expand All @@ -36,7 +36,7 @@ pip install paddlepaddle-gpu
```
More infomation about installation, please view [Quick Install](https://www.paddlepaddle.org.cn/install/quick)

Now our developers can acquire Tesla V100 online computing resources for free. If you create a program by AI Studio, you will obtain 10 hours to train models online per day. [Click here to start](https://aistudio.baidu.com/aistudio/index).
Now our developers can acquire Tesla V100 online computing resources for free. If you create a program by AI Studio, you will obtain 8 hours to train models online per day. [Click here to start](https://aistudio.baidu.com/aistudio/index).

## FOUR LEADING TECHNOLOGIES

Expand Down
4 changes: 2 additions & 2 deletions README_cn.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@

## 安装

### PaddlePaddle最新版本: [v2.0](https://github.com/PaddlePaddle/Paddle/tree/release/2.0)
### PaddlePaddle最新版本: [v2.1](https://github.com/PaddlePaddle/Paddle/tree/release/2.1)

跟进PaddlePaddle最新特性请参考我们的[版本说明](https://github.com/PaddlePaddle/Paddle/releases)

Expand All @@ -32,7 +32,7 @@ pip install paddlepaddle-gpu
```
更多安装信息详见官网 [安装说明](https://www.paddlepaddle.org.cn/install/quick)

PaddlePaddle用户可领取**免费Tesla V100在线算力资源**,训练模型更高效。**每日登陆即送10小时**,[前往使用免费算力](https://aistudio.baidu.com/aistudio/index)。
PaddlePaddle用户可领取**免费Tesla V100在线算力资源**,训练模型更高效。**每日登陆即送8小时**,[前往使用免费算力](https://aistudio.baidu.com/aistudio/index)。

## 四大领先技术

Expand Down
8 changes: 8 additions & 0 deletions cmake/configure.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -143,6 +143,14 @@ elseif(WITH_ROCM)
add_definitions(-DPADDLE_WITH_HIP)
add_definitions(-DEIGEN_USE_GPU)
add_definitions(-DEIGEN_USE_HIP)

if(NOT MIOPEN_FOUND)
message(FATAL_ERROR "Paddle needs MIOpen to compile")
endif()

if(${MIOPEN_VERSION} VERSION_LESS 2090)
message(FATAL_ERROR "Paddle needs MIOPEN >= 2.9 to compile")
endif()
else()
add_definitions(-DHPPL_STUB_FUNC)
list(APPEND CMAKE_CXX_SOURCE_FILE_EXTENSIONS cu)
Expand Down
18 changes: 15 additions & 3 deletions cmake/cuda.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -95,11 +95,23 @@ function(select_nvcc_arch_flags out_variable)
if(${CUDA_ARCH_NAME} STREQUAL "Kepler")
set(cuda_arch_bin "30 35")
elseif(${CUDA_ARCH_NAME} STREQUAL "Maxwell")
set(cuda_arch_bin "50")
if (WITH_NV_JETSON)
set(cuda_arch_bin "53")
else()
set(cuda_arch_bin "50")
endif()
elseif(${CUDA_ARCH_NAME} STREQUAL "Pascal")
set(cuda_arch_bin "60 61")
if (WITH_NV_JETSON)
set(cuda_arch_bin "62")
else()
set(cuda_arch_bin "60 61")
endif()
elseif(${CUDA_ARCH_NAME} STREQUAL "Volta")
set(cuda_arch_bin "70")
if (WITH_NV_JETSON)
set(cuda_arch_bin "72")
else()
set(cuda_arch_bin "70")
endif()
elseif(${CUDA_ARCH_NAME} STREQUAL "Turing")
set(cuda_arch_bin "75")
elseif(${CUDA_ARCH_NAME} STREQUAL "Ampere")
Expand Down
30 changes: 22 additions & 8 deletions cmake/external/lite.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -18,13 +18,21 @@ if(NOT LINUX)
return()
endif()

if(XPU_SDK_ROOT)
set(LITE_WITH_XPU ON)
include_directories("${XPU_SDK_ROOT}/XTDK/include")
include_directories("${XPU_SDK_ROOT}/XTCL/include")
if (LITE_WITH_XPU)
add_definitions(-DLITE_SUBGRAPH_WITH_XPU)
LINK_DIRECTORIES("${XPU_SDK_ROOT}/XTDK/shlib/")
LINK_DIRECTORIES("${XPU_SDK_ROOT}/XTDK/runtime/shlib/")
IF(WITH_AARCH64)
SET(XPU_SDK_ENV "kylin_aarch64")
ELSEIF(WITH_SUNWAY)
SET(XPU_SDK_ENV "deepin_sw6_64")
ELSEIF(WITH_BDCENTOS)
SET(XPU_SDK_ENV "bdcentos_x86_64")
ELSEIF(WITH_UBUNTU)
SET(XPU_SDK_ENV "ubuntu_x86_64")
ELSEIF(WITH_CENTOS)
SET(XPU_SDK_ENV "centos7_x86_64")
ELSE ()
SET(XPU_SDK_ENV "ubuntu_x86_64")
ENDIF()
endif()

if (NOT LITE_SOURCE_DIR OR NOT LITE_BINARY_DIR)
Expand Down Expand Up @@ -57,7 +65,8 @@ if (NOT LITE_SOURCE_DIR OR NOT LITE_BINARY_DIR)
-DWITH_TESTING=OFF
-DLITE_BUILD_EXTRA=ON
-DLITE_WITH_XPU=${LITE_WITH_XPU}
-DXPU_SDK_ROOT=${XPU_SDK_ROOT}
-DXPU_SDK_URL=${XPU_BASE_URL}
-DXPU_SDK_ENV=${XPU_SDK_ENV}
-DLITE_WITH_CODE_META_INFO=OFF
-DLITE_WITH_ARM=ON)
ExternalProject_Add(
Expand Down Expand Up @@ -99,7 +108,8 @@ if (NOT LITE_SOURCE_DIR OR NOT LITE_BINARY_DIR)
-DLITE_WITH_STATIC_CUDA=OFF
-DCUDA_ARCH_NAME=${CUDA_ARCH_NAME}
-DLITE_WITH_XPU=${LITE_WITH_XPU}
-DXPU_SDK_ROOT=${XPU_SDK_ROOT}
-DXPU_SDK_URL=${XPU_BASE_URL}
-DXPU_SDK_ENV=${XPU_SDK_ENV}
-DLITE_WITH_CODE_META_INFO=OFF
-DLITE_WITH_ARM=OFF)

Expand Down Expand Up @@ -147,6 +157,10 @@ message(STATUS "Paddle-lite BINARY_DIR: ${LITE_BINARY_DIR}")
message(STATUS "Paddle-lite SOURCE_DIR: ${LITE_SOURCE_DIR}")
include_directories(${LITE_SOURCE_DIR})
include_directories(${LITE_BINARY_DIR})
if(LITE_WITH_XPU)
include_directories(${LITE_BINARY_DIR}/third_party/install/xpu/xdnn/include/)
include_directories(${LITE_BINARY_DIR}/third_party/install/xpu/xre/include/)
endif()

function(external_lite_libs alias path)
add_library(${alias} SHARED IMPORTED GLOBAL)
Expand Down
27 changes: 21 additions & 6 deletions cmake/external/warpctc.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,21 @@ if(WITH_ASCEND OR WITH_ASCEND_CL)
-DCMAKE_INSTALL_PREFIX:PATH=${WARPCTC_INSTALL_DIR}
)
else()
if(WIN32)
set(WARPCTC_C_FLAGS $<FILTER:${CMAKE_C_FLAGS},EXCLUDE,/Zc:inline>)
set(WARPCTC_C_FLAGS_DEBUG $<FILTER:${CMAKE_C_FLAGS_DEBUG},EXCLUDE,/Zc:inline>)
set(WARPCTC_C_FLAGS_RELEASE $<FILTER:${CMAKE_C_FLAGS_RELEASE},EXCLUDE,/Zc:inline>)
set(WARPCTC_CXX_FLAGS $<FILTER:${CMAKE_CXX_FLAGS},EXCLUDE,/Zc:inline>)
set(WARPCTC_CXX_FLAGS_RELEASE $<FILTER:${CMAKE_CXX_FLAGS_RELEASE},EXCLUDE,/Zc:inline>)
set(WARPCTC_CXX_FLAGS_DEBUG $<FILTER:${CMAKE_CXX_FLAGS_DEBUG},EXCLUDE,/Zc:inline>)
else()
set(WARPCTC_C_FLAGS ${CMAKE_C_FLAGS})
set(WARPCTC_C_FLAGS_DEBUG ${CMAKE_C_FLAGS_DEBUG})
set(WARPCTC_C_FLAGS_RELEASE ${CMAKE_C_FLAGS_RELEASE})
set(WARPCTC_CXX_FLAGS ${CMAKE_CXX_FLAGS})
set(WARPCTC_CXX_FLAGS_RELEASE ${CMAKE_CXX_FLAGS_RELEASE})
set(WARPCTC_CXX_FLAGS_DEBUG ${CMAKE_CXX_FLAGS_DEBUG})
endif()
ExternalProject_Add(
extern_warpctc
${EXTERNAL_PROJECT_LOG_ARGS}
Expand All @@ -90,12 +105,12 @@ else()
BUILD_ALWAYS 1
CMAKE_ARGS -DCMAKE_CXX_COMPILER=${CMAKE_CXX_COMPILER}
-DCMAKE_C_COMPILER=${CMAKE_C_COMPILER}
-DCMAKE_C_FLAGS=$<FILTER:${CMAKE_C_FLAGS},EXCLUDE,/Zc:inline>
-DCMAKE_C_FLAGS_DEBUG=$<FILTER:${CMAKE_C_FLAGS_DEBUG},EXCLUDE,/Zc:inline>
-DCMAKE_C_FLAGS_RELEASE=$<FILTER:${CMAKE_C_FLAGS_RELEASE},EXCLUDE,/Zc:inline>
-DCMAKE_CXX_FLAGS=$<FILTER:${CMAKE_CXX_FLAGS},EXCLUDE,/Zc:inline>
-DCMAKE_CXX_FLAGS_RELEASE=$<FILTER:${CMAKE_CXX_FLAGS_RELEASE},EXCLUDE,/Zc:inline>
-DCMAKE_CXX_FLAGS_DEBUG=$<FILTER:${CMAKE_CXX_FLAGS_DEBUG},EXCLUDE,/Zc:inline>
-DCMAKE_C_FLAGS=${WARPCTC_C_FLAGS}
-DCMAKE_C_FLAGS_DEBUG=${WARPCTC_C_FLAGS_DEBUG}
-DCMAKE_C_FLAGS_RELEASE=${WARPCTC_C_FLAGS_RELEASE}
-DCMAKE_CXX_FLAGS=${WARPCTC_CXX_FLAGS}
-DCMAKE_CXX_FLAGS_RELEASE=${WARPCTC_CXX_FLAGS_RELEASE}
-DCMAKE_CXX_FLAGS_DEBUG=${WARPCTC_CXX_FLAGS_DEBUG}
-DCMAKE_INSTALL_PREFIX=${WARPCTC_INSTALL_DIR}
-DWITH_GPU=${WITH_GPU}
-DWITH_ROCM=${WITH_ROCM}
Expand Down
118 changes: 69 additions & 49 deletions cmake/external/xpu.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -7,52 +7,72 @@ SET(XPU_PROJECT "extern_xpu")
SET(XPU_API_LIB_NAME "libxpuapi.so")
SET(XPU_RT_LIB_NAME "libxpurt.so")

if(NOT XPU_SDK_ROOT)
if (WITH_AARCH64)
SET(XPU_URL "https://baidu-kunlun-public.su.bcebos.com/paddle_depence/aarch64/xpu_2021_01_13.tar.gz" CACHE STRING "" FORCE)
elseif(WITH_SUNWAY)
SET(XPU_URL "https://baidu-kunlun-public.su.bcebos.com/paddle_depence/sunway/xpu_2021_01_13.tar.gz" CACHE STRING "" FORCE)
else()
SET(XPU_URL "https://baidu-kunlun-public.su.bcebos.com/paddle_depence/xpu_2021_04_09.tar.gz" CACHE STRING "" FORCE)
endif()

SET(XPU_SOURCE_DIR "${THIRD_PARTY_PATH}/xpu")
SET(XPU_DOWNLOAD_DIR "${XPU_SOURCE_DIR}/src/${XPU_PROJECT}")
SET(XPU_INSTALL_DIR "${THIRD_PARTY_PATH}/install/xpu")
SET(XPU_API_INC_DIR "${THIRD_PARTY_PATH}/install/xpu/include")
SET(XPU_LIB_DIR "${THIRD_PARTY_PATH}/install/xpu/lib")

SET(XPU_API_LIB "${XPU_LIB_DIR}/${XPU_API_LIB_NAME}")
SET(XPU_RT_LIB "${XPU_LIB_DIR}/${XPU_RT_LIB_NAME}")

SET(CMAKE_INSTALL_RPATH "${CMAKE_INSTALL_RPATH}" "${XPU_INSTALL_DIR}/lib")

FILE(WRITE ${XPU_DOWNLOAD_DIR}/CMakeLists.txt
"PROJECT(XPU)\n"
"cmake_minimum_required(VERSION 3.0)\n"
"install(DIRECTORY xpu/include xpu/lib \n"
" DESTINATION ${XPU_INSTALL_DIR})\n")

ExternalProject_Add(
${XPU_PROJECT}
${EXTERNAL_PROJECT_LOG_ARGS}
PREFIX ${XPU_SOURCE_DIR}
DOWNLOAD_DIR ${XPU_DOWNLOAD_DIR}
DOWNLOAD_COMMAND wget --no-check-certificate ${XPU_URL} -c -q -O xpu.tar.gz
&& tar xvf xpu.tar.gz
DOWNLOAD_NO_PROGRESS 1
UPDATE_COMMAND ""
CMAKE_ARGS -DCMAKE_INSTALL_PREFIX=${XPU_INSTALL_ROOT}
CMAKE_CACHE_ARGS -DCMAKE_INSTALL_PREFIX:PATH=${XPU_INSTALL_ROOT}
)
else()
SET(XPU_API_INC_DIR "${XPU_SDK_ROOT}/XTDK/include/")
SET(XPU_API_LIB "${XPU_SDK_ROOT}/XTDK/shlib/libxpuapi.so")
SET(XPU_RT_LIB "${XPU_SDK_ROOT}/XTDK/runtime/shlib/libxpurt.so")
SET(XPU_LIB_DIR "${XPU_SDK_ROOT}/XTDK/shlib/")
endif()
IF(WITH_AARCH64)
SET(XPU_XRE_DIR_NAME "xre-kylin_aarch64")
SET(XPU_XDNN_DIR_NAME "xdnn-kylin_aarch64")
SET(XPU_XCCL_DIR_NAME "xccl-kylin_aarch64")
ELSEIF(WITH_SUNWAY)
SET(XPU_XRE_DIR_NAME "xre-deepin_sw6_64")
SET(XPU_XDNN_DIR_NAME "xdnn-deepin_sw6_64")
SET(XPU_XCCL_DIR_NAME "xccl-deepin_sw6_64")
ELSEIF(WITH_BDCENTOS)
SET(XPU_XRE_DIR_NAME "xre-bdcentos_x86_64")
SET(XPU_XDNN_DIR_NAME "xdnn-bdcentos_x86_64")
SET(XPU_XCCL_DIR_NAME "xccl-bdcentos_x86_64")
ELSEIF(WITH_UBUNTU)
SET(XPU_XRE_DIR_NAME "xre-ubuntu_x86_64")
SET(XPU_XDNN_DIR_NAME "xdnn-ubuntu_x86_64")
SET(XPU_XCCL_DIR_NAME "xccl-bdcentos_x86_64")
ELSEIF(WITH_CENTOS)
SET(XPU_XRE_DIR_NAME "xre-centos7_x86_64")
SET(XPU_XDNN_DIR_NAME "xdnn-centos7_x86_64")
SET(XPU_XCCL_DIR_NAME "xccl-bdcentos_x86_64")

ELSE ()
SET(XPU_XRE_DIR_NAME "xre-ubuntu_x86_64")
SET(XPU_XDNN_DIR_NAME "xdnn-ubuntu_x86_64")
SET(XPU_XCCL_DIR_NAME "xccl-bdcentos_x86_64")
ENDIF()

SET(XPU_BASE_URL_WITHOUT_DATE "https://baidu-kunlun-product.cdn.bcebos.com/KL-SDK/klsdk-dev")
SET(XPU_BASE_URL "${XPU_BASE_URL_WITHOUT_DATE}/20210701")
SET(XPU_XRE_URL "${XPU_BASE_URL}/${XPU_XRE_DIR_NAME}.tar.gz" CACHE STRING "" FORCE)
SET(XPU_XDNN_URL "${XPU_BASE_URL}/${XPU_XDNN_DIR_NAME}.tar.gz" CACHE STRING "" FORCE)
SET(XPU_XCCL_URL "${XPU_BASE_URL_WITHOUT_DATE}/20210623/${XPU_XCCL_DIR_NAME}.tar.gz" CACHE STRING "" FORCE)
SET(XPU_PACK_DEPENCE_URL "https://baidu-kunlun-public.su.bcebos.com/paddle_depence/pack_paddle_depence.sh" CACHE STRING "" FORCE)

SET(XPU_SOURCE_DIR "${THIRD_PARTY_PATH}/xpu")
SET(XPU_DOWNLOAD_DIR "${XPU_SOURCE_DIR}/src/${XPU_PROJECT}")
SET(XPU_INSTALL_DIR "${THIRD_PARTY_PATH}/install/xpu")
SET(XPU_INC_DIR "${THIRD_PARTY_PATH}/install/xpu/include")
SET(XPU_LIB_DIR "${THIRD_PARTY_PATH}/install/xpu/lib")

SET(XPU_API_LIB "${XPU_LIB_DIR}/${XPU_API_LIB_NAME}")
SET(XPU_RT_LIB "${XPU_LIB_DIR}/${XPU_RT_LIB_NAME}")

SET(CMAKE_INSTALL_RPATH "${CMAKE_INSTALL_RPATH}" "${XPU_INSTALL_DIR}/lib")

FILE(WRITE ${XPU_DOWNLOAD_DIR}/CMakeLists.txt
"PROJECT(XPU)\n"
"cmake_minimum_required(VERSION 3.0)\n"
"install(DIRECTORY xpu/include xpu/lib \n"
" DESTINATION ${XPU_INSTALL_DIR})\n")

ExternalProject_Add(
${XPU_PROJECT}
${EXTERNAL_PROJECT_LOG_ARGS}
PREFIX ${XPU_SOURCE_DIR}
DOWNLOAD_DIR ${XPU_DOWNLOAD_DIR}
DOWNLOAD_COMMAND wget ${XPU_PACK_DEPENCE_URL}
&& bash pack_paddle_depence.sh ${XPU_XRE_URL} ${XPU_XRE_DIR_NAME} ${XPU_XDNN_URL} ${XPU_XDNN_DIR_NAME} ${XPU_XCCL_URL} ${XPU_XCCL_DIR_NAME}

DOWNLOAD_NO_PROGRESS 1
UPDATE_COMMAND ""
CMAKE_ARGS -DCMAKE_INSTALL_PREFIX=${XPU_INSTALL_ROOT}
CMAKE_CACHE_ARGS -DCMAKE_INSTALL_PREFIX:PATH=${XPU_INSTALL_ROOT}
)

INCLUDE_DIRECTORIES(${XPU_API_INC_DIR})
INCLUDE_DIRECTORIES(${XPU_INC_DIR})
ADD_LIBRARY(shared_xpuapi SHARED IMPORTED GLOBAL)
set_property(TARGET shared_xpuapi PROPERTY IMPORTED_LOCATION "${XPU_API_LIB}")

Expand All @@ -62,7 +82,7 @@ generate_dummy_static_lib(LIB_NAME "xpulib" GENERATOR "xpu.cmake")

TARGET_LINK_LIBRARIES(xpulib ${XPU_API_LIB} ${XPU_RT_LIB})

if (WITH_XPU_BKCL)
IF(WITH_XPU_BKCL)
MESSAGE(STATUS "Compile with XPU BKCL!")
ADD_DEFINITIONS(-DPADDLE_WITH_XPU_BKCL)

Expand All @@ -71,9 +91,9 @@ if (WITH_XPU_BKCL)
SET(XPU_BKCL_INC_DIR "${THIRD_PARTY_PATH}/install/xpu/include")
INCLUDE_DIRECTORIES(${XPU_BKCL_INC_DIR})
TARGET_LINK_LIBRARIES(xpulib ${XPU_API_LIB} ${XPU_RT_LIB} ${XPU_BKCL_LIB})
else(WITH_XPU_BKCL)
TARGET_LINK_LIBRARIES(xpulib ${XPU_API_LIB} ${XPU_RT_LIB} )
endif(WITH_XPU_BKCL)
ELSE(WITH_XPU_BKCL)
TARGET_LINK_LIBRARIES(xpulib ${XPU_API_LIB} ${XPU_RT_LIB})
ENDIF(WITH_XPU_BKCL)

if(NOT XPU_SDK_ROOT)
ADD_DEPENDENCIES(xpulib ${XPU_PROJECT})
Expand Down
Loading