update #22

AnnaTrainingG · 2021-08-02T08:36:47Z

PR types

PR changes

Describe

When Graph has sub-graph, apply pass to it and all sub-graph. And add single test script .

* [NPU] add NPU ops&uts of compare, test=develop * testing * try style-format * [NPU] update compare_op_npu uts * [NPU] fix code sytle of test_compare_op_npu.py

This PR added optional boolean is_parameter and stop_gradient in the VarDesc proto, and remove them during save_inference_model

…e and sparse table name in config_fleet.py (#34441)

See #33949 for details

* trt buildEngineWithConfig is deprecated * add trt version control

* support ScaleTensor for scale npu kernel * add more tests for adam npu * fix compile * fix unittest * refine adam optimizer

* Support C++ import python on windows for paddle * Support C++ import python on windows for paddle

* Add build_strategy in @to_static to support open pass * fix os.environ * add timeout * disable test_build_strategy on openblas

* tile op * more uts * disable tile if trt6.0 * typo * fix timeout issue * opteller * opteller remove duplicate code * comments. test=document_fix * modify PADDLE_ENFORCE. * fix reduce_mean issue

* add input option in model.summary

* add persistent_workers. test=develop

As the title

* fix paddle.summary's bug when output contains non-tensor

* graph engine demo * upload unsaved changes * fix dependency error * fix shard_num problem * py client * remove lock and graph-type * add load direct graph * add load direct graph * add load direct graph * batch random_sample * batch_sample_k * fix num_nodes size * batch brpc * batch brpc * add test * add test * add load_nodes; change add_node function * change sample return type to pair * resolve conflict * resolved conflict * resolved conflict * separate server and client * merge pair type * fix * resolved conflict * fixed segment fault; high-level VLOG for load edges and load nodes * random_sample return 0 * rm useless loop * test:load edge * fix ret -1 * test: rm sample * rm sample * random_sample return future * random_sample return int * test fake node * fixed here * memory leak * remove test code * fix return problem * add common_graph_table * random sample node &test & change data-structure from linkedList to vector * add common_graph_table * sample with srand * add node_types * optimize nodes sample * recover test * random sample * destruct weighted sampler * GraphEdgeBlob * WeightedGraphEdgeBlob to GraphEdgeBlob * WeightedGraphEdgeBlob to GraphEdgeBlob * pybind sample nodes api * pull nodes with step * fixed pull_graph_list bug; add test for pull_graph_list by step * add graph table;name * add graph table;name * add pybind * add pybind * add FeatureNode * add FeatureNode * add FeatureNode Serialize * add FeatureNode Serialize * get_feat_node * avoid local rpc * fix get_node_feat * fix get_node_feat * remove log * get_node_feat return py:bytes * merge develop with graph_engine * fix threadpool.h head * fix * fix typo * resolve conflict * fix conflict * recover lost content * fix pybind of FeatureNode * recover cmake * recover tools * resolve conflict * resolve linking problem * code style * change test_server port * fix code problems * remove shard_num config * remove redundent threads * optimize start server * remove logs * fix code problems by reviewers' suggestions * move graph files into a folder * code style change * remove graph operations from base table * optimize get_feat function of graph engine * fix long long count problem * remove redandunt graph files * remove unused shell * recover dropout_op_pass.h * fix potential stack overflow when request number is too large & node add & node clear & node remove * when sample k is larger than neigbor num, return directly * using random seed generator of paddle to speed up * fix bug of random sample k * fix code style * fix code style * fix blocking_queue problem * fix style * fix * recover capacity check Co-authored-by: Huang Zhengjie <270018958@qq.com> Co-authored-by: Weiyue Su <weiyue.su@gmail.com> Co-authored-by: suweiyue <suweiyue@baidu.com> Co-authored-by: luobin06 <luobin06@baidu.com> Co-authored-by: liweibin02 <liweibin02@baidu.com> Co-authored-by: tangwei12 <tangwei12@baidu.com>

* add fix op run order pass * add ut for fix_op_run_order * fix ci error * improve coverage * improve coverge again and fix cpu test case * follow some comments

* fix lr in param group * add unittest for adamw

* Support setitem by None index * remove unreachable code * Add Checkpoint for set_value_op because add a new attribute

* fix force kill for elastic

* [NPU] add clip and clip_grad on NPU, test=develop * address review comments, test=develop * update, test=develop

* added expand_v2 bf16/fp32 kernel * minor change * CI fix * added missing test file * added formatting * reduced binary size * CI fix

* add trainer desc config to distributed strategy * code style modified

…ls (#34219) * test version of matmul_v2 * added matmul_v2 grad kernel * minor changes * minor changes * minor change for CI approval * CI fix * CI fix * added squeeze and squeeze2 kernels * CI fix * CI fix * CI fix * disabled tests when compiled with cuda * added setting format_tag by strides * added sigmoid BF16 FWD/BWD and gelu BF16 BWD * changes after review * Revert "added sigmoid BF16 FWD/BWD and gelu BF16 BWD" This reverts commit 6e3f767. * Revert "Merge branch 'matmul_v2_grad' into squeeze2_op" This reverts commit 06fcf67, reversing changes made to 6e3f767. * minor change * added reshape1/2 kernels * moved some functions into private block * CI fix * CI fix * CI fix

* test version of matmul_v2 * added matmul_v2 grad kernel * minor changes * minor changes * minor change for CI approval * CI fix * CI fix * trigger CI * changes after review, not working yet * moved ops to anonymous namespaces * changes after review

* add resnet50 trt test in pr-ci-inference test

The comment background message is too long, see details at #34521

* [NPU] add reduce_max * [NPU] delete skipIf * [NPU] add atrrs support or check * [NPU] add attr out_dtype * [NPU] delete debug codes

* notest;test=cpu-benchmark * benchmark-cpu * notest;test=cpu-benchmark * notest;benchmark-cpu * notest;benchmark-cpu * notest;benchmark-cpu * notest;benchmark-cpu * notest;benchmark-cpu * notest;benchmark-cpu * fix * fix * add test_ci_model_benchmark.sh

* test=develop * update identity * add unittest * notest,test=mac_py3 * modify comment & testname * test=document_fix * update comment * test=document_fix * activate all of the CI

… ReduceAny (#34436)

Refine print log and add args

thisjiang and others added 30 commits July 28, 2021 10:14

apply pass strategy to sub graph (#34158)

5e27d16

When Graph has sub-graph, apply pass to it and all sub-graph. And add single test script .

[NPU] add NPU ops of compare, test=develop (#34365)

68b4a2c

* [NPU] add NPU ops&uts of compare, test=develop * testing * try style-format * [NPU] update compare_op_npu uts * [NPU] fix code sytle of test_compare_op_npu.py

add quant_dequant_matmul (#34359)

a59f215

graph_to_program save parameter and stop_gradient information (#33771)

8a7dee3

This PR added optional boolean is_parameter and stop_gradient in the VarDesc proto, and remove them during save_inference_model

[CPU-PSLIB] Fix bug for consistency insepection of op's embedding nam…

f1654de

…e and sparse table name in config_fleet.py (#34441)

graph_to_program topology sort (#33949)

167523e

See #33949 for details

[Paddle-TRT] Fix TRT8 cuda error before program exit (#34403)

995195f

* trt buildEngineWithConfig is deprecated * add trt version control

fix optimizer.py (#34431)

0fb15d9

quantize_transpiler_v2 supports quantize fp16 tensor (#34398)

9f60492

[NPU] Support ScaleTensor for scale npu kernel (#34418)

f17ba93

* support ScaleTensor for scale npu kernel * add more tests for adam npu * fix compile * fix unittest * refine adam optimizer

fix ci bug (#34445)

0b2e510

reverse paddle.vision.xxx import (#34432)

a83a368

test=inference-size (#34448)

54cc065

Support C++ import python on windows for paddle (#34312)

cf12ea5

* Support C++ import python on windows for paddle * Support C++ import python on windows for paddle

[Dy2Stat]Add build_strategy in @to_static to support open pass (#34347)

eb27d8b

* Add build_strategy in @to_static to support open pass * fix os.environ * add timeout * disable test_build_strategy on openblas

Improve sccache hit rate and avoid absolute path (#34435)

92d8fed

fix block_queue problem (#34461)

e958316

Tile supported (#34388)

cffa15c

* tile op * more uts * disable tile if trt6.0 * typo * fix timeout issue * opteller * opteller remove duplicate code * comments. test=document_fix * modify PADDLE_ENFORCE. * fix reduce_mean issue

fix test scope (#34450)

d3dae0c

add parameter of input in model.summary (#34165)

40bd7a7

* add input option in model.summary

fix lrn bug when shape=0

104c82b

fix unit test bug

9e1af38

fix ci coverage bug

b451ff2

add persistent_workers (#34017)

76710e5

* add persistent_workers. test=develop

[NPU] Avoid cpu tensor freed before copying to npu completed (#34475)

d71b9ba

Enable FLAGS_convert_all_blocks (#34452)

76f94f8

As the title

fix the allreduce fused bug, test=develop (#34446)

b56dbe0

Fix allreduce_sum potential bugs on NPU. (#34462)

02cc3c5

fix paddle.summary's bug when outputs contains non-tensor (#34160)

b7fac0f

* fix paddle.summary's bug when output contains non-tensor

qili93 and others added 27 commits July 29, 2021 20:50

[NPU] add unit test retry for NPU UT, test=develop (#34443)

9d985ca

add fix op run order pass (#34427)

79e758c

* add fix op run order pass * add ut for fix_op_run_order * fix ci error * improve coverage * improve coverge again and fix cpu test case * follow some comments

all reduce fusion for shardinug, test=develop (#34480)

423ea97

fix lr in param group (#34468)

3041605

* fix lr in param group * add unittest for adamw

Support setitem by None index (#34442)

f775bfc

* Support setitem by None index * remove unreachable code * Add Checkpoint for set_value_op because add a new attribute

[NPU] support npu config on aclinit (#34500)

6c09496

fix force kill for elastic (#34488)

ba19398

* fix force kill for elastic

Fix roll_op by avoiding DivisionByZeroError, test=develop (#34499)

5571c98

[NPU] add clip and clip_grad on NPU, test=develop (#34429)

b68e36d

* [NPU] add clip and clip_grad on NPU, test=develop * address review comments, test=develop * update, test=develop

Added expand_v2 BF16/FP32 FWD/BWD kernels (#34284)

41c4f72

* added expand_v2 bf16/fp32 kernel * minor change * CI fix * added missing test file * added formatting * reduced binary size * CI fix

add trainer desc config to distributed strategy (#34457)

e6aacd1

* add trainer desc config to distributed strategy * code style modified

align /usr and /usr/local trt include file. (#34495)

87148a5

[NPU] disable EmbeddingDenseGrad temporarily (#34498)

2ad1e4c

fix function-redefined 1 (#34507)

06b55ea

fix function-redefined (#34510)

44e4d57

Revert of PR34452 (#34516)

72a9c8f

add resnet50 trt tests in pr-ci-inference (#34465)

8b72a1a

* add resnet50 trt test in pr-ci-inference test

[NPU] fix npu pipeline comm init (#34466)

41e2d41

fix function-redefined (#34517)

a6f55e4

[NPU] refine nan check (#34508)

393a0b1

Fix Inference CE Error by Topo Order (#34521)

508b40e

The comment background message is too long, see details at #34521

[NPU] add reduce_max (#34179)

de53f2b

* [NPU] add reduce_max * [NPU] delete skipIf * [NPU] add atrrs support or check * [NPU] add attr out_dtype * [NPU] delete debug codes

Add Identity OP (#34420)

80f7f7e

* test=develop * update identity * add unittest * notest,test=mac_py3 * modify comment & testname * test=document_fix * update comment * test=document_fix * activate all of the CI

Unify the block/grid strategy and implementation of ReduceLastDim and…

c7cc5ac

… ReduceAny (#34436)

AnnaTrainingG merged commit c1e59cf into AnnaTrainingG:develop Aug 2, 2021

AnnaTrainingG pushed a commit that referenced this pull request Sep 19, 2022

Merge pull request #22 from LielinJiang/refine-code

3bc13ff

Refine print log and add args

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

update #22

update #22

AnnaTrainingG commented Aug 2, 2021

update #22

update #22

Conversation

AnnaTrainingG commented Aug 2, 2021

PR types

PR changes

Describe