Skip to content

DALI v0.13.0

Pre-release
Pre-release
Compare
Choose a tag to compare
@klecki klecki released this 29 Aug 15:34

Bug fixes

  • Upgrade PyTorch to 1.2, TorchVison to 0.4 (#1155)
  • Add use_batched_decode argument to nvJPEGDecoder API (only for legacy nvJPEGDecoder implementation) (#1151)
  • Make loading of the versioned libnvidia-opticalflow.so the primary path (#1147)
  • Fix tests that are not using prolog/epilog functions (#1143)
  • Provide default initialization for scratch sizes in KernelRequiements. (#1141)
  • Fix coco loader (#1135)
  • Fix GET_PROC_EX macro (#1128)
  • Fix typo in installation doc (#1126)
  • Fix capitalization in docs for docker dir (#1122)
  • Fix pipeline serialization/deserialization for logical_id (#1121)
  • Make use right PyTorch capitalization everywhere (#1119)
  • Fix Gluon example that mixes simple and iterator DALI API (#1117)
  • Fix lint in ../dali/pipeline/operators/reader/loader/loader.h (#1113)
  • Fix float16 support in DALI TensorFlow plugin (#1086)
  • Fix python operator with side effects. (#1105)
  • Fix warning (#1061)
  • Fix test header inclusion (#1100)
  • Make dali_kernel_test_lib respect BUILD_TEST (#1101)
  • Fix a race condtion in async pipeline executor (#1103)
  • Typo fixed in getting started notebook (#1091)
  • Reduced batch size to avoid out of memory condition in 19.07 container. (#1089)
  • Fix error of indexing shape in Optical Flow (#1087)
  • Disable video_reader_op test when we disable NVDEC (#1077)
  • Add video error message (#1067)
  • Fix sampling of chroma in the VideoReader op (#1054)
  • Fix detection pipeline example (#1055)
  • Fix fp16 bug from #1129 and add fp16 test case (#1160)

Improvements

  • Adjust customdummy plugin in Docs to new API (#1150)
  • Add view overload to get TensorListView from TensorVector. (#1152)
  • Warp kernels (#1063)
  • Add Setup API to Operator (#1045)
  • Input & output TYPED_TEST (#1133)
  • Refactor SliceFlipNormalizePermutPad (super)kernel (#1129)
  • Add virtual env and conda test case for DALI TF plugin (#1107)
  • Add test for water operator (#1075)
  • BrightnessContrast kernel first implementation (#1060)
  • Add default_cuda_stream_priority documentation (#1131)
  • Fast coco reader (#1098)
  • Optimize docker images building(#1053)
  • Remove explicit Multiple Input Sets handling from C++ Backend (#1088)
  • Document pre-built WML CE packages in Installation docs (#1124)
  • Upgrade VideoCodecSDK to 9.0.20 (#1120)
  • UniformRandomFill for unified storage (#1070)
  • Calculation layout setup for GPU kernels. (#1106)
  • Rework multiple input sets API (#1104)
  • Use per-sample RNG in SSDRandomCrop and RandomBBoxCrop (#1109)
  • Add compile-time mapping for DALIDataType. For use in TYPE_SWITCH. (#1108)
  • Reworks how the reader pick samples from the shuffling buffer (#1005)
  • Add checking if Python API is not mixed between simple, scheduled and iterator (#1074)
  • Enable OpticalFlow test on CI (#1096)
  • Make protobuf linking mode configurable (#1102)
  • Kernel manager (#1079)
  • Add JIRA Task placeholder in PR template (#1090)
  • Replace vector<shared_ptr> with TensorVector (#1040)
  • Deprecate NormalizePermute in favor of CropMirrorNormalize (#982)
  • Adjust TensorFlow ResNet50 example to 1.14 version API (#1081)
  • Update DALI TF plugin docs to be aligned with the current functionality (#1066)
  • Adds BUILD_TF_PLUGIN flag to one-click build script (#1051)
  • Enforce shares_data_ in Buffer (#1057)
  • Improved sampler (#1071)
  • Change test prefix from L*_ to TL*_ (#1069)
  • Rounding Convert and ConvertSat added. (#1068)
  • Copy multiple collections to scratchpad. (#1044)
  • Use DALI_extra in loader test (#1064)
  • Add filename to LMDB reader errors (#1059)
  • Add make check target that runs basic tests (#1019)
  • Bounding box representation (#1052)
  • Add option to enable fast IDCT in libjpeg-turbo (#1031)
  • Adjust Tests to use DALI_EXTRA (#1056)
  • Basic geometric transform functions. (#1047)
  • Add TorchPythonFunction operator (#1033)
  • Add support for reading video files with labels using file_list argument (#1029)
  • add tensorflow 1.14 (#1037)
  • Enable sink operators. (#1004)
  • Update PR template (#1043)

Breaking API changes

  • Added Setup API to Operator with pure virtual SetupImpl
  • Multiple Input Sets handling was removed from backend and is only python level syntactic sugar
  • Reader sampling from shuffling buffer was adjusted
  • Replace vector<shared_ptr> with TensorVector as input and output of CPU Operators allowing for contiguous outputs from CPU Ops
  • Deprecate NormalizePermute in favor of CropMirrorNormalize (#982)
  • Enforce shares_data_ in Buffer - sharing data cannot be implicitly reallocated and must match allocation size

Known issues:

  • New Video reader operator requires NVIDIA VIDEO CODEC SDK support in the platform. NVIDIA GPU Cloud (NGC) optimized containers lacks this functionality in the default configuration prior to 19.01. To enable it please run the container with the ‘video’ capability enabled, ie.:
    -e "NVIDIA_DRIVER_CAPABILITIES=compute,utility,video"
  • The video loader operator requires that the key frames occur at a minimum every 10 to 15 frames of the video stream. If the key frames occur at a lesser frequency, then the returned frames may be out of sync.

Binary builds

Install via pip for CUDA 9:
pip install --extra-index-url https://developer.download.nvidia.com/compute/redist/cuda/9.0 nvidia-dali==0.13.0
or for CUDA 10
pip install --extra-index-url https://developer.download.nvidia.com/compute/redist/cuda/10.0 nvidia-dali==0.13.0

Or use direct download links (CUDA 9.0):

Or use direct download links (CUDA 10.0):

FFmpeg source code:

  • This software uses code of FFmpeg licensed under the LGPLv2.1 and its source can be downloaded here