-
Notifications
You must be signed in to change notification settings - Fork 615
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FileReader filter #3459
FileReader filter #3459
Conversation
4e1e090
to
170d88b
Compare
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
Please rebase this PR on latest |
3c55cdd
to
175ad5c
Compare
53f8131
to
b7532de
Compare
!build |
CI MESSAGE: [3291616]: BUILD STARTED |
CI MESSAGE: [3291616]: BUILD PASSED |
dali/benchmark/file_reader_bench.cc
Outdated
@@ -0,0 +1,136 @@ | |||
// Copyright (c) 2017-2021, NVIDIA CORPORATION & AFFILIATES. All rights reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Copyright (c) 2017-2021, NVIDIA CORPORATION & AFFILIATES. All rights reserved. | |
// Copyright (c) 2021, NVIDIA CORPORATION & AFFILIATES. All rights reserved. |
dali/benchmark/file_reader_bench.cc
Outdated
// open the root | ||
DIR *dir = opendir(file_root.c_str()); | ||
|
||
DALI_ENFORCE(dir != nullptr, "Directory " + file_root + " could not be opened."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DALI_ENFORCE(dir != nullptr, "Directory " + file_root + " could not be opened."); | |
DALI_ENFORCE(dir != nullptr, make_string("Directory ", file_root, " could not be opened.")); |
nitpick
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dali::make_string can be found in dali/core/format.h
@@ -41,7 +42,7 @@ For example, this directory structure:: | |||
<file_root>/2/car.jpeg | |||
<file_root>/2/truck.jp2 | |||
|
|||
will yield the following outputs:: | |||
without specified ``file_filters`` will yield the following outputs:: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
without specified ``file_filters`` will yield the following outputs:: | |
by default will yield the following outputs:: |
@@ -0,0 +1,124 @@ | |||
// Copyright (c) 2017-2021, NVIDIA CORPORATION & AFFILIATES. All rights reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Copyright (c) 2017-2021, NVIDIA CORPORATION & AFFILIATES. All rights reserved. | |
// Copyright (c) 2021, NVIDIA CORPORATION & AFFILIATES. All rights reserved. |
if (HasKnownExtension(std::string(entry->d_name))) { | ||
std::string rel_path = curr_entry + dir_sep + std::string{entry->d_name}; | ||
file_label_pairs.emplace_back(rel_path, label); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (HasKnownExtension(std::string(entry->d_name))) { | |
std::string rel_path = curr_entry + dir_sep + std::string{entry->d_name}; | |
file_label_pairs.emplace_back(rel_path, label); | |
std::string d_name(entry->d_name); | |
if (HasKnownExtension(d_name)) { | |
file_label_pairs.emplace_back(join_path(curr_entry, d_name), label); |
nitpick: to avoid creating the same string twice
class FilesystemTest : public ::testing::Test { | ||
std::vector<std::pair<std::string, int>> readFileLabelFile() { | ||
std::vector<std::pair<std::string, int>> image_label_pairs; | ||
std::string file_list = file_root + "/image_list.txt"; | ||
std::ifstream s(file_list); | ||
DALI_ENFORCE(s.is_open(), "Cannot open: " + file_list); | ||
|
||
std::vector<char> line_buf(16 << 10); | ||
char *line = line_buf.data(); | ||
for (int n = 1; s.getline(line, line_buf.size()); n++) { | ||
int i = strlen(line) - 1; | ||
|
||
for (; i >= 0 && isspace(line[i]); i--) {} | ||
|
||
int label_end = i + 1; | ||
|
||
if (i < 0) | ||
continue; | ||
|
||
for (; i >= 0 && isdigit(line[i]); i--) {} | ||
|
||
int label_start = i + 1; | ||
|
||
for (; i >= 0 && isspace(line[i]); i--) {} | ||
|
||
int name_end = i + 1; | ||
DALI_ENFORCE( | ||
name_end > 0 && name_end < label_start && label_start >= 2 && label_end > label_start, | ||
make_string("Incorrect format of the list file \"", file_list, "\":", n, | ||
" expected file name followed by a label; got: ", line)); | ||
|
||
line[label_end] = 0; | ||
line[name_end] = 0; | ||
|
||
image_label_pairs.emplace_back(line, std::atoi(line + label_start)); | ||
} | ||
std::sort(image_label_pairs.begin(), image_label_pairs.end()); | ||
DALI_ENFORCE(s.eof(), "Wrong format of file_list: " + file_list); | ||
|
||
return image_label_pairs; | ||
} | ||
|
||
protected: | ||
FilesystemTest() | ||
: file_root(testing::dali_extra_path() + "/db/single/jpeg"), | ||
file_label_pairs(readFileLabelFile()) {} | ||
|
||
std::vector<std::string> globMatch(std::vector<std::string> &filters) { | ||
std::vector<std::string> correct_match; | ||
glob_t pglob; | ||
for (auto &filter : filters) { | ||
std::string pattern = file_root + filesystem::dir_sep + '*' + filesystem::dir_sep + filter; | ||
if (glob(pattern.c_str(), GLOB_TILDE, NULL, &pglob) == 0) { | ||
for (unsigned int count = 0; count < pglob.gl_pathc; ++count) { | ||
std::string match(pglob.gl_pathv[count]); | ||
correct_match.push_back(match.substr(file_root.length() + 1, std::string::npos)); | ||
} | ||
globfree(&pglob); | ||
} | ||
} | ||
std::sort(correct_match.begin(), correct_match.end()); | ||
std::unique(correct_match.begin(), correct_match.end()); | ||
return correct_match; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A thought. A significant part of this file is about re-implementing what the reader does, to build the reference. I think the test would be easier to read if we just hardcoded the input list of file/label and the expected outcome in each testcase.
I get that this makes it more flexible in case of changing the files in the test directory, but on the other hand it also introduces another source of bugs.
Feel free to disregard this idea if you feel is better as is now.
@@ -126,7 +128,35 @@ def test_file_reader_relpath_file_list(): | |||
assert contents == ref_contents(fnames[index]) | |||
|
|||
|
|||
batch_size_alias_test=64 | |||
def _test_file_reader_filter(filters): | |||
batch_size = 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd test bigger batch size>1 as well
def _test_file_reader_filter(filters): | ||
batch_size = 1 | ||
|
||
pipe = Pipeline(batch_size, 1, 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd test with more than one thread
for filter in filters: | ||
for file in glob.glob(os.path.join(root, dir, filter)): | ||
fnames.append((label, file.split('/')[-1], file)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
total nitpick: why not use a set() from the start?
DIR *dir = opendir(file_root.c_str()); | ||
|
||
DALI_ENFORCE(dir != nullptr, "Directory " + file_root + " could not be opened."); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Open question: Do we want those benchmarks to stay in DALI repository? Doesn't seem to be benchmarking any DALI component
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's to compare the two solutions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some nitpick, I think we can go with making the list of the known extensions be the default - either add it in case of empty filters
in traverse_directories
or put it as the default there, and have the if in the call site. Probably the second one works better as the documentation.
dali/benchmark/file_reader_bench.cc
Outdated
@@ -0,0 +1,135 @@ | |||
// Copyright (c) 2017-2021, NVIDIA CORPORATION & AFFILIATES. All rights reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a nitpick:
// Copyright (c) 2017-2021, NVIDIA CORPORATION & AFFILIATES. All rights reserved. | |
// Copyright (c) 2021, NVIDIA CORPORATION & AFFILIATES. All rights reserved. |
DIR *dir = opendir(file_root.c_str()); | ||
|
||
DALI_ENFORCE(dir != nullptr, "Directory " + file_root + " could not be opened."); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's to compare the two solutions.
dali/benchmark/file_reader_bench.cc
Outdated
// open the root | ||
DIR *dir = opendir(file_root.c_str()); | ||
|
||
DALI_ENFORCE(dir != nullptr, "Directory " + file_root + " could not be opened."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dali::make_string can be found in dali/core/format.h
Signed-off-by: ksztenderski <ksztenderski@nvidia.com>
Signed-off-by: ksztenderski <ksztenderski@nvidia.com>
…ngs argument. Signed-off-by: ksztenderski <ksztenderski@nvidia.com>
bae8436
to
1af339a
Compare
!build |
CI MESSAGE: [3316146]: BUILD STARTED |
@@ -51,6 +53,13 @@ will yield the following outputs:: | |||
<contents of 2/car.jpeg> 2 | |||
<contents of 2/truck.jp2> 2 | |||
|
|||
and with ``file_filters = [*.jpg, *.jpeg]`` will yield the following outputs:: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick:
and with ``file_filters = [*.jpg, *.jpeg]`` will yield the following outputs:: | |
and with ``file_filters = ["*.jpg", "*.jpeg"]`` will yield the following outputs:: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still not addressed?
static const std::vector<std::string> kKnownExtensionsGlob = { | ||
"*.jpg", "*.jpeg", "*.png", "*.bmp", "*.tif", "*.tiff", "*.pnm", "*.ppm", | ||
"*.pgm", "*.pbm", "*.jp2", "*.webp", "*.flac", "*.ogg", "*.wav"}; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have we checked how that displays on the documentation?
dali/pipeline/operator/op_schema.cc
Outdated
@@ -189,7 +189,7 @@ std::string OpSchema::GetArgumentDefaultValueString(const std::string &name) con | |||
|
|||
auto &val = *value_ptr; | |||
auto str = val.ToString(); | |||
if (val.GetTypeId() == DALI_STRING || | |||
if (val.GetTypeId() == DALI_STRING || val.GetTypeId() == DALI_STRING_VEC || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
|
||
def test_file_reader_filters(): | ||
for filters in [['*.jpg'], ['*.jpg', '*.png', '*.jpeg'], ['dog*.jpg', 'cat*.png', '*.jpg']]: | ||
for num_threads in [1, 2, 4, 8]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
num_threads = random.choice([1, 2, 4, 8])
batch_size = random.choice([1, 3, 10])
should be enough, IMO
CI MESSAGE: [3316146]: BUILD FAILED |
Signed-off-by: ksztenderski <ksztenderski@nvidia.com>
@@ -29,7 +30,8 @@ This operator can be used in the following modes: | |||
|
|||
In this mode, the directory indicated in ``file_root`` argument should contain one or more | |||
subdirectories. The files in these subdirectories are listed and assigned labels based on | |||
lexicographical order of the subdirectory. | |||
lexicographical order of the subdirectory. If you provide ``file_filters`` argument with | |||
list of glob strings, the operator will list files matching at least one of the patterns. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
list of glob strings, the operator will list files matching at least one of the patterns. | |
a list of glob strings, the operator will list files matching at least one of the patterns. Otherwise, a default set of filters is used (see the default value of ``file_filters`` for details). |
I'd also mention what happens by default
@@ -51,6 +53,13 @@ will yield the following outputs:: | |||
<contents of 2/car.jpeg> 2 | |||
<contents of 2/truck.jp2> 2 | |||
|
|||
and with ``file_filters = [*.jpg, *.jpeg]`` will yield the following outputs:: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still not addressed?
.AddOptionalArg<bool>("case_sensitive_filter", R"(If set to True, the filter will be applied | ||
to original file paths, otherwise to lower-cased file paths.)", false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fine, but I think it'd be even better if we just make the filter case insensitive as well. That is, case insensitive filter="*.JPEG"
would match "dog.jpeg"
for (auto &filter : filters) { | ||
if (fnmatch(filter.c_str(), entry->d_name, case_sensitive_filter ? 0 : FNM_CASEFOLD) == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for (auto &filter : filters) { | |
if (fnmatch(filter.c_str(), entry->d_name, case_sensitive_filter ? 0 : FNM_CASEFOLD) == 0) { | |
int flags = case_sensitive_filter ? 0 : FNM_CASEFOLD; | |
for (auto &filter : filters) { | |
if (fnmatch(filter.c_str(), entry->d_name, flags) == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to the documentation it seems to me that the documentation is not accurate (the part saying that it matches lowercase filters).
I'd double check:
filter="*.JPEG"
case_sensitive = false
file="dog.jpeg"
^^ this should match. Then the documentation should be updated
dali/pipeline/operator/op_schema.cc
Outdated
@@ -189,6 +189,7 @@ std::string OpSchema::GetArgumentDefaultValueString(const std::string &name) con | |||
|
|||
auto &val = *value_ptr; | |||
auto str = val.ToString(); | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
revert? Otherwise this file would should changes from this PR, when there are actually none
Update FileReader documentation. Fix glob matching in nosetests for file_filters. Revert op_schema. Signed-off-by: ksztenderski <ksztenderski@nvidia.com>
!build |
CI MESSAGE: [3361929]: BUILD STARTED |
CI MESSAGE: [3361929]: BUILD PASSED |
Signed-off-by: ksztenderski <ksztenderski@nvidia.com>
!build |
CI MESSAGE: [3362777]: BUILD STARTED |
CI MESSAGE: [3362799]: BUILD STARTED |
CI MESSAGE: [3362777]: BUILD PASSED |
CI MESSAGE: [3362799]: BUILD PASSED |
Add file_filters argument to FileReader allowing to select files by matching glob patterns. Filters can be matched case-sensitively or case-insensitively by specifying the value of case_sensitive_filter argument (by default matching case-insensitive). Add benchmark comparing original matching by known extensions and the new matching with fnmatch. Changed implementation entirely to match only with fnmatch in (file, label) case. Add tests for file_filters in FileReader. Temporary fix for other readers that use FileLabelLoader and don't support case_sensitive_filter and thus do not assign it the default value when initializing with GetArgument("case_sensitive_filter"). Signed-off-by: ksztenderski <ksztenderski@nvidia.com>
Add file_filters argument to FileReader allowing to select files by matching glob patterns. Filters can be matched case-sensitively or case-insensitively by specifying the value of case_sensitive_filter argument (by default matching case-insensitive). Add benchmark comparing original matching by known extensions and the new matching with fnmatch. Changed implementation entirely to match only with fnmatch in (file, label) case. Add tests for file_filters in FileReader. Temporary fix for other readers that use FileLabelLoader and don't support case_sensitive_filter and thus do not assign it the default value when initializing with GetArgument("case_sensitive_filter"). Signed-off-by: ksztenderski <ksztenderski@nvidia.com>
Add file_filters argument to FileReader allowing to select files by matching glob patterns. Filters can be matched case-sensitively or case-insensitively by specifying the value of case_sensitive_filter argument (by default matching case-insensitive). Add benchmark comparing original matching by known extensions and the new matching with fnmatch. Changed implementation entirely to match only with fnmatch in (file, label) case. Add tests for file_filters in FileReader. Temporary fix for other readers that use FileLabelLoader and don't support case_sensitive_filter and thus do not assign it the default value when initializing with GetArgument("case_sensitive_filter"). Signed-off-by: ksztenderski <ksztenderski@nvidia.com>
Add file_filters argument to FileReader allowing to select files by matching glob patterns. Filters can be matched case-sensitively or case-insensitively by specifying the value of case_sensitive_filter argument (by default matching case-insensitive). Add benchmark comparing original matching by known extensions and the new matching with fnmatch. Changed implementation entirely to match only with fnmatch in (file, label) case. Add tests for file_filters in FileReader. Temporary fix for other readers that use FileLabelLoader and don't support case_sensitive_filter and thus do not assign it the default value when initializing with GetArgument("case_sensitive_filter"). Signed-off-by: ksztenderski <ksztenderski@nvidia.com>
Add file_filters argument to FileReader allowing to select files by matching glob patterns. Filters can be matched case-sensitively or case-insensitively by specifying the value of case_sensitive_filter argument (by default matching case-insensitive). Add benchmark comparing original matching by known extensions and the new matching with fnmatch. Changed implementation entirely to match only with fnmatch in (file, label) case. Add tests for file_filters in FileReader. Temporary fix for other readers that use FileLabelLoader and don't support case_sensitive_filter and thus do not assign it the default value when initializing with GetArgument("case_sensitive_filter"). Signed-off-by: ksztenderski <ksztenderski@nvidia.com>
Add file_filters argument to FileReader allowing to select files by matching glob patterns. Filters can be matched case-sensitively or case-insensitively by specifying the value of case_sensitive_filter argument (by default matching case-insensitive). Add benchmark comparing original matching by known extensions and the new matching with fnmatch. Changed implementation entirely to match only with fnmatch in (file, label) case. Add tests for file_filters in FileReader. Temporary fix for other readers that use FileLabelLoader and don't support case_sensitive_filter and thus do not assign it the default value when initializing with GetArgument("case_sensitive_filter"). Signed-off-by: ksztenderski <ksztenderski@nvidia.com>
Add file_filters argument to FileReader allowing to select files by matching glob patterns. Filters can be matched case-sensitively or case-insensitively by specifying the value of case_sensitive_filter argument (by default matching case-insensitive). Add benchmark comparing original matching by known extensions and the new matching with fnmatch. Changed implementation entirely to match only with fnmatch in (file, label) case. Add tests for file_filters in FileReader. Temporary fix for other readers that use FileLabelLoader and don't support case_sensitive_filter and thus do not assign it the default value when initializing with GetArgument("case_sensitive_filter"). Signed-off-by: ksztenderski <ksztenderski@nvidia.com>
Add file_filters argument to FileReader allowing to select files by matching glob patterns. Filters can be matched case-sensitively or case-insensitively by specifying the value of case_sensitive_filter argument (by default matching case-insensitive). Add benchmark comparing original matching by known extensions and the new matching with fnmatch. Changed implementation entirely to match only with fnmatch in (file, label) case. Add tests for file_filters in FileReader. Temporary fix for other readers that use FileLabelLoader and don't support case_sensitive_filter and thus do not assign it the default value when initializing with GetArgument("case_sensitive_filter"). Signed-off-by: ksztenderski <ksztenderski@nvidia.com>
Add file_filters argument to FileReader allowing to select files by matching glob patterns. Filters can be matched case-sensitively or case-insensitively by specifying the value of case_sensitive_filter argument (by default matching case-insensitive). Add benchmark comparing original matching by known extensions and the new matching with fnmatch. Changed implementation entirely to match only with fnmatch in (file, label) case. Add tests for file_filters in FileReader. Temporary fix for other readers that use FileLabelLoader and don't support case_sensitive_filter and thus do not assign it the default value when initializing with GetArgument("case_sensitive_filter"). Signed-off-by: ksztenderski <ksztenderski@nvidia.com>
Add file_filters argument to FileReader allowing to select files by matching glob patterns. Filters can be matched case-sensitively or case-insensitively by specifying the value of case_sensitive_filter argument (by default matching case-insensitive). Add benchmark comparing original matching by known extensions and the new matching with fnmatch. Changed implementation entirely to match only with fnmatch in (file, label) case. Add tests for file_filters in FileReader. Temporary fix for other readers that use FileLabelLoader and don't support case_sensitive_filter and thus do not assign it the default value when initializing with GetArgument("case_sensitive_filter"). Signed-off-by: ksztenderski <ksztenderski@nvidia.com>
Add file_filters argument to FileReader allowing to select files by matching glob patterns. Filters can be matched case-sensitively or case-insensitively by specifying the value of case_sensitive_filter argument (by default matching case-insensitive). Add benchmark comparing original matching by known extensions and the new matching with fnmatch. Changed implementation entirely to match only with fnmatch in (file, label) case. Add tests for file_filters in FileReader. Temporary fix for other readers that use FileLabelLoader and don't support case_sensitive_filter and thus do not assign it the default value when initializing with GetArgument("case_sensitive_filter"). Signed-off-by: ksztenderski <ksztenderski@nvidia.com>
Add file_filters argument to FileReader allowing to select files by matching glob patterns. Filters can be matched case-sensitively or case-insensitively by specifying the value of case_sensitive_filter argument (by default matching case-insensitive). Add benchmark comparing original matching by known extensions and the new matching with fnmatch. Changed implementation entirely to match only with fnmatch in (file, label) case. Add tests for file_filters in FileReader. Temporary fix for other readers that use FileLabelLoader and don't support case_sensitive_filter and thus do not assign it the default value when initializing with GetArgument("case_sensitive_filter"). Signed-off-by: ksztenderski <ksztenderski@nvidia.com>
Add file_filters argument to FileReader allowing to select files by matching glob patterns. Filters can be matched case-sensitively or case-insensitively by specifying the value of case_sensitive_filter argument (by default matching case-insensitive). Add benchmark comparing original matching by known extensions and the new matching with fnmatch. Changed implementation entirely to match only with fnmatch in (file, label) case. Add tests for file_filters in FileReader. Temporary fix for other readers that use FileLabelLoader and don't support case_sensitive_filter and thus do not assign it the default value when initializing with GetArgument("case_sensitive_filter"). Signed-off-by: ksztenderski <ksztenderski@nvidia.com>
Add file_filters argument to FileReader allowing to select files by matching glob patterns. Filters can be matched case-sensitively or case-insensitively by specifying the value of case_sensitive_filter argument (by default matching case-insensitive). Add benchmark comparing original matching by known extensions and the new matching with fnmatch. Changed implementation entirely to match only with fnmatch in (file, label) case. Add tests for file_filters in FileReader. Temporary fix for other readers that use FileLabelLoader and don't support case_sensitive_filter and thus do not assign it the default value when initializing with GetArgument("case_sensitive_filter"). Signed-off-by: ksztenderski <ksztenderski@nvidia.com>
Add file_filters argument to FileReader allowing to select files by matching glob patterns. Filters can be matched case-sensitively or case-insensitively by specifying the value of case_sensitive_filter argument (by default matching case-insensitive). Add benchmark comparing original matching by known extensions and the new matching with fnmatch. Changed implementation entirely to match only with fnmatch in (file, label) case. Add tests for file_filters in FileReader. Temporary fix for other readers that use FileLabelLoader and don't support case_sensitive_filter and thus do not assign it the default value when initializing with GetArgument("case_sensitive_filter"). Signed-off-by: ksztenderski <ksztenderski@nvidia.com>
Add file_filters argument to FileReader allowing to select files by matching glob patterns. Filters can be matched case-sensitively or case-insensitively by specifying the value of case_sensitive_filter argument (by default matching case-insensitive). Add benchmark comparing original matching by known extensions and the new matching with fnmatch. Changed implementation entirely to match only with fnmatch in (file, label) case. Add tests for file_filters in FileReader. Temporary fix for other readers that use FileLabelLoader and don't support case_sensitive_filter and thus do not assign it the default value when initializing with GetArgument("case_sensitive_filter"). Signed-off-by: ksztenderski <ksztenderski@nvidia.com>
Description
What happened in this PR
It adds file_filter argument to FileReader allowing to filter files matching list of glob patterns.
Additional information
FileReader operator
Checklist
Tests
Documentation
DALI team only
Requirements
REQ IDs: N/A
JIRA TASK: N/A