Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix issue 1145 #1146

Closed
wants to merge 4 commits into from
Closed

fix issue 1145 #1146

wants to merge 4 commits into from

Conversation

tommady
Copy link

@tommady tommady commented Dec 14, 2022

fix issue 1145

according to the GNU doc

The default makefile names `GNUmakefile', `makefile' and `Makefile'

results
image
image

Copy link

@DriesMeerman DriesMeerman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be nice to put the starts_with calls into an array of files with certain prefixes

src/info/filetype.rs Outdated Show resolved Hide resolved
@marbx
Copy link

marbx commented Feb 22, 2023 via email

@tommady
Copy link
Author

tommady commented Feb 23, 2023

oh! I see what you meant ahah
sure will fix that, thank you!

@tommady
Copy link
Author

tommady commented Feb 23, 2023

new result
image

@marbx
Copy link

marbx commented Feb 23, 2023

Your commit 92f8eba fixes what I meant.
Thank you for you swift reply :-)

I have a question regarding performance.

The function name_is_one_of() is called with 35 file names and is implemented with

choices.contains(&&self.name[..])

choices is the only parameter of the function and hold the file names:

pub fn name_is_one_of(&self, choices: &[&str]) -> bool {

I am beginner of Rust, so reading the Standard library, &[T] is a "shared slice".

The method contains() of the slice primitive has a comment that says

This operation is O(n).
Note that if you have a sorted slice, binary_search may be faster.

Is this the method used?

"O(n)" means linear search.
I have the impression that 35 file names could be large enough to justify sorting and using a binary search.

And for the sake of performance, sorting could be done by sorting the array in the source.

What do you think?

@tommady
Copy link
Author

tommady commented Feb 24, 2023

Your commit 92f8eba fixes what I meant. Thank you for you swift reply :-)

I have a question regarding performance.

The function name_is_one_of() is called with 35 file names and is implemented with

choices.contains(&&self.name[..])

choices is the only parameter of the function and hold the file names:

pub fn name_is_one_of(&self, choices: &[&str]) -> bool {

I am beginner of Rust, so reading the Standard library, &[T] is a "shared slice".

The method contains() of the slice primitive has a comment that says

This operation is O(n).
Note that if you have a sorted slice, binary_search may be faster.

Is this the method used?

"O(n)" means linear search. I have the impression that 35 file names could be large enough to justify sorting and using a binary search.

And for the sake of performance, sorting could be done by sorting the array in the source.

What do you think?

ya, I agree with your point.
I will do that!

BTW, I am a rust beginner too~
thanks for your suggestion!

@marbx
Copy link

marbx commented Feb 25, 2023

@tommady, I appreciate the collaboration!
The sorted, vertical list from ba9c762 is much more readable then the unsorted list across 6 lines, I think.

I also think that is goes without saying, that any new filename needs to be added at its sorted positions.
I leave it to the exa maintainers whether a comment like this would make sense:

// binary_search() requires sorted file names

@marbx
Copy link

marbx commented Feb 25, 2023

On a second thought:
the function name_is_one_of() is currently used only here and this PR would makes the function unused.
Unused code is a bad thing.

As there could be other users, I would like to suggest to move the binary search into the function, together with a break-even calculation.

Along the lines:
if length of choices is below 20: contains(), else: binary_search()

So the potential other users would profit, too.

Final argument: the name_is_one_of() function is currently "too simple". It cyclomatic complexity is probably 1.
In other words: it could use some "flesh" :-)

@marbx
Copy link

marbx commented Feb 25, 2023

Also, we must not use binary_search() on only 35 file names!
It is 16% slower than contains()

I benchmarked contains() and binary_search() on 35 and 85 words:
https://github.com/marbx/LearnRust/blob/main/BenchmarkContainsBinarySearch/src/lib.rs

test tests::bench_binary_search_with_35_words ... bench:          25 ns/iter (+/- 0)
test tests::bench_binary_search_with_85_words ... bench:          33 ns/iter (+/- 1)
test tests::bench_contains_with_35_word       ... bench:          21 ns/iter (+/- 0)
test tests::bench_contains_with_85_words      ... bench:          58 ns/iter (+/- 5)
test tests::bench_xor                         ... bench:          93 ns/iter (+/- 2)

Break-even must be between 35 and 85, I will try to find it in the next days.

@marbx
Copy link

marbx commented Feb 25, 2023

Renaming the functions changes the benchmarked times!

test tests::bench_35_words_binary_search ... bench:          25 ns/iter (+/- 0)
test tests::bench_35_words_contains      ... bench:          25 ns/iter (+/- 0)
test tests::bench_85_words_binary_search ... bench:          33 ns/iter (+/- 1)
test tests::bench_85_words_contains      ... bench:          63 ns/iter (+/- 1)
test tests::bench_xor                    ... bench:          92 ns/iter (+/- 3)

@tommady
Copy link
Author

tommady commented Feb 25, 2023

wow
so based on your comments, this PR's filenames are only 34 which means we should use the contains method right?

and thank you for your survey which made me learn so much.

@marbx
Copy link

marbx commented Feb 25, 2023

I thought so 1 hour ago, but I realized that in the "real world" there are much more files which are not in the list. The number above come from a benchmark "1 file absent, 1 file present".
This is unrealistic

So I changed the benchmark to: "4 files absent, 1 file present" and got:

test tests::bench_35_words_binary_search ... bench:          72 ns/iter (+/- 1)
test tests::bench_35_words_contains      ... bench:          78 ns/iter (+/- 8)
test tests::bench_85_words_binary_search ... bench:          99 ns/iter (+/- 2)
test tests::bench_85_words_contains      ... bench:         138 ns/iter (+/- 1)

This I understand, because contains() must go though the whole list, before it can conclude that the file name is absent, so this benchmark hurts it a lot more than binary_search().

As you see, this benchmark is in favor of binary_search() already for 35 file names.
I would say in the "real world", there are more then 4 times files outside of the list.

In short: binary_search() is safer.

Could you please run and play with the benchmark? I don't know how trustworthy the measurements are.

I just realize: for a directory (or tree listing) larger then 35, one would have to reverse the search and (binary) search 35 file names in the directory/listing.

If you have a directory of 100 files, exa currently searches 100 times over a list of 35 entries.
While it is enough to search 35 times in a list of 100 entries.

@tommady
Copy link
Author

tommady commented Feb 25, 2023

or just a stupid idea, we can make the immediate files into a static hashmap which will lead every search O(1).

WDYT?

@marbx
Copy link

marbx commented Feb 26, 2023

Not stupid idea at all - I got the same.

A hashmap is the fastest of the three methods, even if created within each benchmark function. So hashmap is the way forward.

test tests::bench_35_words_binary_search         ... bench:          72 ns/iter (+/- 5)
test tests::bench_35_words_contains              ... bench:          69 ns/iter (+/- 27)
test tests::bench_35_words_hashmap_local_mutable ... bench:          57 ns/iter (+/- 1)
test tests::bench_85_words_binary_search         ... bench:          99 ns/iter (+/- 2)
test tests::bench_85_words_contains              ... bench:         138 ns/iter (+/- 2)
test tests::bench_85_words_hashmap_local_mutable ... bench:          57 ns/iter (+/- 4)
test tests::bench_xor                            ... bench:          93 ns/iter (+/- 3)

The measurement supports the O(1) expectation.

it is easy to setup a local, mutable hashmap.
I have not yet figured out how to setup a static, global, unmutable hashmap.

@tommady
Copy link
Author

tommady commented Feb 26, 2023

I saw the exa already uses the dependency "lazy_static",
so ideally it is not hard to archive the goal, let me do a new commit!

@marbx
Copy link

marbx commented Feb 26, 2023

Rust-PHF is a library to generate efficient lookup tables at compile time using perfect hash functions.

Unfortunately, PHF is slower than the dynamically created hashmap!

test tests::bench_35_words_binary_search         ... bench:          74 ns/iter (+/- 2)
test tests::bench_35_words_contains              ... bench:          76 ns/iter (+/- 1)
test tests::bench_35_words_hashmap_local_mutable ... bench:          59 ns/iter (+/- 1)
test tests::bench_35_words_hashmap_static_phf    ... bench:          71 ns/iter (+/- 3)
test tests::bench_35_words_static_phf_set        ... bench:          70 ns/iter (+/- 1)
test tests::bench_85_words_binary_search         ... bench:          98 ns/iter (+/- 2)
test tests::bench_85_words_contains              ... bench:         228 ns/iter (+/- 24)
test tests::bench_85_words_hashmap_local_mutable ... bench:          59 ns/iter (+/- 1)
test tests::bench_85_words_hashmap_static_phf    ... bench:          71 ns/iter (+/- 1)
test tests::bench_85_words_static_phf_set        ... bench:          70 ns/iter (+/- 2)
test tests::bench_xor                            ... bench:          91 ns/iter (+/- 0)

What am I doing wrong?
https://github.com/marbx/LearnRust/blob/b41ff2cd5c4c73118c2fea9e2e8ba39f5e1a3425/BenchmarkContainsBinarySearch/src/lib.rs#L510

@tommady
Copy link
Author

tommady commented Feb 26, 2023

or a more simple one?
try the simple match method.
https://www.reddit.com/r/rust/comments/5mnj3y/which_has_better_performance_a_hashmap_or_a/

@marbx
Copy link

marbx commented Feb 26, 2023

The first time, the lazy_static HashMap is as fast as the dynamically created one.
This makes lazy_static HashMap the (current) winner.

Although, the second time there is no speed up.
This may have to to with the benchmark.

I'll try match

test tests::bench_35_words_binary_search         ... bench:          73 ns/iter (+/- 3)
test tests::bench_35_words_contains              ... bench:          94 ns/iter (+/- 10)
test tests::bench_35_words_hashmap_local_mutable ... bench:          60 ns/iter (+/- 3)
test tests::bench_35_words_hashmap_static_phf    ... bench:          70 ns/iter (+/- 2)
test tests::bench_35_words_lazy_hashset_1        ... bench:          60 ns/iter (+/- 4)
test tests::bench_35_words_lazy_hashset_2        ... bench:          60 ns/iter (+/- 1)
test tests::bench_35_words_static_phf_set        ... bench:          70 ns/iter (+/- 3)
test tests::bench_85_words_binary_search         ... bench:          99 ns/iter (+/- 3)
test tests::bench_85_words_contains              ... bench:         138 ns/iter (+/- 4)
test tests::bench_85_words_hashmap_local_mutable ... bench:          58 ns/iter (+/- 3)
test tests::bench_85_words_hashmap_static_phf    ... bench:          70 ns/iter (+/- 1)
test tests::bench_85_words_lazy_hashset_1        ... bench:          59 ns/iter (+/- 2)
test tests::bench_85_words_lazy_hashset_2        ... bench:          59 ns/iter (+/- 3)
test tests::bench_85_words_static_phf_set        ... bench:          70 ns/iter (+/- 2)
test tests::bench_xor                            ... bench:          91 ns/iter (+/- 0)


@marbx
Copy link

marbx commented Feb 26, 2023

Match takes 0ns for 35 words, and 121ns for 85 words, double as for hash
This makes no sense.

test tests::bench_35_words_binary_search         ... bench:          72 ns/iter (+/- 3)
test tests::bench_35_words_contains              ... bench:          78 ns/iter (+/- 3)
test tests::bench_35_words_hashmap_local_mutable ... bench:          59 ns/iter (+/- 1)
test tests::bench_35_words_hashmap_static_phf    ... bench:          70 ns/iter (+/- 2)
test tests::bench_35_words_lazy_hashset_1        ... bench:          58 ns/iter (+/- 1)
test tests::bench_35_words_lazy_hashset_2        ... bench:          59 ns/iter (+/- 4)
test tests::bench_35_words_match                 ... bench:           0 ns/iter (+/- 0)
test tests::bench_35_words_static_phf_set        ... bench:          70 ns/iter (+/- 2)
test tests::bench_85_words_binary_search         ... bench:          99 ns/iter (+/- 4)
test tests::bench_85_words_contains              ... bench:         229 ns/iter (+/- 13)
test tests::bench_85_words_hashmap_local_mutable ... bench:          58 ns/iter (+/- 4)
test tests::bench_85_words_hashmap_static_phf    ... bench:          70 ns/iter (+/- 2)
test tests::bench_85_words_lazy_hashset_1        ... bench:          58 ns/iter (+/- 1)
test tests::bench_85_words_lazy_hashset_2        ... bench:          58 ns/iter (+/- 2)
test tests::bench_85_words_match                 ... bench:         121 ns/iter (+/- 6)
test tests::bench_85_words_static_phf_set        ... bench:          69 ns/iter (+/- 2)
test tests::bench_xor                            ... bench:          92 ns/iter (+/- 5)

Copy link

@marbx marbx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

matches! macro increases readability.

@tommady
Copy link
Author

tommady commented Feb 26, 2023

Ya I think we should leave all those surveys for the maintainer to do the judgement.

Clear winner for this PR (which the file names are below 35) is the matches macro with the fastest result with no doubts.

@ariasuni
Copy link
Collaborator

Does it do a difference in term of final executable size?

@tommady
Copy link
Author

tommady commented Mar 4, 2023

Does it do a difference in term of final executable size?

from this PR's modification, it added three strings

  • `GNUmakefile'
  • `makefile'
  • `Makefile'

is this the reason?

@1stDimension
Copy link
Contributor

The issues with CI are not present in current master c697d06. I believe if you rebase your branch onto it the checks will pass. 1stDimension#3 I tried to create similar environment in my fork.

@ariasuni
Copy link
Collaborator

ariasuni commented Sep 8, 2023

Closing this, since exa is unmaintained (see #1243), and this has been merged in the active fork eza. Thanks!

@ariasuni ariasuni closed this Sep 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants