Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix comparisons for complex types using floats in RowContainer #5833

Closed

Conversation

czentgr
Copy link
Collaborator

@czentgr czentgr commented Jul 25, 2023

The issue is that leftValue == rightValue does not work if both sides are NaN. In fact, none of the comparisons with a NaN work. Equals is false and other comparisons cause an exception.
Therefore, for floating points additional logic is needed to handle the equals case.

The row container implementation contained two bugs when
using complex types.

  1. Incorrect comparison implementation for floating point
    types when NaN values are used.
  2. The comparison flags were not passed down to
    lower level functions.
  3. Refactor to use SimpleVector::comparePrimitiveAsc in RowContainer
    and ContainerRowSerde for a common comparison function.

In addition, the PR adds testing for floating point using the
RowContainer (and subsequent ContainerRowSerde).

When using floating point values in complex types (ROW, ARRAY, MAP (key))
the following operators are affected by the changes:

Operators:
FilterProject, TopN, TopNRowNumber, OrderBy, MergeExchange,
LocalMerge, HashProbe, NestedLoopJoinProbe

Functions:
array_distinct, array_duplicates, array_except, array_max,
array_min, array_has_duplicates, array_sort, array_sort_desc,
map_union, min, max, min_by, max_by, set_union

The lists may not be complete. Any operator/function that requires a
comparison of elements of complex types that use
floating point type is affected.

Fixes prestodb/presto#20283

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 25, 2023
@netlify
Copy link

netlify bot commented Jul 25, 2023

Deploy Preview for meta-velox canceled.

Name Link
🔨 Latest commit 4c94c53
🔍 Latest deploy log https://app.netlify.com/sites/meta-velox/deploys/6536cefe6dec0c00080c64b2

@czentgr czentgr changed the title [WIP][Native] Fix group by with arrays containing NaN [WIP] Fix group by with arrays containing NaN Jul 27, 2023
@czentgr czentgr force-pushed the fix_nan_compare_array branch 2 times, most recently from 66e859a to 8dc3e63 Compare July 28, 2023 18:46
@czentgr
Copy link
Collaborator Author

czentgr commented Jul 28, 2023

Test results prior to code change:

czentgr@Christians-MacBook-Pro tests % ./velox_exec_test --gtest_filter=AggregationTest.testFloatingPointArrayNaNGroupBy
Note: Google Test filter = AggregationTest.testFloatingPointArrayNaNGroupBy
[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from AggregationTest
[ RUN      ] AggregationTest.testFloatingPointArrayNaNGroupBy
/Users/czentgr/gitspace/velox/velox/exec/tests/utils/QueryAssertions.cpp:1013: Failure
Failed
Expected 1, got 2
1 extra rows, 0 missing rows
1 of extra rows:
        ["NaN",1,2]

0 of missing rows:

Note: DuckDB only supports timestamps of millisecond precision. If this test involves timestamp inputs, please make sure you use the right precision.
DuckDB query: SELECT c0 FROM tmp GROUP BY c0
/Users/czentgr/gitspace/velox/velox/exec/tests/utils/QueryAssertions.cpp:1013: Failure
Failed
Expected 1, got 2
1 extra rows, 0 missing rows
1 of extra rows:
        ["NaN",1,2]

0 of missing rows:

Note: DuckDB only supports timestamps of millisecond precision. If this test involves timestamp inputs, please make sure you use the right precision.
DuckDB query: SELECT c0 FROM tmp GROUP BY c0
[  FAILED  ] AggregationTest.testFloatingPointArrayNaNGroupBy (29 ms)
[----------] 1 test from AggregationTest (29 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test suite ran. (58 ms total)
[  PASSED  ] 0 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] AggregationTest.testFloatingPointArrayNaNGroupBy

 1 FAILED TEST

Test result after fix

czentgr@Christians-MacBook-Pro tests % ./velox_exec_test --gtest_filter=AggregationTest.testFloatingPointArrayNaNGroupBy
Note: Google Test filter = AggregationTest.testFloatingPointArrayNaNGroupBy
[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from AggregationTest
[ RUN      ] AggregationTest.testFloatingPointArrayNaNGroupBy
[       OK ] AggregationTest.testFloatingPointArrayNaNGroupBy (29 ms)
[----------] 1 test from AggregationTest (29 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test suite ran. (57 ms total)
[  PASSED  ] 1 test.

@czentgr czentgr changed the title [WIP] Fix group by with arrays containing NaN Fix group by with arrays containing NaN Jul 28, 2023
@czentgr czentgr marked this pull request as ready for review July 28, 2023 20:22
@czentgr
Copy link
Collaborator Author

czentgr commented Jul 28, 2023

@aditi-pandit @mbasmanova if you get a chance please review. This is the fix for the NaN value in array group by issue.

Copy link
Contributor

@mbasmanova mbasmanova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@czentgr Thank you for fixing this.

Should we also check BaseVector::compare and equalValueAt to see if it has a similar issue?

@@ -123,6 +123,22 @@ class AggregationTest : public OperatorTestBase {
return vectors;
}

template <typename T>
void testFloatingPointArrayNaNGroupBy() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is better tested in a unit test inside velox/exec/tests/ContainerRowSerdeTest.cpp

Also, please, document this behavior in a method comment in .h file.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think https://github.com/facebookincubator/velox/blob/main/velox/exec/tests/RowContainerTest.cpp#L281 was intended to test float comparison. But it didn't go as far as testing when these cases are in Arrays, Complex types. Might be good to enhance the tests here in RowContainer to check for the special floating point values in the Arrays, Maps and Complex types.

}

template <>
int compare<TypeKind::DOUBLE>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is another version of compare that takes 2 ByteStreams. Should it be modified as well?

@czentgr
Copy link
Collaborator Author

czentgr commented Jul 31, 2023

@mbasmanova Thank you for your review. I'm on vacation for the next two weeks and be back Aug 17. Then I will take care of your comments.

@aditi-pandit
Copy link
Collaborator

@czentgr Thank you for fixing this.

Should we also check BaseVector::compare and equalValueAt to see if it has a similar issue?

@czentgr : Yes, I too feel that this code change would be needed but the problem has a bigger scope requiring a change at BaseVector::compare and equalValueAt level. So it would be worth spending some time looking at different SQL queries which need a NaN comparison in different places to come up with the most generic solution.

@aditi-pandit aditi-pandit self-requested a review July 31, 2023 16:32
@mbasmanova
Copy link
Contributor

@czentgr

I'm on vacation for the next two weeks and be back Aug 17. Then I will take care of your comments.

Thank you for the heads up.

@czentgr
Copy link
Collaborator Author

czentgr commented Aug 21, 2023

I checked BaseVector::compare and BaseVector::equalValueAt. Both of these functions are calling a pure virtual compare function that is implemented by the more specific BaseVector derived classes. For example, I then checked the SimpleVector class functions and it handles the NaN case (see function comparePrimitiveAsc which is called by the implementation of the pure virtual compare function of the base class). It appears that for the other vector types FlatVector, ComplexVector and ConstantVector compareeventually will callcomparePrimitiveAscofSimpleVector`.

The difference to the vector functions and ContainerRowSerde is that one side is a vector input which are compared against a byte stream (and not another vector). It doesn't build a vector from the stream itself (which would enable it then to use the vector compare function without running into the problem).

I believe the Vector cases are already handled properly. Perhaps there is something where two streams are compared that could have a similar situation? It doesn't appear that ContainerRowSerde implements a compare for two byte streams.

@mbasmanova
Copy link
Contributor

It doesn't appear that ContainerRowSerde implements a compare for two byte streams.

I see this method in velox/exec/ContainerRowSerde.h

  static int32_t compare(
      ByteStream& left,
      ByteStream& right,
      const Type* type,
      CompareFlags flags);

@czentgr
Copy link
Collaborator Author

czentgr commented Aug 21, 2023

Yes. I found this since as well. Which means this piece is also susceptible to the bug. There is lots of the same code with minor differences in how values are accessed from the input (byte stream or vector).

@aditi-pandit
Copy link
Collaborator

It doesn't appear that ContainerRowSerde implements a compare for two byte streams.

I see this method in velox/exec/ContainerRowSerde.h

  static int32_t compare(
      ByteStream& left,
      ByteStream& right,
      const Type* type,
      CompareFlags flags);

@czentgr : In general RowContainer::compare (which is used in HashAgg, HashJoin, Window etc) would be the origin of these calls.
There is a variant being used from
https://github.com/facebookincubator/velox/blob/main/velox/exec/RowContainer.cpp#L521
that invokes the compare on 2 ByteStreams. We should fix this.

@aditi-pandit
Copy link
Collaborator

aditi-pandit commented Aug 21, 2023

Yes. I found this since as well. Which means this piece is also susceptible to the bug. There is lots of the same code with minor differences in how values are accessed from the input (byte stream or vector).

We should standardize on this style I feel https://github.com/facebookincubator/velox/blob/main/velox/exec/RowContainer.h#L1023

@czentgr
Copy link
Collaborator Author

czentgr commented Aug 21, 2023

Yes. I found this since as well. Which means this piece is also susceptible to the bug. There is lots of the same code with minor differences in how values are accessed from the input (byte stream or vector).

We should standardize on this style I feel https://github.com/facebookincubator/velox/blob/main/velox/exec/RowContainer.h#L1023

Yes, this should be a better fix compared to making new templated functions. There is some addition with regards to the (possibly requested) ordering but lets see if this can be worked out.

@@ -587,7 +600,20 @@ int32_t compare(
using T = typename TypeTraits<Kind>::NativeType;
T leftValue = left.read<T>();
T rightValue = right.read<T>();
auto result = leftValue == rightValue ? 0 : leftValue < rightValue ? -1 : 1;
int32_t result{0};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code looks exactly the same as for the other function. Can we abstract this in a common function ?

@@ -144,5 +144,36 @@ TEST_F(ContainerRowSerdeTest, nested) {
testRoundTrip(nestedArray);
}

TEST_F(ContainerRowSerdeTest, compareDoubleEquals) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general it seems like all the comparison function tests are done through RowContainerTests. That is the main caller and touch-point for Velox operators. Please can you add tests there also.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will take a look at those tests.

auto leftPosition = serialize(data);
HashStringAllocator::prepareRead(leftPosition.header, left);

SelectivityVector allRows(data->size());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this DecodedVector ? Can we not compare with a FlatVector ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. The API requires a DecodedVector (there is a difference in the interface and implementation). When using FlatVector (which I originally tried) compilation fails with

/Users/czentgr/gitspace/velox/velox/exec/tests/ContainerRowSerdeTest.cpp:160:18: error: no matching function for call to 'compare'
    EXPECT_EQ(0, ContainerRowSerde::compare(left, data, i, flags));
...
/Users/czentgr/gitspace/velox/./velox/exec/ContainerRowSerde.h:35:18: note: candidate function not viable: no known conversion from 'FlatVectorPtr<EvalType<double>>' (aka 'shared_ptr<FlatVector<double>>') to 'const facebook::velox::DecodedVector' for 2nd argument
  static int32_t compare(

HashStringAllocator::prepareRead(rightPosition.header, right);
auto type = data->type();

for (auto i = 0; i < data->size(); i++) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try different combinations where Nan is compared with non-Nan values also as those code paths have changed now.

Also add tests for ascending/descending variation of flags.

@czentgr czentgr force-pushed the fix_nan_compare_array branch 5 times, most recently from 1d7f0d1 to 0a985eb Compare September 11, 2023 21:32
@czentgr czentgr changed the title Fix group by with arrays containing NaN Fix comparisons for complex types using floats in RowContainer Sep 11, 2023
@czentgr
Copy link
Collaborator Author

czentgr commented Sep 11, 2023

@mbasmanova @aditi-pandit Please take another look when you get a chance. I've added comprehensive tests for the complex types based on the existing RowContainer test. These tests test both types of interfaces - 1. two byte streams and 2. byte stream and decoded vector .
As a result, I found a second bug where the compare flags were not passed down to the lower comparison functions for the complex types which is fixed in this PR as well.

@facebook-github-bot
Copy link
Contributor

@mbasmanova has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@mbasmanova merged this pull request in 714a7ab.

@conbench-facebook
Copy link

Conbench analyzed the 1 benchmark run on commit 714a7ab3.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details.

czentgr added a commit to czentgr/velox that referenced this pull request Nov 29, 2023
The row container implementation for
equalsNoNulls and equalsWithNulls contained a bug:

1.  Incorrect equals check for floating point
types when NaN values are used.
2. Refactor to use SimpleVector::comparePrimitiveAsc in RowContainer
and ContainerRowSerde for a common comparison function.
3. Change static SimpleVector::comparePrimitiveAsc to be static inline
to reduce function call overhead in this expanded usage.

This is a continuation of PR facebookincubator#5833 which addressed
floating point comparisons for complex types.

Affected operators:
FilterProject, TopN, TopNRowNumber, OrderBy, MergeExchange,
LocalMerge, HashProbe, NestedLoopJoinProbe

Affected functions:
array_distinct, array_duplicates, array_except, array_max,
array_min, array_has_duplicates, array_sort, array_sort_desc,
map_union, min, max, min_by, max_by, set_union

The lists may not be complete.
czentgr added a commit to czentgr/velox that referenced this pull request Dec 8, 2023
The row container implementation for
equalsNoNulls and equalsWithNulls contained a bug:

1.  Incorrect equals check for floating point
types when NaN values are used.
2. Refactor to use SimpleVector::comparePrimitiveAsc in RowContainer
and ContainerRowSerde for a common comparison function.
3. Change static SimpleVector::comparePrimitiveAsc to be static inline
to reduce function call overhead in this expanded usage.

This is a continuation of PR facebookincubator#5833 which addressed
floating point comparisons for complex types.

Affected operators:
FilterProject, TopN, TopNRowNumber, OrderBy, MergeExchange,
LocalMerge, HashProbe, NestedLoopJoinProbe

Affected functions:
array_distinct, array_duplicates, array_except, array_max,
array_min, array_has_duplicates, array_sort, array_sort_desc,
map_union, min, max, min_by, max_by, set_union

The lists may not be complete.
czentgr added a commit to czentgr/velox that referenced this pull request Dec 11, 2023
The row container implementation for
equalsNoNulls and equalsWithNulls contained a bug:

1.  Incorrect equals check for floating point
types when NaN values are used.
2. Refactor to use SimpleVector::comparePrimitiveAsc in RowContainer
and ContainerRowSerde for a common comparison function.
3. Change static SimpleVector::comparePrimitiveAsc to be static inline
to reduce function call overhead in this expanded usage.

This is a continuation of PR facebookincubator#5833 which addressed
floating point comparisons for complex types.

Affected operators:
FilterProject, TopN, TopNRowNumber, OrderBy, MergeExchange,
LocalMerge, HashProbe, NestedLoopJoinProbe

Affected functions:
array_distinct, array_duplicates, array_except, array_max,
array_min, array_has_duplicates, array_sort, array_sort_desc,
map_union, min, max, min_by, max_by, set_union

The lists may not be complete.
czentgr added a commit to czentgr/velox that referenced this pull request Dec 15, 2023
The row container implementation for
equalsNoNulls and equalsWithNulls contained a bug:

1.  Incorrect equals check for floating point
types when NaN values are used.
2. Refactor to use SimpleVector::comparePrimitiveAsc in RowContainer
and ContainerRowSerde for a common comparison function.
3. Change static SimpleVector::comparePrimitiveAsc to be static inline
to reduce function call overhead in this expanded usage.

This is a continuation of PR facebookincubator#5833 which addressed
floating point comparisons for complex types.

Affected operators:
FilterProject, TopN, TopNRowNumber, OrderBy, MergeExchange,
LocalMerge, HashProbe, NestedLoopJoinProbe

Affected functions:
array_distinct, array_duplicates, array_except, array_max,
array_min, array_has_duplicates, array_sort, array_sort_desc,
map_union, min, max, min_by, max_by, set_union

The lists may not be complete.
czentgr added a commit to czentgr/velox that referenced this pull request Jan 4, 2024
The row container implementation for
equalsNoNulls and equalsWithNulls contained a bug:

1.  Incorrect equals check for floating point
types when NaN values are used.
2. Refactor to use SimpleVector::comparePrimitiveAsc in RowContainer
and ContainerRowSerde for a common comparison function.
3. Change static SimpleVector::comparePrimitiveAsc to be static inline
to reduce function call overhead in this expanded usage.

This is a continuation of PR facebookincubator#5833 which addressed
floating point comparisons for complex types.

Affected operators:
FilterProject, TopN, TopNRowNumber, OrderBy, MergeExchange,
LocalMerge, HashProbe, NestedLoopJoinProbe

Affected functions:
array_distinct, array_duplicates, array_except, array_max,
array_min, array_has_duplicates, array_sort, array_sort_desc,
map_union, min, max, min_by, max_by, set_union

The lists may not be complete.
czentgr added a commit to czentgr/velox that referenced this pull request Jan 9, 2024
The row container implementation for
equalsNoNulls and equalsWithNulls contained a bug:

1.  Incorrect equals check for floating point
types when NaN values are used.
2. Refactor to use SimpleVector::comparePrimitiveAsc in RowContainer
and ContainerRowSerde for a common comparison function.
3. Change static SimpleVector::comparePrimitiveAsc to be static inline
to reduce function call overhead in this expanded usage.

This is a continuation of PR facebookincubator#5833 which addressed
floating point comparisons for complex types.

Affected operators:
FilterProject, TopN, TopNRowNumber, OrderBy, MergeExchange,
LocalMerge, HashProbe, NestedLoopJoinProbe

Affected functions:
array_distinct, array_duplicates, array_except, array_max,
array_min, array_has_duplicates, array_sort, array_sort_desc,
map_union, min, max, min_by, max_by, set_union

The lists may not be complete.
czentgr added a commit to czentgr/velox that referenced this pull request Jan 11, 2024
The row container implementation for
equalsNoNulls and equalsWithNulls contained a bug:

1.  Incorrect equals check for floating point
types when NaN values are used.
2. Refactor to use SimpleVector::comparePrimitiveAsc in RowContainer
and ContainerRowSerde for a common comparison function.
3. Change static SimpleVector::comparePrimitiveAsc to be static inline
to reduce function call overhead in this expanded usage.

This is a continuation of PR facebookincubator#5833 which addressed
floating point comparisons for complex types.

Affected operators:
FilterProject, TopN, TopNRowNumber, OrderBy, MergeExchange,
LocalMerge, HashProbe, NestedLoopJoinProbe

Affected functions:
array_distinct, array_duplicates, array_except, array_max,
array_min, array_has_duplicates, array_sort, array_sort_desc,
map_union, min, max, min_by, max_by, set_union

The lists may not be complete.
czentgr added a commit to czentgr/velox that referenced this pull request Jan 12, 2024
The row container implementation for
equalsNoNulls and equalsWithNulls contained a bug:

1.  Incorrect equals check for floating point
types when NaN values are used.
2. Refactor to use SimpleVector::comparePrimitiveAsc in RowContainer
and ContainerRowSerde for a common comparison function.
3. Change static SimpleVector::comparePrimitiveAsc to be static inline
to reduce function call overhead in this expanded usage.

This is a continuation of PR facebookincubator#5833 which addressed
floating point comparisons for complex types.

Affected operators:
FilterProject, TopN, TopNRowNumber, OrderBy, MergeExchange,
LocalMerge, HashProbe, NestedLoopJoinProbe

Affected functions:
array_distinct, array_duplicates, array_except, array_max,
array_min, array_has_duplicates, array_sort, array_sort_desc,
map_union, min, max, min_by, max_by, set_union

The lists may not be complete.
czentgr added a commit to czentgr/velox that referenced this pull request Jan 23, 2024
The row container implementation for
equalsNoNulls and equalsWithNulls contained a bug:

1.  Incorrect equals check for floating point
types when NaN values are used.
2. Refactor to use SimpleVector::comparePrimitiveAsc in RowContainer
and ContainerRowSerde for a common comparison function.
3. Change static SimpleVector::comparePrimitiveAsc to be static inline
to reduce function call overhead in this expanded usage.

This is a continuation of PR facebookincubator#5833 which addressed
floating point comparisons for complex types.

Affected operators:
FilterProject, TopN, TopNRowNumber, OrderBy, MergeExchange,
LocalMerge, HashProbe, NestedLoopJoinProbe

Affected functions:
array_distinct, array_duplicates, array_except, array_max,
array_min, array_has_duplicates, array_sort, array_sort_desc,
map_union, min, max, min_by, max_by, set_union

The lists may not be complete.
czentgr added a commit to czentgr/velox that referenced this pull request Jan 29, 2024
The row container implementation for
equalsNoNulls and equalsWithNulls contained a bug:

1.  Incorrect equals check for floating point
types when NaN values are used.
2. Refactor to use SimpleVector::comparePrimitiveAsc in RowContainer
and ContainerRowSerde for a common comparison function.
3. Change static SimpleVector::comparePrimitiveAsc to be static inline
to reduce function call overhead in this expanded usage.

This is a continuation of PR facebookincubator#5833 which addressed
floating point comparisons for complex types.

Affected operators:
FilterProject, TopN, TopNRowNumber, OrderBy, MergeExchange,
LocalMerge, HashProbe, NestedLoopJoinProbe

Affected functions:
array_distinct, array_duplicates, array_except, array_max,
array_min, array_has_duplicates, array_sort, array_sort_desc,
map_union, min, max, min_by, max_by, set_union

The lists may not be complete.
czentgr added a commit to czentgr/velox that referenced this pull request Jan 31, 2024
The row container implementation for
equalsNoNulls and equalsWithNulls contained a bug:

1.  Incorrect equals check for floating point
types when NaN values are used.
2. Refactor to use SimpleVector::comparePrimitiveAsc in RowContainer
and ContainerRowSerde for a common comparison function.
3. Change static SimpleVector::comparePrimitiveAsc to be static inline
to reduce function call overhead in this expanded usage.

This is a continuation of PR facebookincubator#5833 which addressed
floating point comparisons for complex types.

Affected operators:
FilterProject, TopN, TopNRowNumber, OrderBy, MergeExchange,
LocalMerge, HashProbe, NestedLoopJoinProbe

Affected functions:
array_distinct, array_duplicates, array_except, array_max,
array_min, array_has_duplicates, array_sort, array_sort_desc,
map_union, min, max, min_by, max_by, set_union

The lists may not be complete.
czentgr added a commit to czentgr/velox that referenced this pull request Feb 2, 2024
The row container implementation for
equalsNoNulls and equalsWithNulls contained a bug:

1.  Incorrect equals check for floating point
types when NaN values are used.
2. Refactor to use SimpleVector::comparePrimitiveAsc in RowContainer
and ContainerRowSerde for a common comparison function.
3. Change static SimpleVector::comparePrimitiveAsc to be static inline
to reduce function call overhead in this expanded usage.

This is a continuation of PR facebookincubator#5833 which addressed
floating point comparisons for complex types.

Affected operators:
FilterProject, TopN, TopNRowNumber, OrderBy, MergeExchange,
LocalMerge, HashProbe, NestedLoopJoinProbe

Affected functions:
array_distinct, array_duplicates, array_except, array_max,
array_min, array_has_duplicates, array_sort, array_sort_desc,
map_union, min, max, min_by, max_by, set_union

The lists may not be complete.
czentgr added a commit to czentgr/velox that referenced this pull request Feb 9, 2024
The row container implementation for
equalsNoNulls and equalsWithNulls contained a bug:

1.  Incorrect equals check for floating point
types when NaN values are used.
2. Refactor to use SimpleVector::comparePrimitiveAsc in RowContainer
and ContainerRowSerde for a common comparison function.
3. Change static SimpleVector::comparePrimitiveAsc to be static inline
to reduce function call overhead in this expanded usage.

This is a continuation of PR facebookincubator#5833 which addressed
floating point comparisons for complex types.

Affected operators:
FilterProject, TopN, TopNRowNumber, OrderBy, MergeExchange,
LocalMerge, HashProbe, NestedLoopJoinProbe

Affected functions:
array_distinct, array_duplicates, array_except, array_max,
array_min, array_has_duplicates, array_sort, array_sort_desc,
map_union, min, max, min_by, max_by, set_union

The lists may not be complete.
czentgr added a commit to czentgr/velox that referenced this pull request Feb 16, 2024
The row container implementation for
equalsNoNulls and equalsWithNulls contained a bug:

1.  Incorrect equals check for floating point
types when NaN values are used.
2. Refactor to use SimpleVector::comparePrimitiveAsc in RowContainer
and ContainerRowSerde for a common comparison function.
3. Change static SimpleVector::comparePrimitiveAsc to be static inline
to reduce function call overhead in this expanded usage.

This is a continuation of PR facebookincubator#5833 which addressed
floating point comparisons for complex types.

Affected operators:
FilterProject, TopN, TopNRowNumber, OrderBy, MergeExchange,
LocalMerge, HashProbe, NestedLoopJoinProbe

Affected functions:
array_distinct, array_duplicates, array_except, array_max,
array_min, array_has_duplicates, array_sort, array_sort_desc,
map_union, min, max, min_by, max_by, set_union

The lists may not be complete.
czentgr added a commit to czentgr/velox that referenced this pull request Feb 23, 2024
The row container implementation for
equalsNoNulls and equalsWithNulls contained a bug:

1.  Incorrect equals check for floating point
types when NaN values are used.
2. Refactor to use SimpleVector::comparePrimitiveAsc in RowContainer
and ContainerRowSerde for a common comparison function.
3. Change static SimpleVector::comparePrimitiveAsc to be static inline
to reduce function call overhead in this expanded usage.

This is a continuation of PR facebookincubator#5833 which addressed
floating point comparisons for complex types.

Affected operators:
FilterProject, TopN, TopNRowNumber, OrderBy, MergeExchange,
LocalMerge, HashProbe, NestedLoopJoinProbe

Affected functions:
array_distinct, array_duplicates, array_except, array_max,
array_min, array_has_duplicates, array_sort, array_sort_desc,
map_union, min, max, min_by, max_by, set_union

The lists may not be complete.
czentgr added a commit to czentgr/velox that referenced this pull request Feb 29, 2024
The row container implementation for
equalsNoNulls and equalsWithNulls contained a bug:

1.  Incorrect equals check for floating point
types when NaN values are used.
2. Refactor to use SimpleVector::comparePrimitiveAsc in RowContainer
and ContainerRowSerde for a common comparison function.
3. Change static SimpleVector::comparePrimitiveAsc to be static inline
to reduce function call overhead in this expanded usage.

This is a continuation of PR facebookincubator#5833 which addressed
floating point comparisons for complex types.

Affected operators:
FilterProject, TopN, TopNRowNumber, OrderBy, MergeExchange,
LocalMerge, HashProbe, NestedLoopJoinProbe

The lists may not be complete.
czentgr added a commit to czentgr/velox that referenced this pull request Mar 6, 2024
The row container implementation for
equalsNoNulls and equalsWithNulls contained a bug:

1.  Incorrect equals check for floating point
types when NaN values are used.
2. Refactor to use SimpleVector::comparePrimitiveAsc in RowContainer
and ContainerRowSerde for a common comparison function.
3. Change static SimpleVector::comparePrimitiveAsc to be static inline
to reduce function call overhead in this expanded usage.

This is a continuation of PR facebookincubator#5833 which addressed
floating point comparisons for complex types.

Affected operators:
FilterProject, TopN, TopNRowNumber, OrderBy, MergeExchange,
LocalMerge, HashProbe, NestedLoopJoinProbe

The lists may not be complete.
czentgr added a commit to czentgr/velox that referenced this pull request Mar 6, 2024
The row container implementation for
equalsNoNulls and equalsWithNulls contained a bug:

1.  Incorrect equals check for floating point
types when NaN values are used.
2. Refactor to use SimpleVector::comparePrimitiveAsc in RowContainer
and ContainerRowSerde for a common comparison function.
3. Change static SimpleVector::comparePrimitiveAsc to be static inline
to reduce function call overhead in this expanded usage.

This is a continuation of PR facebookincubator#5833 which addressed
floating point comparisons for complex types.

Affected operators:
FilterProject, TopN, TopNRowNumber, OrderBy, MergeExchange,
LocalMerge, HashProbe, NestedLoopJoinProbe

The lists may not be complete.
czentgr added a commit to czentgr/velox that referenced this pull request Mar 12, 2024
The row container implementation for
equalsNoNulls and equalsWithNulls contained a bug:

1.  Incorrect equals check for floating point
types when NaN values are used.
2. Refactor to use SimpleVector::comparePrimitiveAsc in RowContainer
and ContainerRowSerde for a common comparison function.
3. Change static SimpleVector::comparePrimitiveAsc to be static inline
to reduce function call overhead in this expanded usage.

This is a continuation of PR facebookincubator#5833 which addressed
floating point comparisons for complex types.

Affected operators:
FilterProject, TopN, TopNRowNumber, OrderBy, MergeExchange,
LocalMerge, HashProbe, NestedLoopJoinProbe

The lists may not be complete.
facebook-github-bot pushed a commit that referenced this pull request Mar 12, 2024
Summary:
The row container implementation for
equalsNoNulls and equalsWithNulls contained a bug:

1.  Incorrect equals check for floating point types when NaN values are used.
2. Refactor to use SimpleVector::comparePrimitiveAsc in RowContainer and ContainerRowSerde for a common comparison function.
3. Change static SimpleVector::comparePrimitiveAsc to be static inline to reduce function call overhead in this expanded usage.

This is a continuation of PR #5833 which addressed floating point comparisons for complex types.

Affected operators:
FilterProject, TopN, TopNRowNumber, OrderBy, MergeExchange, LocalMerge, HashProbe, NestedLoopJoinProbe

The lists may not be complete.

Pull Request resolved: #7780

Reviewed By: Yuhta

Differential Revision: D54141907

Pulled By: kagamiori

fbshipit-source-id: 0306cfaffd4d486a0b72f6e6b659b40b2d66688f
Joe-Abraham pushed a commit to Joe-Abraham/velox that referenced this pull request Jun 7, 2024
…kincubator#7780)

Summary:
The row container implementation for
equalsNoNulls and equalsWithNulls contained a bug:

1.  Incorrect equals check for floating point types when NaN values are used.
2. Refactor to use SimpleVector::comparePrimitiveAsc in RowContainer and ContainerRowSerde for a common comparison function.
3. Change static SimpleVector::comparePrimitiveAsc to be static inline to reduce function call overhead in this expanded usage.

This is a continuation of PR facebookincubator#5833 which addressed floating point comparisons for complex types.

Affected operators:
FilterProject, TopN, TopNRowNumber, OrderBy, MergeExchange, LocalMerge, HashProbe, NestedLoopJoinProbe

The lists may not be complete.

Pull Request resolved: facebookincubator#7780

Reviewed By: Yuhta

Differential Revision: D54141907

Pulled By: kagamiori

fbshipit-source-id: 0306cfaffd4d486a0b72f6e6b659b40b2d66688f
@czentgr czentgr deleted the fix_nan_compare_array branch July 31, 2024 17:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[native] Support GROUP BY on nan() values
4 participants