Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

distsql: small fixes (debug logs, hash join encoding problem) #12119

Merged

Conversation

RaduBerinde
Copy link
Member

@RaduBerinde RaduBerinde commented Dec 6, 2016

See individual commits.


This change is Reviewable

@rjnn
Copy link
Contributor

rjnn commented Dec 7, 2016

:lgtm:


Reviewed 5 of 5 files at r1, 1 of 1 files at r2, 2 of 2 files at r3.
Review status: :shipit: all files reviewed at latest revision, all discussions resolved, all commit checks successful.


Comments from Reviewable

@irfansharif
Copy link
Contributor

Review status: all files reviewed at latest revision, 2 unresolved discussions, all commit checks successful.


pkg/sql/distsql/hashjoiner.go, line 245 at r3 (raw file):

		// Note: we cannot compare VALUE encodings because they contain column IDs
		// which can vary.
		appendTo, err = row[colIdx].Encode(&h.datumAlloc, sqlbase.DatumEncoding_ASCENDING_KEY, appendTo)

wait I don't understand this. if you look at what Encode(...) actually does when passed in sqlbase.DatumEncoding_VALUE here it uses a default encoding.NoColumnID so the column ID doesn't really make a difference?


pkg/sql/distsql/hashjoiner.go, line 245 at r3 (raw file):

		// Note: we cannot compare VALUE encodings because they contain column IDs
		// which can vary.
		appendTo, err = row[colIdx].Encode(&h.datumAlloc, sqlbase.DatumEncoding_ASCENDING_KEY, appendTo)
// TODO(irfansharif): Different rows may come with different encodings,
// e.g. if they come from different streams that were merged, in which
// case the encodings don't match (despite having the same underlying
// datums). We instead opt to always choose sqlbase.DatumEncoding_VALUE
// but we may want to check the first row for what encodings are already
// available.

this is from distinct.go and is applicable here as well, no?


Comments from Reviewable

@RaduBerinde
Copy link
Member Author

Review status: all files reviewed at latest revision, 2 unresolved discussions, all commit checks successful.


pkg/sql/distsql/hashjoiner.go, line 245 at r3 (raw file):

Previously, irfansharif (irfan sharif) wrote…
// TODO(irfansharif): Different rows may come with different encodings,
// e.g. if they come from different streams that were merged, in which
// case the encodings don't match (despite having the same underlying
// datums). We instead opt to always choose sqlbase.DatumEncoding_VALUE
// but we may want to check the first row for what encodings are already
// available.

this is from distinct.go and is applicable here as well, no?

Yeah. In the future, we can add a function that compares VALUE encodings ignoring any column difference (in most cases it would come down to just skipping the first byte or something like that).


pkg/sql/distsql/hashjoiner.go, line 245 at r3 (raw file):

Previously, irfansharif (irfan sharif) wrote…

wait I don't understand this. if you look at what Encode(...) actually does when passed in sqlbase.DatumEncoding_VALUE here it uses a default encoding.NoColumnID so the column ID doesn't really make a difference?

You are assuming that this particular code always creates the encoding, but the whole point of EncDatum is that we could have the encoding given to us, either directly from what is stored in KV or from another host (we would return that here).


Comments from Reviewable

@irfansharif
Copy link
Contributor

:lgtm:


Review status: all files reviewed at latest revision, 2 unresolved discussions, all commit checks successful.


pkg/sql/distsql/hashjoiner.go, line 245 at r3 (raw file):

Previously, RaduBerinde wrote…

You are assuming that this particular code always creates the encoding, but the whole point of EncDatum is that we could have the encoding given to us, either directly from what is stored in KV or from another host (we would return that here).

got it.


pkg/sql/distsql/hashjoiner.go, line 245 at r3 (raw file):

Previously, RaduBerinde wrote…

Yeah. In the future, we can add a function that compares VALUE encodings ignoring any column difference (in most cases it would come down to just skipping the first byte or something like that).

right, I think it's worth adding this as a TODO for.


Comments from Reviewable

It looks like the VALUE encodings are not unique (they contain a column ID) so
they shouldn't be used for equality testing in hash joins.
@RaduBerinde
Copy link
Member Author

Review status: 6 of 7 files reviewed at latest revision, all discussions resolved.


pkg/sql/distsql/hashjoiner.go, line 245 at r3 (raw file):

Previously, irfansharif (irfan sharif) wrote…

right, I think it's worth adding this as a TODO for.

Done.


Comments from Reviewable

@irfansharif
Copy link
Contributor

Reviewed 5 of 5 files at r1, 2 of 2 files at r3, 1 of 1 files at r4.
Review status: all files reviewed at latest revision, all discussions resolved.


Comments from Reviewable

@RaduBerinde RaduBerinde merged commit 76d810b into cockroachdb:master Dec 7, 2016
@RaduBerinde RaduBerinde deleted the distsql-join-minor-fixes branch December 7, 2016 20:44
pav-kv pushed a commit to pav-kv/cockroach that referenced this pull request Mar 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants