-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix PII pseudonymizer enrichment + unit tests #336
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great!
...les/common/src/test/scala/com.snowplowanalytics.snowplow.enrich.common/EtlPipelineSpec.scala
Show resolved
Hide resolved
...les/common/src/test/scala/com.snowplowanalytics.snowplow.enrich.common/EtlPipelineSpec.scala
Show resolved
Hide resolved
...t/scala/com.snowplowanalytics.snowplow.enrich.common/enrichments/EnrichmentManagerSpec.scala
Outdated
Show resolved
Hide resolved
c313290
to
eaa7c9d
Compare
@benjben I'm not sure. But:
I'm happy to consider counterpoints, but with the above in mind I'm leaning towards hashing the empty string. WDYT? |
IMHO that's the case. And hashing the empty strings will always produce the same hash.
IMHO hashing of PII is not related to data validation and we should hash whatever there is to hash. |
In that case, the test reveals a bug... I think I agree with:
but we differ in that I think this includes all values, including empty string. Btw, do you want to keep this test? I can rebase to merge my commit with your first one. |
But that doesn't reproduce the error
Sure !! The more testing the better ! |
b7e893f
to
087c2c5
Compare
Sure @chuwy ! So just to be sure that we don't duplicate work, this is what you first wanted to take care of but finally I do right ? |
@benjben nope, or at least I don't remember taking care of it. I implemented the recovery for a client. |
Oh, actually you're right - I wanted to do this! Sorry. Yeah, let's decide tomorrow - I'm just signing off today and thought you'll be working on something. Ignore me. |
…sts coverage (close #334)
087c2c5
to
b7cd91c
Compare
b7cd91c
to
cb3b570
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I understand evertying that is going on here, but happy that we have tests.
My implementation of removeAddedFields
is here just for a reference, feel free to add or ignore, although it's a bit shorter.
...owanalytics.snowplow.enrich/common/enrichments/registry/pii/PiiPseudonymizerEnrichment.scala
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
2d5f6a5
to
9e65fe9
Compare
The test failing is due to http://snowplow-hosted-assets.s3.amazonaws.com/third-party/maxmind/GeoLite2-City.mmdb being removed/empty
This will be fixed in another commit.