Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix mmh3.hash64 unicode exception with python2 #10685

Merged
merged 2 commits into from
Nov 19, 2021

Conversation

djova
Copy link
Contributor

@djova djova commented Nov 18, 2021

What does this PR do?

Fixes mmh3.hash64 UnicodeEncodeError when the query contained non-ascii characters in python2.

datadog_checks/base/utils/db/sql.py:27: in compute_sql_signature
    return format(mmh3.hash64(normalized_query, signed=False)[0], 'x')
E   UnicodeEncodeError: 'ascii' codec can't encode character u'\xd2' in position 15: ordinal not in range(128)

Motivation

Fix bug originally surfaced while working on #10637

Additional Notes

Review checklist (to be filled by reviewers)

  • Feature or bugfix MUST have appropriate tests (unit, integration, e2e)
  • PR title must be written as a CHANGELOG entry (see why)
  • Files changes must correspond to the primary purpose of the PR as described in the title (small unrelated changes should have their own PR)
  • PR must have changelog/ and integration/ labels attached

Required for #10637

The test suite improvements in that PR which expand Windows test coverage are what surfaced this bug.
@codecov
Copy link

codecov bot commented Nov 18, 2021

Codecov Report

Merging #10685 (3d0f82e) into master (7ec881f) will increase coverage by 0.03%.
The diff coverage is 100.00%.

Flag Coverage Δ
active_directory 100.00% <ø> (ø)
activemq_xml 82.31% <ø> (ø)
aerospike 86.97% <ø> (+0.36%) ⬆️
airflow 90.00% <ø> (ø)
amazon_msk 88.83% <ø> (ø)
ambari 85.75% <ø> (ø)
apache 95.08% <ø> (ø)
aspdotnet 93.87% <ø> (ø)
avi_vantage 91.92% <ø> (ø)
azure_iot_edge 81.93% <ø> (ø)
btrfs 82.91% <ø> (ø)
cacti 83.95% <ø> (ø)
cassandra_nodetool 94.19% <ø> (ø)
ceph 91.02% <ø> (ø)
cilium 85.84% <ø> (+1.88%) ⬆️
cisco_aci 95.88% <ø> (ø)
citrix_hypervisor 87.50% <ø> (ø)
clickhouse 95.63% <ø> (ø)
cloud_foundry_api 95.98% <ø> (+0.12%) ⬆️
cockroachdb 97.18% <ø> (ø)
consul 91.74% <ø> (ø)
coredns 95.74% <ø> (ø)
couch 95.19% <ø> (+0.24%) ⬆️
couchbase 81.45% <ø> (ø)
crio 100.00% <ø> (ø)
datadog_checks_base 90.22% <100.00%> (+0.36%) ⬆️
datadog_checks_dev 80.05% <ø> (ø)
datadog_checks_downloader 80.64% <ø> (ø)
datadog_cluster_agent 97.50% <ø> (ø)
directory 94.87% <ø> (ø)
disk 91.13% <ø> (-0.49%) ⬇️
dns_check 93.84% <ø> (ø)
dotnetclr 100.00% <ø> (ø)
druid 97.70% <ø> (ø)
ecs_fargate 80.23% <ø> (ø)
eks_fargate 94.05% <ø> (ø)
elastic 88.65% <ø> (ø)
envoy 93.90% <ø> (ø)
etcd 93.27% <ø> (ø)
exchange_server 100.00% <ø> (ø)
external_dns 100.00% <ø> (ø)
fluentd 94.77% <ø> (ø)
gearmand 78.26% <ø> (+1.24%) ⬆️
gitlab 89.94% <ø> (ø)
gitlab_runner 91.94% <ø> (ø)
glusterfs 80.09% <ø> (+0.92%) ⬆️
go_expvar 92.73% <ø> (ø)
gunicorn 93.60% <ø> (+0.75%) ⬆️
haproxy 95.08% <ø> (+0.16%) ⬆️
harbor 81.29% <ø> (ø)
hazelcast 92.39% <ø> (ø)
hdfs_datanode 89.74% <ø> (ø)
hdfs_namenode 86.72% <ø> (ø)
http_check 89.98% <ø> (+1.76%) ⬆️
ibm_db2 94.84% <ø> (ø)
ibm_i 80.65% <ø> (ø)
ibm_mq 89.45% <ø> (ø)
ibm_was 96.06% <ø> (ø)
iis 93.01% <ø> (ø)
istio 76.87% <ø> (+0.57%) ⬆️
kafka_consumer 82.28% <ø> (ø)
kong 92.21% <ø> (ø)
kube_apiserver_metrics 97.35% <ø> (ø)
kube_controller_manager 96.85% <ø> (ø)
kube_dns 98.85% <ø> (ø)
kube_metrics_server 100.00% <ø> (ø)
kube_proxy 100.00% <ø> (ø)
kube_scheduler 96.20% <ø> (ø)
kubelet 89.61% <ø> (ø)
kubernetes_state 89.52% <ø> (ø)
kyototycoon 85.96% <ø> (ø)
lighttpd 83.64% <ø> (ø)
linkerd 85.14% <ø> (+1.14%) ⬆️
linux_proc_extras 96.22% <ø> (ø)
mapr 82.62% <ø> (ø)
mapreduce 81.77% <ø> (+0.46%) ⬆️
marathon 83.12% <ø> (ø)
marklogic 95.33% <ø> (ø)
mcache 93.52% <ø> (ø)
mesos_master 90.68% <ø> (ø)
mesos_slave 93.63% <ø> (ø)
mongo 94.45% <ø> (+0.49%) ⬆️
mysql 86.87% <ø> (+0.13%) ⬆️
nagios 89.53% <ø> (ø)
network 77.76% <ø> (+1.00%) ⬆️
nfsstat 95.20% <ø> (ø)
nginx 94.64% <ø> (+0.85%) ⬆️
nginx_ingress_controller 98.30% <ø> (ø)
openldap 96.33% <ø> (ø)
openmetrics 97.14% <ø> (ø)
openstack 51.30% <ø> (ø)
openstack_controller 90.74% <ø> (ø)
oracle 93.65% <ø> (+0.52%) ⬆️
pdh_check 95.65% <ø> (ø)
pgbouncer 90.45% <ø> (ø)
php_fpm 90.04% <ø> (+0.43%) ⬆️
postfix 88.04% <ø> (ø)
postgres 91.49% <ø> (+0.21%) ⬆️
powerdns_recursor 96.65% <ø> (ø)
process 85.07% <ø> (+0.28%) ⬆️
prometheus 94.17% <ø> (ø)
proxysql 98.97% <ø> (ø)
rabbitmq 94.40% <ø> (ø)
redisdb 87.12% <ø> (ø)
rethinkdb 97.93% <ø> (ø)
riak 99.22% <ø> (ø)
riakcs 93.61% <ø> (ø)
sap_hana 92.12% <ø> (ø)
scylla 100.00% <ø> (ø)
singlestore 90.81% <ø> (ø)
snmp 90.56% <ø> (+0.04%) ⬆️
snowflake 93.48% <ø> (ø)
sonarqube 95.69% <ø> (ø)
spark 93.21% <ø> (ø)
sqlserver 84.66% <ø> (ø)
squid 100.00% <ø> (ø)
ssh_check 91.58% <ø> (ø)
statsd 87.36% <ø> (+1.05%) ⬆️
supervisord 92.30% <ø> (ø)
system_core 91.04% <ø> (ø)
system_swap 98.30% <ø> (ø)
tcp_check 88.83% <ø> (ø)
teamcity 80.00% <ø> (ø)
tls 97.04% <ø> (+0.87%) ⬆️
tokumx 58.40% <ø> (?)
twemproxy 78.33% <ø> (ø)
twistlock 80.25% <ø> (ø)
varnish 84.57% <ø> (+0.24%) ⬆️
vault 95.00% <ø> (+0.55%) ⬆️
vertica 92.33% <ø> (ø)
voltdb 96.81% <ø> (ø)
vsphere 89.75% <ø> (+0.05%) ⬆️
win32_event_log 86.03% <ø> (+0.28%) ⬆️
windows_performance_counters 98.36% <ø> (ø)
windows_service 95.83% <ø> (ø)
wmi_check 92.91% <ø> (ø)
yarn 89.85% <ø> (ø)
zk 85.34% <ø> (+0.46%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Copy link
Contributor

@yzhan289 yzhan289 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR, is it possible to add a test case to ensure this works?

@djova djova changed the title fix mmh3.hash64 unicode exception with python2 on Windows fix mmh3.hash64 unicode exception with python2 Nov 18, 2021
@djova
Copy link
Contributor Author

djova commented Nov 18, 2021

Thanks for the PR, is it possible to add a test case to ensure this works?

Yes, I just added a test case with a unicode character.

@djova djova merged commit cd2c541 into master Nov 19, 2021
@djova djova deleted the djova/fix-mmh3-unicode-windows-error branch November 19, 2021 01:48
@fanny-jiang fanny-jiang added the category/bugfix For use during Agent Release period label Nov 19, 2021
fanny-jiang pushed a commit that referenced this pull request Nov 19, 2021
* fix `mmh3.hash64` unicode exception with python2 on Windows

Required for #10637

The test suite improvements in that PR which expand Windows test coverage are what surfaced this bug.

* add unicode test case

(cherry picked from commit cd2c541)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category/bugfix For use during Agent Release period integration/datadog_checks_base
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants