Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fixed invalid function date_trunc when connecting to hive -- issue#7270 #7657

Closed
wants to merge 8 commits into from

Conversation

jeepxiaozi
Copy link

@jeepxiaozi jeepxiaozi commented Jun 6, 2019

CATEGORY

  • Bug Fix

SUMMARY

adding time_grain_functions to support hive date functions

ADDITIONAL INFORMATION

dpgaspar and others added 2 commits June 6, 2019 22:15
* Add new escape characters to OnPasteSelect

* Add Unit Tests to onPasteSelect_spec

* Update new line to tab escape character

* Fix indentation in 2 cases
@mistercrunch
Copy link
Member

Wait HiveEngineSpec was probably inheriting a lot more than just time_grain_functions here. It would be much less risky to still inherit from PrestoEngineSpec and only override time_grain_functions.

I'm not sure if our Hive never worked in the explore view (too slow! times out), therefore this error never popped up, or if somehow date_trunc was implemented at Airbnb and/or Lyft in our Hive version/UDF packages.......

@codecov-io
Copy link

codecov-io commented Jun 7, 2019

Codecov Report

Merging #7657 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #7657      +/-   ##
==========================================
+ Coverage   65.57%   65.58%   +<.01%     
==========================================
  Files         435      435              
  Lines       21754    21757       +3     
  Branches     2394     2394              
==========================================
+ Hits        14266    14269       +3     
  Misses       7367     7367              
  Partials      121      121
Impacted Files Coverage Δ
superset/assets/src/components/OnPasteSelect.jsx 88.09% <ø> (ø) ⬆️
superset/db_engine_specs.py 67.5% <100%> (+0.02%) ⬆️
superset/views/core.py 72.91% <100%> (+0.03%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d62c37b...aab60f5. Read the comment docs.

@jeepxiaozi
Copy link
Author

Wait HiveEngineSpec was probably inheriting a lot more than just time_grain_functions here. It would be much less risky to still inherit from PrestoEngineSpec and only override time_grain_functions.

I'm not sure if our Hive never worked in the explore view (too slow! times out), therefore this error never popped up, or if somehow date_trunc was implemented at Airbnb and/or Lyft in our Hive version/UDF packages.......

Yes you are absolutely right, I'm sorry, I should override the time_grain_functions only and I've changed it already. Sorry again~

Copy link
Member

@john-bodley john-bodley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jeepxiaozi would you mind rebasing your change? It seems the PR currently has unrelated changes from #7660.


engine = 'hive'
max_column_name_length = 767
time_grain_functions = {
None: '{col}',
'PT1S': "from_unixtime(unix_timestamp({col}), 'yyyy-MM-dd HH:mm:ss')",
Copy link
Member

@john-bodley john-bodley Jun 7, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jeepxiaozi I don't believe this logic is correct, i.e., it's assuming that the {col} is of type STRING (with a pre-specified format) where it could be a DATE or TIMESTAMP. I plan to address this issue in SIP-15A but for the meantime I would suggest implying that {col} can be successfully cast to a TIMESTAMP, i.e., CAST({col} AS TIMESTAMP} (see here for details) as this is consistent with the logic in the other engines.

I'll address the invalid casting in SIP-15A which I'm actively working on.

@villebro
Copy link
Member

@jeepxiaozi db_engine_specs.py has been changed into a package with one module per engine. So when rebasing the Hive logic should go into superset/db_engine_specs/hive.py.

@stale
Copy link

stale bot commented Aug 15, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. For admin, please label this issue .pinned to prevent stale bot from closing the issue.

@stale stale bot added the inactive Inactive for >= 30 days label Aug 15, 2019
@stale stale bot closed this Aug 22, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
inactive Inactive for >= 30 days size/S
Projects
None yet
Development

Successfully merging this pull request may close these issues.

date_trunc function when connect to hive
7 participants