Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bugfix: Improve support for special characters in schema and table names #7297

Merged
merged 9 commits into from
May 8, 2019
Merged

bugfix: Improve support for special characters in schema and table names #7297

merged 9 commits into from
May 8, 2019

Conversation

villebro
Copy link
Member

@villebro villebro commented Apr 14, 2019

CATEGORY

  • Bug Fix
  • Add unit tests
  • Refactor

SUMMARY

Schema and table names that contain special characters currently cause the following problems:

  • SQL Lab: If schema contains slashes, browsing tables doesn't work. This is due to the schema being passed unquoted in the path of the AJAX call, causing it to add a new component to the path.
  • SQL Lab: if table contains slashes, preview of the table doesn't always work for similar reasons as the schema. This has been worked around by using the path:-prefix for table name; however, path: can't be used if there are multiple components in the path containing slashes.
  • Engine specs: Snowflake, Presto, Hive and MySQL currently pass the selected schema unquoted to the connection URI, most likely causing all queries to fail if schema contains slashes.

This PR adds quoting to schema and table names that are passed in the URI path, and ensures that unquoting is performed in the receiving end. Quoting is done by calling encodeURIComponent() first on the schema/table name, and finally calling encodeURI() on the full URI. Both calls are required, as only encoding the table/schema component will be converted to a regular slash by most (all?) webservers. In addition, a typo was fixed in the TableSelector onChange() event, and the utils function js_string_to_python was refactored to be more descriptive and generic.

BEFORE

Screenshot 2019-04-15 at 16 10 54

AFTER

Screenshot 2019-04-15 at 12 55 22

Screenshot 2019-05-07 at 8 05 26

Screenshot 2019-05-07 at 8 04 49

TEST PLAN

Tested locally to work both with and without slashes in table and schema names. Has been confirmed to work on Presto with Pulsar connector (slashes in schema name), Postgres (slashes in both schema and table names) and Snowflake (no slashes, mainly for regression testing).

ADDITIONAL INFORMATION

REVIEWERS

@yciabaud - Tested on Presto + Pulsar.

@codecov-io
Copy link

codecov-io commented Apr 14, 2019

Codecov Report

Merging #7297 into master will increase coverage by <.01%.
The diff coverage is 80.95%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #7297      +/-   ##
==========================================
+ Coverage   65.22%   65.23%   +<.01%     
==========================================
  Files         431      431              
  Lines       21147    21154       +7     
  Branches     2343     2343              
==========================================
+ Hits        13794    13800       +6     
- Misses       7237     7238       +1     
  Partials      116      116
Impacted Files Coverage Δ
superset/assets/src/SqlLab/actions/sqlLab.js 70.64% <0%> (ø) ⬆️
superset/utils/core.py 88.15% <100%> (+0.04%) ⬆️
superset/assets/src/components/TableSelector.jsx 84.67% <100%> (ø) ⬆️
superset/db_engine_specs.py 61.47% <80%> (+0.01%) ⬆️
superset/views/core.py 72.58% <80%> (+0.03%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5cf454b...65b1e78. Read the comment docs.

@yciabaud
Copy link

yciabaud commented Apr 14, 2019

Hello I tried your branch using docker but I am still having some issues when selecting a table.
There is a call on http://localhost:8088/superset/table/2/generator_test/public%252Fdefault/ with a status 500.
Maybe I did something wrong?

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 2309, in __call__
    return self.wsgi_app(environ, start_response)
  File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 2295, in wsgi_app
    response = self.handle_exception(e)
  File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1741, in handle_exception
    reraise(exc_type, exc_value, tb)
  File "/usr/local/lib/python3.6/site-packages/flask/_compat.py", line 35, in reraise
    raise value
  File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 2292, in wsgi_app
    response = self.full_dispatch_request()
  File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1815, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1718, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "/usr/local/lib/python3.6/site-packages/flask/_compat.py", line 35, in reraise
    raise value
  File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1813, in full_dispatch_request
    rv = self.dispatch_request()
  File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1799, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/usr/local/lib/python3.6/site-packages/flask_appbuilder/security/decorators.py", line 26, in wraps
    return f(self, *args, **kwargs)
  File "/home/superset/superset/models/core.py", line 1163, in wrapper
    value = f(*args, **kwargs)
  File "/home/superset/superset/views/core.py", line 2404, in table
    cols=columns, latest_partition=True),
  File "/home/superset/superset/models/core.py", line 908, in select_star
    schema=schema, source=utils.sources.get('sql_lab', None))
  File "/home/superset/superset/utils/core.py", line 114, in __call__
    value = self.func(*args, **kwargs)
  File "/home/superset/superset/models/core.py", line 828, in get_sqla_engine
    return create_engine(url, **params)
  File "/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/__init__.py", line 423, in create_engine
    return strategy.create(*args, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/strategies.py", line 98, in create
    (cargs, cparams) = dialect.create_connect_args(u)
  File "/usr/local/lib/python3.6/site-packages/pyhive/sqlalchemy_presto.py", line 106, in create_connect_args
    raise ValueError("Unexpected database format {}".format(url.database))
ValueError: Unexpected database format pulsar/public/default

Here the database is "pulsar" and the schema is "public/default". A working query is select * from pulsar."public/default".generator_test;

@yciabaud
Copy link

It looks like pyhive is not managing slashes in the DB URL: https://github.com/dropbox/PyHive/blob/master/pyhive/sqlalchemy_presto.py#L92

Will file an issue there too.

@villebro
Copy link
Member Author

It seems schemas can be specified both in the SQLAlchemy URI with a slash after the database (db/schema) name or as a query parameter (?schema=schema), and Superset is currently using the former, while your case would require the latter. That, or just escaping the schema name. Should be easy to fix as long as we know exactly how to construct the URI.

@villebro
Copy link
Member Author

@yciabaud I added a quick fix that might work; try it out and post the stacktrace if not.

@yciabaud
Copy link

I am testing this, meanwhile I can confirm I managed to make engine = create_engine('presto://localhost:8081/pulsar?schema=public/default') working.

I guess your fix will do.

@yciabaud
Copy link

I confirm everything is working for me now.
Thank you very much for your work.

@villebro villebro changed the title [WIP] bugfix: Tables and schemas with slashes break SQL Lab bugfix: Improve support for special characters in schema and table names Apr 15, 2019
@villebro
Copy link
Member Author

Ready for review.

@villebro
Copy link
Member Author

villebro commented May 6, 2019

@mistercrunch could you take a look at this PR? I added unit tests (both python and js) and fixed existing async tests that weren't functioning properly. This seems to be working well and shouldn't have adverse side-effects.

@mistercrunch
Copy link
Member

fantas

@mistercrunch mistercrunch merged commit 959c35d into apache:master May 8, 2019
@villebro villebro deleted the encode_schema branch May 9, 2019 05:59
xtinec pushed a commit that referenced this pull request May 10, 2019
* filter out all nan series (#7313)

* improve not rich tooltip (#7345)

* Create issue_label_bot.yaml (#7341)

* fix: do not save colors without a color scheme (#7347)

* [wtforms] Strip leading/trailing whitespace (#7084)

* [schema] Updating the datasources schema (#5451)

* limit tables/views returned if schema is not provided (#7358)

* limit tables/views returned if schema is not provided

* fix typo

* improve code performance

* handle the case when table name or view name does not present a schema

* Add type anno (#7342)

* Updated local dev instructions to include missing step

* First pass at type annotations

* [schema] Updating the base column schema (#5452)

* Update 937d04c16b64_update_datasources.py (#7361)

* Feature flag for client cache (#7348)

* Feature flag for client cache

* Fix integration test

* Revert "Fix integration test"

This reverts commit 58434ab.

* Feature flag for client cache

* Fix integration tests

* Add feature flag to config.py

* Add another feature check

* Fix more integration tests

* Fix raw HTML in SliceAdder (#7338)

* remove backendSync.json (#7331)

* [bubbles] issue when using duplicated metrics (#7087)

* SUPERSET-7: Docker compose config version breaks on Ubuntu 16.04 (#7359)

* SUPERSET-8: Update text in docs copyright footer (#7360)

* SUPERSET-7: Docker compose config version breaks on Ubuntu 16.04

* SUPERSET-8: Extra text in docs copyright footer

* [schema] Adding commits and removing unnecessary foreign-key definitions (#7371)

*  Store last selected dashboard in sessionStorage (#7181)

* Store last selected dashboard in sessionStorage

* Fix tests

* [schema] Updating the base metric schema (#5453)

* Fix NoneType bug & fill the test recipients with original recipients if empty (#7365)

* Added living goods as among the users of Superset (#7407)

* Added living goods as among the users of Superset

Living Goods is a non profit organisation with operation in africa and the middle east. We work in community health use data heavily on day to day. Superset is our platform of choice for dashboards.

* Update README.md

* [dashboard] allow user re-order top-level tabs (#7390)

* [SQL Lab] Increase timeout threshold for offline check (#7411)

* Bump FAB to 2.0.0 (#7323)

* Bump FAB to 2.0.0

* [tests] whitelist SecurityApi login and refresh endpoints

* [style] Fix, C812 missing trailing commas

* [security] Remove SUPERSET_UPDATE_PERMS flag

Registering sources needs to be performed after the views are
initialized on UPDATE_PERMS=False configuration

* [docs] New, FAB_UPDATE_PERMS and flask fab cli

* [docs] Fix, db upgrade needs to come first, create-admin needs a db

* [cli] New, superset init bootstraps all permissions for FAB and Superset

* [style] Fix, flakes

* [annotations] Improves UX on annotation validation, start_dttm, end_dttm (#7326)

* Setting renderTrigger on label_colors (#7410)

* Refactor out controlUtils.js module + unit tests (#7350)

* [WiP]refactor out a controlUtils.js file

* unit tests

* add missing license

* Addressing comments

* feature: see Presto row and array data types (#7413)

* Merge lastest from master into lyft-release-sp8 (#7405)

* filter out all nan series (#7313)

* improve not rich tooltip (#7345)

* Create issue_label_bot.yaml (#7341)

* fix: do not save colors without a color scheme (#7347)

* [wtforms] Strip leading/trailing whitespace (#7084)

* [schema] Updating the datasources schema (#5451)

* limit tables/views returned if schema is not provided (#7358)

* limit tables/views returned if schema is not provided

* fix typo

* improve code performance

* handle the case when table name or view name does not present a schema

* Add type anno (#7342)

* Updated local dev instructions to include missing step

* First pass at type annotations

* [schema] Updating the base column schema (#5452)

* Update 937d04c16b64_update_datasources.py (#7361)

* Feature flag for client cache (#7348)

* Feature flag for client cache

* Fix integration test

* Revert "Fix integration test"

This reverts commit 58434ab.

* Feature flag for client cache

* Fix integration tests

* Add feature flag to config.py

* Add another feature check

* Fix more integration tests

* Fix raw HTML in SliceAdder (#7338)

* remove backendSync.json (#7331)

* [bubbles] issue when using duplicated metrics (#7087)

* SUPERSET-7: Docker compose config version breaks on Ubuntu 16.04 (#7359)

* SUPERSET-8: Update text in docs copyright footer (#7360)

* SUPERSET-7: Docker compose config version breaks on Ubuntu 16.04

* SUPERSET-8: Extra text in docs copyright footer

* [schema] Adding commits and removing unnecessary foreign-key definitions (#7371)

*  Store last selected dashboard in sessionStorage (#7181)

* Store last selected dashboard in sessionStorage

* Fix tests

* [schema] Updating the base metric schema (#5453)

* Fix NoneType bug & fill the test recipients with original recipients if empty (#7365)

* feat: see Presto row and array data types (#7391)

* feat: see Presto row and array data types

* fix: address PR comments

* fix: lint and build issues

* fix: add types

* add stronger type hints where possible

* fix: lint issues and add select_star func in Hive

* add missing pkg init

* fix: build issues

* fix: pylint issues

* fix: use logging instead of print

* Removed --console-log and superset runserver (#7421)

* Fixes dashboard export button missing download and #7353 (#7427)

* Added additional German translations to string file (#6604)

* Added additional German translations to string file

Updates to German translation files as per directions

* Removed messages.json

* [fix] Fixing SQL parsing issue (#7374)

* add chinese translate (#7402)

* Quick fix to address deadlock issue (#7434)

* feat: view presto row objects in data grid (#7445)

* Merge lastest from master into lyft-release-sp8 (#7405)

* filter out all nan series (#7313)

* improve not rich tooltip (#7345)

* Create issue_label_bot.yaml (#7341)

* fix: do not save colors without a color scheme (#7347)

* [wtforms] Strip leading/trailing whitespace (#7084)

* [schema] Updating the datasources schema (#5451)

* limit tables/views returned if schema is not provided (#7358)

* limit tables/views returned if schema is not provided

* fix typo

* improve code performance

* handle the case when table name or view name does not present a schema

* Add type anno (#7342)

* Updated local dev instructions to include missing step

* First pass at type annotations

* [schema] Updating the base column schema (#5452)

* Update 937d04c16b64_update_datasources.py (#7361)

* Feature flag for client cache (#7348)

* Feature flag for client cache

* Fix integration test

* Revert "Fix integration test"

This reverts commit 58434ab.

* Feature flag for client cache

* Fix integration tests

* Add feature flag to config.py

* Add another feature check

* Fix more integration tests

* Fix raw HTML in SliceAdder (#7338)

* remove backendSync.json (#7331)

* [bubbles] issue when using duplicated metrics (#7087)

* SUPERSET-7: Docker compose config version breaks on Ubuntu 16.04 (#7359)

* SUPERSET-8: Update text in docs copyright footer (#7360)

* SUPERSET-7: Docker compose config version breaks on Ubuntu 16.04

* SUPERSET-8: Extra text in docs copyright footer

* [schema] Adding commits and removing unnecessary foreign-key definitions (#7371)

*  Store last selected dashboard in sessionStorage (#7181)

* Store last selected dashboard in sessionStorage

* Fix tests

* [schema] Updating the base metric schema (#5453)

* Fix NoneType bug & fill the test recipients with original recipients if empty (#7365)

* feat: see Presto row and array data types (#7391)

* feat: see Presto row and array data types

* fix: address PR comments

* fix: lint and build issues

* fix: add types

* Incorporate feedback from initial PR (prematurely merged to lyft-release-sp8) (#7415)

* add stronger type hints where possible

* fix: lint issues and add select_star func in Hive

* add missing pkg init

* fix: build issues

* fix: pylint issues

* fix: use logging instead of print

* feat: view presto row objects in data grid

* fix: address feedback

* fix: spacing

* feat: Scheduling queries from SQL Lab (#7416) (#7446)

* Merge lastest from master into lyft-release-sp8 (#7405)

* filter out all nan series (#7313)

* improve not rich tooltip (#7345)

* Create issue_label_bot.yaml (#7341)

* fix: do not save colors without a color scheme (#7347)

* [wtforms] Strip leading/trailing whitespace (#7084)

* [schema] Updating the datasources schema (#5451)

* limit tables/views returned if schema is not provided (#7358)

* limit tables/views returned if schema is not provided

* fix typo

* improve code performance

* handle the case when table name or view name does not present a schema

* Add type anno (#7342)

* Updated local dev instructions to include missing step

* First pass at type annotations

* [schema] Updating the base column schema (#5452)

* Update 937d04c16b64_update_datasources.py (#7361)

* Feature flag for client cache (#7348)

* Feature flag for client cache

* Fix integration test

* Revert "Fix integration test"

This reverts commit 58434ab.

* Feature flag for client cache

* Fix integration tests

* Add feature flag to config.py

* Add another feature check

* Fix more integration tests

* Fix raw HTML in SliceAdder (#7338)

* remove backendSync.json (#7331)

* [bubbles] issue when using duplicated metrics (#7087)

* SUPERSET-7: Docker compose config version breaks on Ubuntu 16.04 (#7359)

* SUPERSET-8: Update text in docs copyright footer (#7360)

* SUPERSET-7: Docker compose config version breaks on Ubuntu 16.04

* SUPERSET-8: Extra text in docs copyright footer

* [schema] Adding commits and removing unnecessary foreign-key definitions (#7371)

*  Store last selected dashboard in sessionStorage (#7181)

* Store last selected dashboard in sessionStorage

* Fix tests

* [schema] Updating the base metric schema (#5453)

* Fix NoneType bug & fill the test recipients with original recipients if empty (#7365)

* feat: see Presto row and array data types (#7391)

* feat: see Presto row and array data types

* fix: address PR comments

* fix: lint and build issues

* fix: add types

* Incorporate feedback from initial PR (prematurely merged to lyft-release-sp8) (#7415)

* add stronger type hints where possible

* fix: lint issues and add select_star func in Hive

* add missing pkg init

* fix: build issues

* fix: pylint issues

* fix: use logging instead of print

* feat: view presto row objects in data grid

* fix: address feedback

* fix: spacing

* Workaround for no results returned (#7442)

* feat: view presto row objects in data grid (#7436)

* feat: view presto row objects in data grid

* fix: address feedback

* fix: spacing

* feat: Scheduling queries from SQL Lab (#7416)

* Lightweight pipelines POC

* Add docs

* Minor fixes

* Remove Lyft URL

* Use enum

* Minor fix

* Fix unit tests

* Mark props as required

* feat: Add `validate_sql_json` endpoint for checking that a given sql query is valid for the chosen database (#7422) (#7462)

merge from lyft-release-sp8 to master

* Adds missing metric sum__SP_RUR_TOTL (#7452)

* Late import for optional lib pyhive (#7471)

* Late import for optional lib pyhive

* fix

* fix: calendar heatmap examples (#7375)

Fixing a set of examples that trip on ValueError vs TypeError

* bugfix: Improve support for special characters in schema and table names (#7297)

* Bugfix to SQL Lab to support tables and schemas with characters that require quoting

* Remove debugging prints

* Add uri encoding to secondary tables call

* Quote schema names for presto

* Quote selected_schema on Snowflake, MySQL and Hive

* Remove redundant parens

* Add python unit tests

* Add js unit test

* Fix flake8 linting error

* [dashboard] After update filter, trigger new queries when charts are visible (#7233)

* trigger query when chart is visible

* add integration test

* fix: alter sql columns to long text #7463 (#7476)

Merge lyft-release-sp8@7bfe7bc to master

* Refactor ConsoleLog (#7428)

* Revised Chinese translation (#7464)

* add chinese translate

* edit chinese translation

* druid connector: avoid using 'dimensions' for scan queries (#7377)

After the following PyDruid change (contained in version 0.5.2)
the Superset Histogram charts rendered with Druid data are
broken:

druid-io/pydruid@0a59a70

Bump the pydruid requirements accordingly in setup.py

Issue: #7368
@mistercrunch mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 0.34.0 labels Feb 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels size/M 🚢 0.34.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Error in SQL editor with slashes in schema name
4 participants