Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding YAML Import-Export for Datasources to CLI #3978

Merged
merged 2 commits into from
Dec 5, 2017

Conversation

fabianmenges
Copy link
Contributor

@fabianmenges fabianmenges commented Dec 1, 2017

Summary

Adding YAML import and export for datasources, which includes (SqlAlchemy) Databases and Druid Clusters to the Superset CLI.

Description

I added the core of the Import/Export logic to the ImportMixin mix-in class. It heavily relies on SqlAlchemy to determine the schema of the YAML and the relationship of objects. Specifically it uses unique constraints to identify and update existing elements. This required me to add unique constraints to existing tables, but I'm pretty confident that this should not cause major issues with existing installations since I added them in the "spirit" of the current design.

In addition to the SqlAlchemy relationships the main object hierarchy needs to be defined by configuring the export_parent and export_children attributes appropriately (documented in code).

The unit test covers only basic importing exporting (it was liftet from the existing pickle import/export) and should probably be extended to cover more edge cases.

You can export databases, druid clusters, tables, datasources from the UI:
screen shot 2017-11-16 at 6 52 46 pm

Possible Future Projects/Improvements

  • Add YAML import/export for Slices and Dashboards.
  • Adding support to import/export individual database / tables / metrics, etc. as root element of a YAML file.
  • Support of importing multiple YAML files at the same time
  • Support to split exports into smaller files
  • It would also be nice to use the YAML import for the example datasets.
  • Adding YAML import to the Web UI and/or API

Re-created #2993 This PR was getting too old for travis.
Context: https://groups.google.com/forum/#!topic/airbnb_superset/GeWZs42_NyA

@mistercrunch
Copy link
Member

Looks like travis isn't kicking in, any idea why? I noticed that before only on your PRs, bad luck or something related to your workflow?

@fabianmenges fabianmenges changed the title Adding YAML Import/Export for Datasources to CLI Adding YAML Import Export for Datasources to CLI Dec 2, 2017
@fabianmenges fabianmenges changed the title Adding YAML Import Export for Datasources to CLI Adding YAML Import-Export for Datasources to CLI Dec 2, 2017
@fabianmenges
Copy link
Contributor Author

Apparently travis says 'abuse detected'. I'm contacting travis support and CC you.

@mistercrunch
Copy link
Member

In the meantime you can probably branch off your branch, amend the commit with a different commit message to get a different SHA, and open a new PR off of it referencing this one. Might work...

@fabianmenges
Copy link
Contributor Author

Finally got picked up by Travis.

@mistercrunch mistercrunch merged commit 72627b1 into apache:master Dec 5, 2017
@yamyamyuo
Copy link
Contributor

yamyamyuo commented Mar 6, 2018

Hi @mistercrunch , I am very interested in this feature

Possible Future Projects/Improvements:
Add YAML import/export for Slices.

is there any plans for that?

@mistercrunch
Copy link
Member

@yamyamyuo note that it's possible to do most of this using the ORM directly. Could be good to have a code generation feature as an export. Exporting would generate code that uses the ORM to create the slice and/or dashboard.

Check out this file for an example of how we generate the examples:
https://github.com/apache/incubator-superset/blob/master/superset/data/__init__.py#L79

Note that of course this isn't a public API, and we won't provide guarantees around forward compatibility, though it shouldn't change much.

@yamyamyuo
Copy link
Contributor

yamyamyuo commented Mar 7, 2018

@mistercrunch I have a deeper look at superset. I found the security part is very detailed but also complex.

  1. How did Airbnb manage the user's authorization? Pick some people as admin and grant perm when new users come? How did the admin know every user's role? Shall I grant the Finance group datasource to a marketing employee?
  2. Would it be too annoying for the admin to add/delete/edit all the roles and add the roles to each users? Is there an automatic way or more efficient way?
  3. Does Airbnb deploy Superset to separate environment? Such as development, staging and production environment. Does the analyst try to create the slice/dashboard in development environment and when it success, do that process again in production environment? How did the analyst create same slice/dashboard in more than one environment? We can't just copy and paste since slice cannot import/export. Did they repeat the same process in different env?

There are so many questions🙈I am really appreciated if you can share your experience or advice!

michellethomas pushed a commit to michellethomas/panoramix that referenced this pull request May 24, 2018
* Adding import and export for databases

* Linting
wenchma pushed a commit to wenchma/incubator-superset that referenced this pull request Nov 16, 2018
* Adding import and export for databases

* Linting
@mistercrunch mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 0.21.0 labels Feb 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 0.21.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants