Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ORM query service #11774

Open
aaronc opened this issue Apr 26, 2022 · 5 comments
Open

ORM query service #11774

aaronc opened this issue Apr 26, 2022 · 5 comments
Labels

Comments

@aaronc
Copy link
Member

aaronc commented Apr 26, 2022

User Story

As a module developer, I would like the ORM to auto-generate a query service for clients so that I don't have to manually implement gRPC queries.

Possible Designs

None of these are mutually exclusive, it's a matter of prioritizing where we focus energy.

A) Generic gRPC service

This would involve creating a single gRPC endpoint that allows querying all the ORM tables in an app and returns responses using protobuf Anys. One possible approach is to support a set of operators similar to what is supported in mongodb.

Ex:

service Query {
  rpc Find(FindRequest) returns (FindResponse);
}

message FindRequest {
  string table = 1;
  repeated QueryOperator operators = 2;
  repeated Ordering order = 3;
  cosmos.base.query.v1beta1.PageRequest page = 4;
}

message FindResponse {
  repeated google.protobuf.Any results = 1;
  cosmos.base.query.v1beta1.PageResponse page = 2;
}

B) Generic JSON REST service

Same as the above but with JSON instead of protobuf to allow for a simpler more flexible query language along the lines of what is provided in mongo.

C) Generate gRPC .proto files for all tables

We could generate a query service for every ORM .proto file and dynamically serve it. For instance, for the bank example in the ORM tests, we would generate a query service somewhat like this:

service Query {
  rpc Balance(QueryBalanceRequest) returns (QueryBalanceResponse);
  rpc Balances(QueryBalancesRequest) returns (QueryBalancesResponse);
...
}

This would require another layer of codegen and is more similar to what we currently have with gRPC.

D) Graphql

While graphql is a departure from gRPC, it is likely better for building client apps and would allow for getting a bunch of objects in a single query and also traversing some joins (which could be done with additional protobuf annotations). A proposed design to follow is https://www.graphile.org/postgraphile/introduction/ which automatically generates Graphql services just from SQL table definitions.

Design Considerations

Some issues to address when designing this are:

Storage Types vs Presentation Types

See regen-network/regen-ledger#930. In particular, in lots of state tables we store addresses as bytes but users are used to seeing these usually as bech32 strings. All of these fields should be annotated with cosmos.AddressBytes so a smart query layer could do the conversion.

Logical Queries vs Direct Index Access

ORM queries in the state machine are currently done with direct index access and don't really have a separation between a logical and physical plan as more robust databases do. This is probably good for state machines generally because it is more predictable in terms of performance and gas consumption. gRPC queries are also generally written as direct index access just because that's easier to do.

An auto-generated client query layer could expose indexes directly as we currently do or include a more generic set of logical query filters like MongoDB or SQL. The latter would require some query planner even if it is fairly naive. The advantage of full logical queries are:

  • allows for the same queries regardless of which indexes exist
  • is stable between different versions of a state machine which may add/remove indexes
  • allows for the same queries to be retargeted against a traditional DB so that clients can write the same app where the backing data store is direct state or something like Postgres or Mongo with indexed data

For these reasons, I'm leaning towards supporting logical queries with the most naive query planner that is non-trivial (meaning that indexes are selected based on some very simple rule-based heuristics and in memory filtering and sorting are the fallback for all non-trivial cases).

@aaronc aaronc mentioned this issue Apr 26, 2022
33 tasks
@robert-zaremba
Copy link
Collaborator

I know that my opinion may not be the most popular one, but honestly, if we knock out an out of the box integration with a DB with tonnes of integrations from the get go, like: Postgresql (graphql) or Mongodb (mongodb cli), then instead of reinventing and maintaining solutions we embrace existing leading solutions with web2 world with tonnes of integration and tooling.

@aaronc mention that we still need historical queries (tbh, I don't know any app except some bridges which may require it). So maybe only querying by a primary key is enough?

@aaronc
Copy link
Member Author

aaronc commented Apr 26, 2022

I know that my opinion may not be the most popular one, but honestly, if we knock out an out of the box integration with a DB with tonnes of integrations from the get go, like: Postgresql (graphql) or Mongodb (mongodb cli), then instead of reinventing and maintaining solutions we embrace existing leading solutions with web2 world with tonnes of integration and tooling.

@aaronc mention that we still need historical queries (tbh, I don't know any app except some bridges which may require it). So maybe only querying by a primary key is enough?

I totally agree we should make it really easy to index state in conventional databases like Postgres and Mongo. However, I also think the SDK should provide decent out of the box querying which is the current status quo. The reasons in my mind for this are:

  • it's a much simpler deployment - a single binary without lots of devops. Especially in testing and experimental scenarios needing to deploy postgres &/or mongo could be a deal breaker
  • people are already used to it
  • IAVL and store/v2 support historical queries which is actually pretty hard to engineer into a major conventional DB, and the only DBs that do support this would require a fair amount of extra work and involve complex deployments and other limitations (see Datomic or Crux)
  • proofs and queries supported at the same layer

@aaronc
Copy link
Member Author

aaronc commented Apr 26, 2022

My thinking right now is to move forward with approach A & D together, starting with A. Not sure about the logical query layer. On one hand it is sort of better and it's not that hard to implement it naively. But also it obscures performance problems if there's no plan to improve it. So direct index access may be the simple way to get the most bang for the buck and further efforts can be placed on indexing in other dbs as @robert-zaremba suggests.

@aaronc
Copy link
Member Author

aaronc commented Apr 26, 2022

#11791 proposes .proto definitions for a generic gRPC query service (approach A) that uses direct index access rather than logical queries

@aaronc
Copy link
Member Author

aaronc commented Apr 28, 2022

I discussed this a bit with @robert-zaremba today and I think I agree with his point that there's a risk of postponing the inevitable need of a lot of projects to break queries out into a secondary layer as the project grows.

I still think we should provide out-of-the-box queries as that will be desirable at the experimental stage. But I think it may be worth the effort to define a stable API with logical queries that can be retargetted against other DBs as a project grows.

In particular, there's a risk with direct index access of encouraging clients to build too much around available on-chain indexes. And I think we actually should allow a module remove an index if it is only used by clients but not the state machine, with the understanding that this can easily be offloaded to other DBs.

So I'm thinking of closing #11791 and proposing a logical query layer that is sufficent but also not too complex probably in an ADR.

mergify bot pushed a commit that referenced this issue Jun 13, 2022
## Description

Ref #11774 

This proposes a generic gRPC query service that uses direct index access and can basically be implemented as a thin layer directly on top of `ormtable.Table`. There is relatively little code required to implement this but it can potentially provide a lot of value as a way to expose queries to basically everything that's using the ORM without needing to implement any custom gRPC queries.

A version of this which is type safe is also proposed in #11774 but that would require additional codegen of .proto files. #11774 also proposes a logical query layer instead of direct index access but that's substantially more code to do even naively and an optimized version would take more effort.



---

### Author Checklist

*All items are required. Please add a note to the item if the item is not applicable and
please add links to any relevant follow up issues.*

I have...

- [ ] included the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title
- [ ] added `!` to the type prefix if API or client breaking change
- [ ] targeted the correct branch (see [PR Targeting](https://github.com/cosmos/cosmos-sdk/blob/main/CONTRIBUTING.md#pr-targeting))
- [ ] provided a link to the relevant issue or specification
- [ ] followed the guidelines for [building modules](https://github.com/cosmos/cosmos-sdk/blob/main/docs/building-modules)
- [ ] included the necessary unit and integration [tests](https://github.com/cosmos/cosmos-sdk/blob/main/CONTRIBUTING.md#testing)
- [ ] added a changelog entry to `CHANGELOG.md`
- [ ] included comments for [documenting Go code](https://blog.golang.org/godoc)
- [ ] updated the relevant documentation or specification
- [ ] reviewed "Files changed" and left comments if necessary
- [ ] confirmed all CI checks have passed

### Reviewers Checklist

*All items are required. Please add a note if the item is not applicable and please add
your handle next to the items reviewed if you only reviewed selected items.*

I have...

- [ ] confirmed the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title
- [ ] confirmed `!` in the type prefix if API or client breaking change
- [ ] confirmed all author checklist items have been addressed 
- [ ] reviewed state machine logic
- [ ] reviewed API design and naming
- [ ] reviewed documentation is accurate
- [ ] reviewed tests and test coverage
- [ ] manually tested (if applicable)
mergify bot pushed a commit that referenced this issue Oct 7, 2022
## Description

This starts the implementation of approach (C) in #11774 by generating `_query.proto` files for `.proto` files which have ORM definitions. It does this using a new protoc plugin called `protoc-gen-go-cosmos-orm-proto`.

Please review `bank_query.proto` and `test_schema_query.proto` as these are the outputs of the codegen.

The approach taken does not attempt to do any sort of logical queries as discussed in #11774 and instead provides direct index access in the most simple and complete way that we can easily implement now. More advanced things in #11774 could be implemented later, but this should be possible to deliver relatively quickly and should allow developers to completely skip writing gRPC queries manually if they use ORM.

Note that the implementation of these queries can provide merkle proofs because the mapping to state is well known.

I did need to disable pulsar codegen in this PR because it doesn't support proto3 optionals which are needed here (depends on cosmos/cosmos-proto#12). Fortunately, pulsar is 100% compatible with the official google codegen and there is no problem running ORM tests against the official codegen.



---

### Author Checklist

*All items are required. Please add a note to the item if the item is not applicable and
please add links to any relevant follow up issues.*

I have...

- [ ] included the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title
- [ ] added `!` to the type prefix if API or client breaking change
- [ ] targeted the correct branch (see [PR Targeting](https://github.com/cosmos/cosmos-sdk/blob/main/CONTRIBUTING.md#pr-targeting))
- [ ] provided a link to the relevant issue or specification
- [ ] followed the guidelines for [building modules](https://github.com/cosmos/cosmos-sdk/blob/main/docs/building-modules)
- [ ] included the necessary unit and integration [tests](https://github.com/cosmos/cosmos-sdk/blob/main/CONTRIBUTING.md#testing)
- [ ] added a changelog entry to `CHANGELOG.md`
- [ ] included comments for [documenting Go code](https://blog.golang.org/godoc)
- [ ] updated the relevant documentation or specification
- [ ] reviewed "Files changed" and left comments if necessary
- [ ] confirmed all CI checks have passed

### Reviewers Checklist

*All items are required. Please add a note if the item is not applicable and please add
your handle next to the items reviewed if you only reviewed selected items.*

I have...

- [ ] confirmed the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title
- [ ] confirmed `!` in the type prefix if API or client breaking change
- [ ] confirmed all author checklist items have been addressed 
- [ ] reviewed state machine logic
- [ ] reviewed API design and naming
- [ ] reviewed documentation is accurate
- [ ] reviewed tests and test coverage
- [ ] manually tested (if applicable)
Wryhder pushed a commit to Wryhder/cosmos-sdk that referenced this issue Oct 26, 2022
## Description

This starts the implementation of approach (C) in cosmos#11774 by generating `_query.proto` files for `.proto` files which have ORM definitions. It does this using a new protoc plugin called `protoc-gen-go-cosmos-orm-proto`.

Please review `bank_query.proto` and `test_schema_query.proto` as these are the outputs of the codegen.

The approach taken does not attempt to do any sort of logical queries as discussed in cosmos#11774 and instead provides direct index access in the most simple and complete way that we can easily implement now. More advanced things in cosmos#11774 could be implemented later, but this should be possible to deliver relatively quickly and should allow developers to completely skip writing gRPC queries manually if they use ORM.

Note that the implementation of these queries can provide merkle proofs because the mapping to state is well known.

I did need to disable pulsar codegen in this PR because it doesn't support proto3 optionals which are needed here (depends on cosmos/cosmos-proto#12). Fortunately, pulsar is 100% compatible with the official google codegen and there is no problem running ORM tests against the official codegen.



---

### Author Checklist

*All items are required. Please add a note to the item if the item is not applicable and
please add links to any relevant follow up issues.*

I have...

- [ ] included the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title
- [ ] added `!` to the type prefix if API or client breaking change
- [ ] targeted the correct branch (see [PR Targeting](https://github.com/cosmos/cosmos-sdk/blob/main/CONTRIBUTING.md#pr-targeting))
- [ ] provided a link to the relevant issue or specification
- [ ] followed the guidelines for [building modules](https://github.com/cosmos/cosmos-sdk/blob/main/docs/building-modules)
- [ ] included the necessary unit and integration [tests](https://github.com/cosmos/cosmos-sdk/blob/main/CONTRIBUTING.md#testing)
- [ ] added a changelog entry to `CHANGELOG.md`
- [ ] included comments for [documenting Go code](https://blog.golang.org/godoc)
- [ ] updated the relevant documentation or specification
- [ ] reviewed "Files changed" and left comments if necessary
- [ ] confirmed all CI checks have passed

### Reviewers Checklist

*All items are required. Please add a note if the item is not applicable and please add
your handle next to the items reviewed if you only reviewed selected items.*

I have...

- [ ] confirmed the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title
- [ ] confirmed `!` in the type prefix if API or client breaking change
- [ ] confirmed all author checklist items have been addressed 
- [ ] reviewed state machine logic
- [ ] reviewed API design and naming
- [ ] reviewed documentation is accurate
- [ ] reviewed tests and test coverage
- [ ] manually tested (if applicable)
@tac0turtle tac0turtle removed the T:Epic Epics label Nov 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: ☃️ Icebox
Status: Epic Backlog
Development

No branches or pull requests

3 participants