Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feedback on Cosmos DB database provider when building an application #23688

Closed
JonPSmith opened this issue Dec 15, 2020 · 3 comments
Closed

Feedback on Cosmos DB database provider when building an application #23688

JonPSmith opened this issue Dec 15, 2020 · 3 comments

Comments

@JonPSmith
Copy link

JonPSmith commented Dec 15, 2020

As part of the update of my book Entity Framework Core in Action I wrote a chapter than looked as using Cosmos DB in EF Core 5. The code can be found in the associated EfCoreinAction-SecondEdition, branch Part3 GitHub.

I could get my application using Cosmos DB to work, but I did have to implement some workarounds to make it work with EF Core 5

Here is my feedback on what I would like added to the Cosmos DB database provider, with the most useful ones first.

1. Logging output, especially Microsoft.EntityFrameworkCore.Database.Command

I was really off-putting to not having logging (I didn't realise how much I rely on logging). Not knowing what was going on made me do more testing to check that things were working.

2. Implement FromSqlRaw, FromSqlInterpolated, etc.

I'm fine with some commands missing from the Cosmos DB database provider as I am happy to add raw commands myself. The problem was that FromSqlRaw, FromSqlInterpolated, etc. didn't work.

Instead I used context.Database.GetCosmosClient and so on, but the database name isn't available (well I couldn't find it), so I had to build a system to get the database name from the appsetting.json file. It worked but it was too much work, and easy to get wrong.

3. Add RUs to the Microsoft.EntityFrameworkCore.Database.Command logging

Knowing how much RUs are used for each command is really important. It makes a big impact on performance and cost. I think it is vital to get this information.

PS. I would love to have a way to get the RUs via some sort of event/interception too, but logging is good enough.

4. Support subqueries, e.g. "SELECT DISTINCT value f.TagId FROM c JOIN f in c.Tags"

Cosmos DB is designed to have nested items, which works with EF Core's Owned Type relationships. This means subqueries are needed to work with the nested items.

5. Add more mappings to Cosmos DB SQL commands

Obviously there is a ton of work here and you know that. I expect if you do decide to improve the Cosmos DB database provider, then you will do the lot, so its not really worth me picking out certain commands.

6. Case-insensitive string commands

The main string commands like STRINGEQUAL, STARTSWITH, CONTAINS have boolean to say if case-insensitive or not.

7. Implement CanConnect to detect no Cosmos DB service

I would like to use Database.CanConnect() to check if there is Cosmos DB service available in my unit tests. Then I can show a better error message, or in an DevOps pipeline ignore these tests. Other things:

  • I don't know how CanConnect works, but I would it to not throw an exception if the service doesn't responds and times out.
  • I also would like to set a shorter timeout time to make it faster, e.g. Database.CanConnect(4) sets a timeout time of 4 seconds.

Last comments, not requests

A. I'm not sure adding relationships between entity classes in Cosmos DB is a good idea

Cosmos is about nested items, which fits EF Core's Owned Type approach really well. If you create entity to entity relationships they most likely won't work like relational database relationships (cascade delete is the obvious difference). I fear developers will use entity to entity relationships with Cosmos DB and get unstuck/confused because it doesn't work the same way.

Be interesting to see want other NoSQL databases would look like on this. MongoDb seems to use nested entities, but does talk about One-to-Many Relationships with Document References.

B. You can get some nasty errors if you add a new non-null property to Cosmos DB entity class

Scenario:

  1. You have built an app and you have saved data to Cosmos DB
  2. You need to add a new property which is non-nullable
  3. You read in the existing data and you get an exception

Lots of people are going to do that and have problems. Here are some possible solutions:

  • Maybe you add a Cosmos DB configuration setting that tells EF Core to set a non-null property which hasn't got data from Cosmos DB to its default value.
  • An MissingCosmosData exception that comes after all the properties that can be filled in have been set. The exception gives the names of al the non-null properties that haven't got data from Cosmos DB
@AndriySvyryd
Copy link
Member

Hello Jon,

Thanks for the comprehensive list. Here are some comments:

  1. Logging output, especially Microsoft.EntityFrameworkCore.Database.Command

We do log queries under DbLoggerCategory.Database.Command. Are you referring to CUD command logging?

  1. Implement FromSqlRaw, FromSqlInterpolated, etc.

Tracked by #17311

  1. Add RUs to the Microsoft.EntityFrameworkCore.Database.Command logging

Tracked by #17298

  1. Support subqueries, e.g. "SELECT DISTINCT value f.TagId FROM c JOIN f in c.Tags"

Mostly tracked by #16926 and #17957

  1. Add more mappings to Cosmos DB SQL commands
  2. Case-insensitive string commands

#16143

  1. Implement CanConnect to detect no Cosmos DB service

#23501

A. I'm not sure adding relationships between entity classes in Cosmos DB is a good idea

We share this concern, but there's also an argument that this is something that EF can add on top of Cosmos, even if it doesn't work the same way as for other providers. We are considering adding limited support for Include: #16920

B. You can get some nasty errors if you add a new non-null property to Cosmos DB entity class

#22374

@JonPSmith
Copy link
Author

JonPSmith commented Dec 15, 2020

Thanks @AndriySvyryd. Looks like you have it covered, even with the ToConnect timeout parameter. Also don't forget #16146 under item 5.

On the logging I'm used to the Information logging of all read/writes to the database with the SQL in it. Obviously the Cosmos DB writes won't be SQL but I assume the queries will have the Cosmos SQL commands in the logging.

@AndriySvyryd
Copy link
Member

On the logging I'm used to the Information logging of all read/writes to the database with the SQL in it.

Ah, I see the problem now. We are currently logging at the Debug level, filed #23693

I think that takes care of everything, so I'm going to close this.

@AndriySvyryd AndriySvyryd removed their assignment Dec 15, 2020
@ajcvickers ajcvickers reopened this Oct 16, 2022
@ajcvickers ajcvickers closed this as not planned Won't fix, can't repro, duplicate, stale Oct 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants