Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow for deferred concurrency check by using token's manually set value rather than value at time of retrieval #27865

Closed
JohnCgp opened this issue Apr 22, 2022 · 8 comments

Comments

@JohnCgp
Copy link

JohnCgp commented Apr 22, 2022

On EF Core 5.0.0, using Npgsql 5.0.10. I'm not sure if this should be filed under Npgsql itself, as the EF Core docs say

Database providers are responsible for implementing the comparison of concurrency token values.

But I don't know if this behaviour is delegated to the provider or is part of EF Core itself.

What problem are you trying to solve?

I want to implement optimistic concurrency on a web API, in a deferred manner. By deferred, I mean I would like to retrieve an entity, and then set the value for its concurrency token as provided by the caller of the API. This way, if the entity has been changed between the caller originally retrieving it and then trying to change it, a DbUpdateConcurrencyException is thrown. An arbitrary amount of time could have passed between the initial retrieval and the subsequent edit.

The scenario would look something like:

Given Bob queries the API for car ID = 1
And Anna queries the API for car ID = 1
And Anna queries the API to change the colour to Red
When Bob queries the API to change the colour to Blue
Then a DbUpdateConcurrencyException is thrown

Initially I thought that is how it worked, but I'm not able to cause a concurrency exception to be thrown. A closer reading of the documentation says:

whenever an update or delete operation is performed during SaveChanges, the value of the concurrency token on the database is compared against the original value read by EF Core.

So I understand that EF Core doesn't actually check the value of the token at the moment of calling SaveChanges(), but rather it internally keeps a handle on the value the token had at the moment of querying it from the database.

Here's a summarised version of what I'm doing:

public class Car
{
    public int Id { get; set; }
    public string Colour { get; set; }
    public uint Version { get; set; }
}

public partial class CarsDbContext : DbContext
{
    public DbSet<Car> Cars { get; set; }

    protected override void OnModelCreating(ModelBuilder modelBuilder)
    {
        modelBuilder
        .Entity<Car>(carBuilder =>
        {
            // Equivalent to UseXminAsConcurrencyToken(), but using a
            // more semantic property name "Version" instead of "xmin" which
            // might be unfamiliar to future readers.
            // Implemented as an explicit property to ensure cars
            // can be passed across context instances without
            // losing version information. See
            // https://stackoverflow.com/a/60871156 and
            // https://www.npgsql.org/efcore/modeling/concurrency.html
            carBuilder.Property(car => car.Version)
            .HasColumnType("xid")
            .HasColumnName("xmin")
            .ValueGeneratedOnAddOrUpdate()
            .IsConcurrencyToken();
        });
    }
}

public class Tests
{
    [Fact]
    public void Car_EditedByOtherUser_DeferredSaveThrows()
    {
        var car = new Car() { Colour = "Red" };

        using (var dbContext = NewDbContext())
        {
            dbContext.Cars.Add(car);
            dbContext.SaveChanges();
        }

        // Imagine Bob and Anna retrieved the car as of this version.
        var originalVersion = car.Version;

        // Anna saves changes.
        using (var dbContext = NewDbContext())
        {
            var concurrentUpdate = dbContext.Cars
                .Single(x => x.Id == car.Id);

            car.Colour = "Blue";
            dbContext.SaveChanges();
        }

        // Bob also saves changes, without having retrieved the new version of
        // the car, so his API call passes in `originalVersion`.
        using (var dbContext = NewDbContext())
        {
            car = dbContext.Cars
                .Single(x => x.Id == car.Id);

            car.Colour = "Yellow";
            car.Version = originalVersion;

            // Should throw DbUpdateConcurrencyException as the Car's version has
            // been set to reflect the value at the time the API consumer retrieved
            // it, but it no longer matches the value in the database because the Car
            // has been edited in the meantime.
            dbContext.SaveChanges();
        }
    }
}

I am able to cause a DbUpdateConcurrencyException to be thrown if, instead of retrieving car again, I just Attach it to the final DbContext, so at least I know the concurrency check is happening.

Describe the solution you'd like

For dbContext.SaveChanges() to throw a DbUpdateConcurrencyException in the described scenario. My current workaround is to manually compare the passed in version with the one on the entity before saving.

edit: fixed car value setting code.

@JohnCgp
Copy link
Author

JohnCgp commented Apr 22, 2022

For any future readers, I just stumbled upon a way to have EF Core use a manually assigned value for the concurrency token. I'd tried updating the CurrentValues at some point, so close!

I'd still be keen to have EF Core use the concurrency token's value itself - my Car's properties are actually private set, with an Update method. To apply this approach I'd have to have my model be aware of the persistence layer, which I'd rather avoid.

edit: found another discussion on this behaviour and how it's initially confusing.

@JohnCgp
Copy link
Author

JohnCgp commented Apr 26, 2022

Ended up overriding the DbContext's SaveChanges and SaveChangesAsync methods, doing this before calling the base implementation:

foreach (var entry in ChangeTracker.Entries<Car>())
    entry.OriginalValues[nameof(Car.Version)] = entry.Entity.Version;

Which ensures the version is checked regardless of how the entity is saved.

@ajcvickers
Copy link
Member

ajcvickers commented Apr 26, 2022

If the current value is checked, then this would flow throw:

        // Anna saves changes.
        using (var dbContext = NewDbContext())
        {
            var concurrentUpdate = dbContext.Cars
                .Single(x => x.Id == car.Id);

            car.Colour = "Blue";
            dbContext.SaveChanges();
        }

Since the colour is now different in the database than it is in the tracked entity. This is why the original value is used.

@JohnCgp
Copy link
Author

JohnCgp commented Apr 26, 2022

If the current value is checked, then this would flow:

Not sure I follow, by flow do you mean the change would be saved without any error? If so, that's what I would expect (and how it works in the test). Anna retrieves the latest entity and updates a property that is not used as a concurrency token (Colour).

Since the colour is now different in the database than it is in the tracked entity.

Yep, the colour in the database is red, and in the tracked entity it is blue. Because the concurrency token (the Version property) is the expected one, the save is successful. After the save, Version has been bumped allowing edits that use the original value of Version (when it was red) to be caught as out of date.

This is why the original value is used.

This is where I get lost. Due to the original value of Version being used, when Bob queries the API with the value he is aware of (when the car was red, prior to Anna's update), the save is successful and Bob clobbers Anna's change that made the car blue.

        // Bob also saves changes, without having retrieved the new version of
        // the car, so his API call passes in `originalVersion`.
        using (var dbContext = NewDbContext())
        {
            car = dbContext.Cars
                .Single(x => x.Id == car.Id);

            car.Colour = "Yellow";
            car.Version = originalVersion; // <--- this does not cause the optimistic concurrency check to trigger, it has to be done as
            // car.OriginalValues[nameof(Car.Version)] = originalVersion;

            // Should throw DbUpdateConcurrencyException as the Car's version has
            // been set to reflect the value at the time the API consumer retrieved
            // it, but it no longer matches the value in the database because the Car
            // has been edited in the meantime.
            dbContext.SaveChanges();
        }

@ajcvickers
Copy link
Member

@JohnCgp Sorry, "flow" was a typo. I missed that there was a different concurrency token. Yes, I can see what you are saying; we will discuss.

@KillerBoogie
Copy link

Hi John, to my knowledge when using optimistic locking with concurrency tokens you need to consider two different cases.

  1. One is as you described above with Anna and Bob. Here the entity has been changed already before the second call hits the API.

  2. The other one is a race condition between two API calls. Both calls enter the API with the same current concurrency token, but one thread is first for the database update.

Both types can be detected inside the database comparing the concurrency token. Still you need to make sure a suitable isolation level is set for the transaction and that the check and the update are atomic. If you use a stored procedure that does a check in one statement and then the update in another one you are not protected against those race conditions. In general standard isolation levels for transactions, despite what you would naively expect (I did), do not protect against all race conditions. To prevent write skew you need to use additional locking.

With the first case, the second API call already entered with an older token. When you handle the command you normally read the entity from the database first. Either you do this because you utilize EF Core's change tracker or you follow Domain Driven Design principles and separate your business logic in a domain layer. In any case you can immediately check the token that was provided through the ETag with the token from the read entity. If they don't match, you can immediately return a concurrency error and save to run the logic and the update database call, because it must fail. This check can also be implemented with a proper hash function instead of the database concurrency token. With hash functions you need to consider collisions.

The second case can be caught with the concurrency token in the update call to the database when properly configured. As another option you can span a transaction over the read and the update and put a lock For Update on the read record(s). In most cases this lock is only brief. It is the time that your business logic takes to process. The problem is that EF Core does not support For Update, yet, but there extensions that add this option.

If you follow HTTP standards and use If-Match/If-None-Match headers to transport the ETag, the concurrency error in first case must return a 419 Precondition Failed and the second to be precise a 409 Conflict, because the pre-condition (matching ETag) was met, but a 419 in both cases would be fine if there are no other side effects.

@ajcvickers
Copy link
Member

We discussed the case where the current value should be used because it has been set explicitly after a second database query from the one used to originally obtain the data. We could add additional metadata/rules for this case, but that would make the rules in general for understanding how this works more complicated. Instead we will continue to use the original value, if it is available, as the value that was obtained from the most recent database query, and instead document situations like this, as covered by: dotnet/EntityFramework.Docs#860

@roji
Copy link
Member

roji commented Jun 6, 2022

Additional note for reference... If we switched optimistic concurrency to check the current value rather than the original (as now), that would solve the above problem. However, it would break cases of manual concurrency tokens, where users set the concurrency token themselves. We discussed predicating checking original vs. current based on whether the concurrency token is automatically-generated (i.e. database-generated) or not, but concluded that's too much rule complexity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants