Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow managing transactions via SQL commands in the update pipeline #27532

Open
Tracked by #28479 ...
roji opened this issue Mar 1, 2022 · 2 comments
Open
Tracked by #28479 ...

Allow managing transactions via SQL commands in the update pipeline #27532

roji opened this issue Mar 1, 2022 · 2 comments
Assignees
Labels
area-perf area-save-changes needs-design punted-for-7.0 Originally planned for the EF Core 7.0 (EF7) release, but moved out due to resource constraints. type-enhancement
Milestone

Comments

@roji
Copy link
Member

roji commented Mar 1, 2022

SaveChanges implicitly starts a transaction around all changes (except where one isn't needed, see #27439). This is done by calling the standard ADO.NET APIs, DbConnection.BeginTransaction and DbTransaction.Commit. Unfortunately, each of these APIs does a roundtrip, so a typical SaveChanges involves 3 roundtrips instead of just one. Some benchmarking:

Method DatabaseType Mean Error StdDev Median Ratio RatioSD
Without_transaction Postgresql 187.4 us 3.72 us 6.32 us 188.0 us 1.00 0.00
With_transaction_via_API Postgresql 354.8 us 11.70 us 34.51 us 362.7 us 1.87 0.26
With_transaction_via_commands Postgresql 234.4 us 5.71 us 16.84 us 237.1 us 1.24 0.11
Benchmark code
BenchmarkRunner.Run<Benchmark>();

public class Benchmark
{
    [Params(DatabaseType.Postgresql, DatabaseType.SqlServer)]
    public DatabaseType DatabaseType { get; set; }

    private DbConnection _connection;
    private DbCommand _command;

    private async Task Setup()
    {
        _connection = DatabaseType switch
        {
            DatabaseType.Postgresql => new NpgsqlConnection("Host=localhost;Username=test;Password=test"),
            DatabaseType.SqlServer => new SqlConnection("Server=localhost;Database=test;User=SA;Password=Abcd5678;Connect Timeout=60;ConnectRetryCount=0;Trust Server Certificate=true"),
            _ => throw new ArgumentOutOfRangeException()
        };

        await _connection.OpenAsync();
    }

    [GlobalSetup(Targets = new[] { nameof(Without_transaction), nameof(With_transaction_via_API) })]
    public async Task Setup_command_without_transaction()
    {
        await Setup();

        _command = _connection.CreateCommand();
        _command.CommandText = "SELECT 1";
    }

    [GlobalSetup(Target = nameof(With_transaction_via_commands))]
    public async Task Setup_command_with_transaction()
    {
        await Setup();

        _command = _connection.CreateCommand();
        _command.CommandText = DatabaseType switch
        {
            DatabaseType.Postgresql => "BEGIN; SELECT 1; COMMIT",
            DatabaseType.SqlServer => "BEGIN TRANSACTION; SELECT 1; COMMIT",
            _ => throw new ArgumentOutOfRangeException()
        };
    }

    [Benchmark(Baseline = true)]
    public int Without_transaction()
        => (int)_command.ExecuteScalar()!;

    [Benchmark]
    public int With_transaction_via_API()
    {
        var tx = _connection.BeginTransaction();
        _command.Transaction = tx;

        var result = (int)_command.ExecuteScalar()!;

        tx.Commit();

        return result;
    }

    [Benchmark]
    public int With_transaction_via_commands()
        => (int)_command.ExecuteScalar()!;
}

public enum DatabaseType
{
    Postgresql,
    SqlServer
}
Results for SQL Server are similar, though more extreme: starting a transaction with the API is more than twice as long as doing it via the commands. Note that Npgsql already batches BeginTransaction via the API, but SqlClient does not, so the API imposes 2 roundtrips on Npgsql but 3 on SqlClient (probably part of why the difference is bigger on SqlClient).

We could get around this by managing transactions manually: prepend BEGIN to the first batch, and append COMMIT to the last batch.

  • We're not sure if it's always safe/correct bypass the ADO.NET provider APIs like this, since the provider may not be aware of the transaction flow and behavior may change. To be on the safe side, it may be better to implement this in relational but require a provider opt-in.
  • Capabilities such as intercepting transactions would no longer be possible, so a user opt-out may also be a good idea.

Originally discussed in #4351

@roji
Copy link
Member Author

roji commented Mar 19, 2022

This is unfortunately problematic: if we append COMMIT to the end, we have no opportunity to roll back if a concurrency exception occurs. Some thoughts:

  • We can append COMMIT only if users opt out of concurrency checks (#10443).
    • Another "creative" solution is to implement concurrency checks server-side, i.e. send SQL that would generate a database error when 0 rows are affected (i.e. wrap the individual update in a check and error). We may even be able to communicate which row failed the concurrency. But this would be pretty advanced.
  • Even then, we still have to handle database errors (e.g. unique constraint violations):
    • In PG, any error automatically causes the remaining batch statements to be skipped (including the COMMIT), and so we'd just roll back afterwards.
    • In SQL Server, we should be able to set XACT_ABORT to ON to achieve roughly the same behavior - though this needs to be confirmed.
    • SQLite seems dodgy, with some error types triggering a full transaction rollback, and others not. It isn't clear whether SQLite continues with statements after the error and commit them.
  • We could allow providers to opt into this behavior, so that PG would do it but not SQLite. If we do that, we may want to expose BatchExecutor as public and add appropriate extensibility hooks.
  • Even if we don't implement COMMIT append, we could still implement prepending of the BEGIN. However, Npgsql already does this at the ADO.NET layer; we could do the same for SQLite (but is it worth it?) and suggest SqlClient (#1554) and MySqlConnector (#1152) do the same.

@roji roji removed this from the 7.0.0 milestone Mar 19, 2022
@roji
Copy link
Member Author

roji commented Apr 20, 2022

This depends on:

Also, make sure that even after #10433, this is still compatible with SQL Server, i.e. that XACT_ABORT gives us the transaction abort behavior we need (in any case, this needs to be provider-opt-in since providers may not support the right abort behavior, or need special settings like XACT_ABORT).

Leaving in 7.0 for now, but deferring for later in the milestone.

@ajcvickers ajcvickers added the punted-for-7.0 Originally planned for the EF Core 7.0 (EF7) release, but moved out due to resource constraints. label Apr 23, 2022
@ajcvickers ajcvickers added this to the Backlog milestone Apr 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-perf area-save-changes needs-design punted-for-7.0 Originally planned for the EF Core 7.0 (EF7) release, but moved out due to resource constraints. type-enhancement
Projects
None yet
Development

No branches or pull requests

2 participants