-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Microsoft.Data.Sqlite: Support deferred transactions #20786
Comments
/cc @bricelam |
SQLite doesn't support snapshot isolation. It ensures isolation via locking. For serialized, as soon as a write operation is performed, all other SQL statements must wait for the write operation's transaction to complete (they actually fail with SQLITE_BUSY and the app needs to retry them). For read uncommitted, additional reads are allowed, and yes, they will be affected by the write operation (dirty reads, non-repeatable reads, and phantoms). But again, other read operations are blocked. Snapshot isolation works by taking a snapshot of the data when the transaction begins so that read operations are completely isolated from write operations. No locking is required. But SQLite simply doesn't implement this behavior. Deferred transactions aren't exactly related to isolation levels. It means that the locking is deferred until the first write operation is performed. As soon as one takes the lock, the other transaction fails entirely and needs to be retried since any data it read earlier might now be invalid. This gives the perception of greater concurrency but the cost of recovering from concurrency issues also goes up. We tried to switch to deferred transactions by default in version 2.0, but several users started seeing deadlocks in their apps (aspnet/Microsoft.Data.Sqlite#474). I'm open to the idea of adding an option like See also Transactions in Microsoft.Data.Sqlite. |
The scenario I am facing: My application has to generate multiple complicated reports in parallel on background threads. Each report requires multiple select statements and the data must be consistent, therefore a transaction is required. Generating one of those reports could take minutes, so it makes a huge difference, if it happens sequentially or in parallel. With the old System.Data.SQLite provider I simply started deferred transactions. As long as you execute no updates or inserts but only selects in a deferred transaction, SQLite uses a read transaction instead of a write transaction. This way the reports can be generated in parallel, because SQLite supports multiple simultaneous read transactions but only one simultaneous write transaction. In general, I follow these two rules:
I think it’s correct to NOT make deferred the default, because if two deferred transactions execute a select (which starts a read transaction) and then execute an insert or update (which upgrades to a write transaction) it could easily lead to deadlocks, because none of the transactions can upgrade to a write transaction before the other transaction finishes its read transaction. |
Sounds like a great scenario. I’m ok adding API for advanced scenarios like this. The workaround is to manually send the BEGIN TRANSACTION statement: command.CommandText = “BEGIN DEFERRED TRANSACTION”;
command.ExecuteNonQuery();
try
{
// ...
command.CommandText = “COMMIT”;
command.ExecuteNonQuery();
}
catch (SqliteException ex)
when (ex.SqliteErrorCode == SQLitePCL.raw.SQLITE_BUSY)
{
// TODO: Retry
} |
Thanks, that's great! How long does it normally take, until new features like this one are available? |
Our next release is 5.0 in November. Are you interested in contributing? |
It seems to me than Sqlite can support parallel readers when in WAL mode (see https://www.sqlite.org/wal.html, § 2.2). Wouldn't that be an interesting addon to the provider? It shall be possible to support Snapshot isolation level only after checking the DB is in WAL mode... |
@bricelam Why do you ask? Do you want me to implement the deferred transaction feature? |
@tzehetner Only if you’re interested. We might not be able to get to it for 5.0 |
@ArnaudDebaene I don’t think the semantics are quite the same as snapshot isolation. That said, once we enable deferred transactions, you can get even greater throughput by enabling WAL. Note, EF Core automatically enables WAL on the databases it creates. |
I just read the SQLite documentation. If I understand the page correct, than all transactions are deferred transactions by default ("Transactions can be DEFERRED, IMMEDIATE, or EXCLUSIVE. The default transaction behavior is DEFERRED."). From that page, the only mode which is not yet accessible via |
@AlexanderTaeschner At the moment the C# code starts transactions mostly as IMMEDIATE and in one case as DEFERRED, but with the read_uncommitted PRAGMA. There is no way to start a deferred transaction without the read_uncommitted PRAGMA. @bricelam AlexanderTaeschner mentioned the third transaction mode: EXCLUSIVE. Perhaps the new BeginTransaction overload should have an enum parameter instead of a boolean parameter, so that you can pass either DEFERRED, IMMEDIATE or EXCLUSIVE? @bricelam Regarding the implementation: I don't have time now, but I will look into it and perhaps provide an implementation later on. |
I don’t think the difference between IMMEDIATE and EXCLUSIVE is significant. I’d like to see a real-world scenario for it before we start designing for it. A bool to defer when serialized is enough for now. Long-running reads in the background is a valid real-world scenario to enable. |
OK, I understand. |
I just took a look at the actual code and I don't see what you describe above. If one calls |
It's easy to miss, but before |
@AlexanderTaeschner @bricelam Thanks a lot! |
Can you explain why the semantics are not the same as snapshot isolation? Is there a divergent definition of "snapshot isolation" here or a misunderstanding of WAL mode? |
This sounds reasonable. Please file a new issue. @marchy AFAIK, you can't mix in-memory databases and WAL |
What problem are you trying to solve?
I would like to execute multiple read transactions in parallel, while each transaction sees a consistent snapshot of the data (no read uncommited) and the transactions do not block each other. This would be a huge improvement and could be implemented by changing one line of code in SqliteTransaction.cs.
Describe the solution you'd like
At the moment SqliteConnection.BeginTransaction() supports isolation levels “ReadUncommitted” and “Serializable”.
Serializable provides a consistent data snapshot but blocks other serializable transactions. One reader would have to wait until the other reader has finished his transaction, which is quite bad for performance, if you have multiple long running transactions.
ReadUncommited requires SqliteCacheMode.Shared and suffers from dirty reads, non repeatable reads and phantoms.
Solution: Allow IsolationLevel.Snapshot by changing line 43 in SqliteTransaction.cs, so that it doesn’t throw an exception. The remaining code will then automatically start a deferred transaction, which starts a SQLite read transaction on the first select statement.
SQLite read transactions provide a consistent data snapshot without suffering from dirty reads, non repeatable reads and phantoms and without blocking other read transactions (see https://www.sqlite.org/lang_transaction.html).
Additonal information
In SQL Server the isolation level “Snapshot” specifies that data read by any statement in a transaction will be the transactionally consistent version of the data that existed at the start of the transaction (see https://docs.microsoft.com/en-us/sql/t-sql/statements/set-transaction-isolation-level-transact-sql?view=sql-server-ver15).
The proposed implementation is different in that data read will be the transactionally consistent version of the data that existed at the first select statement – not at the start of the transactions. I tried to execute a “select 1;” statement after the “begin;” statement, but SQLite still started the underlying read transaction only at the first “real” select statement, that accessed a table in the database.
The System.Data.SQLite provider requires you to specify IsolationLevel.ReadCommitted instead of IsolationLevel.Snapshot to start a read transaction, but I think that Snapshot is more correct, because SQLite read transactions do not suffer from non repeatable reads and phantoms as ReadCommited does (see isolation level table on this page: https://docs.microsoft.com/en-us/sql/connect/jdbc/understanding-isolation-levels?view=sql-server-ver15).
The text was updated successfully, but these errors were encountered: