Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RegexMatchTimeoutException for SqlServerMigrationsSqlGenerator #16175

Closed
oatsoda opened this issue Jun 20, 2019 · 9 comments · Fixed by #30069
Closed

RegexMatchTimeoutException for SqlServerMigrationsSqlGenerator #16175

oatsoda opened this issue Jun 20, 2019 · 9 comments · Fixed by #30069
Labels
area-migrations area-sqlserver closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. punted-for-3.0 punted-for-5.0 punted-for-6.0 punted-for-7.0 Originally planned for the EF Core 7.0 (EF7) release, but moved out due to resource constraints. type-bug
Milestone

Comments

@oatsoda
Copy link

oatsoda commented Jun 20, 2019

System.Text.RegularExpressions.RegexMatchTimeoutException: The RegEx engine has timed out while trying to match a pattern to an input string. This can occur for many reasons, including very large inputs or excessive backtracking caused by nested quantifiers, back-references and other factors.
   at System.Text.RegularExpressions.RegexRunner.DoCheckTimeout()
   at System.Text.RegularExpressions.RegexInterpreter.Go()
   at System.Text.RegularExpressions.RegexRunner.Scan(Regex regex, String text, Int32 textbeg, Int32 textend, Int32 textstart, Int32 prevlen, Boolean quick, TimeSpan timeout)
   at System.Text.RegularExpressions.Regex.Run(Boolean quick, Int32 prevlen, String input, Int32 beginning, Int32 length, Int32 startat)
   at System.Text.RegularExpressions.Regex.Match(String input, Int32 startat)
   at System.Text.RegularExpressions.RegexReplacement.Replace(Regex regex, String input, Int32 count, Int32 startat)
   at System.Text.RegularExpressions.Regex.Replace(String input, String replacement, Int32 count, Int32 startat)
   at System.Text.RegularExpressions.Regex.Replace(String input, String replacement)
   at System.Text.RegularExpressions.Regex.Replace(String input, String pattern, String replacement, RegexOptions options, TimeSpan matchTimeout)
   at Microsoft.EntityFrameworkCore.Migrations.SqlServerMigrationsSqlGenerator.Generate(SqlOperation operation, IModel model, MigrationCommandListBuilder builder)
   at Microsoft.EntityFrameworkCore.Migrations.MigrationsSqlGenerator.<>c.<.cctor>b__73_25(MigrationsSqlGenerator g, MigrationOperation o, IModel m, MigrationCommandListBuilder b)
   at Microsoft.EntityFrameworkCore.Migrations.MigrationsSqlGenerator.Generate(MigrationOperation operation, IModel model, MigrationCommandListBuilder builder)
   at Microsoft.EntityFrameworkCore.Migrations.SqlServerMigrationsSqlGenerator.Generate(MigrationOperation operation, IModel model, MigrationCommandListBuilder builder)
   at Microsoft.EntityFrameworkCore.Migrations.MigrationsSqlGenerator.Generate(IReadOnlyList`1 operations, IModel model)
   at Microsoft.EntityFrameworkCore.Migrations.SqlServerMigrationsSqlGenerator.Generate(IReadOnlyList`1 operations, IModel model)
   at Microsoft.EntityFrameworkCore.Migrations.Internal.Migrator.GenerateUpSql(Migration migration)
   at Microsoft.EntityFrameworkCore.Migrations.Internal.Migrator.<>c__DisplayClass13_2.<GetMigrationCommandLists>b__2()
   at Microsoft.EntityFrameworkCore.Migrations.Internal.Migrator.Migrate(String targetMigration)
   at Microsoft.EntityFrameworkCore.RelationalDatabaseFacadeExtensions.Migrate(DatabaseFacade databaseFacade)

Version: 2.2.4
Provider: SqlServer

I have some fairly large initial Migrations to set up some static data (approx 100,000 lines):

    public partial class InitialStateStaticData : Migration
    {
        protected override void Up(MigrationBuilder migrationBuilder)
        {            
            var sql = @"
                <snip>
            ";
            migrationBuilder.Sql(sql, true); 
        }

        // ...
    }

The error seems to be coming from the fact that there is a limited 1000ms timeout on the regex parsing of this.

https://github.com/aspnet/EntityFrameworkCore/blob/f9d1df40fed1e1d7780553c3f9c1813e81eca1aa/src/EFCore.SqlServer/Migrations/SqlServerMigrationsSqlGenerator.cs#L1105

To mitigate this I added more GO statements into my SQL to avoid it, which worked fine. That was until I deployed it into an Azure Function and now presumably because of the lower CPU/Mem I am running into issues.

Obviously I am looking in to way I could reduce the 100,000 lines and/or increasing resources, but it might also be helpful if this Timeout was overidable somehow (if the value wasn't set at all I could have used the REGEX_DEFAULT_MATCH_TIMEOUT Environment variable), perhaps in the migrationBuilder.Sql( method call?

@divega
Copy link
Contributor

divega commented Jun 20, 2019

@bricelam can you please explain the workaround?

@divega
Copy link
Contributor

divega commented Jun 20, 2019

Triage: We believe the root cause of the issue is what @oatsoda mentioned above: For a sufficiently long input and a sufficiently slow CPU, the regular expression processing times out before it can complete the work.

We have aggressively added timeout to any regex usage across our codebase as a blanket measure to mitigate catastrophic backtracking. When doing so actually causes functional issues we need to consider other alternatives. For example:

  1. Rewrite the functionality without using a regex
  2. Improve the regex to be more efficient (e.g. using lookahead or lookbehind subexpressions) so that they are more likely to process the whole input in the allotted time.
  3. Employ a non backtracking (?> subexpression ) to prevent any backtracking if the regex doesn't require it

Adding a timeout parameter to the Sql() method doesn't seem to make sense (users shouldn't need to know that there is a regex doing work).

All that said, we don't have much time left in 3.0 and we wish to punt this issue to the backlog.

In the meantime we believe that customers that hit this can implement themselves the split in application code and make multiple calls to the Sql() methods rather than pass long SQL strings with multiple GO to a single call.

@oatsoda
Copy link
Author

oatsoda commented Jun 21, 2019

@divega I will give it a go splitting it. Thanks.

@bricelam
Copy link
Contributor

Here's our splitting code copied into an extension method. You can tweak it as needed:

static void LargeSql(this MigrationBuilder migrationBuilder, string sql, bool suppressTransaction = false)
{
    var batches = Regex.Split(
        Regex.Replace(
            sql,
            @"\\\r?\n",
            string.Empty,
            default,
            TimeSpan.FromMilliseconds(1000.0)), // TODO: Increase timeout as needed
        @"^\s*(GO[ \t]+[0-9]+|GO)(?:\s+|$)",
        RegexOptions.IgnoreCase | RegexOptions.Multiline,
        TimeSpan.FromMilliseconds(1000.0)); // TODO: Increase timeout as needed
    for (var i = 0; i < batches.Length; i++)
    {
        if (batches[i].StartsWith("GO", StringComparison.OrdinalIgnoreCase)
            || string.IsNullOrWhiteSpace(batches[i]))
        {
            continue;
        }

        var count = 1;
        if (i != batches.Length - 1
            && batches[i + 1].StartsWith("GO", StringComparison.OrdinalIgnoreCase))
        {
            var match = Regex.Match(
                batches[i + 1], "([0-9]+)",
                default,
                TimeSpan.FromMilliseconds(1000.0)); // TODO: Increase timeout as needed
            if (match.Success)
            {
                count = int.Parse(match.Value);
            }
        }

        for (var j = 0; j < count; j++)
        {
            var batchSql = batches[i];
            if (i == batches.Length - 1)
            {
                batchSql += Environment.NewLine;
            }
            
            migrationBuilder.Sql(batchSql, suppressTransaction);
        }
    }
}

@oatsoda
Copy link
Author

oatsoda commented Jun 24, 2019

Thanks @bricelam I will give it a go.

@divega I can understand that another parameter might be a bit ugly as consumers won't really want to know that a Regex is taking place, but equally they may be like me and hit this issue! Perhaps you could detect if the environment variable REGEX_DEFAULT_MATCH_TIMEOUT is set and if so don't force the 1000ms timeout - allow Regex to just use the default that the environment has set?

@divega
Copy link
Contributor

divega commented Jun 24, 2019

Thanks @oatsoda. IMO, that is reasonable. When we actually try to fix this issue we can try this option if we can't come up with something better that makes the timeout unnecessary.

cc @bricelam

@oatsoda
Copy link
Author

oatsoda commented Jun 26, 2019

@bricelam I've used your workaround to increase the timeout but I think there could be somewhere else that a Regex is applied.

It gets past the migrationBuilder.LargeSql (to the end of the Up method in the Migration) that I have added...but then AFTER that it gets a Regex timeout...

System.Text.RegularExpressions.RegexMatchTimeoutException:
   at System.Text.RegularExpressions.RegexRunner.DoCheckTimeout (System.Text.RegularExpressions, Version=4.2.1.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a)
   at System.Text.RegularExpressions.RegexInterpreter.Go (System.Text.RegularExpressions, Version=4.2.1.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a)
   at System.Text.RegularExpressions.RegexRunner.Scan (System.Text.RegularExpressions, Version=4.2.1.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a)
   at System.Text.RegularExpressions.Regex.Run (System.Text.RegularExpressions, Version=4.2.1.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a)
   at System.Text.RegularExpressions.Regex.Match (System.Text.RegularExpressions, Version=4.2.1.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a)
   at System.Text.RegularExpressions.Regex.Split (System.Text.RegularExpressions, Version=4.2.1.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a)
   at System.Text.RegularExpressions.Regex.Split (System.Text.RegularExpressions, Version=4.2.1.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a)
   at System.Text.RegularExpressions.Regex.Split (System.Text.RegularExpressions, Version=4.2.1.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a)
   at System.Text.RegularExpressions.Regex.Split (System.Text.RegularExpressions, Version=4.2.1.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a)
   at Microsoft.EntityFrameworkCore.Migrations.SqlServerMigrationsSqlGenerator.Generate (Microsoft.EntityFrameworkCore.SqlServer, Version=2.2.4.0, Culture=neutral, PublicKeyToken=adb9793829ddae60)
   at Microsoft.EntityFrameworkCore.Migrations.MigrationsSqlGenerator+<>c.<.cctor>b__73_25 (Microsoft.EntityFrameworkCore.Relational, Version=2.2.4.0, Culture=neutral, PublicKeyToken=adb9793829ddae60)
   at Microsoft.EntityFrameworkCore.Migrations.MigrationsSqlGenerator.Generate (Microsoft.EntityFrameworkCore.Relational, Version=2.2.4.0, Culture=neutral, PublicKeyToken=adb9793829ddae60)
   at Microsoft.EntityFrameworkCore.Migrations.SqlServerMigrationsSqlGenerator.Generate (Microsoft.EntityFrameworkCore.SqlServer, Version=2.2.4.0, Culture=neutral, PublicKeyToken=adb9793829ddae60)
   at Microsoft.EntityFrameworkCore.Migrations.MigrationsSqlGenerator.Generate (Microsoft.EntityFrameworkCore.Relational, Version=2.2.4.0, Culture=neutral, PublicKeyToken=adb9793829ddae60)
   at Microsoft.EntityFrameworkCore.Migrations.SqlServerMigrationsSqlGenerator.Generate (Microsoft.EntityFrameworkCore.SqlServer, Version=2.2.4.0, Culture=neutral, PublicKeyToken=adb9793829ddae60)
   at Microsoft.EntityFrameworkCore.Migrations.Internal.Migrator.GenerateUpSql (Microsoft.EntityFrameworkCore.Relational, Version=2.2.4.0, Culture=neutral, PublicKeyToken=adb9793829ddae60)
   at Microsoft.EntityFrameworkCore.Migrations.Internal.Migrator+<>c__DisplayClass13_2.<GetMigrationCommandLists>b__2 (Microsoft.EntityFrameworkCore.Relational, Version=2.2.4.0, Culture=neutral, PublicKeyToken=adb9793829ddae60)
   at Microsoft.EntityFrameworkCore.Migrations.Internal.Migrator.Migrate (Microsoft.EntityFrameworkCore.Relational, Version=2.2.4.0, Culture=neutral, PublicKeyToken=adb9793829ddae60)
   at Microsoft.EntityFrameworkCore.RelationalDatabaseFacadeExtensions.Migrate (Microsoft.EntityFrameworkCore.Relational, Version=2.2.4.0, Culture=neutral, PublicKeyToken=adb9793829ddae60)

@bricelam
Copy link
Contributor

I was afraid of that. If there aren't actually any GO statements, pre-splitting won't help. Need to override this code instead.

class MySqlServerMigrationsSqlGenerator : SqlServerMigrationsSqlGenerator
{
    protected override void Generate(
        SqlOperation operation,
        IModel model,
        MigrationCommandListBuilder builder)
    {
        // TODO: Reimplement without timeouts
    }
}
optionsBuilder
    .UseSqlServer(...)
    .ReplaceService<IMigrationsSqlGenerator, MySqlServerMigrationsSqlGenerator>();

@oatsoda
Copy link
Author

oatsoda commented Jun 27, 2019

@bricelam Yeah, I do have GO statements, but I had to reduce it down to each batch being no more than 25 lines (I have about 100,000 in total) to avoid the timeout. Significantly reduced resources running on an Azure Function unfortunately.

Thanks though, the override of the SqlServerMigrationsSqlGenerator worked.

@ajcvickers ajcvickers modified the milestones: 6.0.0, Backlog Jul 30, 2021
@ajcvickers ajcvickers modified the milestones: Backlog, MQ Oct 22, 2021
@ajcvickers ajcvickers modified the milestones: MQ, 7.0.0 Jan 21, 2022
@ajcvickers ajcvickers added propose-punt punted-for-7.0 Originally planned for the EF Core 7.0 (EF7) release, but moved out due to resource constraints. and removed propose-punt labels Jul 6, 2022
@ajcvickers ajcvickers modified the milestones: 7.0.0, Backlog Jul 7, 2022
@ajcvickers ajcvickers assigned ajcvickers and unassigned bricelam Jan 14, 2023
@ajcvickers ajcvickers modified the milestones: Backlog, 8.0.0 Jan 14, 2023
@ajcvickers ajcvickers added the closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. label Jan 14, 2023
@ajcvickers ajcvickers modified the milestones: 8.0.0, 8.0.0-preview1 Jan 29, 2023
@ajcvickers ajcvickers modified the milestones: 8.0.0-preview1, 8.0.0 Nov 14, 2023
@ajcvickers ajcvickers removed their assignment Sep 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-migrations area-sqlserver closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. punted-for-3.0 punted-for-5.0 punted-for-6.0 punted-for-7.0 Originally planned for the EF Core 7.0 (EF7) release, but moved out due to resource constraints. type-bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants