Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EF Functions Contains Search generates nvarchar(max) #19327

Closed
MCSaarloos opened this issue Dec 16, 2019 · 11 comments · Fixed by #19428
Closed

EF Functions Contains Search generates nvarchar(max) #19327

MCSaarloos opened this issue Dec 16, 2019 · 11 comments · Fixed by #19428
Labels
area-query closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. customer-reported type-bug
Milestone

Comments

@MCSaarloos
Copy link

Hi,

I am working with Fulltext search in MSSQL in combination with EF Core 3.1. It's great that we can use EF.Functions.Contains('[Some Column name]', '[Seachvalue]*'). But now I am facing a problem. once the Stringlength of the searchvalue is larger than 8 characters, it will change the parameterlength to nvarchar(max), which results in The argument type "nvarchar(max)" is invalid for argument 2 of "CONTAINS".

How can I force the parameter value in the linq query to have a certain number?

Via the profiler, I found the generated sql:
exec sp_executesql N'SELECT [i].*
FROM [] AS [i]
WHERE (([i].[id] = @__id_0) AND [i].[DeletedAt] IS NULL)
AND CONTAINS([i].[ColumnName], @__Format_2)',N'@__id_0 uniqueidentifier, @__Format_2 nvarchar(max) ',@__id_0 ='00000000-0000-0000-0000-000000000000',@__Format_2=N'SOME1111111*'

But if the value of SOME1111111 is a bit shorter; 8 characters, then the generated SQL is the following:
exec sp_executesql N'SELECT [i].*
FROM [] AS [i]
WHERE (([i].[id] = @__id_0) AND [i].[DeletedAt] IS NULL)
AND CONTAINS([i].[ColumnName], @__Format_2)',N'@__id_0 uniqueidentifier,@__Format_2 nvarchar(11) ',@__id_0 ='00000000-0000-0000-0000-000000000000',@__Format_2=N'SOME1111*'

which works perfectly.

The code used:
var query = _dbContext.SomeTable .Where(x => x.id== companyId && !x.DeletedAt.HasValue) .Where(x => EF.Functions.Contains(x.ColumnName, $"{searchQuery}*")).ToList();

Hope there will be a solution for the problem soon.

@ajcvickers
Copy link
Member

@MCSaarloos What does the EF mapping for ColumnName looks like? Specifically, this issue might happen if it is configured to allow a max of 8 characters and then the search string is longer than that.

If it's not that, then please post a small, runnable project or code listing that demonstrates the behavior you are seeing so that we can investigate.

@MCSaarloos
Copy link
Author

FullTextSampleProject.zip

Hi @ajcvickers I have attached a sample project which I made to reproduce this exact problem.
To run the application,

  • first create a database, set the connection settings in the appsettings
  • run Update-Database from the FullTextSampleProject.Services project.
  • Insert some sample data in the database (FullTextSampleProject.Services\Data\SampleDataScript.sql

Now start the application
First with for example: https://localhost:5001/api/sample/v1/?s=Test
Confirm you see some data,
Next https://localhost:5001/api/sample/v1/?s=PVGHN345678900987654
Here you get the exception:
SqlException: The argument type "nvarchar(max)" is invalid for argument 2 of "CONTAINS".

Hope you can reproduce it with this sample. Have fun ;)

@ajcvickers
Copy link
Member

@smitpatel to take a look, since he has SQL Server Developer Edition.

@smitpatel
Copy link
Member

Related #18861

As long as the search string is in size of the column we infer type correctly. Once it is larger size, we create parameter of nvarchar(max)

@MCSaarloos
Copy link
Author

Yeah I have noticed it, it only occurs when we do a search on multiple columns where the size of the columns differs.
If all the columns have the same size, then the size of the nvarchar parameter will be the size of the columns. Isn't it a solution to take the biggest column size of the columns you search on?

In a contains search, but also in a freetext, I have noticed that the call crashes when the parameter size is nvarchar(max), but when you use for example nvarchar(4000) then the call will work.

@smitpatel
Copy link
Member

Yeah I have noticed it, it only occurs when we do a search on multiple columns where the size of the columns differs.

That is not true. Each parameter is individually processed for what type mapping to assign based on column. The moment your parameter value does not fit in the database type, it will expand to nvarchar(max). If you are using same parameters in conjunction with multiple columns then it would infer based on first occurrence but regardless, if it is oversized value for one, it needs to be expanded anyway. Hence the solution you propose does not work.

@MCSaarloos
Copy link
Author

@smitpatel I don't agree with you, as you don't know what the customer is searching for you can't say if it doesn't fit in the first column, then we expand it to nvarchar(max).

No, think the value of the size of the parameter should be the size of the string you use as value for the parameter, or should have the maximum size of the columns you are searching in. In case there is no size defined for the columns use a maximum of nvarchar(4000) for the size of the parameter. As nvarchar(max) will throw the exception The argument type "nvarchar(max)" is invalid for argument 2 of "CONTAINS".

If the value is oversized for one, then the value won't be found in that column anyways, so no problem I guess

@smitpatel
Copy link
Member

@MCSaarloos - I am not trying to say anything about customer intent. I am merely stating what current code does.

@ajcvickers
Copy link
Member

@MCSaarloos This is quite a tricky area where the most obvious behavior is not necessary the best. Specifically we try to:

  • Avoid query cache fragmentation in SQL Server (perf)
  • Avoid silent truncation in SQLClient (data loss)
  • Avoid using parameter sizes that are larger than they need be (perf)

On top of that, the inference of parameter types in a query is also non-trivial and involves special cases.

We'll certainly look into fixing this issue, but there are a lot of considerations around how best to do that.

@MCSaarloos
Copy link
Author

Can imagine that it's a tricky area, as a lot of code is probably dependent on it. Will see when this problem is addressed.
For now we have given columns we search on in the database the same size. Bit of a dirty solution, but it works.

@ajcvickers ajcvickers self-assigned this Dec 26, 2019
@ajcvickers ajcvickers added this to the 5.0.0 milestone Dec 26, 2019
@ajcvickers
Copy link
Member

Note from triage: look into why we don't go from a specific size to nvarchar(4000) before going to nvarchar(max).

@ajcvickers ajcvickers added type-bug closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. and removed type-enhancement labels Dec 26, 2019
ajcvickers added a commit that referenced this issue Dec 30, 2019
Fixes #19327

Create fixed-length type mappings by default, and then use the correct DbType in the parameter if the length of the value matches the parameter length. This has a better chance of hitting an index. In other cases, change the DbType to match what we did before so that SQLClient doesn't truncate or pad.

Fixes #14555

If the parameter value length is greater than the max length but less than the max bounded length, then using the max bounded length rather than going immediately to unbounded.
ajcvickers added a commit that referenced this issue Dec 30, 2019
Fixes #19327

Create fixed-length type mappings by default, and then use the correct DbType in the parameter if the length of the value matches the parameter length. This has a better chance of hitting an index. In other cases, change the DbType to match what we did before so that SQLClient doesn't truncate or pad.

Fixes #14555

If the parameter value length is greater than the max length but less than the max bounded length, then using the max bounded length rather than going immediately to unbounded.
ajcvickers added a commit that referenced this issue Dec 30, 2019
Fixes #19327

Create fixed-length type mappings by default, and then use the correct DbType in the parameter if the length of the value matches the parameter length. This has a better chance of hitting an index. In other cases, change the DbType to match what we did before so that SQLClient doesn't truncate or pad.

Fixes #14555

If the parameter value length is greater than the max length but less than the max bounded length, then using the max bounded length rather than going immediately to unbounded.
ajcvickers added a commit that referenced this issue Dec 30, 2019
Fixes #19327

Create fixed-length type mappings by default, and then use the correct DbType in the parameter if the length of the value matches the parameter length. This has a better chance of hitting an index. In other cases, change the DbType to match what we did before so that SQLClient doesn't truncate or pad.

Fixes #14555

If the parameter value length is greater than the max length but less than the max bounded length, then using the max bounded length rather than going immediately to unbounded.
@ajcvickers ajcvickers modified the milestones: 5.0.0, 5.0.0-preview1 Mar 13, 2020
@ajcvickers ajcvickers modified the milestones: 5.0.0-preview1, 5.0.0 Nov 7, 2020
@ajcvickers ajcvickers removed their assignment Sep 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-query closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. customer-reported type-bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants