Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle SQL injection vulnerabilities within ObjectToSQLString #3547

Merged
merged 26 commits into from
Jul 2, 2024

Conversation

fredericDelaporte
Copy link
Member

@fredericDelaporte fredericDelaporte commented May 12, 2024

Fix SQL injections noticed in #3516.

It does not fix wholly #3516. Bugs noted in #3516 that do not cause a security issue are not fixed.

@fredericDelaporte fredericDelaporte added this to the 5.4.9 milestone May 12, 2024
@fredericDelaporte

This comment was marked as resolved.

.Append('\'');

if (isUnicode)
literal.Insert(0, "U&");
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other databases supports that escaping, but only DB2 documentation seemed to imply Unicode string literals were supported only by using it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice if this code avoided multiple iterations over the initial value. The longer the value and more escaping added, the more expensive this operation could become. I would recommend similar changes elsewhere.

src/NHibernate/Dialect/Dialect.cs Outdated Show resolved Hide resolved
@deAtog
Copy link
Contributor

deAtog commented May 16, 2024

I don't like this fix. In my opinion, every dialect should have a method to escape a string value for use in a SQL statement. This allows the mechanism of how strings are escaped to vary from dialect to dialect. Additionally, this alleviates the need for a flag as the default implementation of this function in the Dialect class can simply return the value as is.

For example, here is an EscapeString method that I wrote to escape string literals for MySql. Feel free to adapt the code as needed:

public static string EscapeString(string s) {
    StringBuilder sb = new StringBuilder();

    foreach (char c in s) {
        switch (c) {
            case '\0':
                sb.Append(@"\0");
                break;
            case '\b':
                sb.Append(@"\b");
                break;
            case '\n':
                sb.Append(@"\n");
                break;
            case '\r':
                sb.Append(@"\r");
                break;
            case '\t':
                sb.Append(@"\t");
                break;
            case '\x1A':
                sb.Append(@"\Z");
                break;
            case '\'':
                sb.Append(@"\'");
                break;
            case '\"':
                sb.Append("\\\"");
                break;
            case '\\':
                sb.Append(@"\\");
                break;
            default:
                sb.Append(c);
                break;
         }
    }

    return sb.ToString();
}

@hazzik
Copy link
Member

hazzik commented May 17, 2024

I would agree with @deAtog that it would be better to delegate the escape to the dialect instead of introducing the configuration option.

EDIT: the implementation looks legit.

@fredericDelaporte
Copy link
Member Author

fredericDelaporte commented May 18, 2024

I do not really understand your comment, deAtog.

every dialect should have a method to escape a string value for use in a SQL statement

But that is what this PR does, since the base implementation is overridable. Any dialect can define its own implementation if needed. To minimize the likeliness of leaving a dialect with a vulnerability, we should have the most common escaping implemented in the base dialect. All dialects in NHibernate support doubling the quotes. Quite many also support backslash escaping as an alternative, which forces to escapes backslashes too for them. That is why the base implementation handles doubling single quotes and provides an option to easily enable backslash escapes.

Furthermore, dialects known to support backslash escapes are enabling the option by default. I still let open the possibility to enable/disable it by configuration due to cases like PostgreSQL, which has varied about its backslash escape handling.

A dialect whose needs would require a different implementation can still provide it, as done in the DB2 case.

About additional escapes that may be required for database also handling a backslash escape, I may be wrong but it seemed to me the backslash escaping was enough. Or would not escaping line feeds and the like still open injection vulnerabilities for them?

I am not attempting to implement completely all the various escaping of all the supported databases. The subject is to remove a corner cases injection vulnerability. (There is very few uses of literal injection in NHibernate: discriminators, which are normally static values in mapping files, not application user data; HQL queries referencing static fields, which is a very little known and used feature; and some sql builder helpers overloads that NHibernate does not use and that we are going to obsolete.)

@fredericDelaporte fredericDelaporte force-pushed the sql-injection branch 2 times, most recently from 528536b to f079965 Compare May 19, 2024 19:28
/// Indicates if the database needs to have backslash escaped in string literals.
/// </summary>
/// <remarks>The default value is dialect dependent.</remarks>
public const string EscapeBackslashInStrings = "escape_backslash_in_strings";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not be a user configurable option in my opinion. The dialect should perform the all appropriate escaping where necessary. For example, your implementation for DB2 ignores this setting while always escaping back slashes for Unicode strings.

@Pfmawlaw

This comment was marked as duplicate.

@deAtog
Copy link
Contributor

deAtog commented May 21, 2024

I do not really understand your comment, deAtog.

every dialect should have a method to escape a string value for use in a SQL statement

My point here is that the base dialect shouldn't try to do anything. In my opinion this should probably be an abstract method that all dialects are required to implement. I personally do not like the idea of having a base implementation for something that is largely database dependent.

But that is what this PR does, since the base implementation is overridable. Any dialect can define its own implementation if needed. To minimize the likeliness of leaving a dialect with a vulnerability, we should have the most common escaping implemented in the base dialect. All dialects in NHibernate support doubling the quotes. Quite many also support backslash escaping as an alternative, which forces to escapes backslashes too for them. That is why the base implementation handles doubling single quotes and provides an option to easily enable backslash escapes.

Furthermore, dialects known to support backslash escapes are enabling the option by default. I still let open the possibility to enable/disable it by configuration due to cases like PostgreSQL, which has varied about its backslash escape handling.

A dialect whose needs would require a different implementation can still provide it, as done in the DB2 case.

About additional escapes that may be required for database also handling a backslash escape, I may be wrong but it seemed to me the backslash escaping was enough. Or would not escaping line feeds and the like still open injection vulnerabilities for them?

Personally, I don't know. I can see cases where certain control characters may cause problems. The handling of these characters is solely database/library dependent. Some may have issues, others may not. For example, one might be able to insert a backspace character into a string to delete previous characters or insert a null character to early terminate the string.

Also, while there may be a standard for escaping single quotes within a single quoted string literal, not every database escapes them according to the standard. MySQL, for instance, recommends escaping single quotes with a backslash, but supports the use of doubling the number of single quotes within a single quoted string literal.

I am not attempting to implement completely all the various escaping of all the supported databases. The subject is to remove a corner cases injection vulnerability. (There is very few uses of literal injection in NHibernate: discriminators, which are normally static values in mapping files, not application user data; HQL queries referencing static fields, which is a very little known and used feature; and some sql builder helpers overloads that NHibernate does not use and that we are going to obsolete.)

I'll admit, I don't know the full scope of this issue. Most of the above seem like cases where the injection would have to be purposeful. However, if there is a case where a user-provided value can be directly inserted into a query then I wouldn't leave anything to chance. In my experience, it's better to err on the side of caution than to assume a partial implementation corrects all the issues.

@fredericDelaporte
Copy link
Member Author

In my opinion this should probably be an abstract method that all dialects are required to implement.

That is a breaking change which would mandate a new major version. This cannot be done in a patch release. (A throwing base implementation would not be acceptable either for a patch release.)

This should not be a user configurable option in my opinion.

See the PostgreSQL case: it was supporting backslash escapes by default up to its 9.1 version, with an option to disable it. Then it has ceased supporting them by default, but it still allows to re-enable them with a configuration setting. That case can only be handled through a configuration setting on our side too, since we cannot assume which configuration will have the database, whatever its version. The user needs to be able to enable/disable the escape for PostgreSql according to its database configuration, if he uses one of the features relying on literal injection.

For example, your implementation for DB2 ignores this setting while always escaping back slashes for Unicode strings.

Because its Unicode string encoding mandates backslash escapes, no matter what. So, that would be nonsense to follow the configuration settings. Such a setting is meant for cases where the user can know better what is the correct course of action.

It would be nice if this code avoided multiple iterations over the initial value.

Sure. But I see this as just a nice to have, since the implied features are either very seldomly used, or, in the case of discriminators, usually very short.

My first course of resolution was to suggest banning all literal injections in favor of parameterization, which is the best way to avoid injection vulnerabilities. But some discriminator use cases where this could significantly lower performances were raised. So, now, I attempt a "good enough" fix instead.

@deAtog
Copy link
Contributor

deAtog commented May 21, 2024

In my opinion this should probably be an abstract method that all dialects are required to implement.

That is a breaking change which would mandate a new major version. This cannot be done in a patch release. (A throwing base implementation would not be acceptable either for a patch release.)

A base implementation that does nothing but return the value provided would be best in my opinion. This would allow the method to be changed to abstract in a future major release. This also guarantees that the base implementation does not break some obscure dialect. As is, there is no guarantee that the current implementation doesn't introduce a breaking change for all dialects.

This should not be a user configurable option in my opinion.

See the PostgreSQL case: it was supporting backslash escapes by default up to its 9.1 version, with an option to disable it. Then it has ceased supporting them by default, but it still allows to re-enable them with a configuration setting. That case can only be handled through a configuration setting on our side too, since we cannot assume which configuration will have the database, whatever its version. The user needs to be able to enable/disable the escape for PostgreSql according to its database configuration, if he uses one of the features relying on literal injection.

There's no doubt that a configuration flag is needed here for PostgreSQL. The problem here is that you've found a case where the dialect breaks the base implementation and have introduced a global flag to avoid a dialect specific issue, hence my recommendation above. Dialects are provided all configuration properties when the Dialect.Configure is called. My recommendation would be to only interpret the escape_backslash_in_strings property from within the PostgreSQL dialect. There is no convention on where to define dialect specific properties, but I don't think defining it with all the other global properties is appropriate as it implies that it affects all dialects alike. I have seen many cases of cache specific properties that use a similar approach. Such properties are usually only mentioned in the documentation, but it would be good to have constants defined for them and a convention of where to find them.

For example, your implementation for DB2 ignores this setting while always escaping back slashes for Unicode strings.

Because its Unicode string encoding mandates backslash escapes, no matter what. So, that would be nonsense to follow the configuration settings. Such a setting is meant for cases where the user can know better what is the correct course of action.

That's exactly my point. From the context provided, the user has no idea whether or not the setting will affect the dialect they're using.

It would be nice if this code avoided multiple iterations over the initial value.

Sure. But I see this as just a nice to have, since the implied features are either very seldomly used, or, in the case of discriminators, usually very short.

My first course of resolution was to suggest banning all literal injections in favor of parameterization, which is the best way to avoid injection vulnerabilities. But some discriminator use cases where this could significantly lower performances were raised. So, now, I attempt a "good enough" fix instead.

Parameters would undoubtedly be better in my opinion, regardless of performance. There are many opportunities where performance could be improved, but now is not the time to address that concern.

literal
.Replace("'", "''")
.Insert(0, '\'')
.Append('\'');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no guarantee that the implementation of this method won't break an obscure dialect that does not support doubling the number of single quotes inside a single quoted string to escape a single quote. While this is the ANSI standard, not all dialects adhere to that standard.

The only non-breaking thing to do here is to simply return the value passed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixing a bug of lack of escapism is anyway behavior breaking for those which were compensating it by escaping values themselves (if anyone).

And from a security standpoint, better take the risk of breaking some dialects not natively supported by NHibernate rather than leaving the most common vulnerability (lack of doubling the quote) open for all not natively supported dialects.

Fixing bug is anyway always somewhat behavior breaking. That is normal and accepted for patch releases. That is binary breaking changes or source breaking changes which are banned by SemVer for a patch or minor release.

Mores specifically, from https://semver.org/ :

Patch version Z (x.y.Z | x > 0) MUST be incremented if only backward compatible bug fixes are introduced. A bug fix is defined as an internal change that fixes incorrect behavior.

(So, a new throwing base method causing previously "working in most cases" features to cease working is neither a binary breaking change nor a source one. But undoubtedly it does not qualify as "an internal change that fixes incorrect behavior".)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, you are saying that in the case of a bug fix, it's okay to throw an exception for changes that correct behavior. Admittedly, I do not have a problem with that so long as the exception is thrown in the appropriate location. The changes here would cause an exception to be thrown once the statement is executed, not when the method is called. Any dialects broken by this change would therefore not know the root cause of the exception. Therefore, if you'd like to throw an exception in the base case instead of leaving external dialects vulnerable, that would be acceptable.

Alternatively, as already discussed, NHibernate could use query parameters to avoid escaping any values. It may not be as performant as the solution here, but it would undoubtedly be more maintainable and secure. It would also not introduce any breaking changes.

Copy link
Member Author

@fredericDelaporte fredericDelaporte May 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I am not saying that:

you are saying that in the case of a bug fix, it's okay to throw an exception for changes that correct behavior.

/// <returns>The appropriate SQL string literal.</returns>
/// <exception cref="ArgumentNullException">Thrown if <paramref name="value"/> or
/// <paramref name="type"/> is <see langword="null" />.</exception>
public virtual string ToStringLiteral(string value, SqlType type)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sold on necessity of passing the SqlType here. The primary purpose of this value is to determine whether or not the value to be escaped should result in a Unicode value. I would suggest one of the following:

  1. Create a separate method called ToUnicodeStringLiteral.
  2. Pass a boolean value indicating whether or not to return a Unicode string literal.

Admittedly, I prefer the former over the latter as it allows dialects to have different implementations if needed without a conditional on the boolean value provided.

@fredericDelaporte
Copy link
Member Author

I have added the fix for the char type. I have reordered and cleaned the commits by the way, putting tests first to facilitate checking them without the fix.

@@ -51,9 +51,7 @@ public override void Set(DbCommand cmd, object value, int index, ISessionImpleme
}

public override string ObjectToSQLString(object value, Dialect.Dialect dialect)
{
return '\'' + value.ToString() + '\'';
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ToString instead of cast does allow for some type mismatch in queries using static fields. That allows comparing a char entity property to a string instead of a char. The reverse fails due to the string type having a cast instead.

So, we have inconsistent behavior here. As that is not the subject of this PR, I leave it as is.

@hazzik
Copy link
Member

hazzik commented Jun 3, 2024

@fredericDelaporte is it still WIP?

@fredericDelaporte
Copy link
Member Author

fredericDelaporte commented Jun 4, 2024

I was intending to fix the cases of the other types allowing injection, like decimal.

For it to be no more WIP, either I fix the other type cases, or I change it to be the fix only for character and string types.

@hazzik
Copy link
Member

hazzik commented Jun 5, 2024

Ok, thanks for the clarification

@deAtog
Copy link
Contributor

deAtog commented Jun 6, 2024

@hazzik Even if this were only for strings this is not ready for merging in my opinion. I have reservations about breaking non-ANSI compliant dialects with the changes proposed here. It would be far better to switch to parameters, regardless of performance impacts, than use what has been implemented here. NHibernate already suffers from quite a few performance related issues and the impact of using parameters here is irrelevant given their security benefits.

@fredericDelaporte
Copy link
Member Author

Still WIP: I have to review all the other types to check if they allow some SQL injection.

Copy link
Member Author

@fredericDelaporte fredericDelaporte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, still WIP

src/NHibernate.Test/NHSpecificTest/GH3516/FixtureByCode.cs Outdated Show resolved Hide resolved
public override string ObjectToSQLString(object value, Dialect.Dialect dialect) =>
"'" + (DateTime) value + "'";
public override string ObjectToSQLString(object value, Dialect.Dialect dialect)
=> dialect.ToStringLiteral(((DateTime) value).ToString(), SqlTypeFactory.GetAnsiString(50));
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is still far from being a correct datetime representation for many SQL vendors and cultures. But at least, it does no more allow SQL injections. Granted, injecting through a very custom culture is a hard to exploit attack vector, even more with the unlikeliness of the vulnerable features to be used by applications. But after all, who knows how corrupted a computer culture might be?

src/NHibernate/Type/DecimalType.cs Outdated Show resolved Hide resolved
src/NHibernate/Type/DecimalType.cs Outdated Show resolved Hide resolved
@@ -100,7 +101,7 @@ public new object StringToObject(string xml)
/// <inheritdoc />
public override string ObjectToSQLString(object value, Dialect.Dialect dialect)
{
return '\'' + ((DateTime)value).Ticks.ToString() + '\'';
return '\'' + ((DateTime)value).Ticks.ToString(CultureInfo.InvariantCulture) + '\'';
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another fix for this case could be to use ToSqlStringLiteral. But it seems more lightweight to fix it like this.

I do not understand why is this returned as a string literal by the way. I would have expected it to be a int64 literal. But that is beyond the scope of this PR.

src/NHibernate/Type/Int32Type.cs Outdated Show resolved Hide resolved
@fredericDelaporte fredericDelaporte changed the title WIP - Handles SQL injection vulnerability within ObjectToSQLString Handles SQL injection vulnerability within ObjectToSQLString Jun 12, 2024
@fredericDelaporte fredericDelaporte changed the title Handles SQL injection vulnerability within ObjectToSQLString Handle SQL injection vulnerabilities within ObjectToSQLString Jun 12, 2024
@fredericDelaporte fredericDelaporte merged commit b4a69d1 into nhibernate:5.4.x Jul 2, 2024
20 of 21 checks passed
@fredericDelaporte fredericDelaporte deleted the sql-injection branch July 2, 2024 22:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants