Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid SQL generated with 3.1 #20505

Closed
lukemurray opened this issue Apr 2, 2020 · 9 comments
Closed

Invalid SQL generated with 3.1 #20505

lukemurray opened this issue Apr 2, 2020 · 9 comments

Comments

@lukemurray
Copy link

Hi, I am upgrading a project to dotnet 3.1 and EF 3.1 and therefore Npgsql too. A query that previously worked in 2.2 generates incorrect SQL now.

Moving from npgsql/efcore.pg#1331 as suggested.

Steps to reproduce

Here is the C# query.

// query
var t = dbContext.Floors.Where(f => f.Id == 1)
                .Select(f => new
                {
                    SpaceGroups = f.Spaces
                    .SelectMany(fs => fs.SpaceGroupFloorSpaces.Select(sgfs => sgfs.SpaceGroup))
                    .Distinct()
                    .OrderBy(s => s.Name)
                    .Select(g => new
                    {
                        g.Id,
                    }),
                }).FirstOrDefault();

The relationships here: Floor has 0-many Spaces, each Space can have 0-many SpaceGroups. The Distinct() is there as many spaces might be assigned to the same SpaceGroup.

Worth noting that if I remove the .Distinct() it works. Or if I leave the .Distinct() but remove the .OrderBy(s => s.Name) it also works. But both together it fails.

Here is the SQL it generates. I have replaced the parameters. Note the entity also has a default filter for the current date on it.

  SELECT t.id, t.id
  FROM (
      SELECT f.id
      FROM floor AS f
      WHERE f.start_date <= now() AND f.end_date > now() AND (f.id =1)
      LIMIT 1
  ) AS t
  LEFT JOIN LATERAL (
      SELECT DISTINCT t1.id, t1.color, t1.created, t1.customer_id, t1.description, t1.end_date, t1.name, t1.ratio, t1.start_date, t1.type, t1.updated
      FROM floor_space AS f0
      INNER JOIN (
          SELECT t0.id, t0.color, t0.created, t0.customer_id, t0.description, t0.end_date, t0.name, t0.ratio, t0.start_date, t0.type, t0.updated, s.id AS id0, s.floor_space_id
          FROM space_group_floor_space AS s
          INNER JOIN (
              SELECT s0.id, s0.color, s0.created, s0.customer_id, s0.description, s0.end_date, s0.name, s0.ratio, s0.start_date, s0.type, s0.updated
              FROM space_group AS s0
              WHERE s0.start_date <= now() AND s0.end_date > now()
          ) AS t0 ON s.space_group_id = t0.id
          WHERE s.start_date <= now() AND s.end_date > now()
      ) AS t1 ON f0.id = t1.floor_space_id
      WHERE f0.start_date <= now() AND f0.end_date > now() AND (t.id = f0.floor_id)
  ) AS t2 ON TRUE
  ORDER BY t.id, t.name, t.id

Error is that t.name does not exist.

Further technical details

EF Core version: 3.1
Database provider: NPGSQL
Target framework: 3.1
Operating system: Mac OS
IDE: VS Code

@ajcvickers
Copy link
Member

@maumar to look for dupe.

@maumar
Copy link
Contributor

maumar commented Apr 13, 2020

this is actually still broken on master.

Full repro:

        [ConditionalFact]
        public virtual void Test()
        {
            using (var dbContext = new MyContext())
            {
                dbContext.Database.EnsureDeleted();
                dbContext.Database.EnsureCreated();


                var t = dbContext.Floors.Where(f => f.Id == 1)
                    .Select(f => new
                    {
                        SpaceGroups = f.Spaces
                        .SelectMany(fs => fs.SpaceGroupFloorSpaces.Select(sgfs => sgfs.SpaceGroup))
                        .Distinct()
                        .OrderBy(s => s.Name)
                        .Select(g => new
                        {
                            g.Id,
                        }),
                    }).FirstOrDefault();
            }

        }

        public class Floor
        {
            public int Id { get; set; }
            public List<Space> Spaces { get; set; }
        }


        public class Space
        {
            public int Id { get; set; }
            public List<SpaceGroupFloorSpace> SpaceGroupFloorSpaces { get; set; }
        }


        public class SpaceGroup
        {
            public int Id { get; set; }
            public string Name { get; set; }
            public List<SpaceGroupFloorSpace> SpaceGroupFloorSpaces { get; set; }
        }

        public class SpaceGroupFloorSpace
        {
            public int SpaceGroupId { get; set; }
            public int SpaceId { get; set; }
            public SpaceGroup SpaceGroup { get; set; }
            public Space Space { get; set; }
        }


        public class MyContext : DbContext
        {
            protected override void OnModelCreating(ModelBuilder modelBuilder)
            {
                modelBuilder.Entity<SpaceGroupFloorSpace>()
                    .HasKey(x => new { x.SpaceGroupId, x.SpaceId });
                modelBuilder.Entity<SpaceGroupFloorSpace>()
                    .HasOne(x => x.SpaceGroup)
                    .WithMany(x => x.SpaceGroupFloorSpaces)
                    .HasForeignKey(x => x.SpaceGroupId);

                modelBuilder.Entity<SpaceGroupFloorSpace>()
                    .HasOne(x => x.Space)
                    .WithMany(x => x.SpaceGroupFloorSpaces)
                    .HasForeignKey(x => x.SpaceId);
            }

            public DbSet<Floor> Floors { get; set; }
            protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
            {
                optionsBuilder.UseSqlServer(@"Server=(localdb)\mssqllocaldb;Database=Repro20505;Trusted_Connection=True;MultipleActiveResultSets=true");
            }
        }

sql:

SELECT [t].[Id], [t].[Id]
FROM (
    SELECT TOP(1) [f].[Id]
    FROM [Floors] AS [f]
    WHERE [f].[Id] = 1
) AS [t]
OUTER APPLY (
    SELECT DISTINCT [t0].[Id], [t0].[Name]
    FROM [Space] AS [s]
    INNER JOIN (
        SELECT [s1].[Id], [s1].[Name], [s0].[SpaceGroupId], [s0].[SpaceId]
        FROM [SpaceGroupFloorSpace] AS [s0]
        INNER JOIN [SpaceGroup] AS [s1] ON [s0].[SpaceGroupId] = [s1].[Id]
    ) AS [t0] ON [s].[Id] = [t0].[SpaceId]
    WHERE [t].[Id] = [s].[FloorId]
) AS [t1]
ORDER BY [t].[Id], [t].[Name], [t].[Id]

issues are in order by: t.Name is invalid (should be t1), t.Id specified twice in order by (should be de-duplicated and removed)

@smitpatel smitpatel self-assigned this Apr 13, 2020
@ajcvickers ajcvickers added this to the 5.0.0 milestone Apr 17, 2020
@joakimriedel
Copy link
Contributor

@lukemurray @ajcvickers workaround: change FirstOrDefault() to ToList() (or fully (query.ToList()).FirstOrDefault())

@smitpatel similar to my issue #17809 ?

@joakimriedel
Copy link
Contributor

forgot to ping @maumar to bring the entire team's attention on this query generation bug :)

@smitpatel
Copy link
Member

Duplicate of #15873

I am not sure we can correctly translate this query. With the presence of Distinct we don't have enough data to figure out when to create new groups.

@lukemurray
Copy link
Author

@maumar does this mean this issue will not be fixed?

@lukemurray
Copy link
Author

@maumar are we now tracking this on #15873 ?

Any plans that this will be fixed in a 3.1.x release?

@smitpatel why do you say you can't translate it? I can fix the resulting SQL without adding any new joins or aliases so there should be enough information already right?

If I knew the code base I'd happily have a go fixing this, done a lot of SQL translations from .net in the past.

@smitpatel
Copy link
Member

@lukemurray - It cannot be translated because

  • For each floor, you are taking their Spaces-> SpaceGroupFloorSpaces -> SpaceGroup. Plus applying distinct to it.
    At that point all we have is a enumerable of SpaceGroup inside which we somehow have to correlate with each floor from outer. Due to the collection join, outer elements will be repeated. In simple query we have when floor key changes but if for any reason if floor has repeated key then it is not sufficient. That means we need to figure out based on inner elements which of them are part of same group. Given the navigation chain, we need the key properties of Spaces/SpaceGroupFloorSpaces/SpaceGroup. The SpaceGroup one can be fetch but others we cannot otherwise it would change meaning of Distinct operation leaving us without information we need to successfully form groups.

Since you are only selecting one element from Floors, consider rewriting query to do a where predicate on Floor.Id rather than composition via navigation.

@lukemurray
Copy link
Author

Just want to confirm that using ToList() works for a workaround. It will generate the correct SQL.

var t = dbContext.Floors.Where(f => f.Id == 1)
                .Select(f => new
                {
                    SpaceGroups = f.Spaces
                    .SelectMany(fs => fs.SpaceGroupFloorSpaces.Select(sgfs => sgfs.SpaceGroup))
                    .Distinct()
                    .OrderBy(s => s.Name)
                    .Select(g => new
                    {
                        g.Id,
                    }),
                })
                .ToList()
                .FirstOrDefault();

Which leads me to believe without the ToList() and just the First() it should be able to generate the correct SQL.

@ajcvickers ajcvickers reopened this Oct 16, 2022
@ajcvickers ajcvickers closed this as not planned Won't fix, can't repro, duplicate, stale Oct 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants