Print Page | Close Window

Left join

Printed From: IdeaBlade
Category: DevForce
Forum Name: DevForce 2010
Forum Discription: For .NET 4.0
URL: http://www.ideablade.com/forum/forum_posts.asp?TID=3404
Printed Date: 13-May-2026 at 8:27am


Topic: Left join
Posted By: scottarlp
Subject: Left join
Date Posted: 24-Apr-2012 at 11:49am
I have a question about left joins. I saw a couple of articles of people trying various things to ensure their query is using left joins instead of inner joins. We've noticed some odd behavior with .Include(). We've got a simple query with a .Where() and then two .Include(). The order of the .Include() is causing one to inner join and the other to left join.
 
So something like this:
var q = manager.Customers.Where(c => c.ID == 7)
.Include(c => c.table2)
.Include(c => c.table3);
 
What we're seeing is that this query gives us Customers inner join table2 ... left join table3.
 
If we just swap the order of the .Include() to:
var q = manager.Customers.Where(c => c.ID == 7)
.Include(c => c.table3)
.Include(c => c.table2);
 
We get Customers inner join table3 ... left join table2.
 
According to the docs on http://drc.ideablade.com/xwiki/bin/view/Documentation/include-related-entities - http://drc.ideablade.com/xwiki/bin/view/Documentation/include-related-entities , the records returned shouldn't be influenced by .Include(). This is causing a problem because the table in question has a soft link to the related, so the joins have to be left to keep from excluding data.
 
Any thoughts or is there another approach I can take?



Replies:
Posted By: sbelini
Date Posted: 24-Apr-2012 at 12:58pm
Hi scottarlp,
 
I cannot repro your scenario. Here's my test:
 
var query1 = mgr.Employees.Where(emp => emp.EmployeeID == 2)
  .Include(emp => emp.ManagedEmployees)
  .Include(emp => emp.Orders);
 
var result1 = query1.Execute();
var employees1 = mgr.Employees.With(QueryStrategy.CacheOnly).ToList().Count();
var orders1 = mgr.Orders.With(QueryStrategy.CacheOnly).ToList().Count();
 
result1 = null;
mgr.Clear();
 
var query2 = mgr.Employees.Where(emp => emp.EmployeeID == 2)
  .Include(emp => emp.Orders)
  .Include(emp => emp.ManagedEmployees);
 
var result2 = query2.Execute();
var employees2 = mgr.Employees.With(QueryStrategy.CacheOnly).ToList().Count();
var orders2 = mgr.Orders.With(QueryStrategy.CacheOnly).ToList().Count();
 
 
According to our documentation:
Note that the use of the Include method doesn’t change the list of entities returned from the query. The caller still receives the same results with or without the use of an Include method. The difference is that before returning these results, the "Include" processing fetches the related entities and merges them into the cache. This occurs behind the scenes and does not effect the query result.
The query result, in your case, will be var Customers.Where(c => c.ID == 7) and it is the same regardless of the order or presence of Include's. Note that the entities retrieved via Include are NOT part of the result. Even so, I have checked and confirmed that the are the same regardless of the order of the Include's.
 
Please provide a test solution reproducing/demonstrating your issue (against NorthwindIB if possible). Please make sure that your solution focus only on the stated issue by avoiding unnecessary complexity.
 
Kind regards,
   Silvio.


Posted By: scottarlp
Date Posted: 25-Apr-2012 at 7:02am
I think I've got a solid example. I'm including a script that creates 3 simple tables (1 table has 2 soft links to the other 2).
The tables look like table1 (table1ID, table2ID, table3ID, UpdateDate), table2 (table2ID, UpdateDate), table3 (table3ID, UpdateDate). FK between table1 and the other two tables have No Check and no Enforce FK (just to clarify the soft relationship).
 
Pulling those tables (no data required) into EDMX and then simple query of something like:
var manager = new NorthwindIBEntities();
var q3 = manager.table1
.Include(tab => tab.table2)
.Include(tab => tab.table3)
.ToList();
 
Resulting query it generates for me is (using SQL Profiler):
SELECT
[Extent1].[table1ID] AS [table1ID],
[Extent1].[table2ID] AS [table2ID],
[Extent1].[table3ID] AS [table3ID],
[Extent1].[UpdateDate] AS [UpdateDate],
[Extent2].[table2ID] AS [table2ID1],
[Extent2].[UpdateDate] AS [UpdateDate1],
[Extent3].[table3ID] AS [table3ID1],
[Extent3].[UpdateDate] AS [UpdateDate2]
FROM   [dbo].[table1] AS [Extent1]
INNER JOIN [dbo].[table2] AS [Extent2] ON [Extent1].[table2ID] = [Extent2].[table2ID]
LEFT OUTER JOIN [dbo].[table3] AS [Extent3] ON [Extent1].[table3ID] = [Extent3].[table3ID]
 
If I reverse the order of the .Include() so it's:
var manager = new NorthwindIBEntities();
var q3 = manager.table1
.Include(tab => tab.table3)
.Include(tab => tab.table2)
.ToList();
 
I get the following SQL:
SELECT
[Extent1].[table1ID] AS [table1ID],
[Extent1].[table2ID] AS [table2ID],
[Extent1].[table3ID] AS [table3ID],
[Extent1].[UpdateDate] AS [UpdateDate],
[Extent2].[table3ID] AS [table3ID1],
[Extent2].[UpdateDate] AS [UpdateDate1],
[Extent3].[table2ID] AS [table2ID1],
[Extent3].[UpdateDate] AS [UpdateDate2]
FROM   [dbo].[table1] AS [Extent1]
INNER JOIN [dbo].[table3] AS [Extent2] ON [Extent1].[table3ID] = [Extent2].[table3ID]
LEFT OUTER JOIN [dbo].[table2] AS [Extent3] ON [Extent1].[table2ID] = [Extent3].[table2ID]
 
I'm using VS2010, DF 6.1.6.0.
 
 
/****** Object:  Table [dbo].[table3]    Script Date: 04/25/2012 09:55:28 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[table3](
 [table3ID] [int] IDENTITY(1,1) NOT NULL,
 [UpdateDate] [smalldatetime] NOT NULL,
 CONSTRAINT [PK_table3] PRIMARY KEY CLUSTERED
(
 [table3ID] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON, FILLFACTOR = 80) ON [PRIMARY]
) ON [PRIMARY]
GO
/****** Object:  Table [dbo].[table2]    Script Date: 04/25/2012 09:55:28 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[table2](
 [table2ID] [int] IDENTITY(1,1) NOT NULL,
 [UpdateDate] [smalldatetime] NOT NULL,
 CONSTRAINT [PK_table2] PRIMARY KEY CLUSTERED
(
 [table2ID] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON, FILLFACTOR = 80) ON [PRIMARY]
) ON [PRIMARY]
GO
/****** Object:  Table [dbo].[table1]    Script Date: 04/25/2012 09:55:28 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[table1](
 [table1ID] [int] IDENTITY(1,1) NOT NULL,
 [table2ID] [int] NOT NULL,
 [table3ID] [int] NOT NULL,
 [UpdateDate] [smalldatetime] NOT NULL,
 CONSTRAINT [PK_table1] PRIMARY KEY CLUSTERED
(
 [table1ID] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON, FILLFACTOR = 80) ON [PRIMARY]
) ON [PRIMARY]
GO
/****** Object:  ForeignKey [FK_table1_table2]    Script Date: 04/25/2012 09:55:28 ******/
ALTER TABLE [dbo].[table1]  WITH NOCHECK ADD  CONSTRAINT [FK_table1_table2] FOREIGN KEY([table2ID])
REFERENCES [dbo].[table2] ([table2ID])
GO
ALTER TABLE [dbo].[table1] NOCHECK CONSTRAINT [FK_table1_table2]
GO
/****** Object:  ForeignKey [FK_table1_table3]    Script Date: 04/25/2012 09:55:28 ******/
ALTER TABLE [dbo].[table1]  WITH NOCHECK ADD  CONSTRAINT [FK_table1_table3] FOREIGN KEY([table3ID])
REFERENCES [dbo].[table3] ([table3ID])
NOT FOR REPLICATION
GO
ALTER TABLE [dbo].[table1] CHECK CONSTRAINT [FK_table1_table3]
GO


Posted By: scottarlp
Date Posted: 25-Apr-2012 at 7:12am
Ok, I think I see what causes this to happen.
When I first started trying to duplicate the issue, I changes the relationship of the Product -> Supplier & Category tables in NorthwindIB to soft links. It still did a left join no matter what. Then I noticed the fields where nullable. Changing them to not nullable will reproduce the issue with them.
 
It appears that whatever generates the query sees a nullable field and knows that it should be a left join. But when it sees a non-nullable field, it's making the first include inner and then subsequent left joins.


Posted By: sbelini
Date Posted: 26-Apr-2012 at 2:24pm
Hi scottarlp,
 
I first wanted to clarify that DevForce plays no role in generating the SQL executed in the datasource. This is done by EF/Data Provider.
 
Next, I generated tables using the SQL script you sent which gave me:
 
=======           =======           =======
|| Table2 || 1__*  || Table1 || *__1  || Table3 ||
=======           =======           =======
 
Based on the above, a join between table1 and table2 (or table3) will ALWAYS return the same results regardless of being an Inner or Left Join.
Being that said, you will always get the same results.
 
Could you clarify if you are actually getting different results based on the order of Include's or are just being cautious in regards of how the SQL is generated?
 
Kind regards,
   Silvio.


Posted By: scottarlp
Date Posted: 27-Apr-2012 at 4:52am
Thanks for the information. I'll have to mark this up as another MS oversight. The use .Include() should provide consistent results I would think and the order of applying them shouldn't change the resulting query.
 
One comment on yours though. You don't get the same results using .Include() when the relationship is soft (not enforced). An inner join to a FK table that has no data will limit the results from the source table. So in the example I provided, if I've got 10 rows and neither table2 or table3 has any data, the inner join that EF is creating between table1 to table2 (using table2 as the first .Include) returns 0 total records. EF creates a query that runs once against the data source, so every .Include is going to influence the results although in the cases where left joins occur or you end up with expanded results (going 1:m), EF appears to re-normalize the data and give you the correct results.
 
Personally I'm in that camp against nullable fields, but I've given on this and make my soft FK nullable so that EF will generate the proper query.
 
Appreciate the help.
Scott


Posted By: sbelini
Date Posted: 27-Apr-2012 at 10:41am
Hi Scott,
 
I'm still confused when you use the "soft relationship" term.
Correct me if I'm mistaken, but from what I understood you meant that in a soft relationship the FK is not enforced (i.e. can be null).
If my guess is correct, then the relationship would be a 0..1_to_* rather than a 1_to_*. (I can't think of a 1_to_* relationship when the FK is not enforced - that is actually a 0..1_to_* )
 
With that in mind, If I actually execute a query where the relationships are 0..1_to_* the SQL statement will actually start with a Left Join. (I used Employee 0..1_* Order *_0..1 Customer from NorthwindIB to check it)
 
Regards,
   Silvio.



Print Page | Close Window