I have a list of people's ID and their first name, and a list of people's ID and their surname. Some people don't have a first name and some don't have a surname; I'd like to do a full outer join on the two lists.
So the following lists:
ID FirstName
-- ---------
1 John
2 Sue
ID LastName
-- --------
1 Doe
3 Smith
Should produce:
ID FirstName LastName
-- --------- --------
1 John Doe
2 Sue
3 Smith
I'm new to LINQ (so forgive me if I'm being lame) and have found quite a few solutions for 'LINQ Outer Joins' which all look quite similar, but really seem to be left outer joins.
My attempts so far go something like this:
private void OuterJoinTest()
{
List<FirstName> firstNames = new List<FirstName>();
firstNames.Add(new FirstName { ID = 1, Name = "John" });
firstNames.Add(new FirstName { ID = 2, Name = "Sue" });
List<LastName> lastNames = new List<LastName>();
lastNames.Add(new LastName { ID = 1, Name = "Doe" });
lastNames.Add(new LastName { ID = 3, Name = "Smith" });
var outerJoin = from first in firstNames
join last in lastNames
on first.ID equals last.ID
into temp
from last in temp.DefaultIfEmpty()
select new
{
id = first != null ? first.ID : last.ID,
firstname = first != null ? first.Name : string.Empty,
surname = last != null ? last.Name : string.Empty
};
}
}
public class FirstName
{
public int ID;
public string Name;
}
public class LastName
{
public int ID;
public string Name;
}
But this returns:
ID FirstName LastName
-- --------- --------
1 John Doe
2 Sue
What am I doing wrong?
Best Answer
Update 1: providing a truly generalized extension method
FullOuterJoin
Update 2: optionally accepting a custom
IEqualityComparer
for the key typeUpdate 3: this implementation has recently become part of
MoreLinq
- Thanks guys!Edit Added
FullOuterGroupJoin
(ideone). I reused theGetOuter<>
implementation, making this a fraction less performant than it could be, but I'm aiming for 'highlevel' code, not bleeding-edge optimized, right now.See it live on http://ideone.com/O36nWc
Prints the output:
You could also supply defaults: http://ideone.com/kG4kqO
Printing:
Explanation of terms used:
Joining is a term borrowed from relational database design:
a
as many times as there are elements inb
with corresponding key (i.e.: nothing ifb
were empty). Database lingo calls thisinner (equi)join
.a
for which no corresponding element exists inb
. (i.e.: even results ifb
were empty). This is usually referred to asleft join
.a
as well asb
if no corresponding element exists in the other. (i.e. even results ifa
were empty)Something not usually seen in RDBMS is a group join[1]:
a
for multiple correspondingb
, it groups the records with corresponding keys. This is often more convenient when you wish to enumerate through 'joined' records, based on a common key.See also GroupJoin which contains some general background explanations as well.
[1] (I believe Oracle and MSSQL have proprietary extensions for this)
Full code
A generalized 'drop-in' Extension class for this