LINQ to ENTITIES DISTINCT return duplicates when LINQ TO SQL works as expected


I have been using LINQ to SQL for the last year and have had no problems. I decided that I wanted to experiment with the EntityFramework on a project where the requirement was to just return some raw data from several Views in a customer's database. I created the Entities model, included the Views, but when I wrote a query to return the distinct rows for a particular value duplicates were returned.

The only column with the EntityKey set to true was PWSID, but that column contains lots of duplicates. The following was my query:

int[] pwsIds = (from attribs in db.VW_DOR_ParcelAttribs_All
                select attribs.PWSID).Distinct().ToArray();

I created a LINQ to SQL data model and used the same query and everything works as expected. I didn't make any changes to either data model.

I did try to set that value of the EntityKey property to false but then I got a compiler error. I don't have control over the customer's table. There really is not a unique key.

For this application LINQ to SQL is acceptable and works great, but I am trying to better understand what I did wrong when I need to use the Entity Framework in the future. Any insight is appreciated.

Best Solution

EntityKey should be set on properties or combinations of properties that are guaranteed to be unique in that table. I'm guessing EF is optimizing the .Distinct() away, because it assumes the items can't contain duplicates.

Related Question