C# – LINQ and what does it do?


What is LINQ? I know it's for databases, but what does it do?

Best Solution

LINQ stands for Language Integrated Query.

Instead of writing YAQL (Yet Another Query Language), Microsoft language developers provided a way to express queries directly in their languages (such as C# and Visual Basic). The techniques for forming these queries do not rely on the implementation details of the thing being queried, so that you can write valid queries against many targets (databases, in-memory objects, XML) with practically no consideration of the underlying way in which the query will be executed.

Let's start this exploration with the parts belonging to the .NET Framework (3.5).

  • LINQ To Objects - examine System.Linq.Enumerable for query methods. These target IEnumerable<T>, allowing any typed loopable collection to be queried in a type-safe manner. These queries rely on compiled .NET methods, not Expressions.

  • LINQ To Anything - examine System.Linq.Queryable for some query methods. These target IQueryable<T>, allowing the construction of Expression Trees that can be translated by the underlying implementation.

  • Expression Trees - examine System.Linq.Expressions namespace. This is code as data. In practice, you should be aware of this stuff, but don't really need to write code against these types. Language features (such as lambda expressions) can allow you to use various short-hands to avoid dealing with these types directly.

  • LINQ To SQL - examine the System.Data.Linq namespace. Especially note the DataContext. This is a DataAccess technology built by the C# team. It just works.

  • LINQ To Entities - examine the System.Data.Objects namespace. Especially note the ObjectContext. This is a DataAccess technology built by the ADO.NET team. It is complex, powerful, and harder to use than LINQ To SQL.

  • LINQ To XML - examine the System.Xml.Linq namespace. Essentially, people weren't satisfied with the stuff in System.Xml. So Microsoft re-wrote it and took advantage of the re-write to introduce some methods that make it easier to use LINQ To Objects against XML.

  • Some nice helper types, such as Func and Action. These types are delegates with Generic Support. Gone are the days of declaring your own custom (and un-interchangable) delegate types.

All of the above is part of the .NET Framework, and available from any .NET language (VB.NET, C#, IronPython, COBOL .NET etc).

Ok, on to language features. I'm going to stick to C#, since that's what I know best. VB.NET also had several similar improvements (and a couple that C# didn't get - XML literals). This is a short and incomplete list.

  • Extension Methods - this allows you to "add" a method to type. The method is really a static method that is passed an instance of the type, and is restricted to the public contract of the type, but it very useful for adding methods to types you don't control (string), or adding (fully implemented) helper methods to interfaces.

  • Query Comprehension Syntax - this allows you to write in a SQL Like structure. All of this stuff gets translated to the methods on System.Linq.Queryable or System.Linq.Enumerable (depending on the Type of myCustomers). It is completely optional and you can use LINQ well without it. One advantage to this style of query declaration is that the range variables are scoped: they do not need to be re-declared for each clause.

    IEnumerable<string> result =
     from c in myCustomers
     where c.Name.StartsWith("B")
     select c.Name;
  • Lambda Expressions - This is a shorthand for specifying a method. The C# compiler will translate each into either an anonymous method or a true System.Linq.Expressions.Expression. You really need to understand these to use Linq well. There are three parts: a parameter list, an arrow, and a method body.

    IEnumerable<string> result = myCustomers
     .Where(c => c.Name.StartsWith("B"))
     .Select(c => c.Name);`
  • Anonymous Types - Sometimes the compiler has enough information to create a type for you. These types aren't truly anonymous: the compiler names them when it makes them. But those names are made at compile time, which is too late for a developer to use that name at design time.

    myCustomers.Select(c => new 
      Name = c.Name;
      Age = c.Age;
  • Implicit Types - Sometimes the compiler has enough information from an initialization that it can figure out the type for you. You can instruct the compiler to do so by using the var keyword. Implicit typing is required to declare variables for Anonymous Types, since programmers may not use the name of an anonymous type.

    // The compiler will determine that names is an IEnumerable<string>
    var names = myCustomers.Select(c => c.Name);