When you’re using LINQ to work with a database the experience is kind of magical, right?. You treat the database entities like a regular collection, use LINQ operators like Where, Select, or Take, and it just works. But let’s consider whaHere’s how the this functionality can be achieved with dynamic queries and expression trees:t happens under the hood. The LINQ query you wrote transform into SQL (or something else), and that SQL query is sent to the database. The database’s response is then mapped to C# objects. But how does that conversion to SQL happen exactly?

In this article, you’ll see how frameworks like Entity Framework and MongoDB C# driver use expressions trees to make that conversion. And you’ll see how to use expression trees yourself to build dynamic queries. Those are queries that you aren’t able to create at compile time because you’ll know how the query should look like only at runtime.

Demystifying IQueryable and Expression Trees

Consider the following C# code that uses Entity Framework 6:

DbSet<Student> students = context.Students;
var billie = await students.Where(s => s.StudentName == "Billie").ToListAsync();

When executed, entity framework produces the following SQL query:

SQL:SELECT
    [Extent1].[StudentID] AS [StudentID],
    [Extent1].[StudentName] AS [StudentName],
    [Extent1].[DateOfBirth] AS [DateOfBirth],
    FROM [dbo].[Students] AS [Extent1]
    WHERE N'Billie' = [Extent1].[StudentName]

Note that there’s a WHERE operation in the SQL query. That’s not obvious. If the SQL didn’t contain WHERE, all the students would be brought from the database and the filtering would be executed in the .NET process. In fact, the following code does exactly that:

//BAD:
DbSet<Student> students = context.Students;
Func<Student, bool> predicate = s => s.StudentName == "Billie";
var x = students.Where(predicate).ToList();

In the last example, the SQL query got all the students to the process and mapped it to a regular collection. The difference is that in the first snippet of code, the lambda was an Expression<Func<Student, bool>>, which allowed entity framework to add it to the SQL query. In the second snippet of code, the lambda was a Func<Student, bool>, so everything up to the Where operator was executed and turned into a regular IEnumerable collection, and then the rest of the query was executed.

That second snippet of code is terrible in terms of performance, memory, and network. Instead of bringing just a single item from the database, we got a lot of objects using up network. We then serialized them into C# objects using up CPU. And stored them in the process`s heap using up memory.

So let’s go back to the first piece of code. How did await students.Where(s => s.StudentName == "Billie").ToListAsync() produce an SQL query that included WHERE N'Billie' = [Extent1].[StudentName]?

The answer is expression trees. The code s => s.StudentName == "Billie" is actually a structured query that can be programmatically broken down into a tree of nodes. In this example, there are 6 nodes. The topmost node is the lambda expression. On the left is the lambda parameter. On the right of it is the lambda body that’s an Equal expression. Entity framework has an algorithm that goes over these expression trees and constructs SQL queries. Same thing happens with other data source providers like Mongo DB C# Driver, except that it constructs a MongoDB json query.

C# expression tree

In the first snippet of code, the type of s => s.StudentName == "Billie" is Expression<Func<Student, bool>>. That means an expression tree that produces a Func<Student, bool>. The Where clause accepts this type of parameter because a DbSet<TEntity> implements the IQueryable<T> interface, which requires it to work with expression trees. In contrast, a regular collection, like an array or a List<T> is an IEnumerable which means it will use the lambda s => s.StudentName == "Billie" like a regular function.

Alright, but how can I use that to my advantage?

In most cases, the folks who use expression trees are the ones building the entity frameworks of the world. But in some specific cases it becomes extremely useful. Here’s a use case that we encountered recently at Ozcode (my day job):

We wanted to create a dynamic server-side filtering on a database entity called Error. This entity had many properties and we wanted to allow the user to filter on any of them. So filtering should be allowed according to Username, Country, Version, or any other property. Here’s the API we needed to achieve:

IQueryable<Error> _errors;

public IEnumerable<Error> GetErrors(string propertyToFilter, string value){ /*..*/}

In this case, the propertyToFilter is a property of Error. With regular LINQ, the only way to achieve this is with a giant switch/case statement. Kind of like this:

IQueryable<Error> _errors;

public IEnumerable<Error> GetErrors(string propertyToFilter, string value)
{
	switch (propertyToFilter)
	{
		case "Username":
			return await _errors.Where(e=> e.Username == value).ToListAsync();
        case "Country":
			return await _errors.Where(e=> e.Country == value).ToListAsync();
		case "Version":
			return await _errors.Where(e=> e.Version == value).ToListAsync();
        // ...		
	}
}

You’ll probably agree this is not ideal. Besides having to write all that stuff, it’s also very prone to bugs. What if a property is added? What if it’s renamed? The whole thing a mess.

Here’s how the this functionality can be achieved with dynamic queries and expression trees:

private async static Task<IEnumerable<Error>> GetErrors(string propertyToFilter, string value)
{
    var error = Expression.Parameter(typeof(Error));
    var memberAccess = Expression.PropertyOrField(error, propertyToFilter);
    var exprRight = Expression.Constant(value);
    var equalExpr = Expression.Equal(memberAccess, exprRight);
    Expression<Func<Error, bool>> lambda = Expression.Lambda<Func<Error, bool>>(equalExpr, error);
            
    return await _errors.Where(lambda).ToListAsync();
}

Each line of code here represents a node in the expression tree. Together they construct the topmost node—the lambda. Then, the dynamic expression can be used in LINQ and produce a server-side SQL query. Quite nice I think.

Another way to solve this would be to build a custom SQL query string. In Ozcode we are using MongoDB, so SQL wouldn’t fit but we could have created a custom MongoDB JSON query string. Not too hard either, but the expression tree approach is much more flexible and reliable I think. For one thing, you can place it in LINQ and combine with other LINQ operators. Besides, why write your own query when there are tested frameworks like Entity Framework that do that for you.

Summary

To recap. here are some of the key points in this post:

  • The difference between a regular function/delegate and an Expression is that an expression can be represented with a structured tree. This tree can be easily analyzed to create things like database queries.
  • Data sources that support expressions implement the IQueryable interface.
  • If you fail to use expressions (and use regular methods or delegates), the query will be on the server side and not on the database side, which will be terrible for performance.
  • When using lambdas (without a body), expressions are created seamlessly, so you’ve probably been doing this right all these years.
  • You can use expression trees yourself to create dynamic queries. That’s useful in scenarios where you can’t build the query at compile time, only at runtime.