NashTech Insights

Entity Framework Core Internals: Query Pipelines

Phong Nguyen
Phong Nguyen
Table of Contents


This post is the continuation of the Entity Framework Core Internals series, if you haven’t followed along please check out the previous posts below:

In today’s article, we’ll dive deep to see how EF Core processes LINQ queries, translates, and executes them on your target database. We’ll also explore some concepts about IQueryable and Query Providers, and see how EF Core uses caching to make it work effectively.

IEnumerable vs IQueryable

IEnumerable is an abstraction in .Net referring to something you can enumerate over.

Let’s take a closer look at the Where method:

Because we enumerate a list, let’s drill down to the WhereListIterator next:

As you can see, when enumerating, the predicate we passed into is executed to evaluate and filter the data, in our example any Blog that has BlogId > 1 is returned.

IQueryable is something similar to IEnumerable, however, it’s not a stream of data that we can enumerate now, it’s basically an abstraction of something we can query over.

For example: we don’t want to get the whole data from the SQL Server database and apply the filter, we want to translate the instruction to SQL, and then send it to the server, the server evaluates the whole thing and sends back the results. So we can think of EF Core as a special kind of compiler.

Let’s take a closer look at the Where method, this time it’s different:

This time the predicate is an Expression<Func<TSource, bool>> and is not Func<TSource, bool> so what’s the difference:

The Expression<Func<TSource, bool>> is a LambdaExpression that captures a block of code that is similar to a .NET method body. In our example: it’s not a method that receives a Blog and returns true if BlogId > 1, it represents a method that receives a Blog and returns true if BlogId > 1.

Let’s make our query a bit complicated:

And examine the OrderBy method:

If we visualize the Expression Tree now, it should look like below:

Now that we have the Expression Tree, how does EF Core (or any other ORMs) process the Expression Tree, before investigating that, let’s talk about IQueryableProvider next.

Query Providers

The easy way to understand how IQueryable and IQueryableProvider work is to create our own simple implementations of IQueryable and IQueryableProvider.

First, create our MyQueryable class to represent something that we can query over:

The class depends on IQueryProvider Provider to process the Expression, let create MyQueryProvider class next:

Rather than process the expression directly inside the Execute method, let’s create our own ExpressionVisitor class to traverse the expression and translate it to an SQL query.

Finally, let’s use the MyQueryable class in our application:

Set a breakpoint and debug the application, you should see the SQL statement constructed and we can use it to send to the SQL server (of course the statement is not working yet, but it demonstrates how we can process the expression).

If you recall one of the posts in the series Entity Framework Core Internals: DbContext Instantiation and Initialization, we figured out EF Core has its own implementation of IQueryable and IQueryableProvider.

And here is the Execute method of the EntityQueryProvider class (implementation of IAsyncQueryProvider which also is an IQueryProvider).

It uses the IQueryCompiler _queryCompiler dependency to invoke the Execute method, and here is the code of the Execute method in the QueryCompiler class (default implementation of the IQueryCompiler interface).

Query Pipelines

Now you have a basic understanding of how EF Core will translate and process our queries, in reality, it’s a bit more complex as EF Core supports us in writing so many kinds of complex queries, the implementation code is supper complicated so rather showing the code here, we should use some diagrams to visualize it, let’s get started.

Query Provider (EntityQueryProvider class is an IQueryProvider) as mentioned in the last section, is responsible for handling the expression and return results, but it doesn’t do much, it just forwards the expression to the Query Compiler.

Query Compiler (QueryCompiler class) is responsible for compiling the query tree, compiling the logic to invoke the Translator, and compiling the Materializer.

Translator is responsible for traversing the expression tree and generating the instruction (ex: SQL script) that can be understood by the underlying database.

Materializer is code generated at runtime to create optimized code for the user’s model. It’s responsible for reading back the database results and materializing them. This is also a part of the query compilation.

One thing EF does pretty well is it’s using cache to avoid the same expression getting compiled repeatedly.

As you can see, the cacheKey is based on the query expression. However, if we just use the query being passed to generate the cacheKey, there is one issue which is if there are multiple queries with the same structure but different parameters, multiple queries will get compiled and cached even if they are very much the same. For example:

To overcome this issue, EF Core extracts the parameters and generalizes the query.

Below is the query with id = 1 before generalizing:

After generalizing:

And the query with id = 2 before generalizing:

After generalizing:

As you can see, after generalizing both queries look the same now, let’s get back and update the diagram:

If you are working with Compiled Queries, the way it works is very the same, it takes your predefined query, extracts parameters, compiles the query, and returns a delegate (contains the logic to translate and materialize data), when you use that delegate to execute your query multiple times, it doesn’t need to go through the entire process again and again.

I also have a post related to Compiled Queries, in case you’re interested please check it out: Entity Framework Core Tips: Making Queries Run As Fast As Dapper

Wrapping Up

Now that we know the concepts about IQueryable, and Query Providers, and also understand how EF Core does all the hard work to process our LINQ queries effectively.

That should be the end of the Internals series, thanks for following up with me from the beginning, the diagrams in this article were drawn using, if you are interested please get the diagrams file here.

Phong Nguyen

Phong Nguyen

Phong is currently working as Technical Architect at NashTech, has over 12+ years of experience in designing, building and integrating Enterprise Applications. He is interested in Performance Optimization, Security, Code Analysis, Architecture and Cloud Computing.

Leave a Comment

Your email address will not be published. Required fields are marked *

Suggested Article