EF Pitfalls

Every technology has its pitfalls. You may not even notice it. Here, I’ll present some of the most important pitfalls of Entity Framework that you should learn about and have an eye on during development. Additionally interviewers commonly asks about those things, because if you know answer for these, it’s sure you’ve been working with EF long enough.

IEnumerable vs IQueryable vs IList

It is question that distinguishes developers that used EF from those who understand EF. At glance you can say it doesn’t matter which one you’ll return, at the end you’ll have your collection of entities. Wrong! It’s very important to understand what gives you each of them.

IQueryable

When you operate on IQueryable every operation that you call will generate SQL, so every filtering, selection and sorting will be done on database. It’s great for operation on large amounts of data, like initial search. You need to be careful thou, because it is two-edged sword.

Everything will be translated to SQL, if you let IQueryable out someone may access more data than you would like them to get. Additionally client may misuse IQueryable by adding too much filters, which may slow down SQL Execution. Additionally if EF can’t translate linq to SQL it will translate what can, and rest of linq will be done in memory. So more than expected data maybe loaded from database, which may kill your server.

One of the most important information is that IQueryable is not executed immediately. That means, EF will call database only when you will need to read, iterate thru data etc. OR, when you create linq that EF can’t translate to SQL.

IEnumerable

IEnumerable is similar to IQueryable, it is also “lazy”, there’s one significant difference every operation will be done in memory. First EF will load data from database then start filtering, sorting etc. in your application instead of database.

IList

There are methods in LinQ that will call database immediately. One of the most known is ToList() and ToListAsync() which will gather whole query and make call to database. After that data specified by query in linq will be loaded to your application.

It is worth to mention that, when you call ToList() and then add more filtering it will be done in memory. It is the same mechanism as I described in IEnumerable section and it’s called Client Evaluation.

Client Evaluation

//Edit Net core 3.0

According to documentation from EF core 3.0 client evaluation is dissabled and EF will throw runtime exception when client evaluation is detected. We can still use it, but we need to switch to LINQ to Object from Query. By doing it EF will load filtered data to memory and rest of query will be done in memory. AsEnumerable() is lazy, so it won’t load data immediately.

var specialCustomers =
  context.Customers
    .Where(c => c.Name.StartsWith(n)) 
    .AsEnumerable() // switches to LINQ to Objects
    .Where(c => IsSpecialCustomer(c));

It is mechanism that will load data from database and continue operation on client side in our case in Web API. It may bring a lot of problems. Loading too much data from database at once may run your API out of memory. By default EF will inform you about client evaluation by logging warning. It is possible to change this behavior.

public void ConfigureServices(IServiceCollection services)
{
     services.AddDbContext<BlogPostContext>(options => options.UseSqlServer(Configuration.GetConnectionString("DefaultConnection"))
    .ConfigureWarnings(warnings => warnings.Throw(RelationalEventId.QueryClientEvaluationWarning)));
...
}

In Startup.cs we can change a little EF configuration by adding last line. From this point EF will throw exception when it detects that client evaluation is used.

Client evaluation isn’t bad thing in my opinion. It’s just place you should know about and have an eye on, because it may cost you gravely. I would say it’s better to call ToList() by yourself and let know other developers that rest of operation will be done in memory, or change query so whole thing can be finished on database.

Select N + 1

It’s the most known pitfall in Entity Framework. Select N+1 means that we make more database calls than we need. Let’s see it on example. We have Customer table and Order table, every customer has multiple orders, so it’s 1 to many relationship. Now we want to select amount from orders for every customer. Let’s look at example.

var query = context.Customers.Select(
    c => c.Orders.Where(o => o.Amount  > 100).Select(o => o.Amount));

Code above will generate N+1 calls. One for all customers and then N for orders, where N is number of customers. In other words.

SELECT * FROM Customers;	
SELECT Amount FROM Order WHERE CustomerId = ? AND Amount > 100

This is what happens. EF will load Customers and then start loading Orders for every Customer. As you may know calling database twice or even 10 times isn’t bad. You can live with 100 calls, but thousands call made by thousands clients may kill your server.

How to protect yourself?

Entity Framework Core 2.1 has means to deal with Select N+1 easily. We can tell EF that we want to cache some information.

var query = context.Customers.Select(
    c => c.Orders.Where(o => o.Amount  > 100).Select(o => o.Amount).ToList());

By adding .ToList() in right place, we can tell EF that we want to cache results from specified call and reuse it later. Do mind that .ToList() isn’t added at the end of call, but after second select statement.

Outro

These are two most common questions on job interviews regarding ORMs / Entity Framework and very important questions, because you can’t use ORM correctly if you don’t know answer for them. So make sure you remember that and bear in mind to look at EF logs when you use it in development.

Leave a Reply