Improving Data Access Performance in EF Core

By FoxLearn 1/6/2025 8:53:18 AM   7
Entity Framework Core (EF Core) is an open-source ORM (Object-Relational Mapping) framework that connects your application's object model with the underlying database schema.

It handles CRUD operations (create, read, update, delete) without requiring knowledge of how the data is stored in the database, enabling easier manipulation and retrieval of data through C#.

Fetch Only What You Need

When working with large datasets, it's important to retrieve only the records you need for a specific query. Use projections to select only the necessary fields, ensuring you don't fetch unnecessary data.

For example, how to retrieve a subset of records with filtering and sorting.

decimal minPrice = 100, maxPrice = 500;
var dataContext = new ProductDbContext();
var filteredProducts = dataContext.Products
    .Where(p => p.Price >= minPrice && p.Price <= maxPrice)
    .OrderBy(p => p.Name)
    .ToList();

In this example, only products within the specified price range are retrieved, optimizing performance and ensuring that you only work with the data you truly need.

Break Large Data Contexts into Smaller Ones

In Entity Framework Core, using a single large data context can slow down startup time and impact performance. It's better to break the data context into smaller, more manageable ones, ideally having one data context per module or unit of work. To implement this, create separate classes for each data context, each extending the DbContext class.

Using batch updates for large numbers of entities

By default, EF Core sends individual update statements for each entity in a batch, which can lead to performance issues due to multiple database calls. You can use the UpdateRange() method to update multiple entities in one operation, reducing the number of database hits.

For example of batch updating records using UpdateRange():

public class DataContext : DbContext
{
    public void BatchUpdateBooks(List<Book> books)
    {
        var booksToUpdate = this.Books.Where(b => b.PublishedYear < 2000).ToList();
        this.UpdateRange(books);
        SaveChanges();
    }

    protected override void OnConfiguring(DbContextOptionsBuilder options)
    {
        options.UseInMemoryDatabase("BookDb");
    }

    public DbSet<Book> Books { get; set; }
}

If you're using EF Core 7 or later, you can further optimize batch operations using ExecuteUpdate() or ExecuteDelete() to directly perform updates or deletions without loading entities into memory, like this:

_context.Books.Where(b => b.PublishedYear < 2000).ExecuteUpdate();

This eliminates the need for multiple database calls and boosts performance.

Disable change tracking for read-only queries

By default, EF Core tracks entities retrieved from the database, which is necessary for updating data but can be inefficient with large datasets. To improve performance for read-only queries, you can disable tracking using the AsNoTracking() method, which avoids the overhead of tracking entities that you won’t modify.

For example, if you want to retrieve entities without modifying them, you can use AsNoTracking() as shown below:

var dbModel = await this._context.Books.AsNoTracking()
    .FirstOrDefaultAsync(b => b.Title == "Some Book Title");

In cases where you want to retrieve multiple entities for read-only purposes, you can apply AsNoTracking() to the entire query to prevent unnecessary tracking:

public class DataContext : DbContext
{
    public IQueryable<Book> GetBooks()
    {
        return Set<Book>().AsNoTracking();
    }
}

This ensures that EF Core doesn’t track the entities, improving performance when you’re simply reading data without any intention of updating it.

Use DbContext pooling

Applications often use multiple data contexts, and creating and disposing of DbContext objects can be resource-intensive. To address this, EF Core supports DbContext pooling, which allows DbContext instances to be created once and reused, reducing the overhead of frequent instantiation.

DbContext pooling improves performance by minimizing the cost of creating and disposing of DbContext objects, and can also reduce memory usage.

Here’s how you can configure DbContext pooling in your application’s Program.cs file:

builder.Services.AddDbContextPool<MyDbContext>(options => options.UseSqlServer(connection));

This configuration ensures that your MyDbContext objects are pooled and reused efficiently, boosting the performance of your application.

Using IQueryable instead of IEnumerable

When querying data in EF Core, it's more efficient to use IQueryable instead of IEnumerable. The key difference is that IQueryable translates the query into SQL and executes it on the server side, where the data is stored. In contrast, IEnumerable executes the query on the client side, after the data has been loaded into memory. Additionally, IQueryable supports query optimizations and lazy loading, which can make queries execute faster compared to IEnumerable.

For example, how to use IQueryable to query data efficiently:

IQueryable<Product> query = _context.Products;
query = query.Where(p => p.Price > 100);
query = query.OrderBy(p => p.Name);
List<Product> products = query.ToList();

In this example, the query is constructed using IQueryable, which ensures that the filtering and sorting are done at the database level, resulting in better performance compared to using IEnumerable.

Using eager loading instead of lazy loading

EF Core uses lazy loading by default, meaning related entities are only loaded when accessed. While this can be beneficial by reducing unnecessary data loading, it can also lead to performance issues due to multiple database queries being triggered for each related entity.

To mitigate this, you can use eager loading in EF Core, which loads both the primary entities and their related entities in a single query. This reduces the number of database round trips and improves performance.

For example, how to implement eager loading:

public class DataContext : DbContext
{
    public List<Order> GetOrdersWithEagerLoading()
    {
        List<Order> orders = this.Set<Order>()
            .Include(o => o.Customer)
            .Include(o => o.OrderItems)
            .ToList();
        return orders;
    }
}

In this example, both the Customer and OrderItems related to each Order are loaded in a single query, improving efficiency by minimizing database calls.

Disable lazy loading

Lazy loading in EF Core automatically loads related entities when accessed, which can simplify development. However, it can lead to unnecessary database round trips, potentially slowing down your application.

To improve performance, you can disable lazy loading by setting ChangeTracker.LazyLoadingEnabled = false in your data context.

Using asynchronous instead of synchronous code

Using asynchronous programming in EF Core is a great way to enhance your application's performance, particularly for I/O-bound operations like database queries. Asynchronous methods help prevent blocking the main thread, allowing your application to remain responsive while performing database operations.

Let’s explore an example where we query a list of Customer entities asynchronously from the database.

First, define the Customer and Order model classes:

public class Customer
{
    public int Id { get; set; }
    public string Name { get; set; }
    public List<Order> Orders { get; set; }
}

public class Order
{
    public int Id { get; set; }
    public string OrderDetails { get; set; }
    public DateTime OrderDate { get; set; }
    public int CustomerId { get; set; }
    public Customer Customer { get; set; }
}

Now, let's create a custom DbContext for the Customer and Order entities:

public class DataContext : DbContext
{
    public DataContext(DbContextOptions<DataContext> options) : base(options) { }

    public DbSet<Customer> Customers { get; set; }
    public DbSet<Order> Orders { get; set; }
}

Next, here’s how you can use async code to retrieve a list of customers along with their orders asynchronously. This helps avoid blocking the thread while waiting for the database to respond.

public class CustomerService
{
    private readonly DataContext _context;

    public CustomerService(DataContext context)
    {
        _context = context;
    }

    public async Task<List<Customer>> GetCustomersWithOrdersAsync()
    {
        return await _context.Customers
            .Include(c => c.Orders)  // Eager loading related Orders
            .ToListAsync();  // Asynchronously retrieve the customers and their orders
    }
}

In this example, the GetCustomersWithOrdersAsync method asynchronously retrieves a list of Customer entities along with their related Orders. The use of Include(c => c.Orders) ensures that related Order entities are fetched in a single query (eager loading). By calling ToListAsync(), the query is executed asynchronously, preventing the main thread from being blocked while the data is being fetched.

Using async queries like this one improves the responsiveness of your application, especially in scenarios where you are working with large datasets or performing multiple database operations concurrently.

Reduce the round trips to the database

The N+1 select problem is a common performance issue in database-driven applications. It occurs when multiple queries are sent to the database to retrieve related data that could have been fetched in a single query. This can lead to significant performance bottlenecks, especially when dealing with large datasets.

In EF Core, the N+1 problem often arises when querying entities with a one-to-many or many-to-many relationship. For example, imagine you're querying a list of Department entities and their associated Employee entities:

foreach (var department in this._context.Departments)
{
    foreach (var employee in department.Employees)
    {
        Console.WriteLine(employee.Name);
    }
}

In this case, the outer loop will execute a single query to retrieve all departments, which is the "1" in the N+1 problem. However, for each department, the inner loop queries the Employees table to load employees for that specific department. This results in N additional queries, where N is the number of departments. So in total, this approach sends N+1 queries to the database.

To avoid the N+1 problem, you can use eager loading to retrieve both the departments and their related employees in a single query. This is done using the Include method in EF Core:

var departments = this._context.Departments
    .Include(d => d.Employees)  // Eagerly load Employees along with Departments
    .ToList();

foreach (var department in departments)
{
    foreach (var employee in department.Employees)
    {
        Console.WriteLine(employee.Name);
    }
}

In this updated example, the Include(d => d.Employees) ensures that both the departments and their associated employees are loaded in a single query, reducing the total number of queries to just one.

This change eliminates the N+1 problem and reduces database round trips, making the application more efficient by fetching all the necessary data in one go, rather than executing separate queries for each related entity.

Furthermore, starting from EF Core 7, some optimizations have been made to reduce the number of database round trips for specific operations. For example, single insert operations no longer require separate queries to begin and commit a transaction, providing additional performance improvements for such operations.

By using eager loading, your application can fetch related data more efficiently and avoid the costly N+1 problem.