EF Core Aggregation Queries
This presentation walks through the EF Core LINQ queries that power delivery stats, exception reason aggregation, and other analytics endpoints. Each query translates to efficient SQL with GROUP BY and aggregate functions.
Setting Up the Analytics Service
Create a dedicated service for analytics queries. This keeps the aggregation logic separate from the controller and makes it testable.
csharp1public interface IAnalyticsService2{3 Task<List<ParcelCountByStatusDto>> GetParcelCountByStatusAsync();4 Task<DeliveryPerformanceDto> GetDeliveryPerformanceAsync(DateTimeOffset from, DateTimeOffset to);5}
csharp1public class AnalyticsService : IAnalyticsService2{3 private readonly ParcelTrackingDbContext _context;45 public AnalyticsService(ParcelTrackingDbContext context)6 {7 _context = context;8 }9}
The DTOs for these methods are:
csharp1public class ParcelCountByStatusDto2{3 public string Status { get; set; } = string.Empty;4 public int Count { get; set; }5}67public class DeliveryPerformanceDto8{9 public DateTimeOffset From { get; set; }10 public DateTimeOffset To { get; set; }11 public int TotalParcels { get; set; }12 public int Delivered { get; set; }13 public int InTransit { get; set; }14 public int Exceptions { get; set; }15 public double AverageDeliveryTimeHours { get; set; }16 public double OnTimePercentage { get; set; }17}
Register the service in Program.cs:
csharp1builder.Services.AddScoped<IAnalyticsService, AnalyticsService>();
Parcel Count By Status Query
The parcel count by status query returns the current count of parcels in each status. This uses a simple GroupBy aggregation.
csharp1public async Task<List<ParcelCountByStatusDto>> GetParcelCountByStatusAsync()2{3 var counts = await _context.Parcels4 .GroupBy(p => p.Status)5 .Select(g => new ParcelCountByStatusDto6 {7 Status = g.Key.ToString(),8 Count = g.Count()9 })10 .OrderByDescending(p => p.Count)11 .ToListAsync();1213 return counts;14}
This generates:
sql1SELECT Status, COUNT(*) AS Count2FROM Parcels3GROUP BY Status4ORDER BY Count DESC
Including All Statuses
To ensure all status values appear in the result (even with zero counts), fill in the gaps after the query:
csharp1public async Task<List<ParcelCountByStatusDto>> GetParcelCountByStatusAsync()2{3 var counts = await _context.Parcels4 .GroupBy(p => p.Status)5 .Select(g => new6 {7 Status = g.Key,8 Count = g.Count()9 })10 .ToListAsync();1112 var allStatuses = Enum.GetValues<ParcelStatus>();1314 return allStatuses.Select(status => new ParcelCountByStatusDto15 {16 Status = status.ToString(),17 Count = counts.FirstOrDefault(c => c.Status == status)?.Count ?? 018 }).ToList();19}
Delivery Performance Query
The delivery performance endpoint returns comprehensive statistics about parcels within a date range, including counts, average delivery time, and on-time percentage.
Filtering by Date Range
Start by building the base query with the date filter:
csharp1public async Task<DeliveryPerformanceDto> GetDeliveryPerformanceAsync(2 DateTimeOffset from, DateTimeOffset to)3{4 var query = _context.Parcels5 .Where(p => p.CreatedAt >= from && p.CreatedAt <= to);
This creates an IQueryable that has not executed yet. All subsequent operations build on this filtered set.
Counting by Status
Use conditional counting to get totals for each status category:
csharp1 var totalParcels = await query.CountAsync();23 var delivered = await query4 .CountAsync(p => p.Status == ParcelStatus.Delivered);56 var inTransit = await query7 .CountAsync(p => p.Status == ParcelStatus.InTransit8 || p.Status == ParcelStatus.OutForDelivery9 || p.Status == ParcelStatus.PickedUp);1011 var exceptions = await query12 .CountAsync(p => p.Status == ParcelStatus.Exception);
Each CountAsync call generates a separate SQL query. For small-to-medium datasets, this is perfectly fine. For very large datasets, you could combine these into a single GroupBy query.
Single-Query Approach with GroupBy
To get all status counts in one round trip:
csharp1 var statusCounts = await query2 .GroupBy(p => p.Status)3 .Select(g => new4 {5 Status = g.Key,6 Count = g.Count()7 })8 .ToListAsync();910 var delivered = statusCounts11 .FirstOrDefault(s => s.Status == ParcelStatus.Delivered)?.Count ?? 0;1213 var inTransit = statusCounts14 .Where(s => s.Status == ParcelStatus.InTransit15 || s.Status == ParcelStatus.OutForDelivery16 || s.Status == ParcelStatus.PickedUp)17 .Sum(s => s.Count);1819 var exceptions = statusCounts20 .FirstOrDefault(s => s.Status == ParcelStatus.Exception)?.Count ?? 0;
This executes a single SQL GROUP BY query. The in-memory LINQ after ToListAsync() operates on the small result set (at most 7 rows, one per status).
Average Delivery Time
Calculate the average time between parcel creation and actual delivery. Only delivered parcels have an ActualDeliveryDate:
csharp1 var averageDeliveryTimeHours = await query2 .Where(p => p.Status == ParcelStatus.Delivered3 && p.ActualDeliveryDate != null)4 .AverageAsync(p =>5 EF.Functions.DateDiffHour(p.CreatedAt, p.ActualDeliveryDate!.Value));
EF.Functions.DateDiffHour translates to the database's date difference function. The Npgsql provider translates this to PostgreSQL's native date arithmetic.
If no parcels have been delivered in the date range, AverageAsync throws an InvalidOperationException because the sequence is empty. Handle this:
csharp1 double averageDeliveryTimeHours = 0;23 var deliveredParcels = query4 .Where(p => p.Status == ParcelStatus.Delivered5 && p.ActualDeliveryDate != null);67 if (await deliveredParcels.AnyAsync())8 {9 averageDeliveryTimeHours = await deliveredParcels10 .AverageAsync(p =>11 EF.Functions.DateDiffHour(p.CreatedAt, p.ActualDeliveryDate!.Value));12 }
On-Time Percentage
A parcel is on time if ActualDeliveryDate <= EstimatedDeliveryDate:
csharp1 double onTimePercentage = 0;23 if (delivered > 0)4 {5 var onTimeCount = await query6 .CountAsync(p => p.Status == ParcelStatus.Delivered7 && p.ActualDeliveryDate != null8 && p.EstimatedDeliveryDate != null9 && p.ActualDeliveryDate <= p.EstimatedDeliveryDate);1011 onTimePercentage = Math.Round((double)onTimeCount / delivered * 100, 1);12 }
Assembling the Result
Return the assembled DTO:
csharp1 return new DeliveryPerformanceDto2 {3 From = from,4 To = to,5 TotalParcels = totalParcels,6 Delivered = delivered,7 InTransit = inTransit,8 Exceptions = exceptions,9 AverageDeliveryTimeHours = Math.Round(averageDeliveryTimeHours, 1),10 OnTimePercentage = onTimePercentage11 };12}
Understanding GroupBy Projections
The Select after GroupBy is the key to aggregation queries. Each group g has:
g.Key- the value being grouped by (the enum or property value)g.Count()- number of items in that group- Any other aggregate:
g.Average(...),g.Sum(...),g.Min(...),g.Max(...)
EF Core translates these to efficient SQL with GROUP BY and aggregate functions.
Handling Empty Results
Every aggregation query must handle the case where the filtered dataset is empty:
CountAsync()returns 0 for empty sets (safe)AverageAsync()throwsInvalidOperationExceptionon empty sequencesSumAsync()returns 0 for empty sets (safe)GroupBy(...).Select(...).ToListAsync()returns an empty list (safe)
Always check with AnyAsync() before calling AverageAsync(), or use a try-catch, or use the nullable overload:
csharp1// Safe approach: check first2if (await query.AnyAsync())3{4 var avg = await query.AverageAsync(p => p.SomeValue);5}67// Alternative: cast to nullable and use DefaultIfEmpty8var avg = await query9 .Select(p => (double?)p.SomeValue)10 .AverageAsync() ?? 0;
Performance Considerations
Database Indexes
Aggregation queries benefit from indexes on the columns used in WHERE and GROUP BY clauses. For the analytics endpoints, these columns are important:
csharp1// In your DbContext OnModelCreating2modelBuilder.Entity<Parcel>()3 .HasIndex(p => p.CreatedAt);45modelBuilder.Entity<Parcel>()6 .HasIndex(p => p.Status);78modelBuilder.Entity<Parcel>()9 .HasIndex(p => p.ServiceType);1011modelBuilder.Entity<TrackingEvent>()12 .HasIndex(e => new { e.EventType, e.Timestamp });
Avoiding N+1 Queries
The analytics queries in this topic do not use Include() because they do not need navigation properties. They project directly to DTOs with aggregates. This is one of the advantages of aggregation queries: they naturally avoid the N+1 problem because they work with scalar results, not entity graphs.
Let the Database Do the Work
A common mistake is to load all parcels into memory and aggregate in C#:
csharp1// BAD: loads all parcels into memory2var allParcels = await _context.Parcels.ToListAsync();3var delivered = allParcels.Count(p => p.Status == ParcelStatus.Delivered);
Instead, keep the aggregation in the LINQ query so EF Core translates it to SQL:
csharp1// GOOD: database performs the count2var delivered = await _context.Parcels3 .CountAsync(p => p.Status == ParcelStatus.Delivered);
The database engine is optimized for aggregation. It uses indexes, parallelism, and optimized algorithms. Loading rows into your application just to count them wastes memory, network bandwidth, and CPU time.
Key Takeaways
- Use
GroupBywithSelectto project aggregated results into DTOs g.Keyholds the grouping value;g.Count(),g.Sum(),g.Average()compute aggregates- Handle empty sequences before calling
AverageAsync()to avoid exceptions - Compute conditional averages using
Sum/Countand divide in C# after materialization - Add database indexes on columns used in
WHEREandGROUP BYfor performance - Keep aggregation in the LINQ query so the database does the work, not your application
- The
IAnalyticsServiceencapsulates complex aggregation logic and keeps controllers thin