Conquering the N+1 Problem in Spring GraphQL with DataLoader

The N+1 problem can turn your elegant GraphQL API into a database nightmare. Learn how Spring GraphQL's batch loading solves this efficiently.
The N+1 Problem Revisited
Before diving into solutions, let's understand the problem clearly. Consider this query:
query {
books { # 1 query to get 100 books
title
author { # 100 queries - one per book!
name
}
}
}
With a naive implementation, fetching 100 books triggers 101 database queries:
SELECT * FROM books; -- 1 query
SELECT * FROM authors WHERE id = 1; -- +100 queries
SELECT * FROM authors WHERE id = 2;
SELECT * FROM authors WHERE id = 3;
-- ... 97 more queries
That's the N+1 problem: 1 query for the list, plus N queries for related items.
How Spring GraphQL Solves This
Spring GraphQL provides built-in support for batch loading through the @BatchMapping annotation and the BatchLoaderRegistry.
Approach 1: @BatchMapping (Recommended)
The simplest approach uses @BatchMapping:
@Controller
public class BookController {
private final BookRepository bookRepository;
private final AuthorRepository authorRepository;
@QueryMapping
public List<Book> books() {
return bookRepository.findAll();
}
@BatchMapping
public Map<Book, Author> author(List<Book> books) {
// Collect all author IDs
Set<String> authorIds = books.stream()
.map(Book::authorId)
.collect(Collectors.toSet());
// Single query for all authors
Map<String, Author> authorsById = authorRepository
.findAllById(authorIds)
.stream()
.collect(Collectors.toMap(Author::id, Function.identity()));
// Map each book to its author
return books.stream()
.collect(Collectors.toMap(
Function.identity(),
book -> authorsById.get(book.authorId())
));
}
}
Now the same query executes only 2 queries:
SELECT * FROM books;
SELECT * FROM authors WHERE id IN (1, 2, 3, ...);
Understanding @BatchMapping
The @BatchMapping annotation:
- Collects all instances that need the field resolved
- Calls your method once with the entire batch
- Maps results back to individual instances
┌─────────────────────────────────────────────────────────────────┐
│ Query Execution │
└─────────────────────────────────────────────────────────────────┘
books query returns: Book-1, Book-2, Book-3, ..., Book-100
│ │ │ │
└───────┴───────┴────────────┘
│
┌───────────▼───────────┐
│ @BatchMapping │
│ author(List<Book>) │
│ │
│ Single batch call │
│ with all 100 books │
└───────────┬───────────┘
│
┌───────────▼───────────┐
│ Map<Book, Author> │
│ returned │
└───────────────────────┘
Return Types for @BatchMapping
@BatchMapping supports several return types:
// Map - each source maps to one result
@BatchMapping
public Map<Book, Author> author(List<Book> books) { ... }
// Mono<Map> - async loading
@BatchMapping
public Mono<Map<Book, Author>> author(List<Book> books) { ... }
// For collections (one-to-many relationships)
@BatchMapping
public Map<Author, List<Book>> books(List<Author> authors) { ... }
Approach 2: BatchLoaderRegistry
For more control, register batch loaders manually:
@Configuration
public class GraphQLConfig {
@Bean
public RuntimeWiringConfigurer runtimeWiringConfigurer(
AuthorRepository authorRepository) {
return wiringBuilder -> wiringBuilder
.type("Book", builder -> builder
.dataFetcher("author", environment -> {
Book book = environment.getSource();
DataLoader<String, Author> loader =
environment.getDataLoader("authorLoader");
return loader.load(book.authorId());
}));
}
@Bean
public BatchLoaderRegistry batchLoaderRegistry(
AuthorRepository authorRepository) {
return new BatchLoaderRegistry() {
@Override
public <K, V> BatchLoaderRegistry forName(String name) {
return this;
}
@Override
public void registerFor(GraphQLContext context) {
context.put("authorLoader",
DataLoaderFactory.newDataLoader(
(List<String> ids) -> {
Map<String, Author> authors = authorRepository
.findAllById(ids)
.stream()
.collect(Collectors.toMap(Author::id, a -> a));
// Return in same order as requested
return CompletableFuture.completedFuture(
ids.stream()
.map(authors::get)
.toList()
);
}
));
}
};
}
}
Simpler Registration with Spring Boot
Spring Boot 3+ provides a cleaner way:
@Component
public class AuthorBatchLoader implements BatchLoaderWithContext<String, Author> {
private final AuthorRepository authorRepository;
public AuthorBatchLoader(AuthorRepository authorRepository) {
this.authorRepository = authorRepository;
}
@Override
public CompletionStage<List<Author>> load(List<String> authorIds,
BatchLoaderEnvironment env) {
return CompletableFuture.supplyAsync(() -> {
Map<String, Author> authorsById = authorRepository
.findAllById(authorIds)
.stream()
.collect(Collectors.toMap(Author::id, Function.identity()));
// Must return in same order as input
return authorIds.stream()
.map(authorsById::get)
.toList();
});
}
}
@Configuration
public class DataLoaderConfig {
@Bean
public BatchLoaderRegistry batchLoaderRegistry(
AuthorBatchLoader authorBatchLoader) {
return registry -> registry
.forTypePair(String.class, Author.class)
.registerBatchLoader(authorBatchLoader);
}
}
Nested Batch Loading
What about nested relationships?
query {
books {
title
author {
name
publisher { # Another level!
name
}
}
}
}
Just add another @BatchMapping:
@Controller
public class AuthorController {
@BatchMapping
public Map<Author, Publisher> publisher(List<Author> authors) {
Set<String> publisherIds = authors.stream()
.map(Author::publisherId)
.filter(Objects::nonNull)
.collect(Collectors.toSet());
Map<String, Publisher> publishersById = publisherRepository
.findAllById(publisherIds)
.stream()
.collect(Collectors.toMap(Publisher::id, Function.identity()));
return authors.stream()
.filter(a -> a.publisherId() != null)
.collect(Collectors.toMap(
Function.identity(),
a -> publishersById.get(a.publisherId())
));
}
}
Spring GraphQL handles the execution order automatically.
One-to-Many Relationships
For collections, the return type changes slightly:
@Controller
public class AuthorController {
@QueryMapping
public List<Author> authors() {
return authorRepository.findAll();
}
@BatchMapping
public Map<Author, List<Book>> books(List<Author> authors) {
Set<String> authorIds = authors.stream()
.map(Author::id)
.collect(Collectors.toSet());
// Single query for all books by these authors
List<Book> allBooks = bookRepository.findByAuthorIdIn(authorIds);
// Group by author ID
Map<String, List<Book>> booksByAuthorId = allBooks.stream()
.collect(Collectors.groupingBy(Book::authorId));
// Map each author to their books
return authors.stream()
.collect(Collectors.toMap(
Function.identity(),
author -> booksByAuthorId.getOrDefault(author.id(), List.of())
));
}
}
Performance Comparison
Let's measure the difference:
@SpringBootTest
class BatchLoadingPerformanceTest {
@Autowired
private GraphQlTester graphQlTester;
@Test
void measureQueryPerformance() {
// Setup: 100 books, 20 authors
long start = System.currentTimeMillis();
graphQlTester.document("""
query {
books {
title
author {
name
}
}
}
""")
.execute()
.path("books").entityList(Book.class).hasSize(100);
long duration = System.currentTimeMillis() - start;
System.out.println("Query executed in: " + duration + "ms");
}
}
| Approach | Queries | Time |
|---|---|---|
| Naive (N+1) | 101 | ~500ms |
| @BatchMapping | 2 | ~50ms |
10x improvement just by changing how we fetch data!
Caching Within a Request
DataLoader automatically caches within a request:
query {
book1: bookById(id: "1") {
author { name } # Loads author "A"
}
book2: bookById(id: "2") {
author { name } # Also loads author "A" - cached!
}
}
If both books have the same author, the author is fetched only once per request.
Important: This cache is per-request. New requests start with an empty cache.
Common Pitfalls
1. Returning Results in Wrong Order
DataLoader requires results in the same order as the input keys:
// WRONG - HashMap doesn't preserve order
Map<String, Author> map = ...;
return ids.stream().map(map::get).toList();
// RIGHT - explicitly maintain order
return ids.stream()
.map(id -> authorsById.getOrDefault(id, null))
.toList();
2. Forgetting to Handle Nulls
Some IDs might not have corresponding data:
@BatchMapping
public Map<Book, Author> author(List<Book> books) {
// Some books might have invalid authorId
Map<String, Author> authorsById = ...;
return books.stream()
.collect(Collectors.toMap(
Function.identity(),
book -> authorsById.get(book.authorId()), // Might be null!
(a, b) -> a, // Handle duplicate keys
HashMap::new // HashMap allows null values
));
}
3. Not Using Indexes
Batch loading is only fast if the database query is fast:
-- Make sure you have an index!
CREATE INDEX idx_authors_id ON authors(id);
CREATE INDEX idx_books_author_id ON books(author_id);
Advanced: Custom DataLoader Configuration
For fine-tuned control:
@Bean
public DataLoaderOptions dataLoaderOptions() {
return DataLoaderOptions.newOptions()
.setBatchingEnabled(true)
.setCachingEnabled(true)
.setMaxBatchSize(100) // Limit batch size
.setBatchLoaderScheduler(new BatchLoaderScheduler() {
@Override
public <K, V> void scheduleBatchLoader(
ScheduledBatchLoaderCall<V> call,
List<K> keys,
BatchLoaderEnvironment env) {
// Custom scheduling logic
CompletableFuture.runAsync(call::invoke);
}
});
}
Debugging Batch Loading
Add logging to see what's happening:
@BatchMapping
public Map<Book, Author> author(List<Book> books) {
log.debug("Batch loading authors for {} books", books.size());
log.debug("Author IDs: {}", books.stream()
.map(Book::authorId)
.distinct()
.toList());
// ... rest of implementation
}
Or enable SQL logging:
logging:
level:
org.hibernate.SQL: DEBUG
org.hibernate.type.descriptor.sql: TRACE
Summary
| Pattern | Use Case | Annotation |
|---|---|---|
| Single related object | Book.author | @BatchMapping returning Map<Book, Author> |
| List of related objects | Author.books | @BatchMapping returning Map<Author, List<Book>> |
| Manual control | Complex logic | BatchLoaderRegistry |
The N+1 problem is GraphQL's most common performance issue, but Spring GraphQL makes it easy to solve. Use @BatchMapping by default, and you'll have efficient queries without complex configuration.
Next: Testing Spring GraphQL - comprehensive strategies for unit and integration testing.