Most beginners understand “threads”, but they struggle to visualize how multithreading works in Spring Boot. \n It goes deeper into why, how, internals, threading concepts, performance behavior, and production considerations.
In a typical Spring Boot application, each incoming HTTP request is handled by a Tomcat worker thread. \n This thread
Everything happens inside one thread unless you explicitly decide to go async.
This becomes a problem when your request needs to perform slow operations, such as
The Tomcat thread is blocked → slow API → low throughput.
Doing them one-by-one makes your API slow.
Task A → Task B → Task C Total time = A + B + C
But many of these tasks can run in parallel.
Task A Task B Task C (run at the same time)
Your API needs to gather user information
If you do this sequentially
2 + 3 + 4 = 9 seconds
Users will assume your API is broken.
But notice these calls have no dependency on each other.
So, they can run in parallel
Run all 3 calls together → total time = 4 sec (longest task)
This is exactly what ExecutorService + CompletableFuture helps you achieve.
Think of it like a worker team.
A Future on steroids
[API Call] | |--> Task 1 (3 sec) |--> Task 2 (2 sec) |--> Task 3 (5 sec) Total = 10 seconds
[API Call] | |--> Task 1 (3 sec) |--> Task 2 (2 sec) |--> Task 3 (5 sec) All run at same time Total = 5 seconds (longest task)
Let’s break this down in extremely simple terms.
A Tomcat thread (say Thread #27) picks it up.
ExecutorService is a thread pool.
Think of it like
“Here are 5 workers (threads). They will do tasks for you.”
You submit tasks:
executor.submit(taskA) executor.submit(taskB) executor.submit(taskC)
Now 3 worker threads run tasks in parallel.
Tomcat thread is free to do other work.
CompletableFuture is like a promise
So,
CompletableFuture<String> orders = service.fetchOrders();
…means \n “Start task orders now and return response immediately.”
This is a synchronization point
CompletableFuture.allOf(orders, payments, shipment).join();
This says \n “Combine results only when ALL futures have completed.”
By the time Tomcat thread gathers results, tasks are already done.
Result →
| Concept | Meaning | Analogy | |----|----|----| | Thread | Lowest unit of execution | One worker | | ExecutorService | A pool of reusable threads | A team of workers | | CompletableFuture | Async task handler, easy API | A promise that work will finish |
Because manual threads cause:
ExecutorService manages threads properly:
CompletableFuture adds additional magic:
Together → powerful and clean async code.
@Configuration public class AsyncConfig { @Bean public ExecutorService executorService() { return Executors.newFixedThreadPool(5); } }
Meaning
This is crucial for performance.
return CompletableFuture.supplyAsync(() -> { sleep(3000); return "Result A"; }, executor);
Breakdown
This ensures your tasks do not run on the main request thread.
@Service public class AggregationService { private final ExecutorService executor; public AggregationService(ExecutorService executor) { this.executor = executor; } // Simulate a remote call or IO-bound work public CompletableFuture<String> fetchOrders() { return CompletableFuture.supplyAsync(() -> { sleep(300); return "OrdersLoaded"; }, executor); } public CompletableFuture<String> fetchPayments() { return CompletableFuture.supplyAsync(() -> { sleep(250); return "PaymentsLoaded"; }, executor); } public CompletableFuture<String> fetchShipment() { return CompletableFuture.supplyAsync(() -> { sleep(500); return "ShipmentLoaded"; }, executor); } private void sleep(long ms) { try { TimeUnit.MILLISECONDS.sleep(ms); } catch (InterruptedException e) { Thread.currentThread().interrupt(); } } }
@RestController @RequestMapping("/api") public class AggregationController { private final AggregationService service; public AggregationController(AggregationService service) { this.service = service; } // Endpoint using CompletableFuture + custom ExecutorService @GetMapping("/aggregate") public String aggregate() { Instant start = Instant.now(); CompletableFuture<String> orders = service.fetchOrders(); CompletableFuture<String> payments = service.fetchPayments(); CompletableFuture<String> shipment = service.fetchShipment(); // Wait for all to complete CompletableFuture.allOf(orders, payments, shipment).join(); String result = orders.join() + " | " + payments.join() + " | " + shipment.join(); Instant end = Instant.now(); long elapsedMs = Duration.between(start, end).toMillis(); return String.format("result=%s; elapsedMs=%d", result, elapsedMs); }}
Meaning \n “Wait until all async tasks finish.”
Then collect results
String result = orders.join() + " | " + payments.join() + " | " + shipment.join();
This is done only when all tasks complete.
Orders = 3 sec \n Payments = 2 sec \n Shipment = 5 sec
All run simultaneously.
Total time = 5 seconds (longest task)
Without parallelism → 3 + 2 + 5 = 10 seconds \n With parallelism → only 5 seconds
| Scenario | Execution Time | |----|----| | Sequential Processing | 10 sec | | Parallel Processing (3 tasks) | 4 sec | | Parallel + non-blocking I/O | 2–3 sec |
This is a 60% to 80% performance boost.
Here are real use cases where multi-threading is used in enterprise applications:
User Profile API → 2 sec Orders API → 3 sec Payments API → 1 sec
Parallel makes response time 3 seconds instead of 6.
Spark-like parallel job in Spring Boot:
ExecutorService is ideal here.
A PDF report may contain:
Each section can be calculated in parallel.
Extract:
These can run independently → perfect for threads.
Your system triggers:
All can run asynchronously.
When using multi-threading
Spring beans are singletons, so ensure they don’t store per-request state.
Thread pool size depends on workload:
threads = number of CPU cores + 1
threads = 2 × cores or even higher
Always benchmark thread pool sizes.
Massive performance improvement → Parallelism reduces wasted time.
Non-blocking architecture → Allows server to handle more requests.
Clear async syntax → Very readable.
Built-in error handling → Computation doesn’t silently fail.
Thread pooling for efficient usage → No thread explosion.
Works with Microservice Aggregation pattern → Modern microservices use this everywhere.


