
In many architecture discussions, you’ll hear something like:
“Let’s split this into microservices. It’ll be more scalable.”
So the monolith becomes:
- Auth service
- User service
- Order service
- Payment service
- Notification service
Everything looks clean and modular.
But then, a few months later:
- Latency increases
- CPU usage spikes
- Retry storms appear
- The system becomes fragile
What changed?
Nothing… except the number of network calls.
The dangerous assumption
Many engineers unconsciously assume:
A network call is almost free.
It looks simple in code:
var user = await userService.GetUser(id);
One line of code.
But under the hood, this single call may involve:
- Kernel context switches
- DNS lookup
- TCP handshake
- TLS handshake
- Packet retransmissions
- Congestion control
What looks like one function call is actually a complex multi-step protocol dance.
Local call vs network call
Local function call
CPU register → memory → function → returnTime: nanoseconds to microseconds
Network call
App → Kernel → NIC → Switch → Router → Internet → Remote kernel → Remote app → Response back the same wayTime: milliseconds to hundreds of milliseconds
Difference:
- Local call: ~0.1–1 µs
- Network call: 1–100+ ms
That’s 1,000 to 100,000 times slower.
What actually happens during one network call
Let’s say:
Service A → calls → Service B
Here’s what may happen internally.
Step 1: Kernel context switch
Your app runs in user space.
When it sends data:
- App makes a system call.
- CPU switches from:
- User mode
- To kernel mode
- Kernel handles the network stack.
Diagram:
User space (your app) | | system call vKernel space (network stack) | vNetwork card
Each context switch:
- Costs CPU cycles
- Flushes CPU caches
- Adds latency
And it happens:
- On send
- On receive
- Often multiple times per request
Step 2: DNS lookup (if not cached)
Before connecting to a service:
user-service.internal
The system must resolve it to an IP:
10.2.4.15
Flow:
App | vOS resolver | vDNS server | vIP returned
Typical latency:
- Cached DNS: ~0.1–1 ms
- Uncached DNS: 5–50+ ms
In microservices with short-lived connections, DNS cost can become significant.
Step 3: TCP handshake
Before sending data, TCP must establish a connection.
The three-way handshake:
Client Server | ------ SYN ------> | | <--- SYN-ACK ----- | | ------ ACK ------> |
This requires:
- At least 1 round-trip time (RTT)
If RTT = 20 ms:
- TCP handshake ≈ 20 ms
Step 4: TLS handshake (for HTTPS/gRPC)
If the connection is encrypted:
Client Server | --- ClientHello --> | | <-- ServerHello --- | | <--- Certificate ---| | --- Key exchange -->|
This may cost:
- 1–2 extra RTTs
So:
If RTT = 20 ms:
- TCP handshake: ~20 ms
- TLS handshake: ~20–40 ms
- Total before first byte: 40–60 ms
And this is before your API even starts processing.
Step 5: Actual data transfer
Now the request is finally sent.
But TCP doesn’t just blast data at full speed.
It uses congestion control.
Congestion control: why TCP starts slow
TCP uses a mechanism called slow start.
At the beginning:
- It sends a small amount of data.
- Waits for acknowledgments.
- Gradually increases the sending rate.
Diagram:
Time →Packets sent:1 → 2 → 4 → 8 → 16 → ...
This prevents:
- Network congestion
- Packet loss
- Router overload
But it also means:
Small, frequent requests never reach full speed.
This is exactly what happens in microservice architectures.
What happens when retries are added
Now add this common pattern:
If request fails → retry 3 times
In a microservice chain:
API Gateway | vService A | vService B | vService C
If each service retries 3 times:
Service A → 3 attemptsService B → 3 attemptsService C → 3 attempts
Worst-case amplification:
3 × 3 × 3 = 27 calls
One failing request becomes 27 network calls.
This is called a retry storm.
And it causes:
- CPU spikes
- Network congestion
- Cascading failures
End-to-end latency of a single call
Let’s estimate a simple HTTPS call.
Assume:
- RTT: 20 ms
- DNS: 5 ms
- App processing: 10 ms
Breakdown:
DNS lookup: 5 msTCP handshake: 20 msTLS handshake: 40 msRequest/response: 20 msApp processing: 10 ms-------------------------Total: 95 ms
Almost 100 ms for one call.
Now imagine:
API → Service A → Service B → Service C
Four sequential calls:
100 ms × 4 = 400 ms
And that’s under ideal conditions.
Why this matters in microservice architecture
In a monolith:
Function A → Function B → Function C
All in memory.
Latency:
- Microseconds
In microservices:
Service A → network → Service B → network → Service C
Latency:
- Hundreds of milliseconds
So microservices introduce:
- More network calls
- More handshakes
- More context switches
- More congestion risks
A simple mental model
Monolith
[App] | function call v[Module]
Microservices
[Service A] | | DNS | TCP | TLS | kernel switches v[Service B]
One arrow = many hidden costs.
Practical lessons for DevOps and architects
1. Don’t over-split services
Not every module needs to be a microservice.
If two components:
- Always call each other
- Share the same lifecycle
- Have tight coupling
They may belong in the same service.
2. Reuse connections
Avoid:
- Opening a new TCP/TLS connection per request
Use:
- Connection pooling
- HTTP/2 or gRPC
This removes:
- Handshake cost
- DNS overhead
3. Be careful with retries
Retries are useful, but dangerous.
Use:
- Exponential backoff
- Jitter
- Circuit breakers
Avoid:
- Immediate aggressive retries
4. Measure network latency, not just CPU
Track:
- p95 and p99 latency
- Network RTT
- Connection counts
- Retry rates
The key takeaway
A network call is not just:
serviceB.DoSomething();
It is:
DNS → TCP → TLS → kernel switches → congestion control → retries
Microservices turn:
- Cheap function calls
- Into expensive network operations
Understanding these costs helps you:
- Design better architectures
- Avoid latency explosions
- Prevent cascading failures