
When engineers talk about the Internet, we often imagine something simple:
- A request leaves our server
- It travels across the network
- It reaches the destination
- A response comes back
But in reality, your packet might travel across dozens of independent networks owned by different companies, countries, and cloud providers.
And there is one protocol quietly deciding that journey.
That protocol is BGP.
A small incident that explains everything
Imagine this scenario.
You deploy a new version of your service to production.
Everything looks fine:
- Health checks are green
- CPU and memory are stable
- No errors in logs
But suddenly:
- Users in Europe cannot access your API
- Latency from Singapore jumps from 40 ms to 300 ms
- Some requests start timing out
You check:
- Application logs → nothing unusual
- Database → healthy
- Load balancer → normal
- DNS → correct
So what’s wrong?
After some digging, you discover:
Traffic to your cloud region is being routed through a completely different path across the Internet.
No code change caused it.
No infrastructure change caused it.
It was a BGP routing change somewhere on the Internet.
So what exactly is BGP?
BGP (Border Gateway Protocol) is the protocol that decides:
“Which path should data take across the Internet?”
The Internet is not one big network.
It is made of thousands of independent networks.
Each of these is called an Autonomous System (AS).
The Internet as a group of networks
+----------------+
| Cloudflare |
| AS13335 |
+--------+-------+
|
|
+-----------+ +----+----+ +-----------+
| ISP A +---+ BGP +---+ ISP B |
| AS64501 | | Peering | | AS64502 |
+-----+-----+ +----+----+ +-----+-----+
| | |
| | |
+-----+-----+ +----+----+ +-----+-----+
| Company | | Google | | AWS |
| AS65010 | | AS15169 | | AS16509 |
+-----------+ +----------+ +-----------+
Each box is:
- An independent network
- With its own policies
- Its own infrastructure
- Its own business decisions
BGP is what connects them all.
How traffic moves across multiple networks
When a user opens your website, the packet may travel like this:
User | | 1. Request v+---------+| Home ISP| AS64512+----+----+ | | BGP path decision v+----+----+| Transit | AS3356| Provider|+----+----+ | v+----+----+| AWS | AS16509| Region |+----+----+ | v Your Server
Every hop between these networks is decided by BGP.
How networks announce routes
Each Autonomous System tells others what it can reach
AS100
(Cloud Provider)
10.0.0.0/8
|
| "I can reach 10.0.0.0/8"
v
AS200
(Transit)
|
| "To reach 10.0.0.0/8,
| go through AS100"
v
AS300
(ISP)
So when a user in AS300 needs to reach 10.1.2.3:
- It sends traffic to AS200
- AS200 sends it to AS100
- AS100 delivers it
When the path is not what you expect
You may expect traffic to follow the shortest geographic path:
Vietnam → Singapore → AWS Singapore
Vietnam → Hong Kong → Japan → Singapore → AWS
Expected path:[User VN] ---> [Singapore ISP] ---> [AWS Singapore]Actual BGP path:[User VN] | v[ISP VN] | v[Transit HK] | v[Transit JP] | v[AWS Singapore]
Why?
Because:
- Transit HK is cheaper
- Or there is a policy preference
- Or a route change happened
BGP cares about policy, not geography.
BGP in a hybrid cloud setup
This is a common DevOps scenario.
On-prem Data Center
AS65010
+-------------+
| Core Router |
+------+------+
|
| BGP session
|
+------+------+
| Cloud Edge |
| AS16509 |
+------+------+
|
v
+-------------+
| VPC / VNet |
| Application |
+-------------+
When the BGP session is up:
- Routes are exchanged
- On-prem can reach cloud subnets
- Cloud can reach on-prem networks
If BGP goes down:
- Routes disappear
- Connectivity breaks
- Your app may suddenly lose database access
What a BGP problem looks like
Normal routing
User | vISP A | vTransit | vYour Cloud
After a route leak or misconfiguration
User | vISP A | vWrong Network | vBlack hole (traffic dropped)
From your monitoring, this appears as:
- Sudden regional outage
- Increased latency
- Timeouts from specific countries
Even though:
- Your servers are healthy
- Your deployments are fine
Key takeaway diagram
This is the simplest mental model of BGP:
Your App (Cloud) | v Cloud Network (AS1) | v Transit Network (AS2) | v ISP (AS3) | v User
BGP decides:
- Which AS to use
- In which order
- Based on policies and agreements
Practical takeaway for DevOps engineers
You don’t need to configure BGP daily.
But you should understand this chain:
User → ISP → Transit → Cloud → Your App
Because problems can happen at any link in that chain.
And often:
If the app is healthy but unreachable, BGP is part of the story.