My book: 'Running IPv6' by Iljitsch van Beijnum BGPexpert My book: 'BGP' by Iljitsch van Beijnum

Home · BGP Expert Test · What is BGP? · BGP Vendors · Links · Archives · Books · My BGP Book

BGP (advertisement)
Internet exchange renumbering: everything old is new again (posted 2014-10-15)

This week, the Amsterdam Internet Exchange is renumbering its peering LAN.

An internet exchange (IX) is simply a very big Ethernet. Members connect a router port to that Ethernet, and can then exchange packets with each other. When you want to exchange traffic with many other networks, obviously this is more efficient than setting up dedicated connections with all these other networks.

Until this week, AMS-IX used a /22 prefix, allowing for about a thousand connected routers. That was no longer enough, so they got a new /21 prefix, which can accommodate two thousand connected routers. This means that all the currently connected routers must get a new address. No big deal. This is why search-and-replace was invented.

However, sometimes someone makes a mistake. Like configuring <new address>/22 instead of <new address>/21. And then letting that /22 propagate to other networks over BGP. Suppose:

  • 192.0.0.0/21 is the prefix used on the IX peering LAN
  • 192.0.2.1 peers with 192.0.2.2
  • 192.0.2.3 advertises 192.0.0.0/22
  • 192.0.2.1 wants to send a BPG keepalive packet to 192.0.2.2. Normally this packet would go to the port where 192.0.0.0/21 is configured. But the 192.0.0.0/22 prefix is "more specific", so the packet is sent on its way to the source of the 192.0.0.0/22 prefix: router 192.0.2.3.
  • 192.0.2.3 forwards the packet to router 192.0.2.2
  • All's well that ends well? Not so fast. In BGP, there are not supposed to be any intermediate routers between two BGP routers. 192.0.2.2 sees the packet traversed and extra hop, rejects it and resets the BGP session.
  • 192.0.2.1 can no longer send traffic to 192.0.2.2

(A more specific prefix is a smaller range of IP addresses. 192.0.0.0/21 is BGP talk for the address range 192.0.0.0 - 192.0.7.255. 192.0.0.0/22 is the range 192.0.0.0 - 192.0.3.255. Because the latter identifies a smaller range of IP addresses, the packets are sent in that direction, just like you'd follow a sign "Paris" rather than a sign "France" if you were going to Paris, even though Paris is part of France so presumably following the sign "France" would also get you to Paris.)

The sad thing is that the exact same thing happened in 2003, when the AMS-IX renumbered from a /24 to a /23. I always warn against this issue during my training courses, and tell students to filter the IX prefixes of internet exchanges they're connected to, as well as all possible subprefixes (more specifics) that fall within that IX prefix. For instance:

!
ip prefix-list import deny 172.16.0.0/12 le 32
ip prefix-list import deny 80.249.208.0/21 le 32
ip prefix-list import permit 0.0.0.0/0 le 24
!

This prefix list will reject incoming updates with your own prefix and all possible more specifics (assuming your prefix is 172.16.0.0/12) as well as the AMS-IX prefix 80.249.208.0/21 and all possible subprefixes. It then allows all prefixes with a prefix length of no more than /24, which is common practice for IPv4.

"le" means "less or equal" so "172.16.0.0/12 le 32" means:

172.16.0.0/12
172.16.0.0/13
172.24.0.0/13
...
172.31.255.254/31
172.31.255.254/32
172.31.255.255/32

Hopefully, by the time AMS-IX connects more than 2000 routers, the issue is moot because we no longer use IPv4. But for now: happy renumbering!