Table of contents (for this page):
If you could use some help with BGP, have a look at my business web site: inet6consult.com.
BGP routing coursesThere are currently no training courses planned.
Interdomain Routing & IPv6 News
For my training courses, I always check the current size of the IPv4 and IPv6 BGP tables over at the CIDR Report so I can tell the participants what table size capacity to look for when shopping for routers.
Currently, the IPv4 table is at 925k, readying itself for scaling the 1M summit late next year. The IPv6 table is 160k prefixes.
The IPv4 table grew at about 10% per year in the 2010s and 6% last year. At this rate, it'll be at 1.43 million at the beginning of 2030.
The IPv6 table, on the other hand, had been growing at some 31% per year between 2015 and 2020, but last year it grew 37%. At that rate, the IPv6 table will reach 1.7 million prefixes by 2030! Even at a somewhat slower growth rate of 34% the IPv6 table will overtake the IPv4 table before the decade is out.
Of course it's hard to predict 7.5 years into the future, but stranger things have happened.
Also, at this rate, you'll need a router that can handle more than 2 million prefixes five years from now. Which pretty much means that if you are buying a router today that has to be able to hold the full global IPv4 and IPv6 tables, it should already be able to handle more than 2M prefixes in order to have a five year economic lifespan.
Recently, I was looking through some networking certification material. A very large part of it was about OSPF. That's fair, OSPF is probably the most widely used routing protocol in IP networks. But the poor students were submitted to a relentless sequence of increasingly baroquely named features: stub areas, not-so-stubby-areas, totally stubby areas, culminating in totally not-so-stubby areas.
Can we please get rid of some of that legacy? And if not from the standard documents or the router implementations, then at least from the certification requirements and training materials?
Shortest path first, but not so fast
The Open Shortest Path First routing protocol (OSPF, Internet Standard 54) was first defined in RFC 1131 in 1989. So in internet time, OSPF is truly ancient. The base OSPFv2 specification is over 200 pages, with additional extensions in separate documents spanning the early 1990s to the late 2010s.
OSPF is powered by Edsgar Dijkstra's shortest path first algorithm. SPF is a relatively efficient algorithm for finding the shortest path between two places, in the real world or in a network. Still, in a large network there's a lot of paths to check until you can be sure you've found the shortest one. The problem here is that for a network that's 10 times larger, SPF needs 60 times as long to run. So if a router in a network with, say, 100 routers, needs a second to do its SPF calculations after an update, in a network with 1000 routers that takes a minute, and in a network with 10,000 routers an hour.
So in order to make OSPF useful in large networks, you can split your network into different areas. The SPF calculations are then contained to the routers within each area. So rather than calculate SPF over a 10,000-router network, you could have 100 areas with 100 routers each. Then routers that connect two areas would have to calculate SPF over 100 routers for two areas, so 2 seconds rather than an hour worth of SPF calculations.
But if each of those 10,000 routers still injects two, three or four address blocks into OSPF, that means the OSPF database will have something like 30,000 entries. So now updating and remembering all those address blocks becomes a bottleneck. Solution: summarize link advertisements. So if routers in area 35 advertise address blocks 10.35.1.x, 10.35.2.x, … 10.35.95.x, rather than push out all that information to all 10,000 routers throughout the network, the area border routers for area 35 simply say “10.35.x.x” to the rest of the network.
Even better: if an area only connects to the “backbone” area (area 0) and doesn't learn any routing information from other areas or from outside OSPF, it's a stub area that really doesn't even need to know anything that's happening in the rest of the network, so let's give it a default route to reach the rest of the world.
Variations on a stubby theme
Stub areas still have some OSPF routing information from other areas. We can get rid of that too, and then we have a totally stubby area.
On the other hand, maybe we want to import external routing information into OSPF even in our stub area, and then propagate that external information to other areas. This makes for a not-so-stubby area.
And who said you can't have your cake and eat it: let's make our totally stubby area not-so-stubby, and we'll have a totally not-so-stubby area, guaranteeing certification income for years to come. (See Wikipedia's page on OSPF for more details.)
As protocol designers, we're really good at adding more capabilities, more options. As network architects and engineers, we're really good at adding complexity to make our networks do something they won't do out of the box. But we can't just keep adding options and complexity without ever taking any of it away. At least not if we want to have a fighting chance at teaching our craft to the next generation so we can retire at some point.
10,000 routers in one area will melt the network operations center long before the SPF calculations melt the router CPUs. I've personally worked on a network with 600 routers in area 0 back in 1999. SPF performance was the least of our concerns.
So I'm calling it: OSPF areas and summarization are now legacy. New and current OSPF networks should just use a flat area 0 rather than try to micromanage the information flow between areas. Students should no longer have to learn how areas work, and only be informed about the various flavors of stubbiness as an example of humorous naming that doesn't age well.
Laurens Dassen, a new member of the Dutch parliament after the March elections, representing the pan-European party Volt, put several questions about the October 4th Facebook outage to the Dutch cabinet (administration). Yesterday, minister Blok of the ministry of Economic Affairs and Climate answered those. The fourth question was about BGP, among other things.
The Facebook outage was caused by installing a BGP configuration with an error in it. Which underlines what I've been saying for a long time: when all the important parts of your network are redundant and you're using BGP to reroute automatically when failures happen, the remaining outages are your fault. So quite a heavy responsibility. Redundancy wasn't an issue with this incident: Facebook has datacenters all over the world. But if you use automated tools to push out a broken configuration to all of them at once, then it's game over. Remote access also didn't work anymore, and I gather that even access to the buildings didn't work anymore. Probably not exactly what Zuckerberg had in mind with move fast and break things.
The main points: BGP worked correctly and BGP is being developed by the IETF, which is an appropriate forum for that work.
They probably don't realize that BGP has been essentially the same for 27 years: in 1994 BGP version 4 was defined and that's the version we still use today, with relatively minor additions.
Further reading and listening:
I love podcasts. So I'm every happy to be interviewed about BGP on Software Engineering Radio:
Iljitsch van Beijnum, author of the book BGP: Building Reliable Networks with the Border Gateway Protocol https://www.oreilly.com/pub/au/970 discusses internet routing and BGP – the border gateway protocol used by ISPs to update routing information. Host Robert Blumen spoke with Iljitsch about the topology of the internet, autonomous systems (AS), regulatory bodies that coordinate the AS space, IP addresses, the assignment of IPs to ASs; tier-one ISPs, carriers, and home/business ISPs; Internet routing; the path of a packet; routing tables, what they contain, and how they are constructed; routing algorithms; BGP and its role in updating routers with the knowledge of routes held by other routers; and BGP messages. Drill down into the update message. How updates progress from BGP into routing algorithms and then routing tables. What can go wrong. Attacks on BGP.
My Books: "BGP" and "Running IPv6"On this page you can find more information about my book "BGP". Or you can jump immediately to chapter 6, "Traffic Engineering", (approx. 150kB) that O'Reilly has put online as a sample chapter. Information about the Japanese translation can be found here.
More information about my second book, "Running IPv6", is available here.
BGP SecurityBGP has some security holes. This sounds very bad, and of course it isn't good, but don't be overly alarmed. There are basically two problems: sessions can be hijacked, and it is possible to inject incorrect information into the BGP tables for someone who can either hijack a session or someone who has a legitimate BGP session.
Session hijacking is hard to do for someone who can't see the TCP sequence number for the TCP session the BGP protocol runs over, and if there are good anti-spoofing filters it is even impossible. And of course using the TCP MD5 password option (RFC 2385) makes all of this nearly impossible even for someone who can sniff the BGP traffic.
Nearly all ISPs filter BGP information from customers, so in most cases it isn't possible to successfully inject false information. However, filtering on peering sessions between ISPs isn't as widespread, although some networks do this. A rogue ISP could do some real damage here.
There are now two efforts underway to better secure BGP:
The IETF RPSEC (routing protocol security) working group is active in this area.
What is BGPexpert.com?BGPexpert.com is a website dedicated to Internet routing issues. What we want is for packets to find their way from one end of the globe to another, and make the jobs of the people that make this happen a little easier.
Ok, but what is BGP?Have a look at the "what is BGP" page. There is also a list of BGP and interdomain routing terms on this page.
BGP and MultihomingIf you are not an ISP, your main reason to be interested in BGP will probably be to multihome. By connecting to two or more ISPs at the same time, you are "multihomed" and you no longer have to depend on a single ISP for your network connectivity.
This sounds simple enough, but as always, there is a catch. For regular customers, it's the Internet Service Provider who makes sure the rest of the Internet knows where packets have to be sent to reach their customer. If you are multihomed, you can't let your ISP do this, because then you would have to depend on a single ISP again. This is where the BGP protocol comes in: this is the protocol used to carry this information from ISP to ISP. By announcing reachability information for your network to two ISPs, you can make sure everybody still knows how to reach you if one of those ISPs has an outage.
For those of you interested in multihoming in IPv6 (which is pretty much impossible at the moment), have a look at the "IPv6 multihoming solutions" page.
Are you a BGP expert? Take the test to find out!
These questions are somewhat Cisco-centric. We now also have another set of questions and answers for self-study purposes.
You are visiting bgpexpert.com over IPv4. Your address is 126.96.36.199.