My new BGP book: 'Internet Routing with BGP' by Iljitsch van Beijnum BGPexpert My BGP book from 2002: 'BGP' by Iljitsch van Beijnum

Home · BGP Expert Test · What is BGP? · BGP Vendors · Links · Archives · Books · My New BGP Book

BGP (advertisement)
Path MTU Discovery problems (posted 2003-05-19)

If your network has a link with an MTU that's smaller than 1500 bytes in the middle, you're in trouble. It's not the first time this came up on the NANOG list and it won't be the last.

In order to avoid wasting resources by either sending packets that are smaller than the maximum supported by the network or sending packets that are so large they must be fragmented, hosts implement Path MTU Discovery (PMTUD). By assuming a large packet size and simply transmitting them with the don't fragment (DF) bit set (in IPv4, in IPv6 DF is implied) and listening for ICMP messages that say the packet is too big, hosts can quickly determine the lowest Maximum Transmission Unit (MTU) that's in effect on a certain link.

Most of the time, that is. In RFC 1191 it is suggested that hosts quickly react to a changing path MTU. So implementors decided to simply set the DF bit on ALL packets. At the same time, many people are very suspicious of ICMP packets since they can be used in denial of service attacks or to uncover information about a network. So to be on the safe side a significant number of people filters all ICMP messages. Or routers are configured in such a way that the ICMP packet too big messages aren't generated or can't make it back to the source host. NAT really doesn't help in this regard either.

So what happens when all packets have DF set and there are no ICMP packet too big messages? Right: nothing. Since the first few packets in a session are typically small session get set up without problems but as soon as the data transfer starts the session times out. So what can we do?

  • The real solution would be that TCP implementators clean up their act and stop depending on ICMP messages that may or may not be generated somewhere along the way. Unfortunately, it looks like they think they don't have to do anything since:

  • People who configure firewalls and routers should make sure the ICMP packet too big messages are generated with correct IP addresses and passed along to the source host.

Unfortunately, this doesn't really solve the problem for a user who is behind some kind of tunneling mechanism that brings the MTU down below 1500 bytes, such as PPPoE, PPTP or GRE. Fortunately TCP has a Maximum Segment Size (MSS) option that is used to tell the other side the maximum size of the TCP segment (= without IP and TCP headers) it should send. When a host itself knows about the smaller MTU it adjusts the MSS so there is no problem. The trouble starts when there is a router connected to the network with the reduced MTU so the end hosts see the regular ethernet MTU. Many routers implement a feature where the MSS option in TCP session establishment packets is manipulated to avoid this problem. On Cisco routers this is a fairly recent feature that is enabled with the interface command ip tcp adjust-mss .... Unfortunately, it is not entirely clear what impact this feature has on fowarding performance.

RFC 2923 TCP Problems with Path MTU Discovery
The MSS Initiative
Cisco - Why Can't I Browse the Internet when Using a GRE Tunnel?