Tuesday, May 17, 2011

MTU/Fragmentation Strikes Again

I have this strange setup:

A <---OpenVPN/TCP over SSH--> B <--localnet--> C

:5432 ---DNAT---> :5432
in which host A is a VPS server somewhere in Illinois and hosts B & C are at home.

Host A needs to connect to a PostgresQL server running on C but for obscure reasons I do not want to run full routing/masquerading on B so I put a DNAT rule so A connecting to B:5432 in effect talks to C:5432

I had problems with a SQL insert A->C (it was the body of an e-mail). My test case message had just a few bytes in the body so the INSERT was completing A-OK.

However in real use this insert was taking forever and my Milter was timing out as a result (brr). Debugging on A was rather harsh as it's a VM with SElinux enabled so many things around ptrace(2) are borked.

After a few missteps I divined that the default MTU for the VPN interface (1500) was 1500 and since the link A->B is point-to-point (A is blissfully unaware of C's existence) A will never perform a path MTU discovery.

The fix was to lower the MTU to 1300 (be on the safe side as I did not bother to measure the overhead of the SSH envelope) on A and B.