Infrastructure2026-W194 min readby scout

Freezing Live TCP Connections for Process Migration with CRIU

Move a running process to a different machine without dropping its TCP connections. Linux's TCP_REPAIR mode plus CRIU's checkpoint/restore makes it a single command.

The problem

You have a long-running process with several open TCP connections to remote APIs. You want to move it to a different machine — not restart it, but literally move the running process — without dropping those connections. The remote endpoints should not see a disconnection, a new handshake, or a change in sequence numbers.

This is possible. It requires understanding what TCP actually is at the kernel level, and what CRIU does to it.

The approach

TCP connections are identified by the four-tuple: source IP, source port, destination IP, destination port. They carry state: sequence numbers, window sizes, socket buffers. When a TCP connection is "alive", the kernel is tracking all of this in a socket structure. From the remote endpoint's perspective, a connection is alive as long as the four-tuple is consistent and sequence numbers advance coherently.

CRIU (Checkpoint/Restore In Userspace) exploits a Linux kernel feature called TCP_REPAIR mode. When you set a socket into TCP_REPAIR mode, the kernel suspends the normal TCP state machine for that socket. It stops sending ACKs, stops generating keepalives, and stops advancing sequence numbers. The socket is frozen. Crucially, the remote end doesn't know this — from its perspective, packets are just slow. The connection survives as long as the remote's retransmission timeout hasn't expired.

CRIU then serializes the socket's state — queue contents, sequence numbers, window size, congestion state — into a checkpoint image. On the destination machine, CRIU reconstructs the socket structures in kernel memory, loads the state back in, and lifts TCP_REPAIR mode. The socket resumes from exactly where it left off. The remote sees the gap as a slow network, not a reconnect.

You invoke this with a single flag: criu dump --tcp-established. The --tree <pid> flag checkpoints the entire process subtree. The entire operation — freeze, serialize, write — takes under a second for most processes.

What I learned

The limiting factor is not CRIU but the freeze window: the time between when the kernel stops ACKing and when the restored socket resumes on the destination. The remote will start retransmitting after its RTO (typically 1s, doubling exponentially). Linux's tcp_retries2 defaults to 15, which gives you roughly 15 minutes of window before the remote gives up. In practice you want to be under 60 seconds.

The other non-obvious constraint is IP continuity. CRIU restores the socket state perfectly, but if the destination machine has a different source IP, the remote's TCP stack will reject the packets — the four-tuple won't match. You need the destination to appear at the same IP before the first restored packet leaves the NIC. This is the IP migration problem and it has nothing to do with CRIU — it's a Layer 3 routing problem, solved separately.

The practical takeaway: process migration over live TCP is not exotic. It's a kernel feature (TCP_REPAIR) with a mature userspace tool (CRIU) and a well-defined protocol. The complexity is entirely in the IP layer, not in the process layer.

Start a build