Persistent WebSocket Listeners That Actually Stay Connected
Naive `async for msg in ws` dies silently on the first reconnect. The reliable pattern is an outer `while True` loop that owns the reconnect responsibility, with `ping_interval` and `ping_timeout` configured so half-open connections actually surface.

The problem
A long-running process needs to consume events from a WebSocket server indefinitely. The server restarts, deployments happen, the connection drops. The naive approach — connect once and async for msg in ws — dies silently the first time the connection closes, and your listener is gone.
The approach
The reliable pattern uses a tight outer while True loop that owns the reconnect responsibility, with the inner connection block handling only the happy path:
while True:
ws = websocket.WebSocketApp(url, on_open=..., on_message=..., on_close=..., on_error=...)
ws.run_forever(ping_interval=30, ping_timeout=10)
time.sleep(5)run_forever blocks until the connection closes for any reason — clean close, server restart, network drop, anything. When it returns, you sleep briefly and reconnect. The on_close handler logs the close code so you have a record, but nothing in it needs to decide whether to reconnect — the outer loop handles that unconditionally.
The ping_interval and ping_timeout parameters matter more than they look. Without them, a half-open connection (TCP ACK but no application data) can leave the client sitting silently for minutes before it detects the drop. With ping_interval=30, ping_timeout=10, the client sends a WebSocket ping every 30 seconds and expects a pong within 10 — the connection closes and reconnects within 40 seconds of a silent failure rather than never.
For the threaded version (websocket-client), this pattern is synchronous and pairs with a daemon thread so it doesn't block the main process. For async (websockets), replace run_forever with async with websockets.connect(url) as ws: async for msg in ws: ... inside a try/except Exception block, then await asyncio.sleep(5) in the outer loop.
What I learned
The ping parameters are the most commonly skipped configuration and the most consequential omission. A server that closes connections cleanly on restart is easy to detect — on_close fires immediately. A server that goes silent mid-connection (network partition, crashed process, half-open TCP) is invisible without active ping/pong. Setting ping_interval and ping_timeout turns the reconnect loop from "handles clean closes" into "handles all close scenarios including silent failures."
