Skip to content

Droid 0.57.15 hangs on VPS when updater is enabled (IPv6 stalls; workaround in use) #666

@ain3sh

Description

@ain3sh

Droid updater startup hang on Linux VPS with partial IPv6 egress

Summary

On a Hetzner Ubuntu 24.04 VPS, droid hangs at startup when auto-update is enabled. The hang is reproducible for both droid and droid --version.

The same Droid version works on WSL2. Current stable workaround on VPS is:

export FACTORY_DROID_AUTO_UPDATE_ENABLED=false

With this workaround, interactive Droid starts normally.

Impact

  • Interactive CLI is blocked on affected host when updater is enabled.
  • Non-interactive version checks also hang (--version).
  • User can work only with updater disabled.

Environment

  • Droid: 0.57.15
  • Failing host: Hetzner VPS, Ubuntu 24.04, kernel 6.8.0-100-generic
  • Working host: WSL2 Ubuntu 24.04, kernel 6.6.87.2-microsoft-standard-WSL2

Reproduction

# Hangs on VPS
env -u FACTORY_DROID_AUTO_UPDATE_ENABLED ~/.local/bin/droid --version

# Works immediately on VPS
env FACTORY_DROID_AUTO_UPDATE_ENABLED=false ~/.local/bin/droid --version

Expected

Updater check should fail fast and CLI should still start/exit promptly.

Actual

Updater-enabled startup enters a long-running wait loop and never completes in observed runs (>40s, and continued until externally terminated).

Key evidence

  1. VPS behavior split by updater flag

    • See vps_droid_behavior.txt (noenv_ec:124, dis_ec:0, dis_out:0.57.15)
  2. Endpoint connectivity differs by IP family on VPS

    • vps_downloads_connectivity.txt: IPv6 connects to downloads.factory.ai:443 time out; IPv4 succeeds quickly
    • vps_curl_ipv4_ipv6.txt: curl -4 succeeds, curl -6 times out
  3. Hanging process socket state on VPS

    • vps_hung_socket_snapshot.txt: multiple IPv6 sockets in SYN-SENT to CloudFront v6 addresses while IPv4 sockets are established
  4. WSL contrast

    • wsl_downloads_connectivity.txt: IPv6 fails immediately with ENETUNREACH and IPv4 succeeds; Droid does not hang there
  5. Source-address clue on VPS

    • vps_ipv6_source_binding.txt: IPv6 requests fail from 2a01:4ff:f2:4002::2 but succeed from 2a01:4ff:f0:733e::1
  6. Strace shape

    • vps_strace_excerpt_tail.txt: repeated epoll_pwait2 + periodic timerfd reads during hang (no forward progress to version output)

Working workaround (currently in use)

FACTORY_DROID_AUTO_UPDATE_ENABLED=false exported in shell profile. Confirmed working in Termius session (droid --version and full interactive droid).

Likely fix direction

  • Ensure updater check never blocks startup/exit path.
  • Use strict per-address connection timeouts and robust Happy Eyeballs/IPv4 fallback when IPv6 handshakes stall.
  • Treat stalled IPv6 paths as non-fatal and continue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions