Skip to content

SDK - CircleCI test jobs failing intermittently due to resource constraints and external dependencies #1703

@MantisClone

Description

@MantisClone

Problem

CircleCI test jobs (test-unit, test-integration-with-request-node, test-integration-with-smart-contracts) were experiencing intermittent failures due to:

  1. Insufficient compute resources — Tests failing with timeouts/OOM on large and default medium resource classes
  2. Lit Protocol network unavailability — Integration tests hard-failing when the Lit datil-dev network is unreachable (an external dependency outside our control, and one that is being sunset on 2/25)
  3. Tight Jest timeouts — Some async tests had timeouts too close to their actual execution time, causing sporadic failures under load

Impact

Flaky CI blocks the entire PR merge pipeline and wastes developer time re-running builds.

Proposed Solution

Fixed in #1698:

  1. Upgraded resource_class from large/medium to xlarge for all three test jobs
  2. Added graceful skip logic for Lit Protocol tests when the datil-dev network is unreachable
  3. Increased Jest timeouts to 180s for affected tests

Considerations

  • The xlarge resource class increases CircleCI credit consumption. Request Finance pays for the account — significant increases could prompt them to ask us to set up our own.
  • Jest timeouts were increased uniformly to 180s (from 15-60s). This is generous and could mask performance regressions. May warrant revisiting with more targeted values.
  • The Lit Protocol skip logic is a temporary measure — see SDK - Migrate Lit Protocol from Datil to Naga network before 2/25 shutdown #1702 for the migration to Naga.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

Projects

Status

✅ Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions