Skip to content

Commit b9fa345

Browse files
Han5991vespa7
andcommitted
test_runner: add flaky option to retry on failure
Add a `flaky` option that re-runs a failing test until it passes, intended for tests with unavoidable nondeterminism. Setting `flaky: true` retries up to 20 times; `flaky: <positive integer>` sets an explicit retry budget. The option is accepted on tests and suites and via the it.flaky/test.flaky/describe.flaky/suite.flaky shorthands; a test-case value overrides an inherited suite value (nearest wins), and `flaky: false` opts a test out. Only the final attempt is observable: intermediate failures emit no test:fail, no per-test diagnostics, and nothing on the node.test error channel. Each result carries a new `retryCount` field on the test:pass, test:fail, and test:complete events (the number of retries performed, `undefined` for non-flaky tests), reporters print a `# FLAKY` directive, and the run summary gains a `flaky` counter. beforeEach/afterEach re-run on every attempt while before/after run once, so per-attempt state is reset and retries do not leak state. An externally aborted test and an expectFailure are not retried. A flaky test whose timeout is exhausted is reported as a failure rather than cancelled. Co-authored-by: vespa7 <98526766+vespa7@users.noreply.github.com> Signed-off-by: sangwook <rewq5991@gmail.com>
1 parent 30a7e28 commit b9fa345

31 files changed

Lines changed: 1006 additions & 92 deletions

doc/api/test.md

Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1868,6 +1868,16 @@ added:
18681868
Shorthand for marking a suite as `TODO`. This is the same as
18691869
[`suite([name], { todo: true }[, fn])`][suite options].
18701870

1871+
## `suite.flaky([name][, options][, fn])`
1872+
1873+
<!-- YAML
1874+
added:
1875+
- REPLACEME
1876+
-->
1877+
1878+
Shorthand for marking a suite as flaky. This is the same as
1879+
[`suite([name], { flaky: true }[, fn])`][suite options].
1880+
18711881
## `suite.only([name][, options][, fn])`
18721882

18731883
<!-- YAML
@@ -1939,6 +1949,14 @@ changes:
19391949
* `todo` {boolean|string} If truthy, the test marked as `TODO`. If a string
19401950
is provided, that string is displayed in the test results as the reason why
19411951
the test is `TODO`. **Default:** `false`.
1952+
* `flaky` {boolean|number} If `true`, the test (or, for a suite, each of its
1953+
test-cases) is retried until it passes, up to a default of 20 retries. If a
1954+
positive integer is provided, it is retried up to that many times. A
1955+
non-positive or non-integer value throws. Each retry re-runs the test's
1956+
`beforeEach` and `afterEach` hooks. A test-case's own value overrides one
1957+
inherited from its suite; `flaky: false` opts out. Retries are intended for
1958+
tests that fail intermittently due to transient conditions; be aware that
1959+
non-idempotent state may leak between attempts. **Default:** `false`.
19421960
* `timeout` {number} A number of milliseconds the test will fail after.
19431961
If unspecified, subtests inherit this value from their parent.
19441962
**Default:** `Infinity`.
@@ -2000,6 +2018,16 @@ same as [`test([name], { todo: true }[, fn])`][it options].
20002018
Shorthand for marking a test as `only`,
20012019
same as [`test([name], { only: true }[, fn])`][it options].
20022020

2021+
## `test.flaky([name][, options][, fn])`
2022+
2023+
<!-- YAML
2024+
added:
2025+
- REPLACEME
2026+
-->
2027+
2028+
Shorthand for marking a test as flaky,
2029+
same as [`test([name], { flaky: true }[, fn])`][it options].
2030+
20032031
## `describe([name][, options][, fn])`
20042032

20052033
Alias for [`suite()`][].
@@ -2027,6 +2055,16 @@ added:
20272055
Shorthand for marking a suite as `only`. This is the same as
20282056
[`describe([name], { only: true }[, fn])`][describe options].
20292057

2058+
## `describe.flaky([name][, options][, fn])`
2059+
2060+
<!-- YAML
2061+
added:
2062+
- REPLACEME
2063+
-->
2064+
2065+
Shorthand for marking a suite as flaky. This is the same as
2066+
[`describe([name], { flaky: true }[, fn])`][describe options].
2067+
20302068
## `it([name][, options][, fn])`
20312069

20322070
<!-- YAML
@@ -2066,6 +2104,16 @@ added:
20662104
Shorthand for marking a test as `only`,
20672105
same as [`it([name], { only: true }[, fn])`][it options].
20682106

2107+
## `it.flaky([name][, options][, fn])`
2108+
2109+
<!-- YAML
2110+
added:
2111+
- REPLACEME
2112+
-->
2113+
2114+
Shorthand for marking a test as flaky,
2115+
same as [`it([name], { flaky: true }[, fn])`][it options].
2116+
20692117
## `before([fn][, options])`
20702118

20712119
<!-- YAML
@@ -3539,6 +3587,9 @@ Emitted when code coverage is enabled and all tests have completed.
35393587
* `testNumber` {number} The ordinal number of the test.
35403588
* `todo` {string|boolean|undefined} Present if [`context.todo`][] is called
35413589
* `skip` {string|boolean|undefined} Present if [`context.skip`][] is called
3590+
* `retryCount` {number|undefined} The number of retries performed for a test
3591+
marked `flaky` (`0` if it passed on the first attempt). `undefined` for
3592+
tests that are not flaky.
35423593

35433594
Emitted when a test completes its execution.
35443595
This event is not emitted in the same order as the tests are
@@ -3647,6 +3698,9 @@ Emitted when a test is enqueued for execution.
36473698
* `testNumber` {number} The ordinal number of the test.
36483699
* `todo` {string|boolean|undefined} Present if [`context.todo`][] is called
36493700
* `skip` {string|boolean|undefined} Present if [`context.skip`][] is called
3701+
* `retryCount` {number|undefined} The number of retries performed for a test
3702+
marked `flaky` (`0` if it passed on the first attempt). `undefined` for
3703+
tests that are not flaky.
36503704

36513705
Emitted when a test fails.
36523706
This event is guaranteed to be emitted in the same order as the tests are
@@ -3712,6 +3766,9 @@ since the parent runner only knows about file-level tests. When using
37123766
* `testNumber` {number} The ordinal number of the test.
37133767
* `todo` {string|boolean|undefined} Present if [`context.todo`][] is called
37143768
* `skip` {string|boolean|undefined} Present if [`context.skip`][] is called
3769+
* `retryCount` {number|undefined} The number of retries performed for a test
3770+
marked `flaky` (`0` if it passed on the first attempt). `undefined` for
3771+
tests that are not flaky.
37153772

37163773
Emitted when a test passes.
37173774
This event is guaranteed to be emitted in the same order as the tests are
@@ -4527,6 +4584,14 @@ changes:
45274584
* `todo` {boolean|string} If truthy, the test marked as `TODO`. If a string
45284585
is provided, that string is displayed in the test results as the reason why
45294586
the test is `TODO`. **Default:** `false`.
4587+
* `flaky` {boolean|number} If `true`, the test (or, for a suite, each of its
4588+
test-cases) is retried until it passes, up to a default of 20 retries. If a
4589+
positive integer is provided, it is retried up to that many times. A
4590+
non-positive or non-integer value throws. Each retry re-runs the test's
4591+
`beforeEach` and `afterEach` hooks. A test-case's own value overrides one
4592+
inherited from its suite; `flaky: false` opts out. Retries are intended for
4593+
tests that fail intermittently due to transient conditions; be aware that
4594+
non-idempotent state may leak between attempts. **Default:** `false`.
45304595
* `timeout` {number} A number of milliseconds the test will fail after.
45314596
If unspecified, subtests inherit this value from their parent.
45324597
**Default:** `Infinity`.

lib/internal/test_runner/harness.js

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -63,6 +63,7 @@ function createTestTree(rootTestOptions, globalOptions) {
6363
failed: 0,
6464
passed: 0,
6565
cancelled: 0,
66+
flaky: 0,
6667
skipped: 0,
6768
todo: 0,
6869
topLevel: 0,
@@ -406,7 +407,7 @@ function runInParentContext(Factory) {
406407

407408
return run(name, options, fn, overrides);
408409
};
409-
ArrayPrototypeForEach(['expectFailure', 'skip', 'todo', 'only'], (keyword) => {
410+
ArrayPrototypeForEach(['expectFailure', 'flaky', 'skip', 'todo', 'only'], (keyword) => {
410411
test[keyword] = (name, options, fn) => {
411412
const overrides = {
412413
__proto__: null,

lib/internal/test_runner/reporter/dot.js

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,11 @@ module.exports = async function* dot(source) {
1212
const failedTests = [];
1313
for await (const { type, data } of source) {
1414
if (type === 'test:pass') {
15-
yield `${colors.green}.${colors.reset}`;
15+
// A flaky test that needed retries is shown in yellow to distinguish it
16+
// from a clean pass, without using 'F' (which reads as a failure).
17+
yield data.retryCount > 0 ?
18+
`${colors.yellow}.${colors.reset}` :
19+
`${colors.green}.${colors.reset}`;
1620
}
1721
if (type === 'test:fail') {
1822
yield `${colors.red}X${colors.reset}`;

lib/internal/test_runner/reporter/junit.js

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -131,6 +131,16 @@ module.exports = async function* junitReporter(source) {
131131
attrs: { __proto__: null, type: 'todo', message: event.data.todo },
132132
});
133133
}
134+
if (event.data.retryCount > 0) {
135+
ArrayPrototypePush(currentTest.children, {
136+
__proto__: null, nesting: event.data.nesting + 1, tag: 'properties',
137+
attrs: { __proto__: null },
138+
children: [{
139+
__proto__: null, nesting: event.data.nesting + 2, tag: 'property',
140+
attrs: { __proto__: null, name: 'flaky', value: `${event.data.retryCount} retries` },
141+
}],
142+
});
143+
}
134144
if (event.type === 'test:fail') {
135145
const error = event.data.details?.error;
136146
currentTest.children.push({

lib/internal/test_runner/reporter/tap.js

Lines changed: 10 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -33,12 +33,14 @@ async function * tapReporter(source) {
3333
for await (const { type, data } of source) {
3434
switch (type) {
3535
case 'test:fail': {
36-
yield reportTest(data.nesting, data.testNumber, 'not ok', data.name, data.skip, data.todo, data.expectFailure);
36+
yield reportTest(data.nesting, data.testNumber, 'not ok', data.name,
37+
data.skip, data.todo, data.expectFailure, data.retryCount);
3738
const location = data.file ? `${data.file}:${data.line}:${data.column}` : null;
3839
yield reportDetails(data.nesting, data.details, location);
3940
break;
4041
} case 'test:pass':
41-
yield reportTest(data.nesting, data.testNumber, 'ok', data.name, data.skip, data.todo, data.expectFailure);
42+
yield reportTest(data.nesting, data.testNumber, 'ok', data.name,
43+
data.skip, data.todo, data.expectFailure, data.retryCount);
4244
yield reportDetails(data.nesting, data.details, null);
4345
break;
4446
case 'test:plan':
@@ -75,7 +77,7 @@ async function * tapReporter(source) {
7577
}
7678
}
7779

78-
function reportTest(nesting, testNumber, status, name, skip, todo, expectFailure) {
80+
function reportTest(nesting, testNumber, status, name, skip, todo, expectFailure, retryCount) {
7981
let line = `${indent(nesting)}${status} ${testNumber}`;
8082

8183
if (name) {
@@ -88,6 +90,11 @@ function reportTest(nesting, testNumber, status, name, skip, todo, expectFailure
8890
line += ` # TODO${typeof todo === 'string' && todo.length ? ` ${tapEscape(todo)}` : ''}`;
8991
} else if (expectFailure !== undefined) {
9092
line += ` # EXPECTED FAILURE${typeof expectFailure === 'string' ? ` ${tapEscape(expectFailure)}` : ''}`;
93+
} else if (retryCount > 0) {
94+
const word = retryCount === 1 ? 're-try' : 're-tries';
95+
line += status === 'ok' ?
96+
` # FLAKY ${retryCount} ${word}` :
97+
` # FLAKY failed after ${retryCount} ${word}`;
9198
}
9299

93100
line += '\n';

lib/internal/test_runner/reporter/utils.js

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -71,7 +71,7 @@ function formatError(error, indent) {
7171
function formatTestReport(type, data, showErrorDetails = true, prefix = '', indent = '') {
7272
let color = reporterColorMap[type] ?? colors.white;
7373
let symbol = reporterUnicodeSymbolMap[type] ?? ' ';
74-
const { skip, todo, expectFailure } = data;
74+
const { skip, todo, expectFailure, retryCount } = data;
7575
const duration_ms = data.details?.duration_ms ? ` ${colors.gray}(${data.details.duration_ms}ms)${colors.white}` : '';
7676
const replayed = data.details?.passed_on_attempt !== undefined ?
7777
` ${colors.gray}(passed on attempt ${data.details.passed_on_attempt})${colors.white}` :
@@ -90,6 +90,11 @@ function formatTestReport(type, data, showErrorDetails = true, prefix = '', inde
9090
}
9191
} else if (expectFailure !== undefined) {
9292
title += ` # EXPECTED FAILURE`;
93+
} else if (retryCount > 0) {
94+
const word = retryCount === 1 ? 're-try' : 're-tries';
95+
title += type === 'test:fail' ?
96+
` # FLAKY failed after ${retryCount} ${word}` :
97+
` # FLAKY ${retryCount} ${word}`;
9398
}
9499

95100
const err = showErrorDetails && data.details?.error ? formatError(data.details.error, indent) : '';

lib/internal/test_runner/runner.js

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -313,7 +313,13 @@ class FileTest extends Test {
313313
skipped: item.data.skip !== undefined,
314314
isTodo: item.data.todo !== undefined,
315315
passed: item.type === 'test:pass',
316-
cancelled: kCanceledTests.has(item.data.details?.error?.failureType),
316+
// A retried timeout that exhausted (retryCount > 0) is a failure, not a
317+
// cancellation only an un-retried timeout/abort stays cancelled.
318+
cancelled: kCanceledTests.has(item.data.details?.error?.failureType) &&
319+
!(item.data.retryCount > 0),
320+
// retryCount is present (even 0) only for flaky-marked tests, so it lets
321+
// the parent count flaky tests across the process-isolation IPC boundary.
322+
flakyRetries: item.data.retryCount !== undefined ? 1 : 0,
317323
nesting: item.data.nesting,
318324
reportedType: item.data.details?.type,
319325
}, this.root.harness);

0 commit comments

Comments
 (0)