mirror of
https://github.com/MODSetter/SurfSense.git
synced 2026-05-13 01:32:40 +02:00
chore: add playwright cursor skill
This commit is contained in:
parent
25aad38ca4
commit
d52225c18d
57 changed files with 25244 additions and 0 deletions
496
.cursor/skills/playwright-testing/debugging/flaky-tests.md
Normal file
496
.cursor/skills/playwright-testing/debugging/flaky-tests.md
Normal file
|
|
@ -0,0 +1,496 @@
|
|||
# Debugging and Managing Flaky Tests
|
||||
|
||||
## Table of Contents
|
||||
|
||||
1. [Understanding Flakiness Types](#understanding-flakiness-types)
|
||||
2. [Detection and Reproduction](#detection-and-reproduction)
|
||||
3. [Root Cause Analysis](#root-cause-analysis)
|
||||
4. [Fixing Strategies by Type](#fixing-strategies-by-type)
|
||||
5. [CI-Specific Flakiness](#ci-specific-flakiness)
|
||||
6. [Quarantine and Management](#quarantine-and-management)
|
||||
7. [Prevention Strategies](#prevention-strategies)
|
||||
|
||||
## Understanding Flakiness Types
|
||||
|
||||
### Categories of Flakiness
|
||||
|
||||
Most flaky tests fall into distinct categories requiring different remediation:
|
||||
|
||||
| Category | Symptoms | Common Causes |
|
||||
| --------------------------- | ------------------------------- | ------------------------------------------------------ |
|
||||
| **UI-driven** | Element not found, click missed | Missing waits, animations, dynamic rendering |
|
||||
| **Environment-driven** | CI-only failures | Slower CPU, memory limits, cold browser starts |
|
||||
| **Data/parallelism-driven** | Fails with multiple workers | Shared backend data, reused accounts, state collisions |
|
||||
| **Test-suite-driven** | Fails when run with other tests | Leaked state, shared fixtures, order dependencies |
|
||||
|
||||
### Flakiness Decision Tree
|
||||
|
||||
```
|
||||
Test fails intermittently
|
||||
├─ Fails locally too?
|
||||
│ ├─ YES → Timing/async issue → Check waits and assertions
|
||||
│ └─ NO → CI-specific → Check environment differences
|
||||
│
|
||||
├─ Fails only with multiple workers?
|
||||
│ └─ YES → Parallelism issue → Check data isolation
|
||||
│
|
||||
├─ Fails only when run after specific tests?
|
||||
│ └─ YES → State leak → Check fixtures and cleanup
|
||||
│
|
||||
└─ Fails randomly regardless of conditions?
|
||||
└─ External dependency → Check network/API stability
|
||||
```
|
||||
|
||||
## Detection and Reproduction
|
||||
|
||||
### Confirming Flakiness
|
||||
|
||||
```bash
|
||||
# Run test multiple times to confirm instability
|
||||
npx playwright test tests/checkout.spec.ts --repeat-each=20
|
||||
|
||||
# Run with single worker to isolate parallelism issues
|
||||
npx playwright test --workers=1
|
||||
|
||||
# Run in CI-like conditions locally
|
||||
CI=true npx playwright test --repeat-each=10
|
||||
```
|
||||
|
||||
### Reproduction Strategies
|
||||
|
||||
```typescript
|
||||
// playwright.config.ts - Enable artifacts for flaky test investigation
|
||||
export default defineConfig({
|
||||
retries: process.env.CI ? 2 : 0,
|
||||
use: {
|
||||
trace: "on-first-retry", // Capture trace on retry
|
||||
video: "retain-on-failure",
|
||||
screenshot: "only-on-failure",
|
||||
},
|
||||
});
|
||||
```
|
||||
|
||||
### Identify Flaky Tests Programmatically
|
||||
|
||||
```typescript
|
||||
// Track test results across runs
|
||||
test.afterEach(async ({}, testInfo) => {
|
||||
if (testInfo.retry > 0 && testInfo.status === "passed") {
|
||||
console.warn(`FLAKY: ${testInfo.title} passed on retry ${testInfo.retry}`);
|
||||
// Log to your tracking system
|
||||
}
|
||||
});
|
||||
```
|
||||
|
||||
## Root Cause Analysis
|
||||
|
||||
### Event Logging for Race Conditions
|
||||
|
||||
Add comprehensive event logging to expose timing issues:
|
||||
|
||||
```typescript
|
||||
test.beforeEach(async ({ page }) => {
|
||||
page.on("console", (msg) =>
|
||||
console.log(`CONSOLE [${msg.type()}]:`, msg.text()),
|
||||
);
|
||||
page.on("pageerror", (err) => console.error("PAGE ERROR:", err.message));
|
||||
page.on("requestfailed", (req) =>
|
||||
console.error(`REQUEST FAILED: ${req.url()}`),
|
||||
);
|
||||
});
|
||||
```
|
||||
|
||||
> **For comprehensive console error handling** (fail on errors, allowed patterns, fixtures), see [console-errors.md](console-errors.md).
|
||||
|
||||
### Network Timing Analysis
|
||||
|
||||
```typescript
|
||||
// Capture slow or failed requests
|
||||
test.beforeEach(async ({ page }) => {
|
||||
const slowRequests: string[] = [];
|
||||
|
||||
page.on("requestfinished", (request) => {
|
||||
const timing = request.timing();
|
||||
const duration = timing.responseEnd - timing.requestStart;
|
||||
if (duration > 2000) {
|
||||
slowRequests.push(`${request.url()} took ${duration}ms`);
|
||||
}
|
||||
});
|
||||
|
||||
page.on("requestfailed", (request) => {
|
||||
console.error(`Failed: ${request.url()} - ${request.failure()?.errorText}`);
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
### Trace Analysis
|
||||
|
||||
```bash
|
||||
# View trace from failed CI run
|
||||
npx playwright show-trace path/to/trace.zip
|
||||
|
||||
# Generate trace for specific test
|
||||
npx playwright test tests/flaky.spec.ts --trace on
|
||||
```
|
||||
|
||||
## Fixing Strategies by Type
|
||||
|
||||
### UI-Driven Flakiness
|
||||
|
||||
**Problem: Element not ready when action executes**
|
||||
|
||||
```typescript
|
||||
// ❌ BAD: No wait for element state
|
||||
await page.click("#submit");
|
||||
await page.fill("#username", "test"); // Element may not be ready
|
||||
|
||||
// ✅ GOOD: Actions + assertions pattern (auto-waiting built-in)
|
||||
await page.getByRole("button", { name: "Submit" }).click();
|
||||
await expect(page.getByRole("heading", { name: "Dashboard" })).toBeVisible();
|
||||
```
|
||||
|
||||
**Problem: Animations or transitions interfere**
|
||||
|
||||
```typescript
|
||||
// ❌ BAD: Click during animation
|
||||
await page.click(".menu-item");
|
||||
|
||||
// ✅ GOOD: Wait for animation to complete
|
||||
await page.getByRole("menuitem", { name: "Settings" }).click();
|
||||
await expect(page.getByRole("dialog")).toBeVisible();
|
||||
// Or disable animations entirely
|
||||
await page.emulateMedia({ reducedMotion: "reduce" });
|
||||
```
|
||||
|
||||
**Problem: Brittle selectors**
|
||||
|
||||
```typescript
|
||||
// ❌ BAD: Fragile CSS chain
|
||||
await page.click("div.container > div:nth-child(2) > button.btn-primary");
|
||||
|
||||
// ✅ GOOD: Semantic selectors
|
||||
await page.getByRole("button", { name: "Continue" }).click();
|
||||
await page.getByTestId("checkout-button").click();
|
||||
await page.getByLabel("Email address").fill("test@example.com");
|
||||
```
|
||||
|
||||
### Async/Timing Flakiness
|
||||
|
||||
**Problem: Race between test and application**
|
||||
|
||||
```typescript
|
||||
// ❌ BAD: Arbitrary sleep
|
||||
await page.click("#load-data");
|
||||
await page.waitForTimeout(3000); // Hope data loads in 3s
|
||||
|
||||
// ✅ GOOD: Wait for specific condition
|
||||
await page.click("#load-data");
|
||||
await expect(page.locator(".data-row")).toHaveCount(10, { timeout: 10000 });
|
||||
|
||||
// ✅ BETTER: Wait for network response, then assert
|
||||
const responsePromise = page.waitForResponse(
|
||||
(r) =>
|
||||
r.url().includes("/api/data") &&
|
||||
r.request().method() === "GET" &&
|
||||
r.ok(),
|
||||
);
|
||||
await page.click("#load-data");
|
||||
await responsePromise;
|
||||
await expect(page.locator(".data-row")).toHaveCount(10);
|
||||
```
|
||||
|
||||
> **For comprehensive waiting strategies** (navigation, element state, network, polling with `toPass()`), see [assertions-waiting.md](assertions-waiting.md#waiting-strategies).
|
||||
|
||||
**Problem: Complex async state**
|
||||
|
||||
```typescript
|
||||
// Custom wait for application-specific conditions
|
||||
await page.waitForFunction(() => {
|
||||
const app = (window as any).__APP_STATE__;
|
||||
return app?.isReady && !app?.isLoading;
|
||||
});
|
||||
|
||||
// Wait for multiple conditions
|
||||
await Promise.all([
|
||||
page.waitForResponse("**/api/user"),
|
||||
page.waitForResponse("**/api/settings"),
|
||||
page.getByRole("button", { name: "Load" }).click(),
|
||||
]);
|
||||
```
|
||||
|
||||
### Data/Parallelism-Driven Flakiness
|
||||
|
||||
**Problem: Tests share backend data**
|
||||
|
||||
```typescript
|
||||
// ❌ BAD: All workers use same user
|
||||
const testUser = { email: "test@example.com", password: "pass123" };
|
||||
|
||||
// ✅ GOOD: Unique data per worker
|
||||
import { test as base } from "@playwright/test";
|
||||
|
||||
export const test = base.extend<
|
||||
{},
|
||||
{ testUser: { email: string; id: string } }
|
||||
>({
|
||||
testUser: [
|
||||
async ({}, use, workerInfo) => {
|
||||
const email = `test-${workerInfo.workerIndex}-${Date.now()}@example.com`;
|
||||
const user = await createTestUser(email);
|
||||
await use(user);
|
||||
await deleteTestUser(user.id);
|
||||
},
|
||||
{ scope: "worker" },
|
||||
],
|
||||
});
|
||||
```
|
||||
|
||||
**Problem: Shared storageState across workers**
|
||||
|
||||
```typescript
|
||||
// ❌ BAD: All workers share same auth state
|
||||
use: {
|
||||
storageState: '.auth/user.json',
|
||||
}
|
||||
|
||||
// ✅ GOOD: Per-worker auth state
|
||||
export const test = base.extend<{}, { workerStorageState: string }>({
|
||||
workerStorageState: [
|
||||
async ({ browser }, use, workerInfo) => {
|
||||
const id = workerInfo.workerIndex;
|
||||
const fileName = `.auth/user-${id}.json`;
|
||||
|
||||
if (!fs.existsSync(fileName)) {
|
||||
const page = await browser.newPage({ storageState: undefined });
|
||||
await authenticateUser(page, `worker${id}@test.com`);
|
||||
await page.context().storageState({ path: fileName });
|
||||
await page.close();
|
||||
}
|
||||
|
||||
await use(fileName);
|
||||
},
|
||||
{ scope: "worker" },
|
||||
],
|
||||
});
|
||||
```
|
||||
|
||||
### Test-Suite-Driven Flakiness (State Leaks)
|
||||
|
||||
**Problem: Tests affect each other**
|
||||
|
||||
```typescript
|
||||
// ❌ BAD: Module-level state persists across tests
|
||||
let sharedPage: Page;
|
||||
|
||||
test.beforeAll(async ({ browser }) => {
|
||||
sharedPage = await browser.newPage(); // Shared across tests!
|
||||
});
|
||||
|
||||
// ✅ GOOD: Use Playwright's default isolation (fresh context per test)
|
||||
test("first test", async ({ page }) => {
|
||||
// Fresh page for this test
|
||||
});
|
||||
|
||||
test("second test", async ({ page }) => {
|
||||
// Fresh page for this test
|
||||
});
|
||||
```
|
||||
|
||||
**Problem: Fixture cleanup not happening**
|
||||
|
||||
```typescript
|
||||
// ✅ GOOD: Proper fixture with cleanup
|
||||
export const test = base.extend<{ tempFile: string }>({
|
||||
tempFile: async ({}, use) => {
|
||||
const file = `/tmp/test-${Date.now()}.json`;
|
||||
fs.writeFileSync(file, "{}");
|
||||
|
||||
await use(file);
|
||||
|
||||
// Cleanup always runs, even on failure
|
||||
if (fs.existsSync(file)) {
|
||||
fs.unlinkSync(file);
|
||||
}
|
||||
},
|
||||
});
|
||||
```
|
||||
|
||||
## CI-Specific Flakiness
|
||||
|
||||
### Why Tests Fail Only in CI
|
||||
|
||||
| CI Condition | Impact | Solution |
|
||||
| ------------------ | ------------------------------------- | ---------------------------------------------------- |
|
||||
| Slower CPU | Actions complete later than expected | Use auto-waiting, not timeouts |
|
||||
| Cold browser start | No cached assets, slower initial load | Add explicit waits for first navigation |
|
||||
| Headless mode | Different rendering behavior | Test locally in headless mode |
|
||||
| Shared runners | Resource contention | Reduce parallelism or use dedicated runners |
|
||||
| Network latency | API calls slower | Mock external APIs, increase timeouts for real calls |
|
||||
|
||||
### Simulating CI Locally
|
||||
|
||||
```bash
|
||||
# Run headless with CI environment variable
|
||||
CI=true npx playwright test
|
||||
|
||||
# Limit CPU (Linux/Mac)
|
||||
cpulimit -l 50 -- npx playwright test
|
||||
|
||||
# Run in Docker matching CI environment
|
||||
docker run -it --rm \
|
||||
-v $(pwd):/work \
|
||||
-w /work \
|
||||
mcr.microsoft.com/playwright:v1.40.0-jammy \
|
||||
npx playwright test
|
||||
```
|
||||
|
||||
### Consistent Viewport and Scale
|
||||
|
||||
```typescript
|
||||
// playwright.config.ts - Match CI rendering exactly
|
||||
export default defineConfig({
|
||||
use: {
|
||||
viewport: { width: 1280, height: 720 },
|
||||
deviceScaleFactor: 1,
|
||||
},
|
||||
});
|
||||
```
|
||||
|
||||
### Network Stubbing for External APIs
|
||||
|
||||
```typescript
|
||||
// Eliminate external API flakiness
|
||||
test.beforeEach(async ({ page }) => {
|
||||
// Stub unstable third-party APIs
|
||||
await page.route("**/api.analytics.com/**", (route) =>
|
||||
route.fulfill({ body: "" }),
|
||||
);
|
||||
await page.route("**/api.payment-provider.com/**", (route) =>
|
||||
route.fulfill({ json: { status: "ok" } }),
|
||||
);
|
||||
});
|
||||
|
||||
// Test-specific stub
|
||||
test("checkout with payment", async ({ page }) => {
|
||||
await page.route("**/api/payment", (route) =>
|
||||
route.fulfill({ json: { success: true, transactionId: "test-123" } }),
|
||||
);
|
||||
// Test proceeds with deterministic response
|
||||
});
|
||||
```
|
||||
|
||||
## Quarantine and Management
|
||||
|
||||
### Quarantine Pattern
|
||||
|
||||
```typescript
|
||||
// playwright.config.ts - Separate flaky tests
|
||||
export default defineConfig({
|
||||
projects: [
|
||||
{
|
||||
name: "stable",
|
||||
testIgnore: ["**/*.flaky.spec.ts"],
|
||||
},
|
||||
{
|
||||
name: "quarantine",
|
||||
testMatch: ["**/*.flaky.spec.ts"],
|
||||
retries: 3,
|
||||
},
|
||||
],
|
||||
});
|
||||
```
|
||||
|
||||
### Annotation-Based Quarantine
|
||||
|
||||
```typescript
|
||||
// Mark flaky tests with annotations
|
||||
test("intermittent checkout issue", async ({ page }, testInfo) => {
|
||||
testInfo.annotations.push({
|
||||
type: "flaky",
|
||||
description: "Investigating payment API timing - JIRA-1234",
|
||||
});
|
||||
|
||||
// Test implementation
|
||||
});
|
||||
|
||||
// Skip flaky test conditionally
|
||||
test("known CI flaky", async ({ page }) => {
|
||||
test.skip(!!process.env.CI, "Flaky in CI - investigating JIRA-5678");
|
||||
// Test implementation
|
||||
});
|
||||
```
|
||||
|
||||
## Prevention Strategies
|
||||
|
||||
### Test Burn-In
|
||||
|
||||
```bash
|
||||
# Run new tests many times before merging
|
||||
npx playwright test tests/new-feature.spec.ts --repeat-each=50
|
||||
|
||||
# Run in parallel to expose race conditions
|
||||
npx playwright test tests/new-feature.spec.ts --repeat-each=20 --workers=4
|
||||
```
|
||||
|
||||
### Isolation Checklist
|
||||
|
||||
```typescript
|
||||
// ✅ Each test should be self-contained
|
||||
test.describe("User profile", () => {
|
||||
test("can update name", async ({ page, testUser }) => {
|
||||
// Uses unique testUser fixture
|
||||
// No dependency on other tests
|
||||
// Cleanup handled by fixture
|
||||
});
|
||||
|
||||
test("can update email", async ({ page, testUser }) => {
|
||||
// Independent of "can update name"
|
||||
// Own testUser, own state
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
### Defensive Assertions
|
||||
|
||||
```typescript
|
||||
// ❌ BAD: Single point of failure
|
||||
await expect(page.locator(".items")).toHaveCount(5);
|
||||
|
||||
// ✅ GOOD: Progressive assertions that help diagnose
|
||||
await expect(page.locator(".items-container")).toBeVisible();
|
||||
await expect(page.locator(".loading")).not.toBeVisible();
|
||||
await expect(page.locator(".items")).toHaveCount(5);
|
||||
```
|
||||
|
||||
### Retry Budget
|
||||
|
||||
```typescript
|
||||
// playwright.config.ts - Limit retries to avoid masking issues
|
||||
export default defineConfig({
|
||||
retries: process.env.CI ? 2 : 0, // Only retry in CI
|
||||
expect: {
|
||||
timeout: 10000, // Reasonable assertion timeout
|
||||
},
|
||||
timeout: 60000, // Test timeout
|
||||
});
|
||||
```
|
||||
|
||||
## Anti-Patterns to Avoid
|
||||
|
||||
| Anti-Pattern | Problem | Solution |
|
||||
| ----------------------------------------- | ----------------------------------- | ---------------------------------------------- |
|
||||
| `waitForTimeout()` as primary wait | Arbitrary, hides real timing issues | Use auto-waiting assertions |
|
||||
| Increasing global timeout to "fix" flakes | Masks root cause, slows all tests | Find and fix actual timing issue |
|
||||
| Retrying until pass | Hides systemic problems | Fix root cause, use retries for diagnosis only |
|
||||
| Shared test data across workers | Race conditions, collisions | Isolate data per worker |
|
||||
| Testing real external APIs | Network variability | Mock external dependencies |
|
||||
| Module-level mutable state | Leaks between tests | Use fixtures with proper cleanup |
|
||||
| Ignoring flaky tests | Problem compounds over time | Quarantine and track for fixing |
|
||||
|
||||
## Related References
|
||||
|
||||
- **Debugging**: See [debugging.md](debugging.md) for trace viewer and inspector
|
||||
- **Fixtures**: See [fixtures-hooks.md](../core/fixtures-hooks.md) for worker-scoped isolation
|
||||
- **Performance**: See [performance.md](../infrastructure-ci-cd/performance.md) for parallel execution patterns
|
||||
- **Assertions**: See [assertions-waiting.md](../core/assertions-waiting.md) for auto-waiting patterns
|
||||
- **Global Setup**: See [global-setup.md](../core/global-setup.md) for setup vs fixtures decision
|
||||
Loading…
Add table
Add a link
Reference in a new issue