Why toast-only Playwright tests are not enough

AI-generated end-to-end tests often look useful at first glance.

They click through a flow.

They wait for something visible.

They pass.

But sometimes the entire assertion is basically this:

await expect(page.locator('.toast-success')).toBeVisible();

That is not always wrong.

A toast assertion can be useful.

The problem starts when the toast is the only assertion for a critical business flow.

A success toast is not a business outcome

Imagine a checkout test.

The test selects a plan, enters payment details, clicks “Pay”, and then checks that a green success toast appears.

That proves only one thing:

The frontend displayed a success message.

It does not prove that:

the order was actually persisted
the selected plan was applied
the final price was correct
the payment state changed correctly
the subscription was activated
a confirmation email or background job was triggered
the backend accepted the correct request payload

The UI can show a success toast even if the backend state is wrong.

That is why toast-only tests can create false confidence.

The problem with AI-generated tests

General AI tools often generate tests by looking at the visible UI.

They are good at producing steps like:

await page.getByRole('button', { name: 'Pay' }).click();
await expect(page.locator('.toast-success')).toBeVisible();

But they may not understand the business rules behind the flow.

For checkout, the real question is not:

Did the user see a green toast?

The real questions are:

Was the correct plan selected?
Was the price calculated server-side?
Was the discount validated?
Was the order created?
Was the subscription updated?
Did the system reject manipulated client-side values?

Without that context, the test may pass while the product is still broken.

Demo PR: a toast-only checkout test

I created a small public demo PR that adds a Playwright checkout test with exactly this problem.

Public demo PR:

https://github.com/TerFree70/qa-boutique-demo-checkout-risk/pull/2

The test clicks through checkout and checks only this:

await expect(page.locator('.toast-success')).toBeVisible();

The PR looks reasonable at first glance: it adds a Playwright test, clicks through the checkout flow, and checks a visible success signal.

But it still does not verify the actual checkout outcome.

PR Risk Summary in GitHub Actions

What the PR risk scanner detected

The free QA Boutique PR Risk Scanner flagged this pull request as high risk.

Why?

Because the PR touched a test for a critical flow — checkout — but the assertion did not prove the business outcome.

The GitHub Actions summary generated a reviewer checklist with questions such as:

Is the selected plan persisted and used by the backend?
Is the final price calculated server-side?
Can checkout be manipulated through URL or query parameters?
Does the full user flow complete, not only show a success message?
Does the expected backend state change after the flow?

That kind of checklist is useful because it turns a vague review concern into concrete verification steps.

What the AI review found

QA Boutique then analyzed the same PR and found concrete risks.

Slack Alert for a Risky PR

The AI review identified three issues:

Missing payment success validation beyond the UI toast.
No validation of selected plan details in payment.
No error handling or declined card scenario coverage.

The key point is simple:

A visible success message is not proof of a completed transaction.

A user could see a success toast while:

the payment was not processed
the subscription was not created
the wrong plan was selected
the wrong amount was charged
the user account state was not updated
the confirmation flow silently failed

This is the difference between UI-level confidence and product-level confidence.

Keep the toast assertion, but do not stop there

I would not automatically delete toast assertions.

They are still useful as UI checks.

But I would treat them as one layer of validation, not the final proof that the flow works.

A stronger checkout test might verify:

success toast or confirmation UI
confirmation page or order ID
selected plan name
final price
API response status
created order state in the test environment
subscription or billing state
invalid discount or invalid card behavior

For example, instead of only checking this:

await expect(page.locator('.toast-success')).toBeVisible();

A stronger test should also check something closer to the actual business outcome:

await expect(page.getByTestId('order-confirmation')).toBeVisible();
await expect(page.getByTestId('selected-plan')).toHaveText('Startup');
await expect(page.getByTestId('final-price')).toHaveText('$79.20');

In a real test environment, you might also verify backend state directly.

A simple review rule

When reviewing AI-generated Playwright tests, I like to ask:

If this test passes, what business fact do we actually know?

If the answer is only:

A toast appeared.

Then the test is probably too weak for regression coverage.

A better test should prove at least one important business outcome.

For checkout, that might be:

the order exists
the selected plan is correct
the charged amount is correct
the subscription is active
the user can access the paid feature
the system rejects invalid payment states

For auth, that might be:

unauthorized users are rejected
users cannot access another workspace
role-based permissions are enforced on the backend

For billing, that might be:

coupons are validated server-side
subscription status changes correctly
failed payments do not unlock paid features

Practical checklist

For critical flows like checkout, auth, billing, subscriptions, or permissions, I would check:

Does the test verify a real business outcome?
Does it assert backend state or API result where possible?
Does it cover at least one negative or error case?
Does it avoid relying only on generic success messages?
Does it prove the behavior that would matter in production?

Toast assertions are fine.

Toast-only regression coverage is the risky part.

Related tool

I also built a free read-only GitHub Action around this idea:

https://github.com/marketplace/actions/qa-boutique-pr-risk-scanner

It does not use AI, does not upload code, and does not require write access.

It flags risky PR areas such as checkout, pricing, auth, API changes, and missing test signals before merge.

Source code:

https://github.com/TerFree70/qa-boutique-pr-risk-scanner

For deeper AI PR review:

https://qaboutique.com/#free-pr-audit

I would be curious how other teams handle this:

Do you treat toast-only Playwright tests as useful smoke checks, or do you require business-outcome assertions for critical flows like checkout, auth, and billing?