18+ BDD Test Case Types Explained for QA Teams

Behavior-Driven Development sits between product intent and executable tests. You write what the system should do in plain language. The same text drives conversation, documentation, and automation.

This post walks through what BDD actually is, then maps the 18+ test case types QA Lab AI generates from acceptance criteria, URLs, screenshots, or OpenAPI specs. Each group gets a Gherkin example you can drop into Cucumber, SpecFlow, Behave, or Playwright with a Gherkin runner.

What BDD is, in one paragraph

BDD describes behavior in Given/When/Then steps. Given sets state. When performs an action. Then asserts an outcome. Scenarios live in .feature files. Step definitions bind those steps to code in Playwright, Cypress, Selenium, or WebdriverIO. The format is language-agnostic, which is why BDD test generation scales across stacks.

The trap is volume. A serious feature has happy paths, edge cases, error states, accessibility checks, performance budgets, and security scenarios. Writing all of that by hand is where coverage dies.

How QA Lab AI groups the 18+ types

QA Lab AI takes one input — acceptance criteria, a URL, a screenshot, or an OpenAPI spec — and produces a .feature file covering 18+ distinct test types. We group them into six families below.

You can browse the full catalog on /test-cases or see how generation fits into a workflow on /ai-testing.

1. Functional and behavioral tests

These are the core paths users actually walk.

  • Happy path
  • Negative path
  • Boundary value
  • Equivalence partitioning
  • Error handling and validation

Boundary tests catch off-by-one and field-length bugs that unit tests miss because the bug lives at the form layer.

Feature: Coupon code at checkout

  Scenario Outline: Coupon length boundaries
    Given a cart with one item priced 50.00
    When the shopper applies coupon "<code>"
    Then the system responds with "<result>"

    Examples:
      | code         | result                |
      | A            | "Code too short"      |
      | VALID10      | "10% discount applied"|
      | AAAAAAAAAAAA | "Code too long"       |

2. Integration and data-flow tests

Most production bugs are integration bugs. QA Lab AI generates scenarios that cross service or component boundaries.

  • API contract (especially when fed an OpenAPI spec)
  • End-to-end user journey
  • Data-driven (table-backed)
  • State transition

State-transition tests are underused. They prevent the "I clicked back, now the order is in two states at once" class of bug.

Feature: Order state machine

  Scenario: Refund cannot precede capture
    Given an order in state "authorized"
    When the merchant requests a refund
    Then the response is "Cannot refund: payment not captured"
    And the order remains in state "authorized"

3. UI and cross-environment tests

Pulled from a URL or screenshot input, these check that the rendered product matches intent.

  • Cross-browser
  • Responsive and mobile viewport
  • Visual regression cues
  • Locale and i18n

Generation here pairs cleanly with Playwright projects, where each browser is a separate project entry. The .feature file stays the same; the runner fans out.

Feature: Pricing page on small viewports

  Scenario: CTA stays visible on iPhone SE
    Given the viewport is 375 by 667
    When the shopper opens "/pricing"
    Then the "Start free" button is visible without scrolling
    And the page has no horizontal scrollbar

4. Non-functional tests

Behavior is not just "does it work." It is "does it work fast enough, for everyone, reliably."

  • Performance and Lighthouse budget
  • Accessibility (WCAG via axe-core)
  • Load and stress (smoke-level)
  • SEO and metadata

QA Lab AI's audit engine runs Lighthouse, axe-core, OWASP, SEO, broken-link, and cross-browser checks against live sites — see /free-audit for the no-login version. The generated .feature files mirror the same checks so they live in your CI alongside functional suites.

Feature: Accessibility budget for the marketing home

  Scenario: No critical or serious axe violations
    Given the page "/" is loaded in Chromium
    When axe-core runs with WCAG 2.2 AA rules
    Then there are zero "critical" violations
    And there are zero "serious" violations

5. Security tests

Security cases are where untrained generators tend to hallucinate. QA Lab AI scopes them to OWASP categories tied to the input.

  • Authentication and session
  • Authorization (role and tenant)
  • Input sanitization (XSS, SQLi probes)
  • Secrets and headers
Feature: Tenant isolation on report export

  Scenario: User from tenant A cannot read tenant B reports
    Given a session for "alice@tenant-a"
    When she requests "/api/reports/r-tenant-b-001"
    Then the response status is 404
    And the body does not contain "tenant-b"

6. Resilience and recovery tests

The scenarios that get skipped in tight sprints, then bite in production.

  • Network failure and retry
  • Offline and reconnect
  • Idempotency
  • Concurrency and race conditions
Feature: Idempotent payment intent

  Scenario: Duplicate submission charges once
    Given the shopper submits a payment with idempotency key "k-1729"
    And the request succeeds with order "o-42"
    When the same request is submitted again within 60 seconds
    Then the response references order "o-42"
    And no second charge is created

That is the 18+. Counted strictly: happy, negative, boundary, equivalence, validation, contract, end-to-end, data-driven, state-transition, cross-browser, responsive, visual, i18n, performance, accessibility, load, SEO, security-auth, security-authz, sanitization, network-failure, idempotency, concurrency. The catalog grows as new audit modules ship.

Why generation, not templates

Templates produce shaped boilerplate. They do not read your acceptance criteria. QA Lab AI parses the input, identifies the entity model, and emits scenarios that reference the actual fields, states, and routes in your product.

Three practical results:

  1. The .feature file references your real field names, not <input>.
  2. Scenario Outlines get realistic Examples tables, not value1, value2.
  3. Tags map to your test pyramid: @smoke, @regression, @a11y, @security.

Tagging is what makes the suite usable in CI. You run @smoke on every push, @regression on merge to main, @a11y and @security on a nightly schedule.

Output formats and where they go

Every generation run can be exported as:

  • A .feature file (Gherkin, ready for Cucumber, SpecFlow, Behave, or Playwright with a Gherkin adapter)
  • JSON (for custom pipelines and dashboards)
  • Excel (for QA leads who report up to non-engineering stakeholders)

Enterprise plans sync the same artifacts into Jira, Zephyr, or Azure DevOps through the Test Repository, so your product manager sees the same scenarios your CI runs. Pricing for both tiers lives on /pricing.

Try it

Generate a .feature file from your own acceptance criteria. The Starter plan is free forever, with 200 cases per run and 5 AI generations per month — enough to evaluate coverage on a real ticket.

Start at /test-cases. Paste a user story, pick the test types you want, and download the Gherkin. If you would rather audit a live URL first, jump to /free-audit — no signup, no card.