Tests Handbook

Comprehensive guide to KAP automated testing suites

Author

Kav AI

Published

April 5, 2026

1 Overview

Kav AI Platform (KAP) uses a multi-layered testing strategy to ensure reliability across the Active Physical Intelligence™ ecosystem. This handbook documents the available test suites, their purposes, and our official Test Registry for requirement traceability.

1.1 🏗️ SDRT² v3.2 Framework

KAP follows the SDRT² (Structured Design, Requirements, and Traceability) methodology. Every functional test must be registered and linked to a project requirement defined in the .plan folder.

Test Registry: Located at tests.yaml.
Requirements Library: Located at requirements.yaml.

1.1.1 Traceability Chain

Requirement (FR-AI-001) → Test Registry (TEST-AI-001) → Implementation (test_*.py)

1.2 1. AI Backend Suite (`ai/tests`)

The backend testing infrastructure is powered by pytest and is categorized into three tiers of execution.

1.2.1 1.1 Test Tiers

Tier	Name	Description	Command
Tier 1	Unit	Fast, isolated tests with no external dependencies (mocked).	`pixi run -e crewai universal-tests-tier1`
Tier 2	Integration	Validates multi-step pipeline evaluations and tool interactions.	`pixi run -e crewai universal-tests-tier2`
Tier 3	E2E / Golden	Full evaluation against golden datasets using LLM-as-a-judge.	`pixi run -e crewai universal-tests-tier3`

1.2.2 1.2 The `registry` Marker

To maintain traceability, most Python tests should use the registry marker defined in pyproject.toml.

@pytest.mark.registry(
    id="TEST-AI-001", 
    feature_set="F-AI-01", 
    req_ids=["FR-AI-002"]
)
def test_gateway_routing():
    # Test logic here...
    pass

1.2.3 1.3 Key Functional Suites (Registry Index)

Registry ID	Name	Type	Coverage
`TEST-VIS-001`	3D Viewer - CesiumJS	Unit	Globe, terrain, imagery
`TEST-AUTH-002`	Auth - RLS Enforcement	Integration	Cross-org isolation
`TEST-AI-001`	Backend - System Routing	Unit	Gateway, system selection
`TEST-INT-001`	IOW Classification	Unit	API 584 Levels & Zones
`TEST-INT-006`	Integrity Pipeline	E2E	Full DMR → Risk flow

1.3 2. Frontend Web Suite (`web/tests`)

The web application uses a combination of Playwright for E2E and Vitest for API/Component testing.

1.3.1 2.1 Test Categories

Category	Tool	Description	Command
E2E	Playwright	Full user flow validation in a headless browser.	`npm run test:e2e`
API	Vitest	Integration tests for frontend-to-backend communication.	`npm run test:api`
Components	Vitest	Unit tests for React components and hooks.	`npm run test`

1.3.2 2.2 Critical E2E Scenarios

Image Loading Validation (e2e/image-viewer-validation.spec.ts): Ensures images load correctly before bounding boxes are rendered.
Wacker Asset Viewer: Validates asset-specific visualization logic.
Auth SSR: Verifies server-side rendering with authentication.

1.4 3. Deployment & CI/CD

Tests are automatically executed in the following environments:

GitHub Actions: Runs Tier 1 & 2 tests on every Pull Request.
Google Cloud Build: Executes full E2E suites during deployment to the dev environment.

1.5 4. Contributing New Tests

When adding new features, please follow these guidelines:

Unit First: Every tool/module should have a corresponding Tier 1 unit test.
Stable Selectors: For web tests, use data-testid instead of CSS classes.
Mocking: Use the provided mocks directory in both ai and web to keep tests deterministic.
Documentation: Add a docstring explaining the test scenario and expected outcome.

1.6 5. Command Reference (Cheatsheet)

1.6.1 Backend (AI)

# Run all tests
pixi run pytest

# Run only Orion tests
pixi run -e adk test-orion

# Run question bank validation
pixi run test-question-bank

1.6.2 Frontend (Web)

# Run Playwright UI mode
npx playwright test --ui

# Run specific E2E test
npx playwright test tests/e2e/chat-hardening.spec.ts

Last Updated: 2026-04-05