Tests Handbook

Comprehensive guide to KAP automated testing suites

Author

Kav AI

Published

April 5, 2026

1 Overview

Kav AI Platform (KAP) uses a multi-layered testing strategy to ensure reliability across the Active Physical Intelligence™ ecosystem. This handbook documents the available test suites, their purposes, and our official Test Registry for requirement traceability.


1.1 🏗️ SDRT² v3.2 Framework

KAP follows the SDRT² (Structured Design, Requirements, and Traceability) methodology. Every functional test must be registered and linked to a project requirement defined in the .plan folder.

1.1.1 Traceability Chain

Requirement (FR-AI-001) → Test Registry (TEST-AI-001) → Implementation (test_*.py)


1.2 1. AI Backend Suite (ai/tests)

The backend testing infrastructure is powered by pytest and is categorized into three tiers of execution.

1.2.1 1.1 Test Tiers

Tier Name Description Command
Tier 1 Unit Fast, isolated tests with no external dependencies (mocked). pixi run -e crewai universal-tests-tier1
Tier 2 Integration Validates multi-step pipeline evaluations and tool interactions. pixi run -e crewai universal-tests-tier2
Tier 3 E2E / Golden Full evaluation against golden datasets using LLM-as-a-judge. pixi run -e crewai universal-tests-tier3

1.2.2 1.2 The registry Marker

To maintain traceability, most Python tests should use the registry marker defined in pyproject.toml.

@pytest.mark.registry(
    id="TEST-AI-001", 
    feature_set="F-AI-01", 
    req_ids=["FR-AI-002"]
)
def test_gateway_routing():
    # Test logic here...
    pass

1.2.3 1.3 Key Functional Suites (Registry Index)

Registry ID Name Type Coverage
TEST-VIS-001 3D Viewer - CesiumJS Unit Globe, terrain, imagery
TEST-AUTH-002 Auth - RLS Enforcement Integration Cross-org isolation
TEST-AI-001 Backend - System Routing Unit Gateway, system selection
TEST-INT-001 IOW Classification Unit API 584 Levels & Zones
TEST-INT-006 Integrity Pipeline E2E Full DMR → Risk flow

1.3 2. Frontend Web Suite (web/tests)

The web application uses a combination of Playwright for E2E and Vitest for API/Component testing.

1.3.1 2.1 Test Categories

Category Tool Description Command
E2E Playwright Full user flow validation in a headless browser. npm run test:e2e
API Vitest Integration tests for frontend-to-backend communication. npm run test:api
Components Vitest Unit tests for React components and hooks. npm run test

1.3.2 2.2 Critical E2E Scenarios

  • Image Loading Validation (e2e/image-viewer-validation.spec.ts): Ensures images load correctly before bounding boxes are rendered.
  • Wacker Asset Viewer: Validates asset-specific visualization logic.
  • Auth SSR: Verifies server-side rendering with authentication.

1.4 3. Deployment & CI/CD

Tests are automatically executed in the following environments:

  • GitHub Actions: Runs Tier 1 & 2 tests on every Pull Request.
  • Google Cloud Build: Executes full E2E suites during deployment to the dev environment.

1.5 4. Contributing New Tests

When adding new features, please follow these guidelines:

  1. Unit First: Every tool/module should have a corresponding Tier 1 unit test.
  2. Stable Selectors: For web tests, use data-testid instead of CSS classes.
  3. Mocking: Use the provided mocks directory in both ai and web to keep tests deterministic.
  4. Documentation: Add a docstring explaining the test scenario and expected outcome.

1.6 5. Command Reference (Cheatsheet)

1.6.1 Backend (AI)

# Run all tests
pixi run pytest

# Run only Orion tests
pixi run -e adk test-orion

# Run question bank validation
pixi run test-question-bank

1.6.2 Frontend (Web)

# Run Playwright UI mode
npx playwright test --ui

# Run specific E2E test
npx playwright test tests/e2e/chat-hardening.spec.ts

Last Updated: 2026-04-05