Human data infrastructure for frontier model training and evaluation.

KT-22 helps frontier labs and model teams acquire high-signal datasets, in-house research workflows, and expert data operations for frontier model training, evaluation, and capability development.

Explore datasets Research + network

What KT-22 offers

Production-ready datasets, standardized formats, and reliable infrastructure.

Off-the-Shelf Datasets

Ready-to-evaluate datasets for labs and model teams that want to move fast on benchmark coverage, capability validation, and data acquisition.

In-House Research Capability

Senior technical operators design dataset structure, validation protocols, benchmark alignment, and release packaging before delivery.

External Expert Network

A curated network of engineers, researchers, annotators, and domain specialists who support difficult collection, review, red-teaming, and QA workflows.

Featured datasets

High-value datasets for frontier labs.

Designed around difficult, deployment-relevant workflows and packaged in a way that is easier to test, trust, and adopt.

GUI Trajectories

High-quality interaction traces for desktop and web workflows, built for model teams working on perception, planning, and execution across real interfaces.

Multi-step UI tasks

Action + state trajectories

Training and evaluation ready

Terminal Bench + SWE Environments

Task suites and execution environments for coding models operating in terminals and real software workflows.

Terminal-first workflows

Bug fixing and repo tasks

Useful for training, benchmarking, and QA

MCP Tool Use Datasets

Structured examples of models using tools, APIs, and MCP-style workflows with clear task framing and grounded outcome validation.

Tool selection and sequencing

Realistic task decomposition

Grounded success criteria

Research + expert network

Hard-core research in-house. Expert network in hand.

KT-22 combines internal research capability with an external expert network that can generate, validate, and benchmark difficult data products across coding, tools, search, and scientific workflows.

In-house capability

Senior researchers and technical operators who design dataset structure, validation protocols, benchmark alignment, and release packaging.

External expert network

Engineers, researchers, annotators, and domain specialists who support data creation, review, red-teaming, rubricing, and scalable QA.

Why it matters

The best data products are not just collected. They are researched, validated internally, and packaged in a way that makes downstream model improvement easier to trust.

Engagement model

Off-the-shelf dataset packages, benchmark-backed pilots, or custom data programs with pre-training and benchmarking support.

From off-the-shelf datasets to custom frontier data programs.

KT-22 is not just a dataset broker. We build and package hard-to-produce human data products with internal research rigor, external expert capacity, and pilot-ready validation.

Get in touch

Interested in datasets, pilots, or custom collection?

Let's talk about your training, evaluation, or data program needs.

hello@kt22.ai