KT-22
Research-grade data products for frontier labs
Off-the-shelf datasets + research and expert network

Human data infrastructure for frontier model training and evaluation.

KT-22 helps frontier labs and model teams acquire high-signal datasets, in-house research workflows, and expert data operations for frontier model training, evaluation, and capability development.

What KT-22 offers
Production-ready datasets, standardized formats, and reliable infrastructure.
Off-the-Shelf Datasets
Ready-to-evaluate datasets for labs and model teams that want to move fast on benchmark coverage, capability validation, and data acquisition.
In-House Research Capability
Senior technical operators design dataset structure, validation protocols, benchmark alignment, and release packaging before delivery.
External Expert Network
A curated network of engineers, researchers, annotators, and domain specialists who support difficult collection, review, red-teaming, and QA workflows.
Featured datasets

High-value datasets for frontier labs.

Designed around difficult, deployment-relevant workflows and packaged in a way that is easier to test, trust, and adopt.

GUI Trajectories

High-quality interaction traces for desktop and web workflows, built for model teams working on perception, planning, and execution across real interfaces.

Multi-step UI tasks
Action + state trajectories
Training and evaluation ready
Terminal Bench + SWE Environments

Task suites and execution environments for coding models operating in terminals and real software workflows.

Terminal-first workflows
Bug fixing and repo tasks
Useful for training, benchmarking, and QA
MCP Tool Use Datasets

Structured examples of models using tools, APIs, and MCP-style workflows with clear task framing and grounded outcome validation.

Tool selection and sequencing
Realistic task decomposition
Grounded success criteria
Research + expert network

Hard-core research in-house. Expert network in hand.

KT-22 combines internal research capability with an external expert network that can generate, validate, and benchmark difficult data products across coding, tools, search, and scientific workflows.

In-house capability
Senior researchers and technical operators who design dataset structure, validation protocols, benchmark alignment, and release packaging.
External expert network
Engineers, researchers, annotators, and domain specialists who support data creation, review, red-teaming, rubricing, and scalable QA.
Why it matters
The best data products are not just collected. They are researched, validated internally, and packaged in a way that makes downstream model improvement easier to trust.
Engagement model
Off-the-shelf dataset packages, benchmark-backed pilots, or custom data programs with pre-training and benchmarking support.

From off-the-shelf datasets to custom frontier data programs.

KT-22 is not just a dataset broker. We build and package hard-to-produce human data products with internal research rigor, external expert capacity, and pilot-ready validation.

Get in touch
Interested in datasets, pilots, or custom collection?
Let's talk about your training, evaluation, or data program needs.
hello@kt22.ai