Research-backed · AI-era technical hiring

Assess how candidates work with AI — not just what they submit.

FairShot is a process-aware technical assessment platform grounded in peer-reviewed research on behavioral telemetry. It captures 51 signals across coding sessions to distinguish strategic AI use from blind copying.

Join the pilot See the research

Behavioral signals captured

96.75%

Model accuracy on synthetic benchmark

Human–AI collaboration archetypes

HACI

Human–AI Collaboration Index

The future of hiring is not about banning AI — it is about understanding who uses it with judgment, speed, and real problem-solving ability.

The Problem

Traditional evaluation breaks when AI rewrites the process.

Two candidates can submit identical correct code for completely different reasons. Final-output evaluation is blind to the difference.

Correct code proves nothing

AI can generate working solutions in seconds. A polished final result no longer signals understanding, problem-solving, or independent thought.

The question is how, not whether

Did they prompt strategically? Edit AI output critically? Debug intentionally? Or paste the first plausible result without verification?

Reviewers lack evidence

Hiring teams have no visibility into the collaboration process — only a deliverable stripped of every signal that actually mattered.

Research Foundation

Built on peer-reviewed behavioral telemetry research.

FairShot's evaluation model is grounded in a controlled synthetic simulation study that defined 51 signals across five behavioral categories — tested at 96.75% classification accuracy.

Signal Distribution — 51 total

Code Evolution

10 signals

AI Prompt NLP

12 signals

IDE Interaction

10 signals

Keystroke Dynamics

9 signals

Temporal Workflow

10 signals

Behavioral Telemetry for Process-Aware Evaluation of AI-Assisted Programming

−20.25pts

Removing AI Prompt signals drops accuracy by 20 points

Ablation studies confirm that how a candidate interacts with AI output is by far the most predictive feature family — more than IDE activity, keystrokes, or code evolution combined.

0.0872

Silhouette score reveals a behavioral spectrum

Unsupervised clustering shows collaboration styles don't form neat boxes — they exist on a continuum. The HACI index captures this gradient more faithfully than any binary label.

XGBoost

96.75% held-out accuracy, robust 5-fold CV

A StandardScaler + XGBoost pipeline successfully recovers intended synthetic archetype labels. Random Forest follows closely at 95.30%, both outperforming SVM significantly.

Collaboration Archetypes

Seven patterns of human–AI collaboration.

FairShot maps every session to one of seven research-defined archetypes — from independent problem-solvers who barely touch AI, to blind copiers who paste without review.

🧠

Independent Solver

22.9%

🤝

Structured Collaborator

21.3%

⚙️

Prompt Engineer Solver

15.4%

🔄

Iterative Debugger

14.9%

🤖

AI-Dependent Constructor

15.2%

🔍

Exploratory Learner

5.7%

📋

Blind Copier

4.6%

What the Research Found

Process beats output, every time.

SHAP feature importance analysis reveals a clear hierarchy: how a candidate interacts with AI output predicts collaboration style far better than raw activity counts.

SHAP Feature Importance

AI Output Edit Distance process 0.120

Prompt Refinement Count process 0.102

Max Paste Length process 0.089

Compile Events process 0.085

Avg Prompt Length process 0.078

Total Keystrokes count only 0.031

Files Opened count only 0.018

20pt

accuracy drop when AI Prompt NLP signals are removed

From 96.75% down to 76.50% — the single most dramatic finding from the ablation study. No other signal group comes close. Removing code evolution features actually increased accuracy slightly, suggesting partial redundancy.

Source: Ablation Study, Fig. 5
"AI interaction features are the most important feature family to consider."

How It Works

Three steps to process-aware evaluation.

FairShot is built for controlled pilot assessments — with telemetry capture, session evidence, and reviewer-facing analysis.

Run a managed assessment

Candidates complete a technical task in a structured environment designed for AI-era workflows — not an AI-free fiction that bears no resemblance to real work.

Capture behavioral evidence

FairShot collects 51 session-level telemetry signals spanning IDE interaction, prompt patterns, code evolution, keystrokes, and temporal flow — preserving the path, not just the destination.

Support reviewer decisions

Reviewers get integrity-aware summaries, archetype classification, a HACI score, and evidence packs that make collaboration style visible — keeping humans in the loop at every step.

Who This Is For

Pilot-ready for the teams that need it most.

Best suited today for forward-thinking partners who want better signal than traditional coding tests can provide.

Startups hiring engineers

Teams that want to evaluate tool-augmented performance in context — not memorized LeetCode solutions delivered under artificial constraints.

Bootcamps & training programs

Programs that need honest evidence of how learners use AI to solve problems — not just whether they submit something that runs.

Universities & placement cells

Academic settings exploring fair AI-era technical evaluation for emerging developers entering a workforce that already runs on AI assistance.

Open Pilot

Join a small, curated pilot group.

FairShot is currently best suited for controlled pilots with hiring teams, bootcamps, or university partners. If you want to test AI-era technical evaluation with real reviewer workflows, let's talk.

Current stage

Pilot-ready for controlled users. Research-backed, hypothesis-validated. Not yet marketed as broad self-serve enterprise software.

By submitting, you're joining a curated waitlist. No spam — just a direct conversation about whether FairShot is a fit for your team.