Research Initiative
Building models that reason deeply within a narrow vertical. Not a frontier model competitor—a specialized system that understands hiring better than generic AI, costs less to run, and works inside product workflows.
Putting candidate info into a frontier model gets you summaries. Getting true hiring intelligence requires something different.
Frontier models excel at general reasoning, but lack deep hiring domain knowledge. They produce surface-level summaries without understanding role requirements or competency frameworks.
Frontier models are expensive to run repeatedly. Hiring decisions need to be made many times—on many candidates, across many roles. Generic AI doesn't scale economically.
Hiring is legally sensitive. You need evidence traceability, explainable reasoning, and comparable outputs—not opaque neural black boxes.
Instead of competing with frontier models, build specialized systems that work better inside one domain. Below, we outline a 5-phase proof of concept for an interview-specific AI.
The Evidence-to-Intelligence Pipeline
Qualitative Evidence Ingestion
Interview notes, resumes, scorecards, interviewer comments
Evidence Normalization
Convert messy inputs → structured evidence objects with competency, direction, strength, source
Rubric-Aware Candidate State
Merge evidence with org/role-specific rubrics (company standards, role requirements)
Structured Decision Intelligence
Output: comparable candidate views, explicit reasoning, evidence traceability, recommendation + uncertainty
Strata is built on a methodical, measurable approach: five phases to prove the model works, then expand to broader capabilities.
Prove that a compact model can be adapted cheaply for hiring intelligence using QLoRA fine-tuning, structured output schemas, and domain datasets.
# Fine-tune with QLoRA
model = AutoPeftModelForCausalLM.from_pretrained(base_model, peft_config)
# Structured output schema
output = {competencies: [...], strengths: [...], risks: [...]}
Make the model meaningfully better at hiring reasoning. Clean data, remove ambiguous categories, expand edge cases, measure where the model misjudges role context.
Success: Tuned model significantly outperforms base on held-out hiring examples, with fewer confused labels and better role awareness.
Lock the schema and make outputs production-grade. Add confidence levels, evidence traceability, and explicit reasoning that app logic can consume reliably.
# Structured, machine-consumable output
{
competencies: [{ name: "problem solving", confidence: 0.87, evidence: [...] }],
risks: [{ type: "communication", severity: "medium" }],
recommendation: "Strong fit", uncertainty: 0.12
}
Let the system use org-specific rubrics, role requirements, and company preferences without retraining. One core model, multiple company contexts.
Success: Same model adapts to different company standards. Retrieved context improves judgment on real role scenarios. Company knowledge lives outside model weights.
Make it cheap enough for production. Benchmark latency, throughput, quantization, and local/private runtimes. Viable cost per decision-support action.
Success: Clear cost/quality tradeoff for production. Viable local or server deployment. Cheap enough to run repeatedly for every candidate.
Decision support, not autonomous decisions. The system informs; humans decide.
JSON schemas, confidence levels, and explicit reasoning—not vague narrative text.
Every claim ties back to source evidence. Full traceability for legal and compliance requirements.
Candidate state is persistent, versioned, and queryable—not hidden in chat history.
Deep specialization in hiring, not generic model usage. Workflows bound what the system can do.
Economically viable to run for many candidates, many times, repeatedly.
Once the vertical model is proven, Strata can expand in multiple directions.
Build richer candidate state models that track competencies over time, aggregate evidence across multiple interviews, and surface contradictions or missing signals.
Systems that support head-to-head candidate comparison and reviewer alignment—not just individual analysis.
Models that identify gaps in candidate evidence and suggest what to ask next—turning the system into an interviewing co-pilot.
Apply the same methodology to other decision-heavy domains: design evaluation, customer fit, partnership assessment, and beyond.
Strata research powers Debrev Interview, proving the thesis in production.
Debrev Interview is where Strata's domain specialization meets real hiring teams. It takes the structured outputs from our models and integrates them into product workflows—transcription analysis, diarization, candidate comparison, team alignment.
Every interview analyzed, every candidate compared, every team alignment achieved proves that specialized models can outperform generic AI on the narrow tasks that matter most.
Explore Debrev InterviewEvidence Input: Interview audio, notes, resumes, scorecards
Strata Processing: Transcription → normalization → rubric-aware analysis
Structured Output: Candidate insights, competency scores, team comparison
Product Integration: Dashboard, AI chat, pool analysis, offer decisions
Strata is an ongoing research initiative into specialized AI. Stay tuned for updates on the phases, new findings, and expansions beyond hiring.