
Post: Automate Interview Transcription: Make.com & AI Guide
How Sarah Cut 6 Hours of Weekly Interview Processing with Make.com™ and AI Transcription Automation
Interview transcription is one of the most reliably expensive invisible costs in recruiting. Recruiters sit through interviews, take fragmented notes, attempt to reconstruct what was said hours later, and then manually populate candidate records in the ATS — all before any actual decision-making happens. This satellite drills into the specific operational pattern that wastes that time and shows exactly how a Make.com™ automation workflow eliminates it. For the broader framework connecting this to your full HR tech stack, see our parent guide on smart AI workflows for HR and recruiting with Make.com™.
Case Snapshot
| Organization | Regional healthcare system, 800+ employees |
| Contact | Sarah — HR Director, responsible for clinical and administrative hiring |
| Baseline Problem | 12+ hours per week on manual interview scheduling and post-interview processing; inconsistent notes across recruiters; slow candidate record updates |
| Constraints | Healthcare compliance requirements; no dedicated engineering resources; existing ATS with API access; mixed technical fluency on the recruiting team |
| Approach | Make.com™ scenario automating audio capture → transcription API → LLM analysis → structured ATS push |
| Outcomes | 6 hours per week reclaimed per recruiter; interview-to-structured-record time reduced from 3–4 hours to under 5 minutes; hiring decision consistency measurably improved |
Context and Baseline: What Manual Interview Processing Actually Costs
Before automation, Sarah’s team processed interviews the way most healthcare HR teams do — and the waste was structural, not behavioral. Recruiters weren’t lazy. The process was designed to be slow.
After every interview, a recruiter would review their notes (partial at best, since active listening and note-taking compete for the same cognitive bandwidth), attempt to reconstruct a coherent summary, and manually enter fields into the ATS. For panel interviews, they’d collect notes from two or three additional interviewers by email, reconcile contradictions, and then update the record. Average elapsed time between interview completion and a fully updated ATS record: 3 to 4 hours. For a team running 15–20 interviews per week, that’s 45–80 hours of post-interview admin — before anyone makes a single hiring call.
According to Asana’s Anatomy of Work research, knowledge workers spend roughly 60% of their time on work about work rather than skilled work itself. Interview note reconciliation is a textbook example: it produces an input to a decision, not the decision. Automating it doesn’t remove human judgment — it removes the manual labor that precedes judgment.
McKinsey Global Institute research on generative AI’s economic potential identifies data capture and synthesis as among the highest-leverage automation targets in knowledge work, precisely because the output is structured and the quality criteria are explicit. Interview transcription fits both criteria exactly.
Sarah’s specific pain points broke down into three categories:
- Inconsistency: Different recruiters captured different fields, used different terminology, and applied different standards for what counted as a “strong” skills indicator. Hiring managers were making decisions based on incomparable data.
- Latency: Candidate records sat incomplete for hours or days after interviews. Follow-up decisions were delayed. Candidate experience suffered.
- Volume ceiling: The team couldn’t scale interview throughput without proportionally scaling admin time. Every additional 5 interviews per week added roughly 20 hours of processing work across the team.
Approach: Structure Before Intelligence
The instinct for most teams at this point is to search for an AI tool that “summarizes interviews.” That instinct produces mediocre results, because AI layered on top of an inconsistent, manual data pipeline inherits all the pipeline’s inconsistencies. Sarah’s team started differently: they mapped the data flow before touching any AI configuration.
The workflow architecture broke into four deterministic steps — all owned by Make.com™ — and one AI step that fired only after the structured foundation was in place:
- Audio capture and routing: Video conferencing platform recordings were configured to drop automatically into a designated cloud storage folder at interview completion. Make.com™ watched that folder and triggered the scenario on every new file.
- Transcription API call: The Make.com™ scenario passed the audio file URL to a transcription API configured for speaker diarization. Output: a structured text transcript with speaker labels and timestamps.
- LLM analysis: The transcript was passed to an OpenAI GPT model via Make.com™ with a structured prompt specifying the exact output fields required — candidate name, role, interview date, key skills identified, specific answers to four standard screening questions, sentiment indicators, and a 4-sentence summary. The prompt was engineered to produce consistent JSON output on every run.
- ATS data push: Make.com™ parsed the LLM’s JSON output and wrote each field directly to the candidate record via the ATS API. No human copy-paste step.
- Recruiter notification: A Slack message to the responsible recruiter confirmed the record was updated and included a link to review the AI output before any decision actions were taken.
The critical design principle: the AI fires at step 3, after the audio has been reliably captured, routed, and converted to structured text. It does not fire at step 1. Asking an LLM to compensate for an unreliable data pipeline is a reliability tax you pay on every run.
For teams building a parallel workflow for candidate screening, the same principle applies — see our guide on AI candidate screening workflows with Make.com™ and GPT.
Implementation: The Make.com™ Scenario in Detail
Sarah’s team built the scenario without dedicated engineering support. The Make.com™ visual interface handled all connections; no custom code was required for the core workflow. The implementation moved through four phases over approximately three weeks.
Phase 1 — Data Source Standardization (Week 1)
Before a single Make.com™ module was configured, the team standardized where interview recordings lived. Previously, recordings ended up in personal cloud drives, email attachments, and the video platform’s internal storage depending on which recruiter ran the interview. Standardization meant one folder, one naming convention, one permission set. This took a week of change management, not a week of technical work — and it was the most important week of the project.
Parseur’s Manual Data Entry Report documents that organizations lose roughly $28,500 per employee per year to manual data handling errors and inefficiencies. For Sarah’s team, the recording location inconsistency was producing exactly this type of cost: interviews couldn’t be processed centrally because the inputs weren’t centrally available.
Phase 2 — Trigger and Transcription Setup (Days 8–12)
With a reliable folder established, the Make.com™ trigger was configured to watch for new audio files. The scenario then called the transcription API, passing the file URL and requesting speaker-diarized output. Testing across 10 sample recordings confirmed that two-person interviews returned clean speaker labels; panel interviews with three or more speakers required a post-processing prompt step to reconcile ambiguous speaker attribution.
Speaker diarization is the feature that separates useful transcription from merely accurate transcription for HR purposes. A word-for-word transcript of an interview is raw material. A transcript where “Speaker 1 (Hiring Manager): Tell me about your approach to prioritization” and “Speaker 2 (Candidate): In my last role, I owned the project intake process…” is structured data that an LLM can analyze with precision.
Phase 3 — LLM Prompt Engineering (Days 13–18)
This phase took longer than expected and produced the largest quality gains. The team’s first prompt was a natural-language instruction: “Summarize this interview and identify the candidate’s key skills.” The output was coherent but inconsistent — different length summaries, different field names, no reliable structure for downstream ATS parsing.
The engineered prompt replaced natural language instructions with explicit field specifications, character limits, output format requirements (JSON), and instructions for handling ambiguity (“If the candidate does not address [question X], output ‘Not addressed’ for that field rather than inferring an answer”). Three rounds of prompt refinement against a test set of 15 transcripts produced output that parsed cleanly into ATS fields on every run.
This experience matches what we consistently observe across HR automation builds: the difference in output quality between a draft prompt and a refined prompt is larger than the difference between model versions. Prompt engineering is not a technical task — it is a process definition task, and HR teams are well-positioned to own it.
Phase 4 — ATS Integration and Error Handling (Days 19–21)
The ATS API accepted Make.com™’s structured POST requests and confirmed field writes with standard HTTP 200 responses. The Make.com™ error handler was configured to catch non-200 responses and route them to a dedicated Slack channel with the candidate name, audio file link, and error code — so that a failed API call produced an immediate human-readable alert rather than a silent data gap.
The Slack notification at workflow completion included a direct link to the updated ATS record and a reminder prompt for recruiters to review AI output before taking any decision actions. This preserved human oversight at the only point where it added value — the decision — while removing human involvement from all points where it only added latency.
Results: Before and After
| Metric | Before Automation | After Automation | Change |
|---|---|---|---|
| Interview-to-ATS record time | 3–4 hours | Under 5 minutes | ~97% reduction |
| Weekly admin time per recruiter | 12 hrs (scheduling + processing) | 6 hrs | 6 hrs/week reclaimed |
| ATS record completeness | Inconsistent; ~60% of required fields populated | 100% of defined fields populated on every record | +40 percentage points |
| Recruiter review time per interview | 45–60 minutes | 15–20 minutes (reviewing structured AI output) | ~60% reduction |
| Inter-recruiter consistency of notes | Low — freeform, non-comparable | High — identical fields, defined format | Structural improvement |
The 6 hours per week reclaimed per recruiter were reallocated to candidate relationship management — follow-up calls, offer negotiation prep, and proactive pipeline development. These are high-judgment tasks that move hiring outcomes. SHRM benchmarks confirm that time-to-fill and offer acceptance rates are strongly correlated with recruiter responsiveness during the candidate decision window; freeing recruiter time directly improves both metrics.
For a deeper look at how AI-generated candidate data feeds into resume and screening workflows, see our guide on AI resume analysis powered by Make.com™ automation.
What We Would Do Differently
Transparency about what didn’t work is where case studies earn their credibility. Three things would change in a repeat build:
1. Invest More in Prompt Testing Before Go-Live
The team went live with a prompt that had been tested on 15 transcripts. The first week of production exposed edge cases — very short interviews, interviews where the candidate asked most of the questions, interviews conducted in a second language — that the prompt handled poorly. A 30-transcript test set against a diverse sample would have caught most of these before any recruiter saw a malformed output.
2. Build the Error Handler First, Not Last
Error handling was the last module configured, which meant the first week of testing ran without a safety net. A transcription API timeout produced a silent null in the ATS record — discovered two days later. Error routing should be the second thing you build, immediately after the trigger.
3. Document the Prompt Library from Day One
The final prompt evolved through three versions during testing. Version history was not maintained. When the team wanted to modify the output fields six weeks after go-live, they couldn’t compare the current prompt to earlier versions to understand what had changed and why. Prompt documentation is as important as workflow documentation — treat it as such.
Security and compliance considerations for healthcare HR automation are covered in depth in our guide on securing Make.com™ AI HR workflows for data and compliance.
Lessons Learned: The Principles That Transfer
Sarah’s results are specific to her organization, her team size, and her existing ATS. The principles that produced those results transfer to any recruiting environment running manual interview processing.
Standardize Inputs Before Automating Outputs
The single highest-leverage action in the project was standardizing where recordings landed before building any automation. If your audio inputs are inconsistent, your transcription outputs will be inconsistent, and your AI analysis will be inconsistent. Automation amplifies whatever is upstream of it — including chaos.
Define Output Fields Before Writing Prompts
Start with the ATS fields you need populated and work backward to the prompt. “What do we need to know about this candidate in our ATS?” is a more productive question than “What can AI extract from this transcript?” The former produces a field spec. The latter produces a brainstorming session.
The Last Mile Matters as Much as the First
Automating transcription but leaving the ATS update manual is a half-built workflow. The efficiency gain happens at the end, when a recruiter opens a fully populated candidate record instead of a blank form. Every manual step left in the chain is a future failure point and a future relapse into the old process.
Gartner’s future of work research consistently identifies workflow integration — the connection between discrete tools — as the primary barrier to realized productivity gains from automation. The Make.com™ ATS integration step is not a nice-to-have. It is the step that makes every previous step produce value.
For the quantitative business case supporting this type of workflow investment, see our analysis of Make.com™ AI workflows ROI and HR cost savings. For teams focused on reducing overall hiring cycle time, the companion guide on reducing time-to-hire with Make.com™ AI recruitment automation covers the downstream metrics that interview processing automation directly improves.
Next Steps
If your team is processing more than 10 interviews per week manually, the efficiency case for automating this workflow is unambiguous. The build is within reach for any team with Make.com™ access and a transcription API. The sequence is fixed: standardize inputs, build the deterministic pipeline, engineer the prompt, connect the ATS, document everything.
What this case study demonstrates — and what the broader parent guide on practical AI workflows that boost HR efficiency and recruiting elaborates — is that the technology is not the constraint. The constraint is process discipline before the first module is configured. Get that right, and the automation performs exactly as designed.