Transcript Alignment Tools

Align machine-generated transcripts with human corrections to preserve word-level timing.

1Pre-processor

Prepare your corrected transcript for alignment. Remove page headers, process various formats, and extract clean paragraphs with speaker labels.

2Word Alignment

Align your machine transcript JSON with the corrected text. View results as JSON, interactive transcript, or visual diff.

Typical Workflow

  1. Get machine transcript - Export word-level JSON from your ASR service (timestamps + words)
  2. Get corrected text - With fixed transcription errors
  3. Pre-process - If required, use the Pre-processor to clean up the corrected text
  4. Align - Use Word Alignment to transfer timing from machine transcript to corrected text
  5. Test - Test word timings using our Interactive Transcript tool
  6. Export - Copy the aligned JSON for use in your application

Using the Library

The alignment algorithm is available as a standalone JavaScript library: word-alignment.js

See the README for API documentation and integration examples.