Batch Transcription API Guide


Our Batch Transcription API allows you to accurately transcribe pre-recorded audio files. The process is asynchronous, meaning you start a transcription job and we notify you (or you can check) when it's complete.

This guide covers the entire workflow, from uploading your audio to retrieving the final transcription.

How It Works: The Workflow

The process consists of three main steps:

  1. Request an Upload URL: You tell our API about the file you want to transcribe (e.g., its format) and get a secure, temporary URL to upload it.
  2. Upload Your Audio: You upload your audio file directly to the provided URL.
  3. Retrieve Your Transcription: Once we've processed the audio, you can get the result in one of two ways:
    • Webhooks: We send the result directly to a callbackUrl you provide. (Recommended)
    • Polling: You periodically check an endpoint for the status of the transcription job.

Step 1: Request an Upload URL

First, you must request a pre-signed URL to upload your audio file. More info at Pre-signed URL

Step 2: Upload Your Audio File

Next, upload your audio file using an HTTP PUT request to the presignedUrl you received.

Important: The Content-Type header in this PUT request must match the mimeType you specified in Step 1.

JavaScript Example (using Fetch API)

async function uploadAudio(presignedUrl, audioFile, mimeType) {
  try {
    const response = await fetch(presignedUrl, {
      method: 'PUT',
      headers: {
        'Content-Type': mimeType
      },
      body: audioFile
    });

    if (response.ok) {
      console.log('Upload successful! Transcription is now processing.');
    } else {
      console.error('Upload failed:', response.statusText);
    }
  } catch (error) {
    console.error('An error occurred during upload:', error);
  }
}

Step 3: Retrieve Your Transcription

Once the upload is complete, our servers will begin processing the audio. You can get the result using one of the following methods.

Option A: Webhooks (Recommended) If you provided a callbackUrl in Step 1, our server will send an HTTP POST request to your URL once the transcription is complete.

Your endpoint should be prepared to receive the following body:

{
  "requestId": "job-abc-123-xyz",
  "transcription": [
    {
      "speakerId": "Speaker_00",
      "text": "Hello, Doctor.",
      "start": 0.5,
      "end": 2.2,
      "speakerName": "Patient 1"
    }
  ],
  "status": "COMPLETED",
  "signature": "invox-medical-generated-signature",
  "createdAt": 1721060645
}

Security Warning: To verify that the request genuinely comes from Invox Medical, you must validate the signature.

Using NodeJS, an example of how to decrypt the signature and validate that the signature is valid would be:


import * as CryptoJS from 'crypto-js';
export const validateSignature = (signature: string): boolean => {
    try {
        const secret = process.env.APP_SECRET;
        const appId = process.env.APP_ID;
        const apiKey = process.env.API_KEY;
        const expectedValue = `${appId}~${apiKey}` 

        const bytes = CryptoJS.AES.decrypt(signature, secret);
        const originalText = bytes.toString(CryptoJS.enc.Utf8);
        
        if (!originalText) {
           return false;
        }
        return originalText === expectedValue 
    } catch (error) {
       return false
    }
}

Option B: Polling for Status

If you did not provide a callbackUrl, you can periodically check the job status by polling the following endpoint: Get transcription status