Audio Protection & Error Recovery


Invox Medical's streaming transcription client implements enterprise-grade audio protection mechanisms to ensure zero data loss even during network failures, browser crashes, or service disruptions. This guide explains the technical architecture that makes audio transcription reliable and resilient.

Overview

The transcription client employs multiple layers of protection:

  • In-Memory Send Queue with sequence-ordered retry
  • Circuit Breaker Pattern to prevent cascading failures
  • Intelligent Retry Logic with exponential backoff
  • Dual Acknowledgment System for efficient confirmation
  • Automatic Recovery from network interruptions
  • Online/Offline Detection with automatic queue drain on reconnect
  • Queue Management with flow control
  • Graceful Session Teardown on page unload to prevent orphaned sessions

In-Memory Queue

Audio chunks are held in two RAM structures that together guarantee chunks are not silently dropped:

interface PendingChunk {
  id: string;             // Unique identifier
  sessionId: string;      // UUID of the current session
  sequenceNumber: number; // Monotonically increasing — used for ordering and batch ACKs
  bytesB64: string;       // Base64-encoded PCM16 @ 16 kHz, mono
  durationMs: number;     // Duration of audio represented by this chunk
  retryCount?: number;    // Number of retry attempts so far (max: 3)
}

// Chunks waiting to be sent over the WebSocket
const sendQueue: PendingChunk[] = [];

// Chunks that have been sent but not yet acknowledged by the server
const inFlight = new Map<string, { chunk: PendingChunk; timestamp: number }>();

Why RAM is sufficient for typical disconnections:

  • Most network outages last seconds, not minutes. The in-memory queue holds everything during that window.
  • When connectivity returns, sendQueue and inFlight are drained in sequence order before a new chunk is sent.
  • Audio continues to accumulate locally in the PCM accumulator even while the queue is draining.
Because storage is in RAM, a hard browser crash (e.g. the tab is force-killed) will lose any audio that had not yet been acknowledged by the server. The mitigation is to call end() or trigger a graceful teardown on beforeunload/visibilitychange (see Best Practices).

Flow Control

To prevent unbounded memory growth and network congestion, you can track the total audio duration of in-flight chunks:

const MAX_IN_FLIGHT_DURATION_MS = 10_000; // 10 s of unacknowledged audio maximum

function canSendNow(chunkDurationMs: number): boolean {
  return (
    ws?.readyState === WebSocket.OPEN &&
    isOnline &&
    inFlightAudioDurationMs + chunkDurationMs <= MAX_IN_FLIGHT_DURATION_MS
  );
}

Example: With 0.5 s chunks, up to 20 chunks (~10 s) can be in-flight simultaneously. The 21st chunk waits until an acknowledgment frees up budget.

Queue Processing

function processQueueIfReady(): void {
  if (ws?.readyState !== WebSocket.OPEN || !isOnline || isPaused) return;

  while (sendQueue.length > 0) {
    const chunk = sendQueue[0];
    if (!canSendNow(chunk.durationMs)) break; // Wait for ACKs

    sendQueue.shift();
    inFlight.set(chunk.id, { chunk, timestamp: Date.now() });
    inFlightAudioDurationMs += chunk.durationMs;
    ws.send(JSON.stringify({ message: chunk.bytesB64, messageType: 'AUDIO', sessionId: chunk.sessionId }));
  }
}

Error Handling Strategies

Circuit Breaker Pattern

It´s recommended to implement a circuit breaker to prevent cascading failures during service outages.

Configuration

const circuitBreaker = new CircuitBreaker({
  failureThreshold: 5, // Open after 5 consecutive failures
  timeout: 60000, // Wait 60s before retry
  name: "TranscriptionClient",
});

Circuit States

1. CLOSED (Normal Operation)

  • All operations allowed
  • Failures are counted
  • Transitions to OPEN after 5 failures

2. OPEN (Service Degraded)

  • Operations immediately rejected
  • Transcription paused automatically
  • User notified: "Service temporarily unavailable"
  • After 60 seconds, transitions to HALF_OPEN

3. HALF_OPEN (Testing Recovery)

  • Single test operation allowed
  • Success → CLOSED (full recovery)
  • Failure → OPEN (back to degraded)

Protected Operations

The circuit breaker wraps critical operations:

// WebSocket connection
await circuitBreaker.execute("connect", async () => {
  // Connection logic
});

// Audio chunk transmission
await circuitBreaker.execute("sendChunk", async () => {
  // Send logic
});

// Transcription retrieval
await circuitBreaker.execute("getTranscription", async () => {
  // Fetch logic
});

Connection Error Handling

WebSocket Failures:

// Tracks failure patterns
interface ConnectionMetrics {
  consecutiveFailures: number;
  lastFailureTime: number;
  networkLatency: number;
  throughput: number;
}

// On each failure
this.connectionMetrics.consecutiveFailures++;
this.connectionMetrics.lastFailureTime = Date.now();

Poor Network Adaptation:

// Automatically adjusts heartbeat interval
private handlePoorNetworkConditions() {
  // Doubles heartbeat interval (max: 30s)
  this.heartbeatMs = Math.min(this.heartbeatMs * 2, 30000)
}

Recovery Processes

Session Recovery on Reconnection

When the WebSocket reconnects after a disruption, you can re-enqueue everything that was not yet confirmed by the server:

function recoverPendingChunks(): void {
  // Move all in-flight chunks back to the front of the queue
  for (const [chunkId, inFlightData] of inFlight) {
    clearTimeout(acknowledgmentTimeouts.get(chunkId));
    acknowledgmentTimeouts.delete(chunkId);
    inFlightAudioDurationMs -= inFlightData.chunk.durationMs;
    sendQueue.unshift(inFlightData.chunk); // Priority: front of queue
  }
  inFlight.clear();

  // Re-sort by sequence number to preserve order
  sendQueue.sort((a, b) => a.sequenceNumber - b.sequenceNumber);

  // Resume draining
  processQueueIfReady();
}

Recovery Triggers:

  • After successful WebSocket reconnection (onopen)
  • After network comes back online (online event)
  • After circuit breaker transitions from OPEN to CLOSED

Resending Pending Chunks

After network recovery, in-flight chunks are moved back to the send queue and retried:

function resendPendingChunks(): void {
  let recoveredDuration = 0;

  for (const [chunkId, inFlightData] of inFlight) {
    clearTimeout(acknowledgmentTimeouts.get(chunkId));
    acknowledgmentTimeouts.delete(chunkId);
    recoveredDuration += inFlightData.chunk.durationMs;
    sendQueue.unshift(inFlightData.chunk); // Priority: front
  }

  inFlight.clear();
  inFlightAudioDurationMs = Math.max(0, inFlightAudioDurationMs - recoveredDuration);

  processQueueIfReady();
}

Triggered by:

  • Network reconnection (online event)
  • WebSocket state change to connected
  • Recovery from disconnected/error state

Exponential Backoff Reconnection

Failed connections retry with increasing delays:

private async reconnectLoop() {
  // Add jittered delay (prevents thundering herd)
  await sleep(jitter(this.backoffMs))

  // Exponential backoff: 500ms → 1s → 2s → 4s → ... → 30s
  this.backoffMs = Math.min(maxBackoffMs, this.backoffMs * 2)

  return this.connect()
}

// Jitter adds randomness (±20%) to prevent synchronized retries
function jitter(ms: number) {
  const r = Math.random() * 0.4 + 0.8  // Range: 0.8-1.2
  return Math.floor(ms * r)
}

Backoff Schedule:

  • Initial: 500ms ± 20%
  • Maximum: 30,000ms (30 seconds)
  • Multiplier: 2x per attempt
  • Jitter: ±20% randomization

Network Resilience Features

Online/Offline Detection

You can monitor browser network status in real-time:

// Setup network monitoring
window.addEventListener("online", () => {
  this.isOnline = true;
  this.handleNetworkReconnection(); // Immediate reconnection
});

window.addEventListener("offline", () => {
  this.isOnline = false;
  this.handleNetworkDisconnection(); // Pause operations
});

Benefits:

  • Prevents sending when offline
  • Automatic reconnection when online
  • User feedback on network status
  • Preserves audio during outages

WebSocket State Management

You can maintain detailed connection states:

enum ClientStates {
  IDLE = "idle", // Not yet connected
  CONNECTING = "connecting", // Establishing connection
  CONNECTED = "connected", // Active connection
  RECONNECTING = "reconnecting", // Attempting reconnection
  DISCONNECTED = "disconnected", // Intentionally disconnected
  ERROR = "error", // Error state
  PAUSED = "PAUSED", // Circuit breaker open or manual pause
}

State Transitions:

IDLE → CONNECTING → CONNECTED
         ↓              ↓
      ERROR ← → RECONNECTING
                        ↓
                  DISCONNECTED

Connection Health Checks

Heartbeat Mechanism:

// Send periodic ping to detect silent failures
private startHeartbeat() {
  this.heartbeatTimer = setInterval(() => {
    this.sendJson({ type: 'ping', t: Date.now() })
  }, this.heartbeatMs)  // Default: 10,000ms (10 seconds)
}

Adaptive Heartbeat:

  • Normal conditions: 10 seconds
  • Poor network: Doubles to 20s, then 30s (max)
  • Detects silent connection failures
  • Prevents idle connection timeouts

Automatic Reconnection

You can reconnect on unexpected closures:

private async onCloseWebSocket(event?: CloseEvent) {
  console.log('WebSocket closed:', event?.code, event?.reason)
  this.stopHeartbeat()

  // Skip reconnection if intentionally ended
  if (this.isEndingTranscription) {
    this.updateState(ClientStates.DISCONNECTED)
    return
  }

  this.updateState(ClientStates.DISCONNECTED)

  // Auto-reconnect unless in ERROR or PAUSED
  if (this.state !== ClientStates.ERROR && !this.isPaused) {
    await this.reconnectLoop()
  }
}

Acknowledgment & Retry Logic

Dual Acknowledgment System

We support two acknowledgment methods for flexibility and efficiency:

1. Individual Chunk Acknowledgment

Each chunk receives explicit confirmation:

// Server response format
{
  "id": "1732579200000_000042",
  "chunkId": "1732579200000_000042"
}

// Client handling
private async handleChunkAcknowledgment(chunkId: string) {
  this.clearAcknowledmentTimeout(chunkId)

  const inFlightData = this.inFlight.get(chunkId)
  if (inFlightData) {
    // Free up in-flight budget
    this.inFlightAudioDurationMs -= inFlightData.chunk.durationMs

    // Remove from tracking
    this.inFlight.delete(chunkId)

    // Persist acknowledgment
    await this.markChunkAsAcknowledged(chunkId)

    // Try sending more chunks
    await this.processQueueIfReady()
  }
}

Benefits:

  • Fine-grained confirmation
  • Low-latency feedback
  • Simple to implement

2. Batch Acknowledgment

Multiple chunks confirmed with single message:

// Server response format
{
  "type": "AUDIO_CHUNK_ACK",
  "stream_id": "session-uuid",
  "ack_id": 42  // All chunks with sequenceNumber <= 42 confirmed
}

// Client handling
private async handleBatchAcknowledgment(streamId: string, ackId: number) {
  const chunksToAck = []

  // Find all chunks with sequenceNumber <= ackId
  for (const [chunkId, inFlightData] of this.inFlight) {
    if (inFlightData.chunk.sequenceNumber <= ackId) {
      chunksToAck.push(inFlightData)
    }
  }

  // Sort by sequence to maintain order
  chunksToAck.sort((a, b) => a.sequenceNumber - b.sequenceNumber)

  // Acknowledge all matched chunks
  for (const data of chunksToAck) {
    this.clearAcknowledmentTimeout(data.chunk.id)
    this.inFlight.delete(data.chunk.id)
    await this.markChunkAsAcknowledged(data.chunk.id)
    this.inFlightAudioDurationMs -= data.chunk.durationMs
  }

  await this.processQueueIfReady()
}

Benefits:

  • 90%+ reduction in ACK messages
  • Handles high-throughput scenarios
  • Prevents ACK storms

Timeout-Based Retry

Chunks without acknowledgment are automatically retried:

// Set 30-second timeout on send
private setAcknowledmentTimeout(chunkId: string) {
  const timeoutId = setTimeout(() => {
    this.handleChunkTimeout(chunkId)
  }, CHUNK_TIMEOUT_MS)  // 30,000ms

  this.acknowledgmentTimeouts.set(chunkId, timeoutId)
}

// Handle timeout
private handleChunkTimeout(chunkId: string) {
  const inFlightData = this.inFlight.get(chunkId)
  if (inFlightData) {
    this.inFlight.delete(chunkId)
    this.retryChunk(inFlightData.chunk)
  }
}

Timeout Duration: 30 seconds (conservative for high-latency networks)

Exponential Backoff Retry

Failed chunks retry with increasing delays:

private retryChunk(chunk: PendingChunk) {
  chunk.retryCount = (chunk.retryCount || 0) + 1

  if (chunk.retryCount <= MAX_RETRY_ATTEMPTS) {  // MAX = 3
    setTimeout(() => {
      this.sendQueue.unshift(chunk)  // Priority: front of queue
      this.processQueueIfReady()
    }, jitter(1000 * 2 ** (chunk.retryCount - 1)))
  } else {
    // Report failure after max attempts
    this.handleChunkFailure(chunk)
  }
}

Retry Schedule:

  • 1st retry: ~1s (2⁰ × 1000ms ± jitter)
  • 2nd retry: ~2s (2¹ × 1000ms ± jitter)
  • 3rd retry: ~4s (2² × 1000ms ± jitter)
  • After 3 failures: Report error via onError callback

Total Time: ~37 seconds from initial send to final failure

Queue Management

Queue Structure

// FIFO queue for chunks awaiting transmission
private sendQueue: PendingChunk[] = []

// Map of chunks sent but not yet acknowledged
private inFlight: Map<string, { chunk: PendingChunk; timestamp: number }> = new Map()

Flow Control

To prevent memory overflow and buffer bloat:

const MAX_IN_FLIGHT_DURATION_MS = 10000  // 10 seconds max unacknowledged

private canSendNow(chunkDurationMs: number) {
  return (
    this.ws?.readyState === WebSocket.OPEN &&
    this.isOnline &&
    this.inFlightAudioDurationMs + chunkDurationMs <= MAX_IN_FLIGHT_DURATION_MS
  )
}

Mechanism:

  • Tracks total duration of unacknowledged audio
  • Blocks new sends when limit reached
  • Waits for acknowledgments to free budget
  • Prevents network congestion

Example:

  • Max 10 seconds in-flight
  • Chunks are ~600ms each
  • Max ~16 chunks simultaneously
  • 17th chunk waits until ACK received

Queue Processing

private async processQueueIfReady() {
  // Guard: Check connection state
  if (!this.ws || this.ws.readyState !== WebSocket.OPEN ||
      !this.isOnline || this.isPaused) {
    return
  }

  while (this.sendQueue.length > 0) {
    const chunk = this.sendQueue[0]

    // Respect flow control limits
    if (!this.canSendNow(chunk.durationMs)) {
      break  // Wait for ACKs
    }

    this.sendQueue.shift()

    try {
      await this.sendChunkInternal(chunk, 'sendChunk')
    } catch (error) {
      if (error instanceof CircuitBreakerError) {
        // Return to queue, stop processing
        this.sendQueue.unshift(chunk)
        break
      }

      // Other errors: retry with backoff
      this.retryChunk(chunk)
      break
    }
  }
}

Queue Discipline:

  • FIFO (First In, First Out)
  • Priority for retries (added to front via unshift)
  • Respects flow control limits
  • Breaks on error to prevent starvation

Error Recovery Scenarios

Scenario 1: Network Drop During Recording

Flow:

  1. Browser detects network loss (offline event)
  2. isOnline flag set to false
  3. canSendNow() returns false (blocks queue)
  4. Audio continues recording into the PCM accumulator
  5. Accumulated chunks are flushed and held in sendQueue in memory
  6. In-flight chunks timeout → moved back to the front of sendQueue
  7. Network returns (online event)
  8. resendPendingChunks() re-queues in-flight chunks
  9. connect() re-establishes WebSocket
  10. processQueueIfReady() drains the queue in sequence order

Result: ✅ Zero audio loss (network outages of typical duration)

Scenario 2: Browser Crash Mid-Session

Flow:

  1. Browser is force-killed or crashes — all RAM is lost
  2. Any audio still in sendQueue or inFlight that had not been acknowledged by the server is lost
  3. The session on the server side remains open for the TTL window

Mitigation: Minimise exposure by calling end() (or at minimum flushing the buffer) when the page is about to unload:

// Flush remaining audio and send END_TRANSCRIPTION before the page closes
window.addEventListener('beforeunload', () => {
  flushAudioBuffer(); // flush any sub-threshold PCM still accumulated
  ws?.send(JSON.stringify({ messageType: 'END_TRANSCRIPTION', sessionId }));
});

// Handle tab hidden (mobile or background tabs)
document.addEventListener('visibilitychange', () => {
  if (document.visibilityState === 'hidden') {
    flushAudioBuffer();
  }
});

Result: ⚠️ Hard crashes may lose a small tail of unacknowledged audio. Graceful teardown via beforeunload prevents most data loss.

Scenario 3: Circuit Breaker Opens

Flow:

  1. Server returns 5 consecutive errors
  2. Circuit breaker transitions to OPEN
  3. isPaused set to true
  4. User shown: "Service temporarily unavailable"
  5. All sends fail immediately with CircuitBreakerError
  6. Chunks remain in sendQueue in memory
  7. After 60s, circuit transitions to HALF_OPEN
  8. Single test operation attempted
  9. Success: Circuit closes, queue processed
  10. Failure: Circuit reopens, wait another 60s

Result: ✅ Graceful degradation, no data loss

Scenario 4: Chunk Timeout (No ACK)

Flow:

  1. Chunk sent at t=0
  2. No ACK by t=30s
  3. Timeout handler fired
  4. Chunk removed from in-flight
  5. Retry scheduled with backoff
  6. Retry 1 at t=31s (1s delay)
  7. Retry 2 at t=33s (2s delay)
  8. Retry 3 at t=37s (4s delay)
  9. handleChunkFailure() called
  10. Error reported via onError callback

Result: ✅ 4 total attempts over ~37s before reporting failure

Performance Characteristics

Network Performance

  • Chunk size: ~16 KB (0.5 s PCM16 @ 16 kHz, mono)
  • Send frequency: ~2 chunks/sec (500 ms intervals)
  • Bandwidth: ~32 KB/sec (~256 Kbps)
  • In-flight limit: 10 seconds of audio (~200 KB max buffer)

Memory Footprint

  • Send queue: 200 KB - 1 MB (10-50 chunks)
  • In-flight map: ~320 KB (~16 chunks)
  • Audio buffer: ~19 KB (~0.6s buffering)
  • Total RAM: ~500 KB - 1.5 MB (worst case)

Monitoring API

We can expose methods for monitoring health and performance:

Client State

// Get current connection state
const state = client.getState();
// Returns: 'idle' | 'connecting' | 'connected' | 'reconnecting' | 'disconnected' | 'error' | 'PAUSED'

// Get current session ID
const sessionId = client.getSessionId();
// Returns: UUID string

Network Metrics

const metrics = client.getNetworkMetrics();
// Returns: {
//   consecutiveFailures: number,
//   lastFailureTime: number,
//   networkLatency: number,
//   throughput: number,
//   isOnline: boolean
// }

Circuit Breaker Status

const stats = client.getCircuitBreakerStats();
// Returns: {
//   state: 'CLOSED' | 'OPEN' | 'HALF_OPEN',
//   failureCount: number,
//   lastFailureTime: number,
//   retryAfter: number,
//   operations: Record<string, OperationStats>
// }

const state = client.getCircuitBreakerState();
// Returns: 'CLOSED' | 'OPEN' | 'HALF_OPEN'

Storage Information

const storageInfo = await client.getStorageInfo();
// Returns: {
//   totalChunks: number,
//   totalSessions: number,
//   currentSessionChunks: number
// }

Best Practices

1. Monitor Circuit Breaker State

client.onState = (state) => {
  if (state === "PAUSED") {
    // Show user-friendly message
    showNotification("Transcription temporarily paused. Retrying...");
  }
};

2. Handle Errors Gracefully

client.onError = (error) => {
  console.error("Transcription error:", error);
  // Log to monitoring service
  logError(error, { sessionId: client.getSessionId() });
};

3. Track Network Status

client.onNetworkStatus = (isOnline) => {
  if (!isOnline) {
    showWarning("Network connection lost. Audio being saved locally.");
  } else {
    showSuccess("Network reconnected. Resuming transcription.");
  }
};

4. Call end() on page unload

// Prevents orphaned sessions and minimises data loss on tab close
window.addEventListener('beforeunload', () => {
  flushAudioBuffer();
  ws?.send(JSON.stringify({ messageType: 'END_TRANSCRIPTION', sessionId }));
});

5. Use crypto.randomUUID() for session IDs

// Always use the platform-native UUID generator — no custom implementations needed
const sessionId = crypto.randomUUID();

6. Monitor queue depth

// Log a warning if the send queue grows large (sign of sustained disconnection)
setInterval(() => {
  if (sendQueue.length > 40) {
    console.warn('Large send queue:', sendQueue.length, 'chunks pending');
  }
}, 10_000);

7. Clean up on session end

// After transcription is complete, close the audio pipeline
workletNode.disconnect();
audioContext.close();
mediaStream.getTracks().forEach(t => t.stop());

Complete Resilient Integration Example

The following is a self-contained TypeScript snippet that combines all the protection mechanisms described in this guide. It can be used as a starting point for any integration that needs to be robust against network instability.

// ─── Types ────────────────────────────────────────────────────────────────────

interface PendingChunk {
  id: string;
  sessionId: string;
  sequenceNumber: number;
  bytesB64: string;
  durationMs: number;
  retryCount: number;
}

// ─── Configuration ────────────────────────────────────────────────────────────

const WS_URL           = 'wss://live.invoxmedical.com';
const ACCESS_TOKEN     = 'your_access_token';
const CALLBACK_URL     = encodeURIComponent('https://example.mydomain.com');
const SESSION_ID       = crypto.randomUUID();

// Audio accumulation: 0.5 s of PCM16 @ 16 kHz mono = 16 000 bytes
const TARGET_PCM_BYTES       = 16_000 * 2 * 0.5;
// Backoff
const INITIAL_BACKOFF_MS     = 500;
const MAX_BACKOFF_MS         = 30_000;
// Flow control: max 10 s of audio in-flight at once
const MAX_IN_FLIGHT_DURATION = 10_000;
// Circuit breaker
const CB_FAILURE_THRESHOLD   = 5;
const CB_COOLDOWN_MS         = 60_000;
// Chunk acknowledgment timeout
const ACK_TIMEOUT_MS         = 30_000;
const MAX_RETRY_ATTEMPTS     = 3;

// ─── State ────────────────────────────────────────────────────────────────────

let ws: WebSocket | null = null;
let backoffMs = INITIAL_BACKOFF_MS;
let intentionalClose = false;
let isOnline = navigator.onLine;
let isPaused = false;
let sequenceNumber = 0;
let inFlightAudioDurationMs = 0;

const sendQueue: PendingChunk[]                        = [];
const inFlight = new Map<string, { chunk: PendingChunk; timestamp: number }>();
const ackTimeouts = new Map<string, ReturnType<typeof setTimeout>>();

// PCM accumulator
let pcmBuffer: ArrayBuffer[] = [];
let pcmBytesCollected = 0;

// Circuit breaker
let cbFailures = 0;
let cbOpenUntil = 0;

// ─── Helpers ──────────────────────────────────────────────────────────────────

function arrayBufferToBase64(buffer: ArrayBuffer): string {
  const bytes = new Uint8Array(buffer);
  let binary = '';
  for (let i = 0; i < bytes.length; i++) binary += String.fromCharCode(bytes[i]);
  return btoa(binary);
}

function jitter(ms: number): number {
  return Math.floor(ms * (Math.random() * 0.4 + 0.8));
}

function isCircuitOpen(): boolean {
  return cbFailures >= CB_FAILURE_THRESHOLD && Date.now() < cbOpenUntil;
}

function recordFailure(): void {
  cbFailures++;
  if (cbFailures >= CB_FAILURE_THRESHOLD) {
    cbOpenUntil = Date.now() + CB_COOLDOWN_MS;
    isPaused = true;
    console.warn('[circuit] Circuit opened. Pausing for', CB_COOLDOWN_MS / 1000, 's');
    setTimeout(() => {
      isPaused = false;
      cbFailures = 0;
      processQueueIfReady();
    }, CB_COOLDOWN_MS);
  }
}

function recordSuccess(): void {
  cbFailures = 0;
}

// ─── Queue Management ─────────────────────────────────────────────────────────

function canSendNow(durationMs: number): boolean {
  return (
    ws?.readyState === WebSocket.OPEN &&
    isOnline &&
    !isPaused &&
    !isCircuitOpen() &&
    inFlightAudioDurationMs + durationMs <= MAX_IN_FLIGHT_DURATION
  );
}

function processQueueIfReady(): void {
  while (sendQueue.length > 0) {
    const chunk = sendQueue[0];
    if (!canSendNow(chunk.durationMs)) break;

    sendQueue.shift();
    inFlight.set(chunk.id, { chunk, timestamp: Date.now() });
    inFlightAudioDurationMs += chunk.durationMs;

    // Set acknowledgment timeout
    ackTimeouts.set(chunk.id, setTimeout(() => handleAckTimeout(chunk.id), ACK_TIMEOUT_MS));

    try {
      ws!.send(JSON.stringify({
        message: chunk.bytesB64,
        messageType: 'AUDIO',
        sessionId: chunk.sessionId,
      }));
      recordSuccess();
    } catch {
      // Return chunk to queue and stop
      sendQueue.unshift(chunk);
      inFlight.delete(chunk.id);
      inFlightAudioDurationMs -= chunk.durationMs;
      break;
    }
  }
}

// ─── Acknowledgment Handling ──────────────────────────────────────────────────

function acknowledgeChunk(chunkId: string): void {
  const data = inFlight.get(chunkId);
  if (!data) return;

  clearTimeout(ackTimeouts.get(chunkId));
  ackTimeouts.delete(chunkId);
  inFlight.delete(chunkId);
  inFlightAudioDurationMs = Math.max(0, inFlightAudioDurationMs - data.chunk.durationMs);
  processQueueIfReady();
}

function acknowledgeBatch(ackId: number): void {
  for (const [chunkId, data] of inFlight) {
    if (data.chunk.sequenceNumber <= ackId) {
      acknowledgeChunk(chunkId);
    }
  }
}

function handleAckTimeout(chunkId: string): void {
  const data = inFlight.get(chunkId);
  if (!data) return;

  inFlight.delete(chunkId);
  inFlightAudioDurationMs = Math.max(0, inFlightAudioDurationMs - data.chunk.durationMs);
  retryChunk(data.chunk);
}

function retryChunk(chunk: PendingChunk): void {
  chunk.retryCount++;
  if (chunk.retryCount <= MAX_RETRY_ATTEMPTS) {
    setTimeout(() => {
      sendQueue.unshift(chunk); // Priority: front
      processQueueIfReady();
    }, jitter(1000 * Math.pow(2, chunk.retryCount - 1)));
  } else {
    console.error('[transcription] Chunk failed after', MAX_RETRY_ATTEMPTS, 'retries:', chunk.id);
    recordFailure();
  }
}

// ─── Network Recovery ─────────────────────────────────────────────────────────

function resendPendingChunks(): void {
  let recoveredMs = 0;
  for (const [chunkId, data] of inFlight) {
    clearTimeout(ackTimeouts.get(chunkId));
    ackTimeouts.delete(chunkId);
    recoveredMs += data.chunk.durationMs;
    sendQueue.unshift(data.chunk);
  }
  inFlight.clear();
  inFlightAudioDurationMs = Math.max(0, inFlightAudioDurationMs - recoveredMs);
  sendQueue.sort((a, b) => a.sequenceNumber - b.sequenceNumber);
}

window.addEventListener('online',  () => { isOnline = true;  connectWebSocket(); });
window.addEventListener('offline', () => { isOnline = false; });

// ─── WebSocket with Exponential Backoff ───────────────────────────────────────

function connectWebSocket(): void {
  if (intentionalClose || !isOnline) return;

  ws = new WebSocket(
    `${WS_URL}?callbackUrl=${CALLBACK_URL}&accessToken=${ACCESS_TOKEN}&sessionId=${SESSION_ID}`
  );

  ws.onopen = () => {
    backoffMs = INITIAL_BACKOFF_MS;
    resendPendingChunks();
    processQueueIfReady();
  };

  ws.onmessage = (event) => {
    const data = JSON.parse(event.data);

    // Individual chunk ACK
    if (data.chunkId || data.id) {
      acknowledgeChunk(data.chunkId ?? data.id);
    }

    // Batch ACK: all chunks with sequenceNumber <= ack_id are confirmed
    if (data.type === 'AUDIO_CHUNK_ACK' && typeof data.ack_id === 'number') {
      acknowledgeBatch(data.ack_id);
    }

    // Transcription complete
    if (data.isTranscriptionCompleted === true) {
      console.log('[transcription] Complete:', data);
    }
  };

  ws.onclose = () => {
    if (intentionalClose) return;
    setTimeout(() => {
      backoffMs = Math.min(MAX_BACKOFF_MS, backoffMs * 2);
      connectWebSocket();
    }, jitter(backoffMs));
  };

  ws.onerror = () => {
    recordFailure();
    ws?.close();
  };
}

// ─── Audio Accumulator (0.5 s threshold) ─────────────────────────────────────

function onPcmChunk(chunk: ArrayBuffer): void {
  pcmBuffer.push(chunk);
  pcmBytesCollected += chunk.byteLength;

  if (pcmBytesCollected >= TARGET_PCM_BYTES) {
    flushAudioBuffer();
  }
}

function flushAudioBuffer(): void {
  if (pcmBuffer.length === 0) return;

  const combined = new Uint8Array(pcmBytesCollected);
  let offset = 0;
  for (const buf of pcmBuffer) {
    combined.set(new Uint8Array(buf), offset);
    offset += buf.byteLength;
  }

  const durationMs = Math.round((pcmBytesCollected / 2 / 16_000) * 1000);
  const seq = sequenceNumber++;
  const chunk: PendingChunk = {
    id: `chunk_${SESSION_ID}_${seq}`,
    sessionId: SESSION_ID,
    sequenceNumber: seq,
    bytesB64: arrayBufferToBase64(combined.buffer),
    durationMs,
    retryCount: 0,
  };

  pcmBuffer = [];
  pcmBytesCollected = 0;

  sendQueue.push(chunk);
  processQueueIfReady();
}

// ─── AudioWorklet Capture ─────────────────────────────────────────────────────

const processorCode = `
class AudioProcessor extends AudioWorkletProcessor {
  constructor() {
    super();
    this.inputSampleRate = 48000;
    this.targetSampleRate = 16000;
    this.targetChunkSamples = Math.floor(this.targetSampleRate * 0.085);
    this.samplesNeeded = Math.floor(
      this.targetChunkSamples * (this.inputSampleRate / this.targetSampleRate)
    );
    this.bufferSize = this.inputSampleRate * 2;
    this.buffer = new Float32Array(this.bufferSize);
    this.writeIndex = 0;
  }
  downsample(length) {
    const ratio = this.inputSampleRate / this.targetSampleRate;
    const outLen = Math.round(length / ratio);
    const result = new Float32Array(outLen);
    for (let i = 0; i < outLen; i++) {
      const start = Math.floor(i * ratio);
      const end = Math.min(Math.floor((i + 1) * ratio), length);
      let sum = 0;
      for (let j = start; j < end; j++) sum += this.buffer[j];
      result[i] = sum / (end - start);
    }
    return result;
  }
  floatTo16BitPCM(input) {
    const buf = new ArrayBuffer(input.length * 2);
    const view = new DataView(buf);
    for (let i = 0; i < input.length; i++) {
      const s = Math.max(-1, Math.min(1, input[i]));
      view.setInt16(i * 2, s < 0 ? s * 0x8000 : s * 0x7fff, true);
    }
    return buf;
  }
  process(inputs) {
    const input = inputs[0];
    if (!input || !input[0]) return true;
    const data = input[0];
    if (this.writeIndex + data.length <= this.bufferSize) {
      this.buffer.set(data, this.writeIndex);
      this.writeIndex += data.length;
    } else {
      this.writeIndex = 0;
      return true;
    }
    while (this.writeIndex >= this.samplesNeeded) {
      const downsampled = this.downsample(this.samplesNeeded);
      const pcm16 = this.floatTo16BitPCM(downsampled);
      this.buffer.copyWithin(0, this.samplesNeeded, this.writeIndex);
      this.writeIndex -= this.samplesNeeded;
      this.port.postMessage({ pcm16, samples: downsampled.length }, [pcm16]);
    }
    return true;
  }
}
registerProcessor('audio-processor', AudioProcessor);
`;

const blob = new Blob([processorCode], { type: 'application/javascript' });
const processorUrl = URL.createObjectURL(blob);

const mediaStream   = await navigator.mediaDevices.getUserMedia({ audio: true });
const audioContext  = new AudioContext();
await audioContext.audioWorklet.addModule(processorUrl);
URL.revokeObjectURL(processorUrl);

const source      = audioContext.createMediaStreamSource(mediaStream);
const workletNode = new AudioWorkletNode(audioContext, 'audio-processor');
source.connect(workletNode);
workletNode.connect(audioContext.destination);

workletNode.port.onmessage = (event) => {
  onPcmChunk(event.data.pcm16);
};

// ─── Graceful teardown on page close ─────────────────────────────────────────

window.addEventListener('beforeunload', () => {
  flushAudioBuffer();
  ws?.send(JSON.stringify({ messageType: 'END_TRANSCRIPTION', sessionId: SESSION_ID }));
});

document.addEventListener('visibilitychange', () => {
  if (document.visibilityState === 'hidden') flushAudioBuffer();
});

// ─── Start ────────────────────────────────────────────────────────────────────

connectWebSocket();

// To stop recording intentionally:
function stopRecording(): void {
  intentionalClose = true;
  flushAudioBuffer();
  ws?.send(JSON.stringify({ messageType: 'END_TRANSCRIPTION', sessionId: SESSION_ID }));
}

Conclusion

The Invox Medical streaming transcription client provides enterprise-grade audio protection through:

  • In-memory send queue with sequence-ordered retry
  • Intelligent retry logic with exponential backoff
  • Circuit breaker protection against cascading failures
  • Dual acknowledgment system for high-throughput efficiency
  • Automatic recovery from network failures and transient outages
  • Flow control to prevent buffer overflow
  • Online/offline detection with immediate queue drain on reconnect
  • Graceful teardown via beforeunload to minimise data loss on page close

This architecture ensures that audio transcription remains robust and reliable even under adverse conditions including network failures, server outages, and high-latency scenarios.

The system prioritises data integrity while maintaining optimal performance characteristics for real-world production environments.