General considerations


Browser compatibility

The Dictation SDK relies on the Web Audio API for microphone capture, which is available in all modern browsers. The following environments are supported:

  • Chromium-based browsers (Chrome, Edge, Opera) — fully supported, recommended for production.
  • Firefox — supported.
  • Safari — supported on macOS and iOS (version 14.1 or later).
Internet Explorer is not supported. Ensure your application does not serve the SDK to IE users.

The SDK requires a secure context (https:// or localhost). Browsers block microphone access on plain http:// origins.

Connection modes

The SDK supports two ways to connect to the Dictation Service:

Remote Dictation Service

The recommended mode for web applications. Audio is processed on Invox Medical's cloud infrastructure. No software installation is required on the end user's machine.

Set useDictationService: true and provide the host assigned to your organization:

const connectionConfig = {
  host: 'your-host.invoxmedical.com',
  port: 8443,
  useDictationService: true
};

The libs/opus/ folder must be present alongside invox.min.js, as the SDK loads the Opus audio codec automatically to compress audio before transmission.

Local Dictation Agent

An alternative mode where audio is processed by the Invox Medical Dictation Agent installed on the user's machine. This can reduce network latency in environments where the agent is pre-deployed.

Set useDictationService: false. No host is required:

const connectionConfig = {
  port: 8443,
  useDictationService: false,
  localTimeoutAttempt: 13000 // ms to wait before giving up (default: 13000)
};

If the agent is not running, the connection attempt will reject after localTimeoutAttempt milliseconds. See Best Practices for a recommended fallback pattern.

Supported languages

The following dictation languages are available. The value passed to INVOX.SetLang() must be one of the constants below:

LanguageConstant
Spanish (Spain)INVOX.LangEnum.es_ES
English (United States)INVOX.LangEnum.en_US
Portuguese (Portugal)INVOX.LangEnum.pt_PT
Portuguese (Brazil)INVOX.LangEnum.pt_BR

The language must be set before calling INVOX.Login(). It determines both the speech recognition model and the locale used for the SDK's built-in UI elements.

Audio codec

The SDK uses the Opus codec to compress microphone audio before sending it to the Remote Dictation Service. The codec resources are included in the SDK package under libs/opus/. This folder must remain co-located with invox.min.js — do not move or rename it.

No additional configuration is required; the SDK loads the codec automatically.

Hardware dictation device support

In addition to standard microphone input, the SDK supports the following USB dictation devices:

  • Nuance PowerMic — buttons configurable via INVOX.CustomizeNuanceControl()
  • Philips SpeechMike — buttons configurable via INVOX.CustomizeSpeechMikeControl()
  • Olympus DR / RM devices — buttons configurable via INVOX.CustomizeOlympusDRControl() and INVOX.CustomizeOlympusRMControl()

Device button actions can be assigned to any custom function. Example for a Nuance PowerMic record button:

INVOX.CustomizeNuanceControl(INVOX.NuanceButtonEnum.RECORD, function () {
  INVOX.SwitchDictation();
});

Contact support if your device model is not listed here.

Acoustic model adaptation

The SDK can trigger a personalized adaptation of the speech recognition model to improve accuracy for an individual user. This process is initiated with:

INVOX.StartAdaptation();

The adaptation runs in the background. Register the corresponding callbacks inside INVOX.CustomizeComponents() to track its progress:

INVOX.CustomizeComponents(function () {
  INVOX.OnStartAdaptation(function (message) {
    console.log('Adaptation started:', message);
  });

  INVOX.OnFinishAdaptation(function (message, info) {
    console.log(`Adaptation complete in ${info.duration}s`);
  });

  INVOX.OnRejectAdaptation(function (message) {
    console.warn('Adaptation was rejected:', message);
  });

  INVOX.OnErrorAdaptation(function (message, error) {
    console.error('Adaptation error:', message, error);
  });
});

Adaptation requires the user to have sufficient recorded dictation data. It is typically performed periodically rather than at every session start.

Custom writer targets

If your application uses an editor other than a plain <textarea>, you can implement a custom writer by extending INVOX.TextWriterBase:

const myWriter = new INVOX.TextWriterBase();

myWriter.write      = function (text)  { /* insert text into your editor */ };
myWriter.getText    = function ()      { return /* full editor content */; };
myWriter.getSelection = function ()    { return /* Range object */; };
myWriter.setSelection = function (r)   { /* apply range */ };
myWriter.undo       = function ()      { /* undo */ };
myWriter.redo       = function ()      { /* redo */ };

INVOX.SetTextWriter(myWriter);
INVOX.SetWriterTarget(myEditorInstance);

For asynchronous editor APIs, extend INVOX.TextWriterBaseAsync instead.