Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a way to use the Javascript SpeechRecognition API with an audio file?

I want to use the SpeechRecognition api with an audio file (mp3, wave, etc.) Is that possible?

like image 372
The Surrican Avatar asked Aug 30 '25 18:08

The Surrican


2 Answers

The short answer is No.

The Web Speech Api Specification does not prohibit this (the browser could allow the end-user to choose a file to use as input), but the audio input stream is never provided to the calling javascript code (in the current draft version), so you don't have any way to read or change the audio that is input to the speech recognition service.

This specification was designed so that the javascript code will only have access to the result text coming from the speech recognition service.

like image 102
Tiago Sousa Avatar answered Sep 02 '25 06:09

Tiago Sousa


Basicly you may use it only with default audioinput device which is choosen on OS level...

Therefore you just need to play you file into your default audioinput

2 options possible:

1

  • Install https://www.vb-audio.com/Cable/
  • Update system settings to use VCable device as default audiooutput and audioinput
  • Play your file with any audio player you have
  • Recognize it... e.g. using even standard demo UI https://www.google.com/intl/fr/chrome/demos/speech.html

Tested this today, and it works perfectly :-)

2

THIS IS NOT TESTED BY ME, so I cannot confirm that this is working, but you may feed audio file into chrome using Selenium... just like

DesiredCapabilities capabilities = DesiredCapabilities.chrome(); 
ChromeOptions options = new ChromeOptions();
options.addArguments("--allow-file-access-from-files",
                     "--use-fake-ui-for-media-stream",
                     "--allow-file-access",
                     "--use-file-for-fake-audio-capture=D:\\PATH\\TO\\WAV\\xxx.wav",
                     "--use-fake-device-for-media-stream");
capabilities.setCapability(ChromeOptions.CAPABILITY, options);
ChromeDriver driver = new ChromeDriver(capabilities);

But I'm not sure if this stream will replace default audioinput

like image 34
Andrii Muzalevskyi Avatar answered Sep 02 '25 06:09

Andrii Muzalevskyi