Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
271 views
in Technique[技术] by (71.8m points)

javascript - HTML页面上的Google Cloud Speech API(Google Cloud Speech API on HTML page)

I have implemented Google Cloud Speech API in ac# console API.(我已经在ac#控制台API中实现了Google Cloud Speech API。)

Now I want to implement the same on a HTML page.(现在,我想在HTML页面上实现相同的功能。) Below are the steps I have followed:(以下是我遵循的步骤:) Captured the voice on HTML page using Media recorder and post the same to a WEB API:(使用媒体记录器捕获HTML页面上的声音并将其发布到WEB API:) mediaRecorder.ondataavailable = function (e) { chunks.push(e.data); var blob = new Blob(chunks, { 'type': 'audio/wav; codecs=0' }); var fd = new FormData(); fd.append('fname', 'test.wav'); //fd.append('data', chunks[0]); fd.append('data', blob); $.ajax({ type: 'POST', url: APIUrl, data: fd, processData: false, contentType: false }).done(function (data) { console.log(data); }); On the WEB API I am using Google Cloud speech recognition.(在WEB API上,我正在使用Google Cloud语音识别。) But to my luck, It returns null response.(但令我幸运的是,它返回空响应。) The test file provided by google Audio.raw is working fine with the same code.(google Audio.raw提供的测试文件可以在相同的代码下正常工作。) But any audio sent from webpage is not providing any results.(但是从网页发送的任何音频都无法提供任何结果。) string text = ""; var speech = SpeechClient.Create(); var response = speech.Recognize(new RecognitionConfig() { Encoding = RecognitionConfig.Types.AudioEncoding.OggOpus, SampleRateHertz = 48000, LanguageCode = "en", }, RecognitionAudio.FromStream(HttpContext.Current.Request.Files[0].InputStream)); foreach (var result in response.Results) { foreach (var alternative in result.Alternatives) { text = alternative.Transcript; } } I have tried different combinations of Encoding and Hertz.(我尝试了编码和赫兹的不同组合。) But none works.(但是没有办法。) Also I tried saving the audio first on local drive in WAV format and reading the response from local file.(我也尝试过先将音频以WAV格式保存在本地驱动器上,然后从本地文件读取响应。) But it does not work either.(但这也不起作用。)   ask by Abhishek translate from so

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

You are not recording in the format you think you are recording.(您没有以您认为的录制格式进行录制。)

MediaRecorder in Chrome only supports codec opus in WebM container.(MediaRecorder在Chrome只支持编解码器opusWebM容器。) MediaRecorder in Firefox however supports codec opus in Ogg container.(但是,Firefox中的MediaRecorder在Ogg容器中支持编解码器opus 。) This can quickly validated by running the following snippet in respective browser's JS console.(通过在相应浏览器的JS控制台中运行以下代码片段,可以快速验证这一点。) You will see True or False based on the support.(根据支持,您将看到TrueFalse 。) MediaRecorder.isTypeSupported('audio/webm;codecs=opus') MediaRecorder.isTypeSupported('audio/ogg;codecs=opus') Google Cloud Speech API supports Opus but only in Ogg container.(Google Cloud Speech API支持Opus,但仅在Ogg容器中。) If you run the code in Firefox and try the output with Speech API it should work.(如果您在Firefox中运行代码并尝试使用Speech API进行输出,那么它应该可以工作。) For this to work with Chrome you will need to re-mux the file in Ogg container on the server side before sending it to the Cloud Speech API.(为了使此功能与Chrome配合使用,您需要先将文件重新混合到服务器端的Ogg容器中,然后再将其发送到Cloud Speech API。) You can use ffmpeg to do so(您可以使用ffmpeg这样做) ffmpeg -i file_chrome.wav -acodec copy resources/file.oga(ffmpeg -i file_chrome.wav -acodec复制资源/file.oga) Note that this is a re-mux and not a re-encode process.(请注意,这是重新复用,而不是重新编码过程。) You are just copying the same data in a different container.(您只是将相同的数据复制到另一个容器中。) Bonus Tip: If you are on Linux/Mac you can use the file <file_name> command to check the output file type.(温馨提示:如果您使用的是Linux / Mac,则可以使用file <file_name>命令检查输出文件的类型。) Chrome file would show up as WebM and Firefox output would show up as Ogg data, Opus audio .(Chrome文件将显示为WebM而Firefox输出将显示为Ogg data, Opus audio 。)

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...