Google speech to text SSML ignores pauses occasionally for unknown reasons

Question

Welcome To Ask or Share your Answers For Others

Google speech to text SSML ignores pauses occasionally for unknown reasons

asked Jan 27, 2021 in Technique[技术] by 深蓝 (71.8m points)

Google speech to text SSML ignores pauses occasionally for unknown reasons

I've had this problem since I began using Google's text to speech, now I'm determined to fix it. 22% of the time, the SSML language will not work and a text will be rendered without pauses for no reasons that I'm aware of. I really wish Google would just put the pauses in automatically for me. An audio text without pauses is virtually unlistenable. In short the program will ignore the syntax

<break time="0.4s"/>

But it will only do this for some of the texts. I should also add that I divide the text up into chunks of I think 3000 characters and the software will either obey all of the break times for that text or 5% of the time it will ignore all of the break times for that text.

Due to the fact that stackoverflow will not show text between < > I cannot post actual text that is causing the problem, so I must post the text here:

The exact text I converted into audio is located here:

problematic texts

Each text is preceded by a number surrounded by __ . On both occasions the following chunks failed both times:

16 41 46 58 59 61 65 74 80 85 86 87 90 91 92 94 95 96 97 98

The following chunks failed once out of two tries

40 45 47 81 82 89

Here is the code I'm using

from google.cloud import texttospeech
str1 = 'my_credentials.json"
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = str1
client = texttospeech.TextToSpeechClient()
input_text = texttospeech.SynthesisInput(ssml=txt1)
voice = texttospeech.VoiceSelectionParams(
    language_code='en-US',
    name='en-US-Wavenet-C',
    ssml_gender=texttospeech.SsmlVoiceGender.FEMALE,
)

audio_config = texttospeech.AudioConfig(
    audio_encoding=texttospeech.AudioEncoding.MP3,
    speaking_rate=.85
    )

response = client.synthesize_speech(input=input_text,
                                    voice=voice,
                                    audio_config=audio_config)
with open(f'{self.folder}{idx}.mp3', 'wb') as out:
    out.write(response.audio_content)

I'm using version 2.2.0

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

Categories

Google speech to text SSML ignores pauses occasionally for unknown reasons

Google speech to text SSML ignores pauses occasionally for unknown reasons

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags