Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Saving audio from mp4 as wav file using Moviepy Audiofile

I have a video file named 'video.mp4'. I am trying to seperate a section of audio from the video and save it as a wav file that can be used with other Python modules. I want to do this with MoviePy.

I send parameters to the write_audiofile function, specifying the filename, fps, nbyte, and codec.

Following the MoviePy AudioClip docs, I specified the codec as ‘pcm_s32le’ for a 32-bit wav file.

from moviepy.editor import *

sound = AudioFileClip("video.mp4")
newsound = sound.subclip("00:00:13","00:00:15")   #audio from 13 to 15 seconds
newsound.write_audiofile("sound.wav", 44100, 2, 2000,"pcm_s32le")

This code generates a .wav file, named 'sound.wav'.


Opening the audio file in Audacity

The resulting file, sound.wav, can be opened in Audacity, however I run into problems when I try to use it as a wav file with other Python modules.


Playing the sound file in pygame

import pygame
pygame.mixer.init()
sound=pygame.mixer.Sound("sound.wav")

The third line gives the following error:

pygame.error: Unable to open file 'sound.wav'


Determining type of sound file using sndhdr.what()

import sndhdr
sndhdr.what("sound.wav")

The sndhdr method returned none . According to the docs, when this happens, the method failed to determine the type of sound data stored in the file.


Reading the file with Google Speech Recognition

import speech_recognition as sr
r = sr.Recognizer()
audio = "sound.wav"

with sr.AudioFile(audio) as source:
    audio = r.record(source)
text= r.recognize_google(audio)
print(text)

This code stops execution on the second to last line:

ValueError: Audio file could not be read as PCM WAV, AIFF/AIFF-C, or Native FLAC; check if file is corrupted or in another format

Why does the audio file open in Audacity, if sndhdr.what() can not recognize it as an audio file type? How can I properly export a MoviePy AudioClip as a wav file?

like image 228
Frank Fanelli Avatar asked Sep 03 '25 07:09

Frank Fanelli


2 Answers

I had the same issue with no codec specified or with codec = 'pcms32le', the one that worked for me was pcm_s16le. Note that I am using "fr-FR" language, you should probably adapt to yur needs. here is the entire code :

# Python code to convert video to audio
import moviepy.editor as mp
import speech_recognition as sr

# Insert Local Video File Path
clip = mp.VideoFileClip("/tmp/data/test.mp4")

# Insert Local Audio File Path
clip.audio.write_audiofile("/tmp/data/test.wav",codec='pcm_s16le')

# initialize the recognizer
r = sr.Recognizer()

# open the file
with sr.AudioFile("/tmp/data/test.wav") as source:
    # listen for the data (load audio to memory)
    audio_data = r.record(source)
    # recognize (convert from speech to text)
    text = r.recognize_google(audio_data, language = "fr-FR")
    print(text)
like image 85
FXA Avatar answered Sep 04 '25 21:09

FXA


I had the same issue. I was trying to get a mp4 file from URL, then convert It into wav file and call Google Speech Recognition over It. Instead I used pydub to handle conversion and it worked! Here's a sample of the code:

    import requests
    import io
    import speech_recognition as sr
    from pydub import AudioSegment


    # This function translate speech to text
    def speech_to_text(file):
        recognizer = sr.Recognizer()
        audio = sr.AudioFile(file)
        with audio as source:
            speech = recognizer.record(source)
            try:
                # Call recognizer with audio and language
                text = recognizer.recognize_google(speech, language='pt-BR')
                print("Você disse: " + text)
                return text
            # If recognizer don't understand
            except:
                print("Não entendi")

    def mp4_to_wav(file):
        audio = AudioSegment.from_file(file, format="mp4")
        audio.export("audio.wav", format="wav")
        return audio

    def mp4_to_wav_mem(file):
        audio = AudioSegment.from_file_using_temporary_files(file, 'mp4')
        file = io.BytesIO()
        file = audio.export(file, format="wav")
        file.seek(0)
        return file


    url = ''
    r = requests.get(url, stream=True)
    file = io.BytesIO(r.content)
    file = mp4_to_wav_mem(file)
    speech_to_text(file)

Note that I wrote two functions: mp4_to_wav and mp4_to_wav_mem. The only difference is mp4_to_wav_mem handle all files in memory and mp4_to_wav generates .wav file.

like image 37
Filipe Medeiros Avatar answered Sep 04 '25 20:09

Filipe Medeiros