I have implemented FFT/DFT that I intend to use for frequency analysis of an audio file recorded from a fan. Here's the input I get when using scipy's fft. As opposed to this, here's the output from my implementation.
So far, my code looks like this:
from scipy.fft import fft
import soundfile as sf
import numpy as np
import matplotlib.pyplot as plt
from pydub import AudioSegment
def dft(filename):
#read file
audio_seg = AudioSegment.from_file(filename)
data = audio_seg.set_channels(1).get_array_of_samples()
N = len(data)
n = np.arange(0, N)
k = np.linspace(1, 1000, 1500)
X = np.zeros(len(k))
#generate sinusoids and calculate dot product
for i in range (0, len(k)):
base_sinusoid = np.exp(-2j * np.pi * k[i] * n / N)
dot_product = np.dot(data, base_sinusoid)
X[i] = np.abs(dot_product)
#calculate signal's energy
energy_sum = np.sum(X ** 2)
#find frequencies whose energy is more than 1% of the signal's energy
present_frequencies = k[(X ** 2) * 100 / energy_sum >= 1]
print("The following frequencies are present: ", present_frequencies)
plt.figure(1)
plt.title("Frequency analysis")
plt.xlabel("Frequency")
plt.ylabel("Magnitude")
plt.plot(k, X)
plt.show()
and then I simply call the function:
dft(<absolute path>)
Apart from the image, I also output the frequencies whose energy represents more than 1% of the entire signal's energy.
The audio file is recorded from a fan with 10 blades, and around 200 Hz is the expected fundamental frequency. The frequency content is supposed to go up to 1000 Hz.
What I suspect may be the problem is an incorrect frequency resolution. Since I generate my own sinusoids, I may just be selecting the wrong frequencies. In my opinion, instead of generating 1500 frequencies (line 8), I should be generating N frequencies, however that takes a very long time to execute. So, I've mostly tried changing the maximum frequency, and the width of the frequency bins, but to no avail. I should also note that I've used this function before on audio files where I say a vowel for 3 seconds, and it generated a correct output. The only difference there was in line 8, where I set the maximum frequency to be 20000.
Thank you for your time and insight!
EDIT: I added comments to the code, as well as improvements on how I calculate the energy. Thank you @GaTe for your suggestions! I should also add that I'm not allowed to use scipy's fft or fftfreq.
I suggest three modifications:
linspace since you can use the exact frequency bins given by the FFT like that.N manually for each frequency is indeed very slow. You can exploit the FFT algorithm itself.N.This is the modified code:
from scipy.fft import fft, fftfreq
import soundfile as sf
import numpy as np
import matplotlib.pyplot as plt
from pydub import AudioSegment
def fft_analysis(filename):
# Load audio file
audio_seg = AudioSegment.from_file(filename)
data = np.array(audio_seg.set_channels(1).get_array_of_samples())
# Number of samples in the signal
N = len(data)
# Perform FFT
X = fft(data)
# Compute the frequencies corresponding to the FFT output
freqs = fftfreq(N, 1.0 / audio_seg.frame_rate)
# Only take the positive frequencies
pos_indices = np.where(freqs >= 0)
freqs = freqs[pos_indices]
X = np.abs(X[pos_indices])
# Calculate the total energy of the signal
energy_sum = np.sum(X ** 2)
# Find frequencies with energy more than 1% of the total energy
significant_freqs = freqs[(X ** 2) * 100 / energy_sum >= 1]
print("The following frequencies are present: ", significant_freqs)
# Plot the frequency spectrum
plt.figure(figsize=(10, 6))
plt.plot(freqs, X)
plt.title("Frequency analysis")
plt.xlabel("Frequency (Hz)")
plt.ylabel("Magnitude")
plt.xlim(0, 1000) # Limit the x-axis to 1000 Hz for clarity
plt.show()
fft_analysis("<absolute path>")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With