I'm trying to create a home made spectrum analyzer with 8 strips of LED's.
The part i'm struggling with is performing the FFT and understanding how to use the results.
So far this is what I have:
import opc
import time
import pyaudio
import wave
import sys
import numpy
import math
CHUNK = 1024
# Gets the pitch from the audio
def pitch(signal):
  # NOT SURE IF ANY OF THIS IS CORRECT
  signal = numpy.fromstring(signal, 'Int16');
  print "signal = ", signal
  testing = numpy.fft.fft(signal)
  print "testing = ", testing
wf = wave.open(sys.argv[1], 'rb')
RATE = wf.getframerate()
p = pyaudio.PyAudio() # Instantiate PyAudio
# Open Stream
stream = p.open(format=p.get_format_from_width(wf.getsampwidth()),
                channels=wf.getnchannels(),
                rate=wf.getframerate(),
                output=True)
# Read data
data = wf.readframes(CHUNK)
# Play Stream
while data != '':
    stream.write(data)
    data = wf.readframes(CHUNK)
    frequency = pitch(data)
    print "%f frequency" %frequency
I'm struggling with what to do in the pitch method. I know i need to perform FFT on the data that is passed in, but am really unsure how to do it.
Also should be using this function?
Because of the way np.fft.fft works, if you use 1024 data points you will get values for 512 frequencies (plus a value zero Hz, DC offset). If you only want 8 frequencies you have to feed it 16 data points.
You might be able to do what you want by down sampling by a factor of 64 - then 16 down sampled points would be time-equivalent to 1024 original points. I've never explored this so I don't know what this entails or what the pitfalls might be.
You're going to have to do some learning - The Scientist and Engineer's Guide to Digital Signal Processing really is an excellant resource, at least it was for me.
Keep in mind that for an audio cd .wav file the sample frequency is 44100 Hz - a 1024 sample chunk is only 23 mS of the sound.
scipy.io.wavfile.read makes getting the data easy.
samp_rate, data = scipy.io.wavfile.read(filename)
data is a 2-d numpy array with one channel in in column zero, data[:,0], and the other in column 1, data[:,1]
Matplotlib's specgram and psd functions can give you the data you want. A graphing analog to what you are trying to do would be.
from matplotlib import pyplot as plt
import scipy.io.wavfile
samp_rate, data = scipy.io.wavfile.read(filename)
Pxx, freqs, bins, im = plt.specgram(data[:1024,0], NFFT = 16, noverlap = 0, Fs = samp_rate)
plt.show()
plt.close()
Since you aren't doing any plotting just use matplolib.mlab.specgram.
Pxx, freqs, t = matplolib.mlab.specgram(data[:1024,0], NFFT = 16, noverlap = 0, Fs = samp_rate)
Its return values (Pxx, freqs, t) are
     - *Pxx*: 2-D array, columns are the periodograms of successive segments
     - *freqs*: 1-D array of frequencies corresponding to the rows in Pxx
     - *t*: 1-D array of times corresponding to midpoints of segments.
Pxx[1:, 0] would be the values for the frequencies for T0, Pxx[1:, 1] for T1, Pxx[1:, 2] for T2, ... This is what you would feed to your display. You don't use Pxx[0, :] because it is for 0 Hz.
power spectral density - matplotlib.mlab.psd()
Maybe another strategy to get down to 8 bands would be to use large chunks and normalize the values. Then you could break the values up into eight segments and get the sum of each segments. I think this is valid - maybe only for the power spectral density. sklearn.preprocessing.normalize
w = sklearn.preprocessing.normalize(Pxx[1:,:], norm = 'l1', axis = 0)
But then again, I just made all that up.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With