I'm implementing the forward recursion for Hidden Markov Model: the necessary steps (a-b) are shown here

Below is my implementation
from scipy.stats import norm
import numpy as np
import random
def gmmdensity(obs, w, mu, sd):
# compute probability density function of gaussian mixture
gauss_mixt = norm.pdf(obs, mu, sd)[:,None]*w
return gauss_mixt
def alpha(obs, states, A, pi, w, mu, sigma):
dens = np.sum(gmmdensity(obs, w, mu, sigma), axis = 2)
# scaling factor is used to renormalize probabilities in order to
# avoid numerical underflow
scaling_factor = np.ones(len(obs))
alpha_matrix = np.zeros((len(states), len(obs)))
# for t = 0
alpha_matrix[:,0] = pi*dens[0]
scaling_factor[0] = 1/np.sum(alpha_matrix[:,0], axis = 0)
alpha_matrix[:,0] *= scaling_factor[0]
# for t == 1:T
for t in range(1, len(obs)):
alpha_matrix[:,t] = np.matmul(alpha_matrix[:,t-1], A)*dens[t]
scaling_factor[t] = 1/np.sum(alpha_matrix[:,t], axis = 0)
alpha_matrix[:,t] *= scaling_factor[t]
return alpha_matrix, scaling_factor
Let's generate some data to run the algorithm
obs = np.concatenate((np.random.normal(0, 1, size = 500),
np.random.normal(1.5, 1, size = 500))).reshape(-1,1)
N = 2 # number of hidden states
M = 3 # number of mixture components
states = list(range(N))
pi = np.array([0.5, 0.5]) # initial probabilities
A = np.array([[0.8, 0.2], [0.3, 0.7]]) # transition matrix
mu = np.array([np.min(obs), np.median(obs), np.max(obs)]) # means of mixture components
sigma = np.array([1, 1, 1]) # variances of mixture components
w = np.array([[0.2, 0.3, 0.5], [0.6, 0.2, 0.2]]) # weights of mixture components
Let's see how fast the algorithm is
%timeit alpha(obs, states, A, pi, w, mu, sigma)
13.6 ms ± 1.24 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)
Is there any possibility to make this code faster? I thought about using numba or cython, but it never fully worked in this case.
I profiled this code, and found the following ideas were helpful for speeding it up.
alpha_matrix[:,t] vector multiple times is slow. It is faster to keep the vector you are manipulating as a local variable until you are done manipulating it. Same for scaling_factor[t].alpha_matrix[:,t] has poor memory locality. It accesses an element, skips a few elements, accesses an element, and so on. Since you are doing alpha_matrix[:,t] everywhere, I found it was easier to index if you swap the two indices. Once we're done manipulating it, we can transpose it back to its original state so you don't have to change the rest of your code.np.sum(..., axis=0) on an array with only one axis is the same thing as np.sum() without an axis argument, but using the axis argument is slower.w array creates a big (1000, 2, 3) array, which you then sum the last dimension of. You can combine these two steps using einsum(). This is faster because it avoids creating the big intermediate array. I copied the code for this from Using Python numpy einsum to obtain dot product between 2 Matrices.scipy.stats.norm.pdf() is weirdly slow here. I think SciPy is using a finite difference of the CDF to define the PDF? I re-coded it with the formula for the normal PDF, and found that was 3x faster.This adds up to a roughly 40x speedup on the provided benchmark:
%timeit alpha_orig(obs, states, A, pi, w, mu, sigma)
14.4 ms ± 159 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit alpha(obs, states, A, pi, w, mu, sigma)
308 µs ± 2.02 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
Full code:
from scipy.stats import norm
import numpy as np
import random
import numba as nb
@nb.njit()
def fast_normal_pdf(x, mu, sigma):
x = (x - mu) / sigma
one_over_sqrt_2pi = 1 / np.sqrt(2 * np.pi)
return (1 / sigma) * one_over_sqrt_2pi * np.exp(-0.5*x*x)
def gmmdensitysum(obs, w, mu, sd):
pdf = fast_normal_pdf(obs, mu, sigma)
return np.einsum('ij,kj->ik', pdf, w)
@nb.njit()
def alpha_inner(N, alpha_matrix, dens, scaling_factor):
for t in range(1, N):
current_alpha = (alpha_matrix[t-1] @ A) * dens[t]
current_scaling_factor = 1 / current_alpha.sum()
current_alpha *= current_scaling_factor
scaling_factor[t] = current_scaling_factor
alpha_matrix[t] = current_alpha
def alpha(obs, states, A, pi, w, mu, sigma):
dens = gmmdensitysum(obs, w, mu, sigma)
# scaling factor is used to renormalize probabilities in order to
# avoid numerical underflow
scaling_factor = np.ones(len(obs))
alpha_matrix = np.zeros((len(obs), len(states)))
# for t = 0
alpha_matrix[0] = pi * dens[0]
scaling_factor[0] = 1 / np.sum(alpha_matrix[0])
alpha_matrix[0] *= scaling_factor[0]
alpha_inner(len(obs), alpha_matrix, dens, scaling_factor)
return alpha_matrix.T, scaling_factor
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With