Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the best way to obtain the optimal number of topics for a LDA-Model using Gensim?

Tags:

People also ask

How do you choose optimal number of topics in LDA?

To decide on a suitable number of topics, you can compare the goodness-of-fit of LDA models fit with varying numbers of topics. You can evaluate the goodness-of-fit of an LDA model by calculating the perplexity of a held-out set of documents. The perplexity indicates how well the model describes a set of documents.

How do you pick the number of topics K When you run a LDA topic model?

My approach to finding the optimal number of topics is to build many LDA models with different values of number of topics (k) and pick the one that gives the highest coherence value. Choosing a 'k' that marks the end of a rapid growth of topic coherence usually offers meaningful and interpretable topics.

How many iterations does LDA have?

LDA uses a 4-step iterative process, which produces better results as the number of iterations increases based on the way that probabilities are updated with successive iterations in the LDA algorithm.


I am trying to obtain the optimal number of topics for an LDA-model within Gensim. One method I found is to calculate the log likelihood for each model and compare each against each other, e.g. at The input parameters for using latent Dirichlet allocation

Hence I looked into calculating the log likelihood of a LDA-model with Gensim and came across following post: How do you estimate α parameter of a latent dirichlet allocation model?

which basically states that the update_alpha() method implements the method decribed in Huang, Jonathan. Maximum likelihood estimation of Dirichlet distribution parameters. Still I don't know how to obtain this parameter using the libary without changing the code.

How can I obtain log likelihood from an LDA model with Gensim?

Is there a better way to obtain optimal number of topics with Gensim?