What is the best/preferred approach to implement Maximum Likelihood Estimation for large data sets in GBs

Question

I have a data-set in Gigabytes(GB) and want to estimate the parameters for missing values in that.

There is an algorithm called MLE(Maximum-likelihood Estimation) in machine learning that can be used for it.
Since R might not work on such a large data-set,so which library will be best to use for it?

greeness · Accepted Answer

By wiki:MLE:

In statistics, maximum-likelihood estimation (MLE) is a method of estimating the parameters of a statistical model. When applied to a data set and given a statistical model, maximum-likelihood estimation provides estimates for the model's parameters.

Generally you need two steps before you can apply MLE:

obtain a dataset
identify a statistical model

At this time, if you can obtain an analytic form of solution for the MLE estimate, just stream your data to the mle-estimate calculation, e.g., for gaussian distribution, to estimate mean, you just accumulate the sum, and keep the count and the sample mean will be your mle-estimate.

However, when the model involves many parameters and its pdf is highly non-linear. In such situations, the MLE estimate must be sought numerically using nonlinear optimization algorithms. If your data size is huge, try stochastic gradient descent, the true gradient is approximated by a gradient at a single example. As the algorithm sweeps through the training set, it performs the update formula for each training example. So that you can still stream your data one at a time to your update program in multiple sweeps fashion. In this way, memory constraint should not be a problem at all.

What is the best/preferred approach to implement Maximum Likelihood Estimation for large data sets in GBs

Tags:

machine-learning

Nishu Tayal

1 Answers

greeness

Recent Activity

Donate For Us

What is the best/preferred approach to implement Maximum Likelihood Estimation for large data sets in GBs

Tags:

machine-learning

Nishu Tayal

1 Answers

greeness

Related questions

Recent Activity

Donate For Us