Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

When is it appropriate to use Python's fmean instead of mean?

Python 3.8 released fmean as part of the statistics module, which supplements the existing mean function in the same module. As per the docs:

Convert data to floats and compute the arithmetic mean. This runs faster than the mean() function and it always returns a float.

Source: https://docs.python.org/3/library/statistics.html#statistics.fmean

However, the docs don't really discuss any trade-offs. My question is when can I use fmean over mean, and when should I stick with mean?

My specific example is averaging error probabilities, derived from Phred scores, in FASTQ reads. Example:

from statistics import mean

def decode(c):
    return ord(c) - 33

def phred_to_probability(phred_score):
    return 10**(-phred_score/10)

def raw_q_to_probability(q):
    return phred_to_probability(decode(q))

qualities = list("3===RONT{{QKLIFGHEH=::::CAAA@[email protected]::;IBCHKJIIHHHEGGGHHGIJGFFFFMKKPILMLGGGGGIMNMEB@CBCDEKNMQQSJJMT{UKOKLLEEDEELGKIJKPEBA==>??@@@HD@?@AH?>?>?IIKIPFFEEFFKIEDDCEHFFHIHIKMLPOHQOH")


mean(raw_q_to_probability(q) for q in qualities)

Should I expect any difference in accuracy between the two functions for this application? I note the fmean solution has an additional digit reported when executing the above example but is otherwise the same.

like image 627
Equation2876 Avatar asked Oct 27 '25 13:10

Equation2876


1 Answers

From higher up in the statistics docs:

Unless explicitly noted, these functions support int, float, Decimal and Fraction.

statistics.mean can take a sequence of Fractions and give you the mean as a Fraction, or take a list of Decimal instances and give you a Decimal. statistics.fmean cannot do that.

Also, even when the inputs are already floats, statistics.mean may avoid a very slight bit of rounding error, as it does all computations in exact arithmetic until the final conversion to the result type. statistics.fmean uses math.fsum to sum the inputs with as much precision as float will allow, but the result of fsum is still a float, so that's one rounding that statistics.mean avoids.

Finally, statistics.fmean supports weights. statistics.mean does not.

like image 176
user2357112 supports Monica Avatar answered Oct 30 '25 10:10

user2357112 supports Monica



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!