In the July 2012 issue of "Mensa Bulletin" there is an article entitled "The Digital Brain." In it the author relates the human brain to base64 computing. It is a rather interesting and fun article with a prompt at the end. Said prompt asks the reader to convert Cytosine Guanine Adenine Guanine Adenine Guanine to a base 10 number using the fact that Cytosine Cytosine Guanine Cytosine Adenine Guanine equals 2011 (the first codon set mentioned is cgagag for short and the second is ccgcag for short.) Basically you have to convert a base 64 number to base 10 using a table in the article that displays all of the possible codons in proper order with aug = 0, uuu = 1, uuc = 2, ... , gga == 61, ggg = 62, uag = 63. I decided to give this a go and settled on writing a python program to convert codon numbers to base 10 and base 10 numbers to codons. After writing a quick algorithm for both, I ran it. The program gave no errors and popped out codons for my numbers and vice versa. However, they were the wrong numbers! I can not seem to see what is going wrong and would greatly appreciate any help.
Without further ado, the code:
codons = ['aug', 'uuu', 'uuc', 'uua', 'uug', 'ucu', 'ucc', 'uca', 'ucg', 'uau', 'uac', 'uaa', 'ugu', 'ugc', 'uga', 'ugg', 'cuu', 'cuc', 'cua', 'cug', 'ccu', 'ccc', 'cca', 'ccg', 'cau', 'cac', 'caa', 'cag', 'cgu', 'cgc', 'cga', 'cgg', 'auu', 'auc', 'aua', 'acu', 'acc', 'aca', 'acg', 'aau', 'aac', 'aaa', 'aag', 'agu', 'agc', 'aga', 'agg', 'guu', 'guc', 'gua', 'gug', 'gcu', 'gcc', 'gca', 'gcg', 'gau', 'gac', 'gaa', 'gag', 'ggu', 'ggc', 'gga', 'ggg', 'uag' ]
def codonNumToBase10 ( codonValue ) :
    numberOfChars = len( codonValue )
    # check to see if contains sets of threes
    if len( codonValue ) % 3 != 0 :
        return -1
    # check to see if it contains the correct characters
    for i in range(0, numberOfChars ) :
        if codonValue[i] != 'a' :
            if codonValue[i] != 'u' :
                if codonValue[i] != 'c' :
                    if codonValue[i] != 'g' :
                        return -2
    # populate an array with decimal versions of each codon in the input
    codonNumbers = []
    base10Value = 0
    numberOfCodons = int(numberOfChars / 3 )
    for i in range(0, numberOfCodons) :
        charVal = codonValue[ 0 + (i*3) ] + codonValue[ 1 + (i*3) ] + codonValue[ 2 + (i*3) ]
        val = 0
        for j in codons :
            if j == charVal :
                codonNumbers.append( val )
                break
            val += 1
        base10Value += ( pow( 64, numberOfCodons - i - 1 ) ) * codonNumbers[i]
    return base10Value
def base10ToCodonNum ( number ) :
    codonNumber = ''
    hitZeroCount = 0
    while( 1==1 ) :
        val = number % 64
        number = int( number / 64 )
        codonNumber = codons[val] + codonNumber
        if number == 0 :
            if hitZeroCount > 0:
                break
            hitZeroCount += 1
    return codonNumber
val_2011 = 'ccgcag'
val_unknown = 'cgagag'
print( base10ToCodonNum( codonNumToBase10( val_2011 ) ), '::', codonNumToBase10( val_2011 ) )
print( base10ToCodonNum( codonNumToBase10( val_unknown ) ), '::', codonNumToBase10( val_unknown ) )
EDIT 1: The values I am getting are 1499 for ccgcag and 1978 for cgagag.
EDIT 2: base10ToCodonNum function fixed thanks to Ashwini Chaudhary.
I could not follow your code, so I made another implementation, but I got the same results:
CODONS = [
    'aug', 'uuu', 'uuc', 'uua', 'uug', 'ucu', 'ucc', 'uca',
    'ucg', 'uau', 'uac', 'uaa', 'ugu', 'ugc', 'uga', 'ugg',
    'uuu', 'cuc', 'cua', 'cug', 'ccu', 'ccc', 'cca', 'ccg',
    'cau', 'cac', 'caa', 'cag', 'cgu', 'cgc', 'cga', 'cgg',
    'auu', 'auc', 'aua', 'acu', 'acc', 'aca', 'acg', 'aau',
    'aac', 'aaa', 'aag', 'agu', 'agc', 'aga', 'agg', 'guu',
    'guc', 'gua', 'gug', 'gcu', 'gcc', 'gca', 'gcg', 'gau',
    'gac', 'gaa', 'gag', 'ggu', 'ggc', 'gga', 'ggg', 'uag',
]
def codon2decimal(s):
    if len(s) % 3 != 0:
        raise ValueError("%s doesn't look like a codon number." % s)
    digits = reversed([ s[i*3:i*3+3] for i in range(len(s)/3) ])
    val = 0
    for i, digit in enumerate(digits):
        if digit not in CODONS:
            raise ValueError("invalid sequence: %s." % digit)
        val += CODONS.index(digit) * 64 ** i
    return val
def main():
    for number in ('cggcag', 'ccgcag', 'cgagag', 'auguuuuuc'):
        print number, ':', codon2decimal(number)
if __name__ == '__main__':
    main()
results:
cggcag : 2011
ccgcag : 1499
cgagag : 1978
auguuuuuc : 66
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With