I have this dictionary, where the keys represent atom types and the values represent the atomic masses:
mass = {'H': 1.007825, 'C': 12.01, 'O': 15.9994, 'N': 14.0067, 'S': 31.972071,
'P': 30.973762}
what I want to do is to create a function that given a molecule, for instance ('H2-N-C6-H4-C-O-2H'), iterates over the mass dictionary and calculates the atomic mass on the given molecule. The value of the mass must be multiplied by the number that comes right after the atom type: H2 = H.value * 2
I know that firstly I must isolate the keys of the given molecules, for this I could use string.split('-'). Then, I think I could use and if block to stablish a condition to accomplish if the key of the given molecule is in the dictionary. But later I'm lost about how I should proceed to find the mass for each key of the dictionary.
The expected result should be something like:
mass_counter('H2-N15-P3')
out[0] 39351.14
How could I do this?
EDIT:
This is what I've tried so far
# Atomic masses
mass = {'H': 1.007825, 'C': 12.01, 'O': 15.9994, 'N': 14.0067, 'S': 31.972071,
'P': 30.973762}
def calculate_atomic_mass(molecule):
"""
Calculate the atomic mass of a given molecule
"""
mass = 0.0
mol = molecule.split('-')
for key in mass:
if key in mol:
atom = key
return mass
print calculate_atomic_mass('H2-O')
print calculate_atomic_mass('H2-S-O4')
print calculate_atomic_mass('C2-H5-O-H')
print calculate_atomic_mass('H2-N-C6-H4-C-O-2H')
Given all components have the shape Aa123, It might be easier here to identify parts with a regex, for example:
import re
srch = re.compile(r'([A-Za-z]+)(\d*)')
mass = {'H': 1.007825, 'C': 12.01, 'O': 15.9994, 'N': 14.0067, 'S': 31.972071, 'P': 30.973762}
def calculate_atomic_mass(molecule):
return sum(mass[a[1]]*int(a[2] or '1') for a in srch.finditer(molecule))
Here our regular expression [wiki] thus captures a sequence of [A-Z-a-z]s, and a (possibly empty) sequence of digits (\d*), these are the first and second capture group respectively, and thus can be obtained for a match with a[1] and a[2].
this then yields:
>>> print(calculate_atomic_mass('H2-O'))
18.01505
>>> print(calculate_atomic_mass('H2-S-O4'))
97.985321
>>> print(calculate_atomic_mass('C2-H5-O-H'))
46.06635
>>> print(calculate_atomic_mass('H2-N-C6-H4-C-O-2H'))
121.130875
>>> print(calculate_atomic_mass('H2-N15-P3'))
305.037436
We thus take the sum of the mass[..] of the first capture group (the name of the atom) times the number at the end, and we use '1' in case no such number can be found.
Or we can first split the data, and then look for a atom part and a number part:
import re
srch = re.compile(r'^([A-Za-z]+)(\d*)$')
def calculate_atomic_mass(molecule):
"""
Calculate the atomic mass of a given molecule
"""
result = 0.0
mol = molecule.split('-')
if atm in mol:
c = srch.find(atm)
result += result[c[1]] * int(c[2] or '1')
return result
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With