I'm writing a text to cdr (chordpro) converter and I'm having trouble detecting chord lines on the form:
               Cmaj7    F#m           C7    
Xxx xxxxxx xxx xxxxx xx x xxxxxxxxxxx xxx 
This is my python code:
def getChordMatches(line):
    import re
    notes = "[CDEFGAB]";
    accidentals = "(#|##|b|bb)?";
    chords = "(maj|min|m|sus|aug|dim)?";
    additions = "[0-9]?"
    return re.findall(notes + accidentals + chords + additions, line)
I want it to return a list ["Cmaj7", "F#m", "C7"]. The above code doesn't work, I've struggled with the documentation, but I'm not getting anywhere.
Why doesn't it work to just chain the classes and groups together?
edit
Thanks, I ended up with the following which covers most (it won't match E#m11 for instance) of my needs.
def getChordMatches(line):
    import re
    notes = "[ABCDEFG]";
    accidentals = "(?:#|##|b|bb)?";
    chords = "(?:maj|min|m|sus|aug|dim)?"
    additions = "[0-9]?"
    chordFormPattern = notes + accidentals + chords + additions
    fullPattern = chordFormPattern + "(?:/%s)?\s" % (notes + accidentals)
    matches = [x.replace(' ', '').replace('\n', '') for x in re.findall(fullPattern, line)]
    positions = [x.start() for x in re.finditer(fullPattern, line)]
    return matches, positions
You should make your groups non-capturing by changing (...) to (?:...).
accidentals = "(?:#|##|b|bb)?";
chords = "(?:maj|min|m|sus|aug|dim)?";
See it working online: ideone
The reason why it doesn't work when you have capturing groups is that it only returns those groups and not the entire match. From the documentation:
re.findall(pattern, string, flags=0)Return all non-overlapping matches of pattern in string, as a list of strings. The string is scanned left-to-right, and matches are returned in the order found. If one or more groups are present in the pattern, return a list of groups; this will be a list of tuples if the pattern has more than one group. Empty matches are included in the result unless they touch the beginning of another match.
There is a specific syntax for writing a verbose regex
regex = re.compile(
    r"""[CDEFGAB]                 # Notes
        (?:#|##|b|bb)?            # Accidentals
        (?:maj|min|m|sus|aug|dim) # Chords
        [0-9]?                    # Additions
     """, re.VERBOSE
)
result_list = regex.findall(line)
It's arguably a bit clearer than joining the strings together
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With