Using regex to avoid if elifs

Question

I'm new to Python, which I'm using to do an ugly little put-this-tabular-data-into-a-db conversion. The program looks at the data, creates a table in MySQL, and then reads the data into the table. In this section, header row text is checked to make some decision about data typing. I had an idea that I could be clever and do this with a single regex rather than if/elifs. My solution works for this case at least, where I don't have to worry about multiple matches. What I'm asking is, is there any real merit to this approach in terms of efficiency?

def _typeMe(self, header_txt):
    # data typing
    colspecs = {
        'id':'SMALLINT(10)', 
        'date':'DATE', 
        'comments':'TEXT(4000)',
        'flag':'BIT(1)', 
        'def':'VARCHAR(255)'
    }
    # regex to match on header text e.g. 'Provisioner ID'
    r = re.search(re.compile('(ID$)|(Date)|(Comments$)|(FLAG$)', re.IGNORECASE), header_txt)
    checktype = lambda m: max(m.groups()).lower() if m else 'def'
    return colspecs[checktype(r)]

ecatmur · Accepted Answer

Absolutely; what you've got is called data-driven programming. In general it's good style because it allows you to make changes easily without having to worry about duplicating code sections.

In terms of performance it's unlikely to make much difference; the important thing is that it's more readable and more maintainable than the alternative.

Danica · Answer

I agree with @ecatmur's answer; I just wanted to post some slight code suggestions that are a little too long for a comment.

There's no need to do re.search(re.compile('...', re.IGNORECASE), header_text). Instead, you can just pass the string straight in as re.search('...', header_text, re.IGNORECASE). If you're using the same regex over and over, re.compile is faster, but re.search and friends will call it for you if you didn't.

Though I don't share Colin's disdain for named lambdas (it can be handy just because they're still one line instead of two), you don't need an inner function here at all:

return colspecs[max(m.groups()).lower() if m else 'def']

The max(m.groups()) trick also isn't necessary if you just make one capturing group instead of four: '(ID|Date|Comments|Flag)$'. Then you can do m.group(1).

Using regex to avoid if elifs

Tags:

python

Ernesto Frittata

2 Answers

ecatmur

Danica

Recent Activity

Donate For Us

Using regex to avoid if elifs

Tags:

python

Ernesto Frittata

2 Answers

ecatmur

Danica

Related questions

Recent Activity

Donate For Us