Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using regex to avoid if elifs

Tags:

python

I'm new to Python, which I'm using to do an ugly little put-this-tabular-data-into-a-db conversion. The program looks at the data, creates a table in MySQL, and then reads the data into the table. In this section, header row text is checked to make some decision about data typing. I had an idea that I could be clever and do this with a single regex rather than if/elifs. My solution works for this case at least, where I don't have to worry about multiple matches. What I'm asking is, is there any real merit to this approach in terms of efficiency?

def _typeMe(self, header_txt):
    # data typing
    colspecs = {
        'id':'SMALLINT(10)', 
        'date':'DATE', 
        'comments':'TEXT(4000)',
        'flag':'BIT(1)', 
        'def':'VARCHAR(255)'
    }
    # regex to match on header text e.g. 'Provisioner ID'
    r = re.search(re.compile('(ID$)|(Date)|(Comments$)|(FLAG$)', re.IGNORECASE), header_txt)
    checktype = lambda m: max(m.groups()).lower() if m else 'def'
    return colspecs[checktype(r)]
like image 568
Ernesto Frittata Avatar asked Feb 01 '26 15:02

Ernesto Frittata


2 Answers

Absolutely; what you've got is called data-driven programming. In general it's good style because it allows you to make changes easily without having to worry about duplicating code sections.

In terms of performance it's unlikely to make much difference; the important thing is that it's more readable and more maintainable than the alternative.

like image 109
ecatmur Avatar answered Feb 04 '26 06:02

ecatmur


I agree with @ecatmur's answer; I just wanted to post some slight code suggestions that are a little too long for a comment.

There's no need to do re.search(re.compile('...', re.IGNORECASE), header_text). Instead, you can just pass the string straight in as re.search('...', header_text, re.IGNORECASE). If you're using the same regex over and over, re.compile is faster, but re.search and friends will call it for you if you didn't.

Though I don't share Colin's disdain for named lambdas (it can be handy just because they're still one line instead of two), you don't need an inner function here at all:

return colspecs[max(m.groups()).lower() if m else 'def']

The max(m.groups()) trick also isn't necessary if you just make one capturing group instead of four: '(ID|Date|Comments|Flag)$'. Then you can do m.group(1).

like image 26
Danica Avatar answered Feb 04 '26 06:02

Danica



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!