Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python: Data validation using regular expression

Tags:

python

regex

I am trying to use Python regular expression to validate the value of a variable.

The validation rules are as follows:

  • The value can contain any of a-z, A-Z, 0-9 and * (no blank, no -, no ,)
  • The value can start with a number (0-9) or alphabet (a-z, A-Z) or *
  • The value can end with a number (0-9) or alphabet (a-z, A-Z) or *
  • In the middle, the value can contain a number (0-9) or alphabet (a-z, A-Z) or *
  • Any other values must not be allowed

Currently I am using the following snippet of code to do the validation:

import re
data = "asdsaq2323-asds"
if re.compile("[a-zA-Z0-9*]+").match(data).group() == data:
    print "match"
else:
    print "no match"

I feel there should be a better way of doing the above. I am looking for something like the following:

validate_func(pattern, data) 
/* returns data if the data passes the validation rules */
/* return None if the data does not passes the validation rules */
/* should not return part of the data which matches the validation rules */

Does one such build-in function exist?

like image 380
Sangeeth Saravanaraj Avatar asked Mar 17 '26 17:03

Sangeeth Saravanaraj


2 Answers

In a regex, the metacharacters ^ and $ mean "start-of-string" and "end-of-string" (respectively); so, rather than seeing what matches, and comparing it to the whole string, you can simply require that the regex match the whole string to begin with:

import re
data = "asdsaq2323-asds"
if re.compile("^[a-zA-Z0-9*]+$").match(data):
    print "match"
else:
    print "no match"

In addition, since you're only using the regex once — you compile it and immediately use it — you can use the convenience method re.match to handle that as a single step:

import re
data = "asdsaq2323-asds"
if re.match("^[a-zA-Z0-9*]+$", data):
    print "match"
else:
    print "no match"
like image 74
ruakh Avatar answered Mar 19 '26 05:03

ruakh


To make sure the entire string matches your pattern, use beginning and end of string anchors in your regex. For example:

regex = re.compile(r'\A[a-zA-Z0-9*]+\Z')
if regex.match(data):
    print "match"
else:
    print "no match"

Making this a function:

def validate_func(regex, data):
    return data if regex.match(data) else None

Example:

>>> regex = re.compile(r'\A[a-zA-Z0-9*]+\Z')
>>> validate_func(regex, 'asdsaq2323-asds')
>>> validate_func(regex, 'asdsaq2323asds')
'asdsaq2323asds'

As a side note, I prefer \A and \Z over ^ and $ for validation like this the meaning of ^ and $ can change depending on the flags used, and $ will match just before a line break characters at the end of the string.

like image 23
Andrew Clark Avatar answered Mar 19 '26 05:03

Andrew Clark



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!