I want to create a script that will add new domains to our DNS Servers. I found that Fully qualified domain name validation REGEX. However, when I use it with sed, it is not working as I would expect:
echo test | sed '/(?=^.{5,254}$)(^(?:(?!\d+\.)[a-zA-Z0-9_\-]{1,63}\.?)+(:[a-zA-Z]{2,})$)/p'
--------
Output is:
test
echo test.com | sed '/(?=^.{5,254}$)(^(?:(?!\d+\.)[a-zA-Z0-9_\-]{1,63}\.?)+(:[a-zA-Z]{2,})$)/p'
--------
Output is:
test.com
I expected that the output of the first command should be a blank line. What do I do wrong?
I find this to be a more comprehensive regex:
(?=^.{4,253}$)(^(?:[a-zA-Z0-9](?:(?:[a-zA-Z0-9\-]){0,61}[a-zA-Z0-9])?\.)+([a-zA-Z]{2,}|xn--[a-zA-Z0-9][a-zA-Z0-9\-]*[a-zA-Z0-9])$)
(?=^.{4,253}$)
(?:[a-zA-Z0-9](?:(?:[a-zA-Z0-9\-]){,61}[a-zA-Z0-9])?\.)([a-zA-Z]{2,}|xn--[a-zA-Z0-9][a-zA-Z0-9\-]*[a-zA-Z0-9])
RFC 3696§2: The DNS spec technically permits numerics in the TLD, as well as single-letter TLDs; however, there are currently no single-letter TLDs or TLDs with numbers currently, and all-numeric TLDs are not permitted, so this part of the regex has been simplified to [a-zA-Z]{2,}.
--OR--
RFC 3490§5: an internationalized domain name ccTLD (IDN ccTLD) may be punycoded, as indicated by an "xn--" prefix, after which it may contain letters, numbers, or hyphens. This approximates to xn--[a-zA-Z0-9][a-zA-Z0-9\-]*[a-zA-Z0-9]
Be aware that this pattern does not validate a punycode TLD! Invalid punycode will be tolerated, e.g. "xn--qqqq", because attempting to validate punycode against the appropriate encoding mechanisms is beyond the scope of a regular expression. While punycode itself technically permits an encoded string ending in a hyphen, RFC 3492§5 observes and respects the IDNA limitation that labels may not end in a hyphen.
EDIT 02/2021: Hat tip to user2241415 for pointing out that IDN ccTLDs did not match the previously-specified regex.
You are missing a question mark in your regex :
(?=^.{5,254}$)(^(?:(?!\d+\.)[a-zA-Z0-9_\-]{1,63}\.?)+(?:[a-zA-Z]{2,})$)
You can test your regex here
You can do what you want with grep :
$ echo test.com | grep -P '(?=^.{5,254}$)(^(?:(?!\d+\.)[a-zA-Z0-9_\-]{1,63}\.?)+(?:[a-zA-Z]{2,})$)'
test.com
$ echo test | grep -P '(?=^.{5,254}$)(^(?:(?!\d+\.)[a-zA-Z0-9_\-]{1,63}\.?)+(?:[a-zA-Z]{2,})$)'
$
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With