Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

RegExp to match comma separated hostnames

The problem

I'm trying to validate the content of a <textarea> using JavaScript, So I created a validate() function, which returns true or false wheter the text inside the textarea is valid or not.

The textarea can only contain comma separated hostnames. By hostname I mean something like subdomain.domain.com, so it's basically some dot separated strings. Since that users don't tend to write very well, I also want to allow the possibility of leaving any amount of spaces between the various hostnames and commas, but not inside a hostname.

Here are some examples of what should or shouldn't match:

  • Should match:

    • domain.com,domain2.co.vu,sub.domain.org
    • ​ domai2n.com , dom-ain.org.co.vu.nl ,domain.it ​
    • dom-ain.it, domain.com, domain.eu.org.something
    • a.b.c, a.b, a.a.a , a.r
    • 0191481.com
  • Should not match:

    • domain.com., sub.domain.it uncomplete hostname
    • domain.me, domain2 uncomplete hostname
    • sub.sub.sub.domain.tv, do main.it hostname contains spaces
    • site uncomplete hostname
    • hèy.com hostname cannot contain accents
    • hey.01com hostname cannot end with numbers or strings containing numbers
    • hello.org..wow uncomplete hostname

What I have tried so far

I built my function using the following code:

function validate(text) {
    return (
        (/^([a-z0-9\-\.]+ *, *)*[a-z0-9\-\.]+[^, ]$/i.test(text) 
        && !/\.[^a-z]|\.$/i.test(text)
        && ~text.indexOf('.'))
    );
}

unfortunately, my function just doesn't work. It fails to recognize uncomplete hostnames and returns true.

Is there any method to accomplish this? Maybe without using RegExps, even if I'd prefer to use a single RegExp.

like image 472
Marco Bonelli Avatar asked Oct 20 '25 13:10

Marco Bonelli


1 Answers

The answers saying to not use regex are perfectly fine, but I like regex so:

^\s*(?:(?:\w+(?:-+\w+)*\.)+[a-z]+)\s*(?:,\s*(?:(?:\w+(?:-+\w+)*\.)+[a-z]+)\s*)*$

Yeah..it's not so pretty. But it works - tested on your sample cases at http://regex101.com

Edit: OK let's break it down. And only allow sub-domain-01.com and a--b.com and not -.com

Each subdomain thingo: \w+(?:-+\w+)* matches string of word characters plus optionally some words with dashes preceeding it.

Each hostname: \s*(?:(?:\w+(?:-\w+)*\.)+[a-z]+)\s* a bunch of subdomain thingos followed by a dot. Then finally followed by a string of letters only (the tld). And of course the optional spaces around the sides.

Whole thing: \s*(?:(?:\w+(?:-\w+)*\.)+[a-z]+)\s*(?:,\s*(?:(?:\w+(?:-\w+)*\.)+[a-z]+)\s*)* a single hostname, followed by 0 or more ,hostnames for our comma separated list.

Pretty simple really.

like image 132
cbreezier Avatar answered Oct 22 '25 02:10

cbreezier