Validation of the e-mail address with unique domain names with a regex

advertisements

I have a regex that I am using to validate email addresses. I like this regex because it is fairly relax and has proven to work quite well.

Here is the regex:

(['\"]{1,}.+['\"]{1,}\s+)?<?[\w\.\-][email protected][^\.][\w\.\-]+\.[A-Za-z]{2,}>?

Ok great, basically all reasonably valid email addresses that you can throw at it will validate. I know that maybe even some invalid ones will fall through but that is ok for my specific use-case.

Now it happens to be the case that [email protected] does not validate. And guess what x.com is actually a domain name that exists (owned by paypall).

Looking at the regex part that validates the domain name:

@[^\.][\w\.\-]+

It looks like this should be able to parse the x.com domain name, but it doesn't. The culprit is the part that checks that a domain name can not begin with a dot (such as [email protected])

@[^\.]

If I remove the [^.] part of my regex the domain x.com validates but now the regex allows domains names beginning with a dot, such as .test.com; this is a little bit too relax for me ;-)

So my question is how can the negative character list part affect my single character check, basically the way I am reading the regex is: "make sure this string does not start with a dot", but apparantly it does more.

Any help would be appreciated.

Regards,

Waseem


As Luis suggested, you can use [^\.][\w\.\-]* to match the domtain name, however it will now also match addresses like [email protected] and [email protected]@.com. You might want to make sure that there is only one period at a time, and that the first character after the @ is more restricted than just not being a period.

Match the domain name and the period (and subdomains and their periods) using:

([\w\-]+\.)+

So your pattern would be:

(['\"]{1,}.+['\"]{1,}\s+)?<?[\w\.\-][email protected]([\w\-]+\.)+[A-Za-z]{2,}>?