How to validate internationalized domain names

advertisements

I want to validate the domain url in php which may be in internationalized domain name format like in greek domain name= http://παράδειγμα.δοκιμή Is their any way to validate it using regular expression?


This are idn domains, i would first convert it to the puny code version and validate the domains then.

But if you realy like to validate an by regex

<?php

$domain = 'παράδειγμα.gr';
$regex = '#^([\w-]+://?|www[\.])?([^\-\s\,\;\:\+\/\\\?\^\`\=\&\%\"\'\*\#\<\>]*)\.[a-z]{2,7}$#';
if (preg_match($regex, $domain)) {
    echo "VALID";
}

But this you let you run in false possitives, because it is realy complex to validate an idn domain i tryed to validate that no invalid chars are within, but the list is NOT complete.

Better convert bevore to punny code

$regex = '#^([\w-]+://?|www[\.])?[a-z0-9]+[a-z0-9\-\.]*[a-z0-9]+\.[a-z]{2,7}$#';
if (preg_match($regex, idn_to_ascii($domain))) {
    echo "VALID";
}

And if you additional want to test if the domain could be resolved try:

$regex = '#^([\w-]+://?|www[\.])?[a-z0-9]+[a-z0-9\-\.]*[a-z0-9]+\.[a-z]{2,7}$#';
$punny_domain = idn_to_ascii($domain);
if (preg_match($regex, $punny_domain)) {
    if (gethostbyname($punny_domain) != $punny_domain) {
        echo "VALID";
    }
}