Random data generator for a regex in python


In python, I am looking for python code which I can use to create random data matching any regex. For example, if the regex is


I want to have a list of random numbers with a random length between 1 and 100 (equally distributed)

There are some 'regex inverters' available (see here) which compute ALL possible matches, which is not what I want, and which is extremely impracticable. The example above, for example, has more then 10^100 possible matches, which never can be stored in a list. I just need a function to return a match by random.

Maybe there is a package already available which can be used to accomplish this? I need a function that creates a matching string for ANY regex, not just the given one or some other, but maybe 100 different regex. I just cannot code them myself, I want the function extract the pattern to return me a matching string.

If the expressions you match do not have any "advanced" features, like look-ahead or look-behind, then you can parse it yourself and build a proper generator

Treat each part of the regex as a function returning something (e.g., between 1 and 100 digits) and glue them together at the top:

import random
from string import digits, uppercase, letters

def joiner(*items):
    # actually should return lambda as the other functions
    return ''.join(item() for item in items)  

def roll(item, n1, n2=None):
    n2 = n2 or n1
    return lambda: ''.join(item() for _ in xrange(random.randint(n1, n2)))

def rand(collection):
    return lambda: random.choice(collection)

# this is a generator for /\d{1,10}:[A-Z]{5}/
print joiner(roll(rand(digits), 1, 10),
             roll(rand(uppercase), 5))

# [A-C]{2}\d{2,20}@\w{10,1000}
print joiner(roll(rand('ABC'), 2),
             roll(rand(digits), 2, 20),
             roll(rand(letters), 10, 1000))

Parsing the regex would be another question. So this solution is not universal, but maybe it's sufficient