Storing logic in the database

advertisements

This is NOT about stored procedures (or at least i don't think it is).

Lets say i have a database. A database has some cities in it and events that take place in those cities. Some people are going to use this site and they want to get notified about certain events, when they enter the site.
The rules for specifying what events they want to be notified about are supposed to be general.

For example, i would want user to be able to say "I want to be notified about all events, that will take place on sunday, in city that was founded between year 1200 and 1400, in the country which name starts with the letter "F" or which is in South America", which would be translate in pseudo-logic-code of:

  (
  event.date.day == Sunday
  and
  event.city.founded_date.year.between(1200, 1400)
  )
AND
  (
  event.city.country.starts_with("F")
  or
  event.city.country.continent == "South Africa"
  )

The rules like "continent is", "day is", "foundation date is between" et cetera are to be predefined, user will choose it, but i want to be able to add new rules in future.

What is the best way to store such logic? The only solution i can come up with is a "NotificationGatherer" model. It would contain id of a user, and a string with json. I would create a json binary tree, for this particular case, the capitalized "AND" would be a root with two children - the inner and and inner or. The first children would have two simple conditions, which would reflect the actual conditions as abive.

Then i would have a method that's called on user's request, which would either:

  1. Evaluate the value (true/false) of this conditions set for all upcoming events

    or

  2. Form a queryset with filters, which would fetch all the upcoming events that satisfy given conditions (much more difficult and much more efficient).

Now, is this even a good approach, or should i try something else? It seems rather complex (I already see how painful it would be to test it), and i can imagine that a lot of people could have needed something like this in the past, yet i can't find any suggestions, because any searching for "logic in database" automatically points me to articles/questions about stored procedures.

If that makes any difference, I'm using django and mysql.


If it was for me, I would store the rules in database, then process them from time to time using Celery.

For the model part, I think multi-table inheritance is the way to go, since different rules need to store different data. In my opinion, django-polymorphic is your friend here:

I suggest something such as:

from django.db import models
from polymorphic import PolymorphicModel

class AbtractRuleObject(models.Model):
    class Meta:
        abstract = True

    def filter_queryset(self, queryset):
        """Will handle actual filtering of the event queryset"""
        raise NotImplementedError

    def match_instance(self, instance):
        raise NotImplementedError

class RuleSet(AbtractRuleObject):
    """Will manage the painful part o handling the OR / AND logic inside the database"""
    NATURE_CHOICES = (
        ('or', 'OR'),
        ('and', 'AND'),
    )
    nature = models.CharField(max_length=5, choices=NATURE_CHOICES, default='and')

    # since a set can belong to another set, etc.
    parent_set = models.ForeignKey('self', null=True, blank=True, related_name='children')

    def filter_queryset(self, queryset):
        """This is rather naive and could be optimized"""
        if not self.parent_set:
            # this is a root rule set so we just filter according to registered rules
            for rule in self.rules:
                if self.nature == 'and':
                    queryset = rule.filter_queryset(queryset)
                elif self.nature == 'or':
                    queryset = queryset | rule.filter_queryset(queryset)
        else:
            # it has children rules set
            for rule_set in self.children:
                if self.nature == 'and':
                    queryset = rule_set.filter_queryset(queryset)
                elif self.nature == 'or':
                    queryset = queryset | rule_set.filter_queryset(queryset)
        return queryset

    def match_instance(self, instance):
        if not self.parent_set:
            if self.nature == 'and':
                return all([rule_set.match_instance(instance) for rule_set in self.children])
            if self.nature == 'any':
                return any([rule_set.match_instance(instance) for rule_set in self.children])
        else:
            if self.nature == 'and':
                return all([rule_set.match_instance(instance) for rule_set in self.children])
            if self.nature == 'any':
                return any([rule_set.match_instance(instance) for rule_set in self.children])

class Rule(AbtractRuleObject, PolymorphicModel):
    """Base class for all rules"""
    attribute = models.CharField(help_text="Attribute of the model on which the rule will apply")
    rule_set = models.ForeignKey(RuleSet, related_name='rules')

class DateRangeRule(Rule):
    start = models.DateField(null=True, blank=True)
    end = models.DateField(null=True, blank=True)

    def filter_queryset(self, queryset):
        filters = {}
        if self.start:
            filters['{0}__gte'.format(self.attribute)] = self.start
        if self.end:
            filters['{0}__lte'.format(self.attribute)] = self.end
        return queryset.filter(**filters)

    def match_instance(self, instance):
        start_ok = True
        end_ok = True
        if self.start:
            start_ok = getattr(instance, self.attribute) >= self.start
        if self.end:
            end_ok = getattr(instance, self.attribute) <= self.end

        return start_ok and end_ok

class MatchStringRule(Rule):
    match = models.CharField()
    def filter_queryset(self, queryset):
        filters = {'{0}'.format(self.attribute): self.match}
        return queryset.filter(**filters)

    def match_instance(self, instance):
        return getattr(instance, self.attribute) == self.match

class StartsWithRule(Rule):
    start = models.CharField()

    def filter_queryset(self, queryset):
        filters = {'{0}__startswith'.format(self.attribute): self.start}
        return queryset.filter(**filters)

    def match_instance(self, instance):
        return getattr(instance, self.attribute).startswith(self.start)

Now, assuming your Event and City models look like:

class Country(models.Model):
    continent = models.CharField()
    name = models.CharField(unique=True)

class City(models.Model):
    name = models.CharField(unique=True)
    country = models.ForeignKey(Country)
    founded_date = models.DateField()

class Event(models.Model):
    name = models.CharField(unique=True)
    city = models.ForeignKey(City)
    start = models.DateField()
    end = models.DateField()

Then you can use my example as follow:

global_set = RuleSet(nature='and')
global_set.save()

set1 = RuleSet(nature='and', parent_set=global_set)
set1.save()

year_range = DateRangeRule(start=datetime.date(1200, 1, 1),
                           end=datetime.date(1400, 1, 1),
                           attribute='city__founded_date',
                           rule_set=set1)
year_range.save()

set2 = RuleSet(nature='or', parent_set=global_set)
set2.save()

startswith_f = StartsWithRule(start='F',
                              attribute='city__country__name')
                              rule_set=set2)
startswith_f.save()

exact_match = MatchStringRule(match='South Africa',
                              attribute='city__country__continent')
                              rule_set=set2)
exact_match.save()

queryset = Event.objects.all()

# Magic happens here

# Get all instances corresponding to the rules
filtered_queryset = global_set.filter_queryset(queryset)

# Check if a specific instance match the rules
assert global_set.match_instance(filtered_queryset[0]) == True

The code is absolutely untested, but I think it could eventually work or, at least, give you an implementation idea.

I hope it helps!