Optimizing access to the django database when you perform multiple filters at different times on the same set of filtered basic queries

advertisements

I have a view that does something like this:

objectBase = MyModel.objects.filter(startDate__range=(start,end))
automatedObjects = objectBase.filter(automated = True).count()
userCreatedObjects = objectBase.filter(userCreated = True).count()
bookObjects = objectBase.filter(subClass = 'book').count()
pageObjects = objectBase.filter(subClass = 'page').count()
allObjectsCount = objectBase.count()

I am using 1.2.4 and the latest postgres

Anyways, I have about 20 different ways I need to filter my objectBase that filtered by date, and I noticed that each SQL query filters by date. Is there a more efficient way to make the subsequent queries not have to filter by date? Would there be a speed difference?

Also what do you think would be the best method for caching the objectBase query since theoretically it could hold hundreds or thousands of objects for the dates filtered and the likely hood of the start, end dates being the same for a request is very unlikely.

like say somebody could request the stats between dates t1 and t2, and then later request t3 to t4 where t1 < t3 < t2 and t2 < t4 so there is some overlap. Is there a way to cache it so where there is overlap between the requests that it would have to access the db for it?

sorry if this seems like a hefty request, but any help would be appreciated.


To reduce the number of queries...

objectBase = MyModel.objects.filter(startDate__range=(start,end))
automated, user_created, books, pages, total = 0,0,0,0,0
for o in objectBase:
    if o.automated: automated += 1
    if o.userCreated: user_created += 1
    if o.subClass == 'book': books += 1
    if o.subClass == 'page': pages += 1
    total += 1

This will only execute a single query, but it will probably be slower than what you're already doing, depending on your SQL indexes. If all of the fields you're counting on are indexed, along with the date range, your solution will be quick. I'm going to doubt that you have all these fields indexed however.

To your question on caching. There is no easy way to cache query set results without using the same query set instance. You could attempt using the django cache framework, but if you have many thousands of rows in your table, I don't think caching will help you.

My advice would be to create indexes on all the columns you are counting that are covered by the date range. This should make your .count queries extremely fast without having to iterate over a potentially massive collection.