The query works several times more slowly when comparing the BIT column with 0 instead of 1

advertisements

I use a view based on a complex query with 17 joins (both inner and left/right outer) and subqueries. All view rows are shown in about 5 seconds.

SELECT * FROM a_view;

One of the view columns has type BIT. And when I filter view rows comparing it with 1, a query works again about 5 seconds.

SELECT * FROM a_view WHERE c = 1;

But when I compare this BIT column with 0, a query works about 50 seconds (10 times slower).

SELECT * FROM a_view WHERE c = 0;

This query which returns the same result rows works as expected about 10 seconds:

SELECT * FROM a_view
EXCEPT
SELECT * FROM a_view WHERE c = 1;

So I wonder why comparing with 0 or 'FALSE' takes so much time? Any ideas, please.

Sorting on this BIT field is fast. Filtering by other columns is fast too.


Usually there is more than one way to execute a query involving joins. All modern RDBMSs search through different join plans looking for the best plan by estimating costs (CPU and disk access times) for each.

The problem is, each additional join in a query multiplies the number of possible plans by an increasing number, resulting in double-factorial (worse than exponential) growth in the number of plans to consider as the number of joins increases. For that reason, the DB has to limit the search somewhere, meaning suboptimal plans inevitably get chosen for queries involving many joins.

For reference, PostgreSQL stops evaluating all possible plans after 12 joins by default. SQL Server will have a similar limit for sure. 17 joins would take (2 * 13) * (2 * 14) * (2 * 15) * (2 * 16) * (2 * 17) times as long to evaluate -- it's more than enough to overwhelm any RDBMS that has ever existed, or ever will.

There is also the fact to consider that DBs estimate costs based on crude statistics, such as the number of distinct values in a column and/or a list of the 10 most common values in a column. This all adds up to the fact that, as the number of joins goes up, the likelihood of choosing the best (or even a reasonable) join strategy goes way down.

Why do you need to join 17 tables? Is there no way you can simplify your DB schema?