Compound OR SQL query (query # 1 with WITH) and (query # 2) with COUNT () excursions



Yes, you are correct. What I am looking for in pseudo code is a list that

(st_area(geom)>0.1) OR (COUNT(*) > 1)

and in words:

return a list, that only has only states that have an area greater than 0.1, but don't exclude that state if it is the only one in the country (usually and island country, which has plenty room for labels next to it). The excluded states are places like Slovenia that has 100 provinces, but a tiny land area (Great Britain is also and offender).

I have a table for the entire world listing all states and provinces (I call the table states, but it also can mean province).

StateName, ContryName, Pop, geometry

The table is on PostGreSQL 9.2 PostGIS 2.0

I need to remove small states (area too small) to label. But if it is an island (one country, one state) then I want to leave it in.

My Naive query is like this, but there is a syntax error:

SELECT,s.admin, st_area(geom)
FROM vector.states s
SELECT ss.admin
FROM vector.states ss
GROUP BY ss.admin
HAVING (COUNT(*) > 1) AND (st_area(ss.geom) > 0.01)
) a ON a.admin = s.admin
ORDER BY s.admin ASC;

this is the syntax error (and I expected this to happen).

ERROR:  column "ss.geom" must appear in the GROUP BY clause or be used in an aggregate function
LINE 7: HAVING (COUNT(*) > 1) AND (st_area(ss.geom) > 0.01)

Two problems:

  1. Just like the error message tells you, geom needs to be wrapped in an aggregate function if it is not listed in GROUP BY. You could just use min() ...
  2. You got your logic backwards. It needs to be COUNT(*) = 1 OR ..

But this can be solved more elegantly with an anti-semi-join using NOT EXISTS:

SELECT, s.admin, st_area(geom)
FROM   vector.states s
WHERE  st_area(s.geom) > 0.01           -- state big enough ...
   OR NOT EXISTS (                      -- ... or there are no other counties
         SELECT 1 FROM vector.states s2
         WHERE  s2.admin = s.admin
         AND    s2.pk_column <> s.pk_column  -- exclude self-join
ORDER BY s.admin;

Replace pk_column with your actual primary key column(s).