Select the rows where the last row of the associated table has a specific value

advertisements

I have two tables:

User (id, name)
UserEvent (id, user_id, name, date)

How can I get all the users where the last (ordered by date) UserEvent.name has a value of 'played'?

I wrote an example on SQLFiddle with some specific data: http://sqlfiddle.com/#!9/b76e24 - For this scenario I would just get 'Mery' from table User, because even though 'John' has associated events name of the last one is not 'played'.


This is probably fastest:

SELECT u.*
FROM   usr u  -- avoiding "User" as table name
JOIN   LATERAL (
   SELECT name
   FROM   userevent
   WHERE  user_id = u.id
   ORDER  BY date DESC NULLS LAST
   LIMIT  1
   ) ue ON ue.name = 'played';

LATERAL requires Postgres 9.3+:

  • What is the difference between LATERAL and a subquery in PostgreSQL?

Or you could use DISTINCT ON (faster for few rows per user):

SELECT u.*
FROM   usr u  -- avoiding "User" as table name
JOIN   (
   SELECT DISTINCT ON (user_id)
          user_id, name
   FROM   userevent
   ORDER  BY user_id, date DESC NULLS LAST
   ) ue ON ue.user_id = u.id
       AND ue.name = 'played';

Details for DISTINCT ON:

  • Select first row in each GROUP BY group?

SQL Fiddle with valid test case.

If date is defined NOT NULL, you don't need NULLS LAST. (Neither in the index below.)

  • PostgreSQL sort by datetime asc, null first?

Key to read performance for both but especially the first query is a matching multicolumn index:

CREATE INDEX userevent_foo_idx ON userevent (user_id, date DESC NULLS LAST, name);

  • Optimize GROUP BY query to retrieve latest record per user

Aside: Never use reserved words as identifiers.