Query Nasty SQL: Is there a way to find the first and last lines in a group without a cursor?


I have data that looks like this:

What I need to do is, for records having the same ClientId, I need to group consecutive rows (using CpId) where PlaceId is not null, and find the first and last row in each group so that I can retrieve the DateAdmitted value from the first row and the DateDischarged value from the last row. So, the above data needs to be organized like this and then filtered for the values I need:

Using the above example, I would want the following based on ClientId:

ClientId    FirstCpIdInSet    DateAdmitted    LastCpIdInSet    DateDischarged
1967        NULL              NULL            NULL             NULL
1983        45                1986-12-29      45               1987-10-09
1983        47                1990-10-01      49               2009-04-12
1983        52                2009-08-31      52               2009-11-30
1988        62                1997-12-15      65               2000-01-07

ClientId 1967 could be excluded from the result set, since it never has a row where PlaceId is not null. A couple of other things to note:

  • This is taken from a temp table that is created with CpId as the IDENTITY, and the table is populated with a strict ORDER BY, so CpId is sequential in the order needed.
  • For those rows that have PlaceId and are consecutive for a single ClientId, the DateAdmitted should equal the DateDischarged in the previous row.

I'd really like to be able to do this without a cursor, if possible, but after puzzling on it for two days I just can't figure it out. This is on SQL Server 2008 R2.

You don't say what you are basing first and last on. Let me assume it is CPID. You can do this with ranking functions:

select ClientID, PlaceId,
       max(CpID) as max(CPId),
       min(case when seqnumasc = 1 then DateAdmitted end) as DateAdmitted,
       max(case when seqnumdesc = 1 then DateDischarged end) as DateDischarged
from (select t.*,
             row_number() over (partition by clientID, placeID order by cpid) as seqnumasc
             row_number() over (partition by clientID, placeID order by cpid desc) as seqnumdesc
      from t
     ) t
where placeID is not null
group by ClientID, placeID

This puts in sequence nubmers to determine the first and last rows in each group. However, why can't you just use min and max on date addmited and discharged?

Based on enhanced information . . .

Now the question appears to be to define the "sets" of records according to the following conditions:

  • Consecutive CPIDs
  • Same client, same company
  • Place not null

If so, the following will give you a "set id". This uses a trick for bringing together consecutive values, based on subtracting a sequential number from the CPID. This difference is a constant for consecutive values, providing a set id.

select clientid, setid,
       min(DateAdmitted) as DateAdmitted,
       max(DateDischarged) as DateDischarged,
       min(cpid) as minCPID,
       max(cpid) as maxCPID
from (select clientid, setid, cpid,
             row_number() over (partition by clientid, setid order by cpid) as seqnum,
             count(*) over (partition by clientid, setid) as setsize
      from (select t.*,
                   (cpid - row_number() over (partition by clientid order by cpid)
                   ) as setid
            from t
            where PlaceID is not NULL
           ) t
    ) t
group by clientid, setid