I have data that looks like this:

What I need to do is, for records having the same ClientId
, I need to group consecutive rows (using CpId) where PlaceId
is not null, and find the first and last row in each group so that I can retrieve the DateAdmitted
value from the first row and the DateDischarged
value from the last row. So, the above data needs to be organized like this and then filtered for the values I need:

Using the above example, I would want the following based on ClientId
:
ClientId FirstCpIdInSet DateAdmitted LastCpIdInSet DateDischarged
-----------------------------------------------------------------------------
1967 NULL NULL NULL NULL
1983 45 1986-12-29 45 1987-10-09
1983 47 1990-10-01 49 2009-04-12
1983 52 2009-08-31 52 2009-11-30
1988 62 1997-12-15 65 2000-01-07
ClientId
1967 could be excluded from the result set, since it never has a row where PlaceId
is not null. A couple of other things to note:
- This is taken from a temp table that is created with
CpId
as theIDENTITY
, and the table is populated with a strictORDER BY
, soCpId
is sequential in the order needed. - For those rows that have
PlaceId
and are consecutive for a singleClientId
, theDateAdmitted
should equal theDateDischarged
in the previous row.
I'd really like to be able to do this without a cursor, if possible, but after puzzling on it for two days I just can't figure it out. This is on SQL Server 2008 R2.
You don't say what you are basing first and last on. Let me assume it is CPID. You can do this with ranking functions:
select ClientID, PlaceId,
max(CpID) as max(CPId),
min(case when seqnumasc = 1 then DateAdmitted end) as DateAdmitted,
max(case when seqnumdesc = 1 then DateDischarged end) as DateDischarged
from (select t.*,
row_number() over (partition by clientID, placeID order by cpid) as seqnumasc
row_number() over (partition by clientID, placeID order by cpid desc) as seqnumdesc
from t
) t
where placeID is not null
group by ClientID, placeID
This puts in sequence nubmers to determine the first and last rows in each group. However, why can't you just use min and max on date addmited and discharged?
Based on enhanced information . . .
Now the question appears to be to define the "sets" of records according to the following conditions:
- Consecutive CPIDs
- Same client, same company
- Place not null
If so, the following will give you a "set id". This uses a trick for bringing together consecutive values, based on subtracting a sequential number from the CPID. This difference is a constant for consecutive values, providing a set id.
select clientid, setid,
min(DateAdmitted) as DateAdmitted,
max(DateDischarged) as DateDischarged,
min(cpid) as minCPID,
max(cpid) as maxCPID
from (select clientid, setid, cpid,
row_number() over (partition by clientid, setid order by cpid) as seqnum,
count(*) over (partition by clientid, setid) as setsize
from (select t.*,
(cpid - row_number() over (partition by clientid order by cpid)
) as setid
from t
where PlaceID is not NULL
) t
) t
group by clientid, setid