Oracle SQL Self Join

advertisements

This is my first question on this forum, so be patient... My question is about the most efficient way to build a query on the following table:

create table OpenTrades(
    AccountNumber   number,
    SnapshotTime    date,
    Ticket      number,
    OpenTime    date,
    TradeType   varchar2(4),
    TradeSize   number,
    TradeItem   char(6),
    OpenPrice   number,
    TradeSL     number,
    TradeTP     number,
    TradeSwap   number,
    TradeProfit number
);

alter table OpenTrades add constraint OpenTrades_PK Primary Key (AccountNumber, SnapshotTime, Ticket) using index tablespace MyNNIdx;`

This table is populated every 15 minutes, and the timestamp of the insert is saved in SnapshotTime column. The sought query should group records by Week number (based on SnapshotTime), and filter the records so that it would return those with the latest (SnapshotTime) within the same week.

So far, I've tried the following:

select MyWeekNo(ot1.SnapshotTime), max(ot2.SnapshotTime) from OpenTrades ot1, OpenTrades ot2 where MyWeekNo(ot2.SnapshotTime)=MyWeekNo(ot1.SnapshotTime) group by MyWeekNo(ot1.SnapshotTime);.

( Note: MyWeekNo() is a slightly modified version of to_char(,'IW'). )

However, this takes way longer than expected (15 minutes for less than 1500 rows); I'm sure there's a much better way to write this, but so far, it has eluded me. Any idea?


You don't need to do a self-join at all. You'll get the same result with just:

select MyWeekNo(ot.SnapshotTime), max(ot.SnapshotTime)
from OpenTrades ot
group by MyWeekNo(ot.SnapshotTime);

You may find it useful to add a virtual column holding the calculated week number, but it won't have any performance impact for this query, and you wouldn't benefit from indexing either. With no filter you'd need to do a full table scan anyway, unless that calculated value and SnapshotTime were indexed, and that may not be desirable.

Your question suggests you might want other data from the latest rows in each grouped week, and you have multiple rows per snapshot. That might explain the self-join approach: you're initially just trying to find the latest snapshot date in each week and then plan to join back to find all rows with those snapshot dates? Though I may be misinterpreting that. If you are trying to do that though, you can use an analytic function to assign a ranking to each row based on the snapshot date and its week number, and then filter to only see those ranked first:

select WeekNo, SnapShotTime, AccountNumber -- and other columns
from (
  select ot.*, MyWeekNo(ot.SnapshotTime) as WeekNo,
    rank() over (partition by MyWeekNo(ot.SnapshotTime)
      order by ot.SnapshotTime desc) as Ranking
  from OpenTrades ot
)
where Ranking = 1
order by WeekNo, AccountNumber, Ticket;

SQL Fiddle demo.


If a 'snapshot' can contain dates spanning a period of, say, 10 minutes, you could use an alternative analytic approach to get all records in the latest load:

select WeekNo, SnapShotTime, AccountNumber, Ticket
from (
  select ot.*, MyWeekNo(ot.SnapshotTime) as WeekNo,
    max(SnapshotTime) over (partition by MyWeekNo(ot.SnapshotTime)) as LatestEnd
  from OpenTrades ot
)
where SnapshotTime > LatestEnd - interval '10' minute
and SnapshotTime <= LatestEnd
order by WeekNo, AccountNumber, Ticket;

Here LatestEnd from the inner query is the last SnapshotTime for the week, and the filter ignores anything that isn't between that time and ten minutes before it.

SQL Fiddle showing all five records from the most recent load selected, and those from the preceding load skipped.