I am working with historical price data of many different products.
I have a table
products that lists all products with their name, description etc., identifying them with an uuid. Then there is another table
history that stores every price change ever made for every product. The prices may (usually do) change quite a few times per day.
Now I want to calculate what each product's price was at a specific point in time, say 14th March 2015 at 12 noon. How can I do this in SQL?
I am able to do this for one product:
SELECT product_id, price, date FROM history WHERE product_id = 'aa6d9976-e9ae-4478-486e-097e86c1e5fe' AND (date-'2015-03-14 12:00:00+02') < interval '1 second' ORDER BY diff DESC LIMIT 1 -> aa6d9976-e9ae-4478-486e-097e86c1e5fe 109 2015-03-14 11:55:00+01
But I would like ALL products in one query in one set. My idea was to get all the products and LEFT JOIN that table with the history, selecting the appropriate price for each, but I fail at the latter:
SELECT products.product_id, name, price, date FROM products LEFT JOIN history ON products.product_id = history.product_id WHERE date "is the greatest value that is still somewhat smaller than" '2015-03-14 12:00:00+01'
How do you write properly what I tried to express in the quotes?
I use PostgreSQL (although I have mainly worked with MySQL before). The tables are about 15 thousand (products) and 50 million (history) rows respectively.
Some sample data if you like some:
PRODUCTS product_id name aa6d9976-e9ae-4478-486e-097e86c1e5fe One 8da97d50-540e-4fdb-d032-7f443a9869a0 Two b51654ea-6190-4ed2-5e23-7075ffd3b472 Three HISTORY id product_id price date 1 aa6d9976-e9ae-4478-486e-097e86c1e5fe 100 2015-03-14 09:30:00+01 2 aa6d9976-e9ae-4478-486e-097e86c1e5fe 110 2015-03-14 10:48:00+01 3 b51654ea-6190-4ed2-5e23-7075ffd3b472 9 2015-03-14 11:01:00+01 4 8da97d50-540e-4fdb-d032-7f443a9869a0 49 2015-03-14 11:27:00+01 5 aa6d9976-e9ae-4478-486e-097e86c1e5fe 109 2015-03-14 11:55:00+01 6 b51654ea-6190-4ed2-5e23-7075ffd3b472 8 2015-03-14 13:59:00+01 7 aa6d9976-e9ae-4478-486e-097e86c1e5fe 110 2015-03-14 16:10:00+01 8 8da97d50-540e-4fdb-d032-7f443a9869a0 48 2015-03-14 19:34:00+01 9 8da97d50-540e-4fdb-d032-7f443a9869a0 49 2015-03-14 23:30:00+01 10 aa6d9976-e9ae-4478-486e-097e86c1e5fe 103 2015-03-14 23:33:00+01 DESIRED OUTPUT id name price date aa6d9976-e9ae-4478-486e-097e86c1e5fe One 109 2015-03-14 11:55:00+01 8da97d50-540e-4fdb-d032-7f443a9869a0 Two 49 2015-03-14 11:27:00+01 b51654ea-6190-4ed2-5e23-7075ffd3b472 Three 9 2015-03-14 11:01:00+01
First you write a query to find the maximum date that is less than the date you are querying for each product. That would look something like this:
select product_id, MAX(date) date from history where date < '3/14/2015 12:00:00' group by product_id
Then you can join that sub-query with your
history tables to get the results you want:
select products.*, history.price, history.date from products left join ( select product_id, MAX(date) date from history where date < '3/14/2015 12:00:00' group by product_id ) PriceDates on products.product_id = PriceDates.product_id join history on PriceDates.product_id = history.product_id and PriceDates.date = history.date