How to calculate the monthly average in MySQL monthly data that stretches over more than a month?

advertisements

I have data that spans multiple months and I want to be able to take the average per day and separate it to the appropriate months. For example, say that one data point is 2/9/2010 - 3/8/2010 and the amount is 1500. Then, the query should return 1071.4 for February 2010 and 428.6 for March. I am hoping there is a MySQL statement that will be able to do the computations instead of my PHP logic. Thanks.

EDIT (added table definition): start (datetime), end (datetime), use

EDIT 2: Here is some dummy data



DROP TABLE IF EXISTS `dummy_data`;
CREATE TABLE `dummy_data` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `start_date` date NOT NULL,
  `end_date` date NOT NULL,
  `data` double(15,4) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=MyISAM AUTO_INCREMENT=8 DEFAULT CHARSET=latin1;

-- ----------------------------
--  Records of `dummy_data`
-- ----------------------------
BEGIN;
INSERT INTO `dummy_data` VALUES ('1', '2010-01-01', '2010-02-02', '200.0000'), ('2', '2010-02-03', '2010-02-25', '250.0000'), ('3', '2010-02-26', '2010-03-08', '300.0000'), ('4', '2010-03-09', '2010-04-12', '210.0000'), ('5', '2010-04-13', '2010-05-10', '260.0000'), ('6', '2010-05-11', '2010-06-15', '310.0000'), ('7', '2010-06-16', '2010-07-20', '320.0000');
COMMIT;


This solution handles [start_date, end_date] spans of as small as one (1) day and as large as twelve (12) months, but is incorrect on a span of thirteen (13) or more months:

CREATE TABLE integers (i INT NOT NULL);

INSERT INTO integers VALUES (0), (1), (2), (3), (4), (5), (6), (7), (8), (9);

CREATE VIEW hundreds AS
   SELECT iii.i * 100 + ii.i * 10 + i AS i
     FROM integers i JOIN integers ii JOIN integers iii;

-- We do not have CTEs, so we create a view
CREATE VIEW spans AS
   SELECT id, start_date, DATEDIFF(end_date, start_date) + 1 AS ndays, data
     FROM dummy_data;

   SELECT spans.id,
          month_name,
          spans.data * COUNT(month_name)/spans.ndays AS month_amount
     FROM spans
LEFT JOIN (SELECT id,
                  MONTH(start_date + INTERVAL i DAY) AS month_num,
                  MONTHNAME(start_date + INTERVAL i DAY) AS month_name
             FROM spans
             JOIN hundreds WHERE i < ndays) daybyday
       ON spans.id = daybyday.id
 GROUP BY id, month_name
 ORDER BY id, month_num;

Output looks like this:

+----+------------+---------------+
| id | month_name | month_amount  |
+----+------------+---------------+
|  1 | January    |  187.87878788 |
|  1 | February   |   12.12121212 |
|  2 | February   |  250.00000000 |
|  3 | February   |   81.81818182 |
...

We use DATEDIFF to determine the number of days represented by a source record. Then, building off an integers table, we can enumerate the month of each day in a particular span. From there it's a matter of SQL aggregation by record id and month_name.