Optimizing Mysql Query For Group by with date functions -
i have report pulls information summary table , ideally pull 2 periods @ once, current period , previous period. table structured thusly:
report_table item_id int(11) amount decimal(8,2) day date
the primary key item_id, day. table holds 37k records 92 different items , 1200 different days. using mysql 5.1.
here select statement:
select r.day, sum(r.amount)/(count(distinct r.item_id)*count(r.day)) `current_avg_day`, sum(r2.amount)/(count(distinct r2.item_id)*count(r2.day)) `previous_avg_day` `client_location_item` `cla` inner join `client_location` `cl` inner join `report_item_day` `r` inner join `report_item_day` `r2` (r.item_id = cla.item_id) , (cla.location_id = cl.location_id) , (r.day between from_unixtime(1293840000) , from_unixtime(1296518399)) , (r2.day between from_unixtime(1291161600) , from_unixtime(1293839999)) , (cl.location_code = 'location') group month(r.day);
at present query takes 2.2 seconds in environment. explain plan is:
'1', 'simple', 'cl', 'all', 'primary', null, null, null, '33', 'using where; using temporary; using filesort' '1', 'simple', 'cla', 'ref', 'primary,location_id,location_id_idxfk', 'location_id', '4', 'cl.location_id', '1', 'using index' '1', 'simple', 'r', 'ref', 'primary', 'primary', '4', cla.asset_id', '211', 'using where' '1', 'simple', 'r2', 'all', null, null, null, null, '37602', 'using where; using join buffer'
if add index "day" column, instead of query running faster, runs in 2.4 seconds. explain plan query @ time is:
'1', 'simple', 'r2', 'range', 'report_day_day_idx', 'report_day_day_idx', '3', null, '1092', 'using where; using temporary; using filesort' '1', 'simple', 'r', 'range', 'primary,report_day_day_idx', 'report_day_day_idx', '3', null, '1180', 'using where; using join buffer' '1', 'simple', 'cla', 'eq_ref', 'primary,location_id,location_id_idxfk', 'primary', '4', 'r.asset_id', '1', 'using where' '1', 'simple', 'cl', 'eq_ref', 'primary', 'primary', '4', cla.location_id', '1', 'using where'
according mysql documentation efficient group execution when there index retrieve grouping columns. states functions can make use of indexes min() , max(). have ideas can further optimize query? or why, 'indexed' version runs more despite having fewer rows overall non-indexed version?
create table:
create table `report_item_day` ( `item_id` int(11) not null, `amount` decimal(8,2) default null, `day` date not null, primary key (`item_id`,`day`) ) engine=innodb default charset=latin1
of course other option have make 2 db calls, 1 each time period. if that, straight away query each drops 0.031s. still feel there should way optimize query achieve comparable results.
why selecting day when grouping on month? don't entirely output of query like. hate mysql allowing that!
i show 2 approaches query 2 periods in 1 go. first 1 union query. should 2-query approach does. return 2 rows, 1 each period.
select sum(r.amount) / (count(distinct r.item_id) * count(r.day) ) curr_avg report_item_day r join client_location_item cla using(item_id) join client_location cl using(location_id) cl.location_code = 'location' , r.day between from_unixtime(1293840000) , from_unixtime(1296518399) union select sum(r.amount) / (count(distinct r.item_id) * count(r.day) ) prev_avg report_item_day r join client_location_item cla using(item_id) join client_location cl using(location_id) cl.location_code = 'location' , r.day between from_unixtime(1291161600) , from_unixtime(1293839999)
the following approach potentially faster above, uglier , harder read.
select period ,sum(amount) / (count(distinct item_id) * count(day) ) avg_day (select case when r.day between from_unixtime(1293840000) , from_unixtime(1296518399) 'current' when r.day between from_unixtime(1291161600) , from_unixtime(1293839999) 'previous' end period ,r.amount ,r.item_id ,r.day report_item_day r join client_location_item cla using(item_id) join client_location cl using(location_id) cl.location_code = 'location' , ( r.day between from_unixtime(1293840000) , from_unixtime(1296518399) or r.day between from_unixtime(1291161600) , from_unixtime(1293839999) ) ) v group period;
note 1: didn't give ddl, can't test if syntax correct
note 2: consider creating calendar table, keyed date. add appropriate columns such month, week, financial_year etcetera, able support reporting doing. queries much easier write , understand.
Comments
Post a Comment