I'm working on a MySQL database that stores "Facebook-like" images: every user follows a set of "artists", and these artists upload images.
I wrote a query that, for a given user, returns the images of all the artists he follows, that are more recent than 30 days, sorted by date:
SELECT img.id, img.url, a.name FROM (SELECT artist FROM user_follow_artist WHERE user = <USER_ID>) AS f -- need to work only on the followed artists JOIN artistimage AS img ON img.artist = f.artist -- join on the table that contains the images (8.000.000 rows!) JOIN artist AS a ON a.id = img.artist -- join on artist table to add details on image's owner -- following: where + orderby that slow down everything WHERE img.uploadDate > NOW() - INTERVAL 30 DAY ORDER BY img.uploadDate DESC LIMIT 50
The query tooks an average of 30 seconds, the more the user follows artists, the more it takes. Here are some details:
- As written, the artistimage table is 8 millions rows big
- If I remove the ORDER BY clause, the time drops to 2-3 seconds
- img.uploadDate is a TIMESTAMP and it's INDEXED
- The engine is InnoDB
I hoped indexing uploadDate could increase the speed, but nothing changed. How can I solve this problem?
EDIT: Table structures
artist ------------------ id (integer, primary) name (string) user_follow_artist ------------------ user (integer, foreign key on user.id, indexed) artist (integer, foreign key on artist.id, indexed) artistimage ------------------ id (integer, primary) artist (integer, foreign key on artist.id, indexed) url (string) uploadDate (timestamp, indexed)
id select_type table type possible_keys key key_len ref rows Extra 1 PRIMARY <derived2> ALL NULL NULL NULL NULL 327 Using temporary; Using filesort 1 PRIMARY img ref artist,uploadDate artist 9 img.artist 36 Using where 1 PRIMARY user eq_ref PRIMARY PRIMARY 8 db.img.artist 1 2 DERIVED user_follow_artist ref PRIMARY,user user 8 327 Using index
If I remove the ORDER BY, the EXPLAIN is the same but without
Using temporary; Using filesort in first row.
I think your "Limit 50" is confusing you. It's getting you the first 50 rows with no criteria, which will always return fast. When you add the where and order by clauses, it will require MySQL to fetch all rows and then sort and filter to get your results.
I think if you took out the WHERE, ORDER and LIMIT 50, you will see your query takes a long time to fetch all rows.
To speed it up, you can try re-writing your query.
You can make a temp table for your "(SELECT artist FROM user_follow_artist WHERE user = )" set.
You can move the WHERE clause to the JOIN like this:
JOIN artistimage AS img ON img.artist = f.artist AND img.uploadDate > NOW() - INTERVAL 30 DAY