Today I would like to share my experience of Query Optimization. I’ve heard of Query Optimization for a long time, but I’ve never realized how powerful it was until now. Following are some graphs showing how my system performed before and after I optimized my query. Not to mention that it cut my query time by 80%. Isn’t that amazing?

CPU usage before/after query optimization

IO rate before/after query optimization
My web service is using Hibernate in data access layer. At the beginning, I just wrote HQL (Hibernate Query Language) code to retrieve my objects based on business logic. And it worked. So I never thought about looking into the exact SQL query it generated. Until my query became slower and slower while my database grew bigger.
Here are some pointers I learned through the tuning process. (My frameworks: Hibernate + Spring Annotation)
1. Take a look at your actual queries
First make sure you set the show_sql property to true in your Hibernate configuration. Then you will see the Hibernate logs every SQL query in the stderrout.log file. Below is an example:
Hibernate: select photo0_.photo_id as col_0_0_, count(votes1_.vote_id) as col_1_0_, (select count(votes2_.vote_id) from votes votes2_ where photo0_.photo_id=votes2_.photo_id and votes2_.thumbs_up=0) as col_2_0_, (select count(comment3_.comment_id) from comments comment3_ where comment3_.photo_id=photo0_.photo_id) as col_3_0_, photo0_.photo_id as photo1_8_, photo0_.book_id as book2_8_, photo0_.inappropriate as inapprop3_8_, photo0_.thumbnail as thumbnail8_, photo0_.w_id as w5_8_ from photos photo0_ inner join votes votes1_ on photo0_.photo_id=votes1_.photo_id where (photo0_.inappropriate is null or photo0_.inappropriate=0) and votes1_.thumbs_up=1 and votes1_.time_stamp>? group by photo0_.photo_id order by count(votes1_.vote_id) desc limit ?
2. Use explain to analyze your query
Take the query logged by Hibernate and add explain keyword in front of the query. Run it against your database. Then you will get MySQL explains how it will process the query. What type of join? Which index will it use? etc.. Especially pay attention to the extra values from the result. Watch out “Using filesort” and “Using temporary“. Those indicators mean your query consumes higher CPU and memory. At least that was the reason slowed down my query significantly. And It turned out there was other way to achieve the same result without the heavy cost.
Further Reading ..
- http://dev.mysql.com/doc/refman/5.0/en/explain-output.html