There is a table messages
that contains data as shown below:
Id Name Other_Columns
-------------------------
1 A A_data_1
2 A A_data_2
3 A A_data_3
4 B B_data_1
5 B B_data_2
6 C C_data_1
If I run a query select * from messages group by name
, I will get the result as:
1 A A_data_1
4 B B_data_1
6 C C_data_1
What query will return the following result?
3 A A_data_3
5 B B_data_2
6 C C_data_1
That is, the last record in each group should be returned.
At present, this is the query that I use:
SELECT
*
FROM (SELECT
*
FROM messages
ORDER BY id DESC) AS x
GROUP BY name
But this looks highly inefficient. Any other ways to achieve the same result?
Best Answer
MySQL 8.0 now supports windowing functions, like almost all popular SQL implementations. With this standard syntax, we can write greatest-n-per-group queries:
Below is the original answer I wrote for this question in 2009:
I write the solution this way:
Regarding performance, one solution or the other can be better, depending on the nature of your data. So you should test both queries and use the one that is better at performance given your database.
For example, I have a copy of the StackOverflow August data dump. I'll use that for benchmarking. There are 1,114,357 rows in the
Posts
table. This is running on MySQL 5.0.75 on my Macbook Pro 2.40GHz.I'll write a query to find the most recent post for a given user ID (mine).
First using the technique shown by @Eric with the
GROUP BY
in a subquery:Even the
EXPLAIN
analysis takes over 16 seconds:Now produce the same query result using my technique with
LEFT JOIN
:The
EXPLAIN
analysis shows that both tables are able to use their indexes:Here's the DDL for my
Posts
table:Note to commenters: If you want another benchmark with a different version of MySQL, a different dataset, or different table design, feel free to do it yourself. I have shown the technique above. Stack Overflow is here to show you how to do software development work, not to do all the work for you.