I'm new to Postgres, coming from MySQL and hoping that one of y'all would be able to help me out.
I have a table with three columns: name
, week
, and value
. This table has a record of the names, the week at which they recorded the height, and the value of their height.
Something like this:
Name | Week | Value
------+--------+-------
John | 1 | 9
Cassie| 2 | 5
Luke | 6 | 3
John | 8 | 14
Cassie| 5 | 7
Luke | 9 | 5
John | 2 | 10
Cassie| 4 | 4
Luke | 7 | 4
What I want is a list per user of the value at the minimum week and the max week. Something like this:
Name |minWeek | Value |maxWeek | value
------+--------+-------+--------+-------
John | 1 | 9 | 8 | 14
Cassie| 2 | 5 | 5 | 7
Luke | 6 | 3 | 9 | 5
In Postgres, I use this query:
select name, week, value
from table t
inner join(
select name, min(week) as minweek
from table
group by name)
ss on t.name = ss.name and t.week = ss.minweek
group by t.name
;
However, I receive an error:
column "w.week" must appear in the GROUP BY clause or be used in an aggregate function
Position: 20
This worked fine for me in MySQL so I'm wondering what I'm doing wrong here?
Best Solution
There are various simpler and faster ways.
2x
DISTINCT ON
Or shorter:
Simple and easy to understand. Also fastest in my old tests. Detailed explanation for
DISTINCT ON
:2x window function, 1x
DISTINCT ON
The explicit
WINDOW
clause only shortens the code, no effect on performance.first_value()
of composite typeThe aggregate functions
min()
ormax()
do not accept composite types as input. You would have to create custom aggregate functions (which is not that hard).But the window functions
first_value()
andlast_value()
do. Building on that we can devise simple solutions:Simple query
The output has all data, but the values for the last week are stuffed into an anonymous record (optionally cast to
text
). You may need decomposed values.Decomposed result with opportunistic use of table type
For that we need a well-known composite type. An adapted table definition would allow for the opportunistic use of the table type itself directly:
week
andvalue
come first, so now we can sort by the table type itself:Decomposed result from user-defined row type
That's probably not possible in most cases. Register a composite type with
CREATE TYPE
(permanent) or withCREATE TEMP TABLE
(for the duration of the session):Custom aggregate functions
first()
&last()
Create functions and aggregates once per database:
Then:
Probably the most elegant solution. Faster with the additional module
first_last_agg
providing a C implementation.Compare instructions in the Postgres Wiki.
Related:
db<>fiddle here (showing all)
Old sqlfiddle
Each of these queries was substantially faster than the currently accepted answer in a quick test on a table with 50k rows with
EXPLAIN ANALYZE
.There are more ways. Depending on data distribution, different query styles may be (much) faster, yet. See: