Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

POSTGRESQL : How to select the first row of each group?

With this query :

WITH responsesNew AS
(
  SELECT DISTINCT responses."studentId", notation, responses."givenHeart", 
  SUM(notation + responses."givenHeart") OVER (partition BY responses."studentId" 
  ORDER BY responses."createdAt") AS total, responses."createdAt",  
  FROM responses
)
SELECT responsesNew."studentId", notation, responsesNew."givenHeart", total, 
responsesNew."createdAt"
FROM responsesNew
WHERE total = 3
GROUP BY responsesNew."studentId", notation, responsesNew."givenHeart", total, 
responsesNew."createdAt"
ORDER BY responsesNew."studentId" ASC

I get this data table :

studentId | notation | givenHeart | total |      createdAt     |
----------+----------+------------+-------+--------------------+
 374      | 1        | 0          | 3     | 2017-02-13 12:43:03   
 374      | null     | 0          | 3     | 2017-02-15 22:22:17
 639      | 1        | 2          | 3     | 2017-04-03 17:21:30 
 790      | 1        | 0          | 3     | 2017-02-12 21:12:23
 ...

My goal is to keep only in my data table the early row of each group like shown below :

studentId | notation | givenHeart | total |      createdAt     |
----------+----------+------------+-------+--------------------+
 374      | 1        | 0          | 3     | 2017-02-13 12:43:03 
 639      | 1        | 2          | 3     | 2017-04-03 17:21:30 
 790      | 1        | 0          | 3     | 2017-02-12 21:12:23
 ...

How can I get there?

I've read many topics over here but nothing I've tried with DISTINCT, DISTINCT ON, subqueries in WHERE, LIMIT, etc have worked for me (surely due to my poor understanding). I've met errors related to window function, missing column in ORDER BY and a few others I can't remember.

like image 727
DFATPUNK Avatar asked Aug 30 '25 16:08

DFATPUNK


2 Answers

You can do this with distinct on. The query would look like this:

WITH responsesNew AS (
      SELECT DISTINCT r."studentId", notation, r."givenHeart", 
             SUM(notation + r."givenHeart") OVER (partition BY r."studentId" 
                                                  ORDER BY r."createdAt") AS total,
             r."createdAt" 
      FROM responses r
     )
SELECT DISTINCT ON (r."studentId") r."studentId", notation, r."givenHeart", total, 
r."createdAt"
FROM responsesNew r
WHERE total = 3
ORDER BY r."studentId" ASC, r."createdAt";

I'm pretty sure this can be simplified. I just don't understand the purpose of the CTE. Using SELECT DISTINCT in this way is very curious.

If you want a simplified query, ask another question with sample data, desired results, and explanation of what you are doing and include the query or a link to this question.

like image 159
Gordon Linoff Avatar answered Sep 02 '25 05:09

Gordon Linoff


use Row_number() window function to add a row number to each partition and then only show row 1.

no need to fully qualify names if only one table is involved. and use aliases when qualifying to simplify readability.

WITH responsesNew AS
(
  SELECT "studentId"
       , notation
       , "givenHeart"
       , SUM(notation + "givenHeart") OVER (partition BY "studentId" ORDER BY "createdAt") AS total
       , "createdAt"
       , Row_number() OVER ("studentId" ORDER BY "createdAt") As RNum
  FROM responses r
)
SELECT RN."studentId"
     , notation, RN."givenHeart"
     , total
     , RN."createdAt"
FROM responsesNew RN
WHERE total = 3
  AND RNum = 1
GROUP BY RN."studentId"
       , notation
       , RN."givenHeart", total
       , RN."createdAt"
ORDER BY RN."studentId" ASC
like image 22
xQbert Avatar answered Sep 02 '25 05:09

xQbert