PostgreSQL GROUP BY Clause
The GROUP BY clause in PostgreSQL is used to group rows that have the same values in one or more columns. It is commonly used with aggregate functions likeCOUNT, SUM, AVG, MAX, and MINto calculate summarized information.
1. Basic Syntax
SELECT column1, aggregate_function(column2)
FROM table_name
WHERE condition
GROUP BY column1;- column1 → the column(s) to group by - aggregate_function → SUM, AVG, COUNT, MAX, MIN, etc. - WHERE is optional to filter rows before grouping
2. Example: Count Rows in Groups
-- Count number of employees in each department
SELECT department, COUNT(*) AS total_employees
FROM employees
GROUP BY department;3. Using Multiple Columns in GROUP BY
You can group by more than one column to get finer-grained summaries.
-- Count employees by department and salary
SELECT department, salary, COUNT(*) AS total
FROM employees
GROUP BY department, salary;4. GROUP BY with Aggregate Functions
-- Average salary by department
SELECT department, AVG(salary) AS avg_salary
FROM employees
GROUP BY department;
-- Maximum and minimum salary per department
SELECT department, MAX(salary) AS max_salary, MIN(salary) AS min_salary
FROM employees
GROUP BY department;5. Filtering Groups with HAVING
The HAVING clause is used to filter groups after aggregation, unlike WHEREwhich filters rows before grouping.
-- Departments with more than 5 employees
SELECT department, COUNT(*) AS total_employees
FROM employees
GROUP BY department
HAVING COUNT(*) > 5;6. GROUP BY with ORDER BY
You can sort grouped results using ORDER BY.
-- Departments by highest average salary
SELECT department, AVG(salary) AS avg_salary
FROM employees
GROUP BY department
ORDER BY avg_salary DESC;7. GROUP BY with JOIN
You can group results after joining multiple tables.
-- Total bonuses per department
SELECT e.department, SUM(b.bonus) AS total_bonus
FROM employees e
JOIN bonuses b ON e.id = b.emp_id
GROUP BY e.department;8. GROUPING SETS, ROLLUP, and CUBE
PostgreSQL supports advanced grouping for multiple levels of aggregation:
-- Total salary by department and overall total
SELECT department, SUM(salary) AS total_salary
FROM employees
GROUP BY GROUPING SETS ((department), ());
-- ROLLUP example: subtotal and grand total
SELECT department, SUM(salary) AS total_salary
FROM employees
GROUP BY ROLLUP(department);
-- CUBE example: subtotal by department and another column
SELECT department, job_title, SUM(salary) AS total_salary
FROM employees
GROUP BY CUBE(department, job_title);These features are useful for reporting and analytics, providing subtotals and totals automatically.
9. Best Practices
- Always include in
GROUP BYall non-aggregated columns in the SELECT list. - Use
HAVINGto filter aggregated results, notWHERE. - For large datasets, create indexes on grouping columns to improve performance.
- Consider using
ROLLUPandCUBEfor advanced reporting needs.
Conclusion
The GROUP BY clause is essential for aggregating data in PostgreSQL. Combined with aggregate functions, HAVING, ORDER BY, and advanced grouping techniques like ROLLUP and CUBE, you can perform complex analytical queries efficiently. In the next tutorial, we will explore the HAVING clause in detail, which filters groups based on aggregate values.