SQL GROUP BY Clause

The GROUP BY clause in SQL is used to group rows that have the same values in one or more columns. It is often used with aggregate functions like COUNT, SUM,AVG, MAX, and MIN.

📌 Syntax

SELECT column1, aggregate_function(column2)
FROM table_name
WHERE condition
GROUP BY column1
HAVING aggregate_function(column2) condition;

📊 Example Table: users

id | name    | age | city
---+---------+-----+---------
1  | Alice   | 22  | New York
2  | Bob     | 30  | London
3  | Charlie | 18  | Sydney
4  | Diana   | 27  | Toronto
5  | Ethan   | 35  | Paris
6  | Frank   | 28  | London

🔍 Example 1: Count Users in Each City

Group by city and count the number of users:

SELECT city, COUNT(*) AS user_count FROM users GROUP BY city;

🔍 Example 2: Sum of Ages by City

Calculate total age for users in each city:

SELECT city, SUM(age) AS total_age
FROM users
GROUP BY city;

🔍 Example 3: Using HAVING with GROUP BY

Show only cities with more than 1 user:

SELECT city, COUNT(*) AS user_count
FROM users
GROUP BY city
HAVING COUNT(*) > 1;

🔍 Example 4: Multiple Columns Grouping

Group by city and age:

SELECT city, age, COUNT(*) AS user_count
FROM users
GROUP BY city, age;

⚡ Best Practices

  • Always include all non-aggregated columns in GROUP BY.
  • Use HAVING to filter aggregated results (WHERE cannot filter aggregates).
  • Index the columns used in GROUP BY for better performance.
  • Combine with ORDER BY to sort grouped results.

📝 Summary

The GROUP BY clause allows you to summarize and aggregate data efficiently. It is especially useful in reporting, analytics, and generating insights from large datasets. You can group by one or more columns, apply aggregate functions, and filter with HAVING.

🚀 Next Steps

Next, we will explore SQL JOINs, which allow combining data from multiple tables to perform relational queries.