SQL GROUP BY Clause
The GROUP BY clause in SQL is used to group rows that have the same values in one or more columns. It is often used with aggregate functions like COUNT, SUM,AVG, MAX, and MIN.
📌 Syntax
SELECT column1, aggregate_function(column2) FROM table_name WHERE condition GROUP BY column1 HAVING aggregate_function(column2) condition;
📊 Example Table: users
id | name | age | city ---+---------+-----+--------- 1 | Alice | 22 | New York 2 | Bob | 30 | London 3 | Charlie | 18 | Sydney 4 | Diana | 27 | Toronto 5 | Ethan | 35 | Paris 6 | Frank | 28 | London
🔍 Example 1: Count Users in Each City
Group by city and count the number of users:
SELECT city, COUNT(*) AS user_count FROM users GROUP BY city;
🔍 Example 2: Sum of Ages by City
Calculate total age for users in each city:
SELECT city, SUM(age) AS total_age FROM users GROUP BY city;
🔍 Example 3: Using HAVING with GROUP BY
Show only cities with more than 1 user:
SELECT city, COUNT(*) AS user_count FROM users GROUP BY city HAVING COUNT(*) > 1;
🔍 Example 4: Multiple Columns Grouping
Group by city and age:
SELECT city, age, COUNT(*) AS user_count FROM users GROUP BY city, age;
⚡ Best Practices
- Always include all non-aggregated columns in GROUP BY.
- Use HAVING to filter aggregated results (WHERE cannot filter aggregates).
- Index the columns used in GROUP BY for better performance.
- Combine with ORDER BY to sort grouped results.
📝 Summary
The GROUP BY clause allows you to summarize and aggregate data efficiently. It is especially useful in reporting, analytics, and generating insights from large datasets. You can group by one or more columns, apply aggregate functions, and filter with HAVING.
🚀 Next Steps
Next, we will explore SQL JOINs, which allow combining data from multiple tables to perform relational queries.