Microsoft SQL Server

GROUP BY vs DISTINCT in SQL Server

Yesterday, I was travelling in the metro where I overheard two college students puzzling over the difference between GROUP BY and DISTINCT.

GROUP BY and DISTINCT cause a lot of confusion and I couldn’t help myself so I jumped into their conversation and explained the differences.

Their problem was understanding where to use GROUP BY and where to use DISTINCT when both return same results in most scenarios when no aggregates are used, and which one had better performance.

GROUP BY and DISTINCT both generate the same query execution plan when no aggregates are used, so there is no appreciable difference in performance in this situation.

GROUP BY becomes important when aggregates are used. DISTINCT is just for filtering out duplicate records from query result sets.

Ideally, when aggregates are not used in a query, then one should use DISTINCT to enhance readability. But when aggregates are used GROUP BY performs much better.

Here are some examples:

DISTINCT

SELECT DISTINCT Name, RollNo
 FROM Student

GROUP BY

SELECT Name, RollNo
 FROM Student
 GROUP BY Name, RollNo

GROUP BY with Aggregate Function

SELECT Name, RollNo, Sum(Marks) TotalMarks
 FROM Student
 GROUP BY Name, RollNo

The post GROUP BY vs DISTINCT in SQL Server appeared first on OptimusBI.