Aggregating Correlated Sub-Queries in SQL Server 2008

While working on a query I was trying to calculate an aggregate function such as SUM() on a correlated subquery and received the following error:

Msg 130, Level 15, State 1, Line 24
Cannot perform an aggregate function on an
expression containing an aggregate or a subquery.

Here are a couple of solutions you can use to get around this error.

Let’s start by looking at a typical example of summing up a correlated SUM(). Suppose you have written the following SELECT:

[code lang=”sql”]select Company.Name
,Company.Region
, (select sum(Amount) from Orders where Orders.CompanyID = Company.CompanyID) as Total
from Company[/code]

Notice that we have used a correlated sub-query to return the total Order Amount per Company. Working from that, let’s say that you instead would like to the return the total Order Amount by Region. Typically, you would remove Company from the SELECT list, add GROUP BY Region to the end, and wrap what you’d like to total in a SUM() aggregate function:

[code lang=”sql”]select Company.Region
,sum(select sum(Amount) from Orders where Orders.CompanyID = Company.CompanyID) as Total
from Company
group by Company.Region[/code]

Unfortunately, that is not a valid SQL statement, since you are trying to directly SUM() a correlated sub-query (which also happens to have a SUM() in it already). This is not allowed in SQL Server; if you try to execute that, you’ll be greeted with the error mentioned above.

There are two ways to fix this

Option 1: Use a derived table

Perhaps the easiest solution is to simply wrap your original SELECT in a derived table, and then select from that derived table and do the grouping in the outer select:

[code lang=”sql”]select Region
,sum(Total) as Total
from
(select Company.Name, Company.Region, (select sum(Amount) from Orders
where Orders.CompanyID = Company.CompanyID) as Total from Company
) x
group by Region[/code]

This is logically equivalent to what we tried to do earlier, except it is now a valid SQL statement and it will return the correct results without an error. This is a very quick and easy way to solve this situation in general, though it is not always the optimal solution in terms of performance and readability.

Option 2: Rewrite your SELECT without a Correlated Subquery

Often the best solution is to rewrite your SELECT without using a correlated sub-query at all. In our example, we could write the original as:

[code lang=”sql”]select Company.Name
,Company.Region
,sum(Orders.Amount) as Total
from Company
left Join Orders on Orders.CompanyID = Company.CompanyID
group by Company.Name
,Company.Region[/code]

Once the SELECT is re-written, we can now easily adjust that to return only totals by Region simply by removing Company.Name from the SELECT and GROUP BY clauses:

[code lang=”sql”]select Company.Region, sum(Orders.Amount) as Total
from Company
left outer Orders on Orders.CompanyID = Company.CompanyID
group by Company.Region[/code]

This is usually a good way to solve the issue because the end result is very efficient and well-structured, so it is easier to understand and maintain.