A few weeks ago, I got the chance to put on my T-SQL Sherlock Holmes hat and solve a problem!
My customer base is over 300 business units (franchises). Each of these units has their own customers. The Customer Service department wanted a list of each unit’s top 10 customers. There are several ways to accomplish this, but I wanted to avoid cursors. I knew options were there in SQL Server 2008 just begging to rid my database servers of cursors once and for all.
Since I’ve been reading Itzik Ben-Gan’s “Microsoft SQL Server 2008 T-SQL Fundamentals”, I wanted to try using a new method, OVER, with PARTITION and RANK.
What is OVER? MSDN say, “Determines the partitioning and ordering of the rowset before the associated window function is applied.” (http://msdn.microsoft.com/en-us/library/ms189461.aspx)
Ben-Gan explains it as, “The OVER clause exposes a window of rows to certain kinds of calculations.”
OVER is particularly useful with aggregations. Say I have 10 customers, and I want to see the value of each order each customer has placed. I could write:
SELECT C.CustID, O.OrderID, O.OrderValue FROM #Order O INNER JOIN #Customer C ON C.CustID = O.CustID
Then, I want the total value of all orders each customer has placed. I could SUM the value.
SELECT C.CustID, SUM(O.OrderValue) AS SumOrders FROM #Order O INNER JOIN #Customer C ON C.CustID = O.CustID GROUP BY C.CustID
Lastly, I want the total value of all orders all customers have placed. I could find this with another SUM.
SELECT SUM(O.OrderValue) AS SumOrders FROM #Order O INNER JOIN #Customer C ON C.CustID = O.CustID
The difficulty with this is that I have to do three separate queries, and I can’t compare the results of each query directly. However, I can do this in one query, using OVER with a PARTITION clause.
Then, I can add in the RANK function to, rank the results in order. It’s one of four ranking functions available – the others are ROW_NUMBER, DENSE_RANK and NTILE. MSDN has information on them here, http://msdn.microsoft.com/en-us/library/ms189798.aspx, and Ben-Gan’s book has a good explanation with examples also.
Building the Query
I took this step by step. Why rush into OVER and RANK? We don’t build houses by putting the roof on first!
The steps I used, roughly, were:
- Select my information – the division, the customer, and the sales value of the order.
- Aggregate the information – SUM the sales value.
- Rank the customers by sales values, using RANK.
- Use a CTE (Common Table Expression) to select only the top customers.
This isn’t perfect, and I see how I could have cut out a couple of steps. Oh, hindsight! However, I hope my thought process will help you understand this.
First, here is a small sample of test data for you to use. (If I’d had Developer Edition when I did this a couple months ago, I would have used AdventureWorks. I did not.)
First, the customer information.
CREATE TABLE #Customer (CustID INT, DivID INT) INSERT INTO #Customer VALUES (100, 1) INSERT INTO #Customer VALUES (135, 1) INSERT INTO #Customer VALUES (159, 1) INSERT INTO #Customer VALUES (205, 2) INSERT INTO #Customer VALUES (222, 2) INSERT INTO #Customer VALUES (248, 2) INSERT INTO #Customer VALUES (312, 3) INSERT INTO #Customer VALUES (345, 3) INSERT INTO #Customer VALUES (389, 3)
Then, the order information.
CREATE TABLE #Order (CustID INT, OrderID INT, OrderValue MONEY) INSERT INTO #Order VALUES (100, 1, 100.00) INSERT INTO #Order VALUES (100, 2, 250.00) INSERT INTO #Order VALUES (100, 3, 150.00) INSERT INTO #Order VALUES (100, 4, 130.00) INSERT INTO #Order VALUES (135, 5, 110.00) INSERT INTO #Order VALUES (135, 6, 260.00) INSERT INTO #Order VALUES (135, 7, 160.00) INSERT INTO #Order VALUES (135, 8, 140.00) INSERT INTO #Order VALUES (159, 9, 100.00) INSERT INTO #Order VALUES (159, 10, 200.00) INSERT INTO #Order VALUES (159, 11, 120.00) INSERT INTO #Order VALUES (159, 12, 130.00) INSERT INTO #Order VALUES (205, 13, 100.00) INSERT INTO #Order VALUES (205, 14, 250.00) INSERT INTO #Order VALUES (205, 15, 150.00) INSERT INTO #Order VALUES (205, 16, 130.00) INSERT INTO #Order VALUES (222, 17, 110.00) INSERT INTO #Order VALUES (222, 18, 260.00) INSERT INTO #Order VALUES (222, 19, 160.00) INSERT INTO #Order VALUES (222, 20, 140.00) INSERT INTO #Order VALUES (248, 21, 100.00) INSERT INTO #Order VALUES (248, 22, 200.00) INSERT INTO #Order VALUES (248, 23, 120.00) INSERT INTO #Order VALUES (248, 24, 130.00) INSERT INTO #Order VALUES (312, 25, 100.00) INSERT INTO #Order VALUES (312, 26, 250.00) INSERT INTO #Order VALUES (312, 27, 150.00) INSERT INTO #Order VALUES (312, 28, 130.00) INSERT INTO #Order VALUES (345, 29, 110.00) INSERT INTO #Order VALUES (345, 30, 260.00) INSERT INTO #Order VALUES (345, 31, 160.00) INSERT INTO #Order VALUES (345, 32, 140.00) INSERT INTO #Order VALUES (389, 33, 100.00) INSERT INTO #Order VALUES (389, 34, 200.00) INSERT INTO #Order VALUES (389, 35, 120.00) INSERT INTO #Order VALUES (389, 36, 130.00)
While this example only has a small number of divisions and customers, I was working with over 300. So, my first step, as it usually is for testing and my sanity, was to work with the orders for one division.
SELECT C.DivID, C.CustID, O.OrderID, O.OrderValue FROM #Order O INNER JOIN #Customer C ON C.CustID = O.CustID WHERE C.DivID = 1
There are three customers. Each has placed four orders. You can see the order values here.
I then add in OVER and PARTITION. This will show me the value of each order, the sum of all orders for the customer, and the sum of all orders for the division.
SELECT C.DivID, C.CustID, O.OrderValue, SUM(O.OrderValue) OVER (PARTITION BY C.CustID ) AS CustSales, SUM(O.OrderValue) OVER (PARTITION BY C.DivID ) AS DivSales FROM #Order O INNER JOIN #Customer C ON C.CustID = O.CustID WHERE C.DivID = 1
This is a great way to see a simple example of what OVER can do. All of the orders for customer 100 are summed in the CustSales column, and all orders for division 1 are summed in the DivSales column. The same goes for customers 135 and 159.
Next, I will use RANK to rank the customers by total sales value of their orders.
SELECT C.DivID, C.CustID, SUM(O.OrderValue) AS SumOrderValue, RANK() OVER (PARTITION BY C.DivID ORDER BY SUM(O.OrderValue) DESC) AS Ranking FROM #Order O INNER JOIN #Customer C ON C.CustID = O.CustID WHERE C.DivID = 1 GROUP BY C.DivID, C.CustID
I can see that customer 135 had the largest total value of orders, followed by customers 100 and 159. Results:
I can easily expand this, now, to all of the divisions.
SELECT C.DivID, C.CustID, SUM(O.OrderValue) AS SumOrderValue, RANK() OVER (PARTITION BY C.DivID ORDER BY SUM(O.OrderValue) DESC) AS Ranking FROM #Order O INNER JOIN #Customer C ON C.CustID = O.CustID GROUP BY C.DivID, C.CustID
For each division, the order values have been summed per customer, and then ranked in order.
There was one final step in my task, and that was to only choose the top 10 customers from each division. I love a good CTE (Common Table Expression), so I chose to use that.
;WITH Ranking ( DivID, CustID, OrderValue, Ranking) AS ( SELECT C.DivID, C.CustID, SUM(O.OrderValue) AS SumOrderValue, RANK() OVER (PARTITION BY C.DivID ORDER BY SUM(O.OrderValue) DESC) AS Ranking FROM #Order O INNER JOIN #Customer C ON C.CustID = O.CustID GROUP BY C.DivID, C.CustID ) SELECT R.DivID, R.CustID, R.OrderValue FROM Ranking R WHERE R.Ranking = 1
And the results are in: the top customer for each division.

I’m glad I took the time to learn this very useful bit of T-SQL to solve this problem. I learned something new, and I hope you have too!
I owe a big, “Thanks!” to Denis Gobo (blog | Twitter) from LessThanDot.com for helping me out with this!









This is a great post– I really like the way you step through the process of developing the query, as well as explaining what you didn’t do.
Great post, Jes!
Great Explanation!
Pingback: Weekly Link Post 152 « Rhonda Tipton's WebLog
Jes, Love the detailed explanation!