Pages

Tuesday, April 3, 2012

Basics of SQL Ranking Functions

This is a subject that I didn't know much about until I had to really investigate and use a month ago. Ranking Functions, that is SQL commands to obtain results sets with a column containing some "order" or ranking related information.


This addition came with SQL 2005, and it comes handy in certain scenarios where, in the past, I used to get the result set and then manually (programatically) add the ranking information before displaying on the UI. Obviously, there was extra and boring work required to do this, which is not necessary today.

These are the ranking functions which I will try to clarify with examples:

  • ROW_NUMBER
  • RANKING
  • DENSE_RANKING
  • NTILE

But before going with the examples, I will create a simple table with some sample data: a basic Products table with Name, Category, and Price that looks like this:


Then I'll add some sample Products:

ROW_NUMBER
This function allows us to add a ranking column where the values will be based simply on the order of the row in the final result set. Obviously we need to use some sorting criteria, so let's say we want to order these products by Price:


Notice how in the case of equal prices, Row_Number just keeps enumerating, adding one to the Ranking column no matter what.

RANK
Rank is pretty similar to ROW_NUMBER except in case of a tie, the next spot in the ranking won't be a consecutive number:




Notice that Condensed Milk and Butter have the same price so they share the number 3 in the ranking. Then Pork chunks is in the 5th of the ranking, leaving spot 4 not assigned as there was a tie in the 3rd position.

DENSE_RANK
Similar to RANK but leaving no empty spots after the ties:


Notice how there are no empty spots in the ranking. Spot 4 comes right after 3, even though there were two elements tied in position number three.

NTILE
NTILE divides the ranking in groups. Let's say you are interested in knowing the 25% of products with lower prices. Well you can divide your ranking in four blocks assigning positions 1 to 4 to each block. By the same logic, If I am interested in the third that contains the least expensive products, I could use NTILE to divide the result set in three blocks:




This one is maybe of less use. But it is good for you to know that is there, that it exists. In practice I can see ROW_NUMBER and RANK being pretty handy.


No comments:

Post a Comment