DBA Hub

📋Steps in this guide1/10

Top-N Queries

Top-N queries provide a method for limiting the number of rows returned from ordered sets of data.

oracle miscconfigurationintermediate
by OracleDba
11 views
1

Setup

First we must create and populate a test table. The following query shows we have 20 rows with 10 distinct values.

Code/Command (click line numbers to comment):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
DROP TABLE rownum_order_test;

CREATE TABLE rownum_order_test (
  val  NUMBER
);

INSERT ALL
  INTO rownum_order_test
  INTO rownum_order_test
SELECT level
FROM   dual
CONNECT BY level <= 10;

COMMIT;

SELECT val
FROM   rownum_order_test
ORDER BY val;

       VAL
----------
         1
         1
         2
         2
         3
         3
         4
         4
         5
         5
         6

       VAL
----------
         6
         7
         7
         8
         8
         9
         9
        10
        10

20 rows selected.

SQL>
2

What not to do!

The following example shows a common trap people fall into when they don't understand the way the pseudocolumn and clause interact. Let's assume we wanted to return the top 5 values in the ID column. We might decide to order the data by descending ID and pick off the first five rows. That sounds correct, so we go ahead and issue the following query. That didn't do what we wanted! The problem is that the assignment is performed prior to the operation, resulting in potentially random data being returned.

Code/Command (click line numbers to comment):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
SELECT val
FROM   rownum_order_test
WHERE  rownum <= 5
ORDER BY val DESC;

       VAL
----------
         5
         4
         3
         2
         1

5 rows selected.

SQL>
3

Row Limiting Clause (12c onward)

If you are using Oracle 12c onward, Top-N queries just got a whole lot easier with the introduction of the Row Limiting Clause . If backwards compatibility is not your concern, switch to using this new clause. There is a quick example below, but the separate article explains all the ways of using this new syntax.

Code/Command (click line numbers to comment):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
SELECT val
FROM   rownum_order_test
ORDER BY val DESC
FETCH FIRST 5 ROWS ONLY;

       VAL
----------
        10
        10
         9
         9
         8

5 rows selected.

SQL>
4

Inline View and ROWNUM

The classic Top-N style query uses an ordered inline view to force the data into the correct order, then uses the check to limit the data returned. As the data is in the desired order before the check is performed, we get the result we wanted. Notice that we asked for 5 rows and we got five, even though there is a second row with the value "8". We can return the 5 smallest values by altering the clause to ascending. This method can also be used for paging through data, like paged web reports. This looks like it might not perform well, but we can see from the execution plan that Oracle can push the predicates down into the inline views to make them much more efficient, so this is the best way to page through data using SQL.

Code/Command (click line numbers to comment):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
SELECT val
FROM   (SELECT val
        FROM   rownum_order_test
        ORDER BY val DESC)
WHERE ROWNUM <= 5;

       VAL
----------
        10
        10
         9
         9
         8

5 rows selected.

SQL>

SELECT val
FROM   (SELECT val
        FROM   rownum_order_test
        ORDER BY val)
WHERE rownum <= 5;

       VAL
----------
         1
         1
         2
         2
         3

5 rows selected.

SQL>

SELECT val
FROM   (SELECT val, rownum AS rnum
        FROM   (SELECT val
                FROM   rownum_order_test
                ORDER BY val)
        WHERE rownum <= 8)
WHERE  rnum >= 4;

       VAL
----------
         2
         3
         3
         4
         4

5 rows selected.

SQL>

Execution Plan
----------------------------------------------------------
Plan hash value: 2927523340

----------------------------------------------------------------------------------------------
| Id  | Operation                | Name              | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |                   |     8 |   208 |     4  (25)| 00:00:01 |
|*  1 |  VIEW                    |                   |     8 |   208 |     4  (25)| 00:00:01 |
|*  2 |   COUNT STOPKEY          |                   |       |       |            |          |
|   3 |    VIEW                  |                   |   100 |  1300 |     4  (25)| 00:00:01 |
|*  4 |     SORT ORDER BY STOPKEY|                   |   100 |  1300 |     4  (25)| 00:00:01 |
|   5 |      TABLE ACCESS FULL   | ROWNUM_ORDER_TEST |   100 |  1300 |     3   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("RNUM">=4)
   2 - filter(ROWNUM<=8)
   4 - filter(ROWNUM<=8)
5

WITH Clause and ROWNUM

The previous example can be rewritten to use a clause in place of the inline view.

Code/Command (click line numbers to comment):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
WITH ordered_query AS
  (SELECT val
   FROM   rownum_order_test
   ORDER BY val DESC)
SELECT val
FROM   ordered_query
WHERE  rownum <= 5;

       VAL
----------
        10
        10
         9
         9
         8

5 rows selected.

SQL>
6

RANK

The analytic function assigns a sequential rank for each distinct value in the specified window. At first glance this looks like there may be a problem, but displaying the rank information shows us what is happening. From this we can see that duplicate rows are assigned the same rank, followed by a skip in the sequence to keep the rank consistent. Similar to Olympic medal places. This means the function doesn't give us the "top N rows" or the "top N distinct values". The number of rows returned is dependent on the number of duplicates in the data. There is an article on the analytic function here .

Code/Command (click line numbers to comment):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
SELECT val
FROM   (SELECT val,
               RANK() OVER (ORDER BY val DESC) AS val_rank 
        FROM   rownum_order_test)
WHERE  val_rank <= 5;

       VAL
----------
        10
        10
         9
         9
         8
         8

6 rows selected.

SQL>

SELECT val, val_rank
FROM   (SELECT val,
               RANK() OVER (ORDER BY val DESC) AS val_rank 
        FROM   rownum_order_test)
WHERE  val_rank <= 5;

       VAL   VAL_RANK
---------- ----------
        10          1
        10          1
         9          3
         9          3
         8          5
         8          5

6 rows selected.

SQL>
7

DENSE_RANK

The analytic function is similar to the analytic function in that it assigns a sequential rank for each distinct value in the specified window. The difference being the ranks are compacted, so there are no gaps. Displaying the rank information shows us what is happening. Once again, duplicate values are assigned the same rank, but there is no gap in the rank sequence. As a result always gives us a "top N distinct values" result. There is an article on the analytic function here .

Code/Command (click line numbers to comment):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
SELECT val
FROM   (SELECT val,
               DENSE_RANK() OVER (ORDER BY val DESC) AS val_rank 
        FROM   rownum_order_test)
WHERE  val_rank <= 5;

       VAL
----------
        10
        10
         9
         9
         8
         8
         7
         7
         6
         6

10 rows selected.

SQL>

SELECT val, val_rank
FROM   (SELECT val,
               DENSE_RANK() OVER (ORDER BY val DESC) AS val_rank 
        FROM   rownum_order_test)
WHERE  val_rank <= 5;

       VAL   VAL_RANK
---------- ----------
        10          1
        10          1
         9          2
         9          2
         8          3
         8          3
         7          4
         7          4
         6          5
         6          5

10 rows selected.

SQL>
8

ROW_NUMBER

The analytic function is similar to the pseudocolumn in that it assigns a unique number for each row returned, but like all analytic functions its action can be limited to a specific window of data in the result set and based on the order of data in that window. In this simple example using a window of the whole result set it functions the same as the psuedocolumn. There is an article on the analytic function here .

Code/Command (click line numbers to comment):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
SELECT val
FROM   (SELECT val,
               ROW_NUMBER() OVER (ORDER BY val DESC) AS val_row_number 
        FROM   rownum_order_test)
WHERE  val_row_number <= 5;

       VAL
----------
        10
        10
         9
         9
         8

5 rows selected.

SQL>
9

PERCENT_RANK

The analytic function assigns value between 0-1 which represents the position of the current row relative to the set as a percentage. The following example displays the top 80% of the rows based on the value. There is an article on the analytic function here .

Code/Command (click line numbers to comment):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
SELECT val
FROM   (SELECT val,
               PERCENT_RANK() OVER (ORDER BY val) AS val_percent_rank 
        FROM   rownum_order_test)
WHERE  val_percent_rank >= 0.8;

       VAL
----------
         9
         9
        10
        10

4 rows selected.

SQL>
10

NTILE

The analytic function divides the rows into the specified number of buckets, each with approximately equal numbers of rows. The following example divides the results into three buckets, and shows the rows from the third bucket, or the top third of the results. There is an article on the analytic function here . For more information see: Hope this helps. Regards Tim...

Code/Command (click line numbers to comment):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
SELECT val
FROM   (SELECT val,
               NTILE(3) OVER (ORDER BY val) AS val_ntile 
        FROM   rownum_order_test)
WHERE  val_ntile = 3;

       VAL
----------
         8
         8
         9
         9
        10
        10

6 rows selected.

SQL>

Comments (0)

Please to add comments

No comments yet. Be the first to comment!