DBA Hub

📋Steps in this guide1/4

Online Statistics Gathering for Bulk Loads in Oracle Database 12c Release 1 (12.1)

See how Oracle 12c can automatically gather statistics for some bulk operations.

oracle 12cconfigurationintermediate
by OracleDba
14 views
1

CREATE TABLE ... AS SELECT (CTAS)

Oracle is able to gather statistics during a (CTAS) operation for most regular heap organized tables, as shown below. We use a or CTAS to create a table with 1000 rows. This is dummy data, but it could have been based on a select from an external table. Checking the column in the view shows us some stats have been gathered. There are other table-level stats, but for simplicity we'll use the column to show if statistics have been gathered.

Code/Command (click line numbers to comment):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
CONN test/test@pdb1

DROP TABLE tab1 PURGE;

CREATE TABLE tab1 AS
SELECT level AS id,
       'Description of ' || level AS description
FROM   dual
CONNECT BY level <= 1000;

COLUMN table_name FORMAT A20

SELECT table_name,
       num_rows
FROM   user_tables
WHERE  table_name = 'TAB1';

TABLE_NAME             NUM_ROWS
-------------------- ----------
TAB1
1000
SQL>
2

INSERT INTO ... SELECT

In a similar fashion, operations on tables with no segments using direct path can also benefit from online statistics gathering. These operations are performed using direct path if they are running in parallel, or if the hint is specified. We truncate the table to remove all the segments, then we do a direct path insert using the hint. This time we load 500 rows. A conventional path insert, without a or hint, will not result in the gathering of statistics. The following is a repeat of the first example but for 700 rows instead of 500, and it doesn't have the hint. The column still shows 500 rows, from the previous test. The table must have no segments (missing segment due to delayed segment creation) for this to work. A regular delete will not leave the table in the correct state to allow the online gathering of statistics to kick in. In this example, rather than truncating the table we delete the rows. The table is empty, but there are existing segments from the previous load. We do a direct path insert of 700 rows. Notice the APPEND hint is back. Once again the stats were not gathered during the insert. With direct path inserts the stats are only gathered for tables with no segments. The hint can be used to explicitly prevent the online gathering of statistics. In this example we truncate the table and do a direct path insert of 700 rows, but we include the hint. As expected, the hint prevented the stats from being gathered during the insert, even though the rest of the conditions were correct. We still see the 500 row value from a previous load.

Code/Command (click line numbers to comment):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
TRUNCATE TABLE tab1;

INSERT /*+ APPEND */ INTO tab1
SELECT level AS id,
       'Description of ' || level AS description
FROM   dual
CONNECT BY level <= 500;
COMMIT;

COLUMN table_name FORMAT A20

SELECT table_name,
       num_rows
FROM   user_tables
WHERE  table_name = 'TAB1';

TABLE_NAME             NUM_ROWS
-------------------- ----------
TAB1
500
SQL>

TRUNCATE TABLE tab1;

INSERT INTO tab1
SELECT level AS id,
       'Description of ' || level AS description
FROM   dual
CONNECT BY level <= 700;
COMMIT;

COLUMN table_name FORMAT A20

SELECT table_name,
       num_rows
FROM   user_tables
WHERE  table_name = 'TAB1';

TABLE_NAME             NUM_ROWS
-------------------- ----------
TAB1
500
SQL>

DELETE FROM tab1;
COMMIT;

INSERT /*+ APPEND */ INTO tab1
SELECT level AS id,
       'Description of ' || level AS description
FROM   dual
CONNECT BY level <= 700;
COMMIT;

COLUMN table_name FORMAT A20

SELECT table_name,
       num_rows
FROM   user_tables
WHERE  table_name = 'TAB1';

TABLE_NAME             NUM_ROWS
-------------------- ----------
TAB1
500
SQL>

TRUNCATE TABLE tab1;

INSERT /*+ APPEND */ INTO tab1
SELECT
/*+ NO_GATHER_OPTIMIZER_STATISTICS */
level AS id,
       'Description of ' || level AS description
FROM   dual
CONNECT BY level <= 700;
COMMIT;

COLUMN table_name FORMAT A20

SELECT table_name,
       num_rows
FROM   user_tables
WHERE  table_name = 'TAB1';

TABLE_NAME             NUM_ROWS
-------------------- ----------
TAB1
500
SQL>
3

Partitioned Tables

A bulk load into an empty partitioned table will trigger the gathering of global statistics, but no partition-level statistics. We drop and recreate the test table as a partitioned table with 2 partitions. We do a direct path insert of 100 rows into the new table. All the data will go into the 2016 partition. We see the view has the correct global value of 100 rows, but there are no rows recorded at partition-level in the view. A bulk load that explicitly references an empty partition will result in partition-level statistics, but no global statistics. We delete the table stats, truncate the table, and do a direct path insert into the table, explicitly naming the table partition. This time the view doesn't contain any global stats, but the partition stats are present in the view.

Code/Command (click line numbers to comment):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
DROP TABLE tab1 PURGE;

CREATE TABLE tab1
(id            NUMBER,
 created_date  DATE)
PARTITION BY RANGE (created_date)
(PARTITION tab1_2015 VALUES LESS THAN (TO_DATE('01/01/2016', 'DD/MM/YYYY')) TABLESPACE users,
 PARTITION tab1_2016 VALUES LESS THAN (TO_DATE('01/01/2017', 'DD/MM/YYYY')) TABLESPACE users);

INSERT /*+ APPEND */ INTO tab1
SELECT level,
       TO_DATE('01/01/2016', 'DD/MM/YYYY')
FROM   dual
CONNECT BY level <= 100;
COMMIT;

COLUMN table_name FORMAT A20

SELECT table_name,
       num_rows
FROM   user_tables
WHERE  table_name = 'TAB1';

TABLE_NAME             NUM_ROWS
-------------------- ----------
TAB1                        100

SQL>


COLUMN table_name FORMAT A20
COLUMN partition_name FORMAT A20

SELECT table_name,
       partition_name,
       num_rows
FROM   user_tab_partitions
WHERE  table_name = 'TAB1'
ORDER BY partition_name;

TABLE_NAME           PARTITION_NAME         NUM_ROWS
-------------------- -------------------- ----------
TAB1                 TAB1_2015
TAB1                 TAB1_2016

SQL>

EXEC DBMS_STATS.delete_table_stats(USER, 'TAB1');
TRUNCATE TABLE tab1;

INSERT /*+ APPEND */ INTO tab1 PARTITION (TAB1_2016)
SELECT level,
       TO_DATE('01/01/2016', 'DD/MM/YYYY')
FROM   dual
CONNECT BY level <= 100;
COMMIT;

COLUMN table_name FORMAT A20

SELECT table_name,
       num_rows
FROM   user_tables
WHERE  table_name = 'TAB1';

TABLE_NAME             NUM_ROWS
-------------------- ----------
TAB1

SQL>


COLUMN table_name FORMAT A20
COLUMN partition_name FORMAT A20

SELECT table_name,
       partition_name,
       num_rows
FROM   user_tab_partitions
WHERE  table_name = 'TAB1'
ORDER BY partition_name;

TABLE_NAME           PARTITION_NAME         NUM_ROWS
-------------------- -------------------- ----------
TAB1                 TAB1_2015
TAB1                 TAB1_2016                   100

SQL>
4

Restrictions

There are a number of restrictions associated with online statistics gathering. They are not gathered for: - Index statistics or histograms. If those are required the must be gathered in a separate operation. The following call will gather missing statistics, but will not re-gather table or column statistics unless the existing statistics are already stale. Index statistics or histograms. If those are required the must be gathered in a separate operation. The following call will gather missing statistics, but will not re-gather table or column statistics unless the existing statistics are already stale. - Non-empty segments, as described above. - Tables in built-in schemas. Only those in user-defined schemas. - Nested, index-organized or external tables. - Global temporary tables using the clause. - Table with virtual columns. - Tables if the preference is set to FALSE for . - Tables with locked statistics. - Partitioned tables using incremental statistics, where the insert is not explicitly referencing a partition using the clause. - Tables loaded using multitable inserts. For more information see: Hope this helps. Regards Tim...

Code/Command (click line numbers to comment):

1
EXEC DBMS_STATS.GATHER_TABLE_STATS(USER, 'TAB1', options => 'GATHER AUTO');

Comments (0)

Please to add comments

No comments yet. Be the first to comment!