DBA Hub

📋Steps in this guide1/6

Indexing JSON Data in Oracle Database 12c Release 1 (12.1.0.2)

This article provides an overview of how to index JSON data in Oracle Database 12c Release 1 (12.1.0.2).

oracle 12cconfigurationintermediate
by OracleDba
14 views
1

Considerations

Before you launch into indexing your JSON documents, there are a few things to consider. - Many developers will have used document stores that are designed for extremely fast throughput. Adding indexes to any table will affect the performance of DML against it, as the indexes have to be maintained. You have to weigh up the overhead of indexing on DML performance against the improved query performance. - The majority of the JSON indexes are function-based indexes, which means the maintenance overhead is event higher than a regular B*Tree index. - If the JSON documents are very large, the overhead of indexing is likely to be even greater. - During this article you will see the optimizer is very picky about using JSON indexes. Slight changes in parameters can make the index unavailable for a specific query. You will definitely not be able to index every possible combination. - The same rules apply with respect to the maintenance of bitmap and full-text indexes. You will probably not want to consider these for highly volatile tables.
2

Setup

The examples in this article assume the following table as been created.

Code/Command (click line numbers to comment):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
DROP TABLE json_documents PURGE;

CREATE TABLE json_documents (
  id    RAW(16) NOT NULL,
  data  CLOB,
  CONSTRAINT json_documents_pk PRIMARY KEY (id),
  CONSTRAINT json_documents_json_chk CHECK (data IS JSON)
);

INSERT INTO json_documents (id, data)
VALUES (SYS_GUID(),
        '{
          "FirstName"      : "John",
          "LastName"       : "Doe",
          "Job"            : "Clerk",
          "Address"        : {
                              "Street"   : "99 My Street",
                              "City"     : "My City",
                              "Country"  : "UK",
                              "Postcode" : "A12 34B"
                             },
          "ContactDetails" : {
                              "Email"    : "[email protected]",
                              "Phone"    : "44 123 123456",
                              "Twitter"  : "@johndoe"
                             },
          "DateOfBirth"    : "01-JAN-1980",
          "Active"         : true
         }');

INSERT INTO json_documents (id, data)
VALUES (SYS_GUID(),
        '{
          "FirstName"      : "Jayne",
          "LastName"       : "Doe",
          "Job"            : "Manager",
          "Address"        : {
                              "Street"   : "100 My Street",
                              "City"     : "My City",
                              "Country"  : "UK",
                              "Postcode" : "A12 34B"
                             },
          "ContactDetails" : {
                              "Email"    : "[email protected]",
                              "Phone"    : ""
                             },
          "DateOfBirth"    : "01-JAN-1982",
          "Active"         : false
         }');

COMMIT;
3

Function-Based Indexes

The following example creates a function-based index on the email element of the JSON documents using the function. The execution plan shows the index is used by a suitable query. The same index is also used if we query the table using the dot notation. Notice that although we use the dot notation, the predicate is actually treated as a call to enable it to use the index. Normally we see dot notation references transformed into calls (as described here ), but in this case, the optimizer has performed a query transformation to a call to take advantage of the index. We can see a typical example of a query transformation of dot notation in the following example, which uses the dot notation to create the index. Notice how the index is used when querying using the dot notation, but not when using the direct call. Looking at the predicate information in the execution plan produced by the query using dot notation, we can see a reference to the function, which is what we would expect from a typical query transformation of dot notation. Since the index creation and the query both used dot notation, which was transformed to a call, the index was used as expected. The query using the direct call found no suitable function-based index using , so no index was used. If we rewrite the query to use a direct call to , the index is used as expected, but only if we match all parameters of the function call. The example below compares two queries using a direct call to . The first matches the index definition exactly and so the index is used. The second uses the default settings, which are missing the keyword, so the index is not used. This means your indexing choices have to be carefully considered and you should try to standardise your approach to using the JSON functionality in SQL to make sure you can take advantage of the indexes you create. The use of a relational view over the JSON data, with queries accessing the relation view, rather than the JSON directly, would force this standardisation, but with a loss of flexibility. In the previous examples, we've seen index creation using the function explicitly and the function implicitly via dot notation. It is also possible to create function-based indexes using and explicitly. With some restrictions , queries containing can use JSON indexes if predicates and filters against columns projected by match the index definition.

Code/Command (click line numbers to comment):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
CREATE INDEX json_docs_email_idx
  ON json_documents (JSON_VALUE(data, '$.ContactDetails.Email' RETURNING VARCHAR2 ERROR ON ERROR));

SET AUTOTRACE TRACE EXPLAIN

SELECT a.data.FirstName,
       a.data.LastName,
       a.data.ContactDetails.Email AS Email
FROM   json_documents a
WHERE  JSON_VALUE(data, '$.ContactDetails.Email' RETURNING VARCHAR2 ERROR ON ERROR) = '[email protected]';

-----------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name                | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                     |     1 |  1499 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| JSON_DOCUMENTS      |     1 |  1499 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | JSON_DOCS_EMAIL_IDX |     1 |       |     1   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access(JSON_VALUE("DATA" FORMAT JSON , '$.ContactDetails.Email' RETURNING VARCHAR2(4000)
              ERROR ON ERROR)='[email protected]')

SET AUTOTRACE TRACE EXPLAIN

SELECT a.data.FirstName,
       a.data.LastName,
       a.data.ContactDetails.Email AS Email
FROM   json_documents a
WHERE  a.data.ContactDetails.Email = '[email protected]';

-----------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name                | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                     |     1 |  1499 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| JSON_DOCUMENTS      |     1 |  1499 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | JSON_DOCS_EMAIL_IDX |     1 |       |     1   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access(JSON_VALUE("DATA" FORMAT JSON , '$.ContactDetails.Email' RETURNING VARCHAR2(4000)
              ERROR ON ERROR)='[email protected]')

DROP INDEX json_docs_email_idx;

CREATE INDEX json_docs_email_idx
  ON json_documents a (a.data.ContactDetails.Email);


SET AUTOTRACE TRACE EXPLAIN

SELECT a.data.FirstName,
       a.data.LastName,
       a.data.ContactDetails.Email AS Email
FROM   json_documents a
WHERE  a.data.ContactDetails.Email = '[email protected]';

-----------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name                | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                     |     1 |  1499 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| JSON_DOCUMENTS      |     1 |  1499 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | JSON_DOCS_EMAIL_IDX |     1 |       |     1   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access(JSON_QUERY("DATA" FORMAT JSON , '$.ContactDetails.Email' RETURNING VARCHAR2(4000)
              ASIS  WITHOUT ARRAY WRAPPER NULL ON ERROR)='[email protected]')


SET AUTOTRACE TRACE EXPLAIN

SELECT a.data.FirstName,
       a.data.LastName,
       a.data.ContactDetails.Email AS Email
FROM   json_documents a
WHERE  JSON_VALUE(data, '$.ContactDetails.Email' RETURNING VARCHAR2 ERROR ON ERROR) = '[email protected]';

------------------------------------------------------------------------------------
| Id  | Operation         | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |                |     1 |  1499 |     4   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| JSON_DOCUMENTS |     1 |  1499 |     4   (0)| 00:00:01 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(JSON_VALUE("DATA" FORMAT JSON , '$.ContactDetails.Email'
              RETURNING VARCHAR2(4000) ERROR ON ERROR)='[email protected]')

SET AUTOTRACE TRACE EXPLAIN

SELECT a.data.FirstName,
       a.data.LastName,
       a.data.ContactDetails.Email AS Email
FROM   json_documents a
WHERE  JSON_QUERY("DATA" FORMAT JSON , '$.ContactDetails.Email' RETURNING VARCHAR2(4000)
              ASIS  WITHOUT ARRAY WRAPPER NULL ON ERROR) = '[email protected]';

-----------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name                | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                     |     1 |  1499 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| JSON_DOCUMENTS      |     1 |  1499 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | JSON_DOCS_EMAIL_IDX |     1 |       |     1   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access(JSON_QUERY("DATA" FORMAT JSON , '$.ContactDetails.Email' RETURNING VARCHAR2(4000)
              ASIS  WITHOUT ARRAY WRAPPER NULL ON ERROR)='[email protected]')


SET AUTOTRACE TRACE EXPLAIN

SELECT a.data.FirstName,
       a.data.LastName,
       a.data.ContactDetails.Email AS Email
FROM   json_documents a
WHERE  JSON_QUERY("DATA" FORMAT JSON , '$.ContactDetails.Email') = '[email protected]';

------------------------------------------------------------------------------------
| Id  | Operation         | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |                |     1 |  1499 |     4   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| JSON_DOCUMENTS |     1 |  1499 |     4   (0)| 00:00:01 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(JSON_QUERY("DATA" FORMAT JSON , '$.ContactDetails.Email'
              RETURNING VARCHAR2(4000) WITHOUT ARRAY WRAPPER NULL ON
              ERROR)='[email protected]')

DROP INDEX json_docs_name_idx;
CREATE INDEX json_docs_name_idx ON json_documents (
  JSON_QUERY(data, '$.FirstName')
);

DROP INDEX json_docs_name_idx;
CREATE INDEX json_docs_name_idx ON json_documents (
  JSON_EXISTS(data, '$.FirstName')
);
4

Composite B-Tree Indexes

Composite indexes can be created by defining virtual columns, with a conventional index against those columns. Internally, this is still a function-based index, but the definition looks much simpler and it gives you the option of querying the virtual columns directly. Alternatively, you can define the composite function-based index directly.

Code/Command (click line numbers to comment):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
-- Create the virtual columns and index.
ALTER TABLE json_documents ADD (first_name VARCHAR2(50)
  GENERATED ALWAYS AS (JSON_VALUE(data, '$.FirstName' RETURNING VARCHAR2(50))));

ALTER TABLE json_documents ADD (last_name VARCHAR2(50)
  GENERATED ALWAYS AS (JSON_VALUE(data, '$.LastName' RETURNING VARCHAR2(50))));

CREATE INDEX json_docs_name_idx ON json_documents (first_name, last_name);


-- Test a query against the virtual columns,
SET AUTOTRACE TRACE EXPLAIN

SELECT COUNT(*)
FROM   json_documents
WHERE  first_name = 'John'
AND    last_name  = 'Doe';

----------------------------------------------------------------------------------------
| Id  | Operation         | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |                    |     1 |    54 |     1   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE   |                    |     1 |    54 |            |          |
|*  2 |   INDEX RANGE SCAN| JSON_DOCS_NAME_IDX |     1 |    54 |     1   (0)| 00:00:01 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("FIRST_NAME"='John' AND "LAST_NAME"='Doe')


-- Test a query using direct JSON_VALUE calls.
SET AUTOTRACE TRACE EXPLAIN

SELECT COUNT(*)
FROM   json_documents
WHERE  JSON_VALUE(data, '$.FirstName' RETURNING VARCHAR2(50)) = 'John'
AND    JSON_VALUE(data, '$.LastName' RETURNING VARCHAR2(50)) = 'Doe';

----------------------------------------------------------------------------------------
| Id  | Operation         | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |                    |     1 |    54 |     1   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE   |                    |     1 |    54 |            |          |
|*  2 |   INDEX RANGE SCAN| JSON_DOCS_NAME_IDX |     1 |    54 |     1   (0)| 00:00:01 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("JSON_DOCUMENTS"."FIRST_NAME"='John' AND
              "JSON_DOCUMENTS"."LAST_NAME"='Doe')

-- Remove the previous index and virtual columns.
DROP INDEX json_docs_name_idx;
ALTER TABLE json_documents DROP COLUMN first_name;
ALTER TABLE json_documents DROP COLUMN last_name;

-- Create the composite index directly.
CREATE INDEX json_docs_name_idx ON json_documents (
  JSON_VALUE(data, '$.FirstName' RETURNING VARCHAR2(50)),
  JSON_VALUE(data, '$.LastName' RETURNING VARCHAR2(50))
);


-- Test a query using direct JSON_VALUE calls.
SET AUTOTRACE TRACE EXPLAIN

SELECT COUNT(*)
FROM   json_documents
WHERE  JSON_VALUE(data, '$.FirstName' RETURNING VARCHAR2(50)) = 'John'
AND    JSON_VALUE(data, '$.LastName' RETURNING VARCHAR2(50)) = 'Doe';

----------------------------------------------------------------------------------------
| Id  | Operation         | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |                    |     1 |    54 |     1   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE   |                    |     1 |    54 |            |          |
|*  2 |   INDEX RANGE SCAN| JSON_DOCS_NAME_IDX |     1 |    54 |     1   (0)| 00:00:01 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access(JSON_VALUE("DATA" FORMAT JSON , '$.FirstName' RETURNING
              VARCHAR2(50) NULL ON ERROR)='John' AND JSON_VALUE("DATA" FORMAT JSON ,
              '$.LastName' RETURNING VARCHAR2(50) NULL ON ERROR)='Doe')
5

Bitmap Indexes

Much of the discussion from the previous section applies equally to bitmap indexes against JSON data. The examples below show you that , and can all be used to define bitmap indexes. The normal rules apply here. If the data is highly volatile, bitmap indexes may not be an ideal choice.

Code/Command (click line numbers to comment):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
DROP INDEX json_docs_name_idx;
CREATE BITMAP INDEX json_docs_name_idx ON json_documents (
  JSON_VALUE(data, '$.FirstName')
);

DROP INDEX json_docs_name_idx;
CREATE BITMAP INDEX json_docs_name_idx ON json_documents (
  JSON_QUERY(data, '$.FirstName')
);

DROP INDEX json_docs_name_idx;
CREATE BITMAP INDEX json_docs_name_idx ON json_documents (
  JSON_EXISTS(data, '$.FirstName')
);
6

Full-Text Search (JSON Search Indexes)

A new simplified JSON search index syntax has been introduced in Oracle Database 12.2. There are no differences in the way the resulting JSON search index performs. A JSON search index is a type of full text index specifically for JSON data. The optimiser will only consider using the index if the database uses a character set of AL32UTF8 or WE8ISO8859P1, and only for JSON data in , , or columns, not and columns. Once a JSON search index has been created, the condition can be used in a query. JSON search indexes can also be used to support queries other than those containing conditions. For more information see: Hope this helps. Regards Tim...

Code/Command (click line numbers to comment):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
-- 12.1 Syntax
CREATE INDEX json_docs_search_idx ON json_documents (data)
  INDEXTYPE IS CTXSYS.CONTEXT
  PARAMETERS ('section group CTXSYS.JSON_SECTION_GROUP SYNC (ON COMMIT)');

-- 12.2 Syntax
CREATE
SEARCH
INDEX json_docs_search_idx ON json_documents (data)
FOR JSON
;

EXEC DBMS_STATS.gather_table_stats(USER, 'JSON_DOCUMENTS');

SET AUTOTRACE TRACE EXPLAIN

SELECT COUNT(*)
FROM   json_documents
WHERE  JSON_TEXTCONTAINS(data, '$.ContactDetails.Email', '[email protected]');

-----------------------------------------------------------------------------------------
| Id  | Operation        | Name                 | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT |                      |     1 |  2014 |     4   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE  |                      |     1 |  2014 |            |          |
|*  2 |   DOMAIN INDEX   | JSON_DOCS_SEARCH_IDX |     1 |  2014 |     4   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("CTXSYS"."CONTAINS"("JSON_DOCUMENTS"."DATA",'{[email protected]}
               INPATH(/ContactDetails/Email)')>0)

SELECT COUNT(*)
FROM   json_documents
WHERE  JSON_EXISTS(data, '$.ContactDetails');

-----------------------------------------------------------------------------------------
| Id  | Operation        | Name                 | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT |                      |     1 |  1494 |     4   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE  |                      |     1 |  1494 |            |          |
|*  2 |   DOMAIN INDEX   | JSON_DOCS_SEARCH_IDX |     1 |  1494 |     4   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("CTXSYS"."CONTAINS"("JSON_DOCUMENTS"."DATA",'HASPATH(/ContactDetail
              s)')>0)


SELECT COUNT(*)
FROM   json_documents
WHERE  JSON_VALUE(data, '$.ContactDetails.Email') = '[email protected]';

-----------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name                 | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                      |     1 |  1494 |     4   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE              |                      |     1 |  1494 |            |          |
|*  2 |   TABLE ACCESS BY INDEX ROWID| JSON_DOCUMENTS       |     1 |  1494 |     4   (0)| 00:00:01 |
|*  3 |    DOMAIN INDEX              | JSON_DOCS_SEARCH_IDX |       |       |     4   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(JSON_VALUE("DATA" FORMAT JSON , '$.ContactDetails.Email' RETURNING
              VARCHAR2(4000) NULL ON ERROR)='[email protected]')
   3 - access("CTXSYS"."CONTAINS"("JSON_DOCUMENTS"."DATA",'{[email protected]}
              INPATH(/ContactDetails/Email)')>0)

Comments (0)

Please to add comments

No comments yet. Be the first to comment!