DBA Hub

📋Steps in this guide1/5

Multivalue Function-Based Indexes for JSON_EXISTS in Oracle Database 21c

Multivalue function-based indexes allow us to index multiple scalar values in JSON documents stored in the JSON data type.

oracle 21cconfigurationintermediate
by OracleDba
12 views
1

Setup

Create and populate a test table with some JSON data. Each row represents a book, with an array of words. For each word there is an array of pages that word can be found on. Between each test we will flush the shared pool to make sure we get a hard parse each time. We use to display the execution plan when we run queries, but not clutter the screen with the output rows or the statistics.

Code/Command (click line numbers to comment):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
-- drop table t1 purge;

create table t1 (
  id         number generated always as identity,
  name       varchar2(20),
  json_data  json,
  constraint t1_pk primary key (id)
);


begin
  insert into t1 (name, json_data)
  values ('book 1', json('{"words":[
                             {"word":"apple","pages":[5, 10, 15, 20, 25, 30]},
                             {"word":"orange","pages":[10, 20, 30, 40, 50, 60]}
                          ]}'));

  insert into t1 (name, json_data)
  values ('book 2', json('{"words":[
                             {"word":"apple","pages":[1, 6, 11, 16, 21, 26]},
                             {"word":"orange","pages":[11, 21, 31, 41, 51, 61]}
                          ]}'));

  commit;
  dbms_stats.gather_table_stats(null, 't1');
end;
/

conn sys/SysPassword1@//localhost:1521/pdb1 as sysdba
alter system flush shared_pool;
conn testuser1/testuser1@//localhost:1521/pdb1
set autotrace trace explain
2

Non-Composite Multivalue Indexes for Arrays

We use the following query to check for a row which has a reference to page 40. It's a bit of a silly query, since we are not checking what word is referenced on that page, but it serves our purpose for this test. There are no indexes present, so we get a full table scan to retrieve the data. We create an index on the pages array, but this doesn't use the keyword. As a result the index is not used and the query still requires a full table scan. We run the query again and it still uses a full table scan. We recreate the index using the keyword, so now it can index the array properly. This is a non-composite multivalue index, so we must use a data type conversion method. In this case we use the "number()" method. We run the same query again, but this time notice the operation. The index can still be used if it only represents part of our search term. In the following query we are looking for books with the word "apple" on the 25th page.

Code/Command (click line numbers to comment):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
conn sys/SysPassword1@//localhost:1521/pdb1 as sysdba
alter system flush shared_pool;
conn testuser1/testuser1@//localhost:1521/pdb1
set autotrace trace explain

select *
from   t1
where  json_exists(json_data, '$.words.pages?(@.number() == 40)');

Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |   183 |    36   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |     1 |   183 |    36   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(JSON_EXISTS2("JSON_DATA" /*+ LOB_BY_VALUE */  FORMAT OSON
              , '$.words.pages?(@.number() == 40)' FALSE ON ERROR)=1)

SQL>

drop index t1_mvi;

create index t1_mvi on t1 t
  (t.json_data.words.pages.number());

conn sys/SysPassword1@//localhost:1521/pdb1 as sysdba
alter system flush shared_pool;
conn testuser1/testuser1@//localhost:1521/pdb1
set autotrace trace explain

select *
from   t1
where  json_exists(json_data, '$.words.pages?(@.number() == 40)');

Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |   183 |    36   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |     1 |   183 |    36   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(JSON_EXISTS2("JSON_DATA" /*+ LOB_BY_VALUE */  FORMAT OSON
              , '$.words.pages?(@.number() == 40)' FALSE ON ERROR)=1)

SQL>

drop index t1_mvi;

create multivalue index t1_mvi on t1 t
  (t.json_data.words.pages.number());

conn sys/SysPassword1@//localhost:1521/pdb1 as sysdba
alter system flush shared_pool;
conn testuser1/testuser1@//localhost:1521/pdb1
set autotrace trace explain

select *
from   t1
where  json_exists(json_data, '$.words.pages?(@.number() == 40)');

Execution Plan
----------------------------------------------------------
Plan hash value: 1854116654

----------------------------------------------------------------------------------------------
| Id  | Operation                           | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |        |     1 |   185 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T1     |     1 |   185 |     2   (0)| 00:00:01 |
|   2 |   HASH UNIQUE                       |        |     1 |   185 |            |          |
|*  3 |    INDEX RANGE SCAN (MULTI VALUE)   | T1_MVI |     1 |       |     1   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access(JSON_QUERY("JSON_DATA" /*+ LOB_BY_VALUE */  FORMAT OSON ,
              '$.words.pages.number()' RETURNING NUMBER ASIS  WITHOUT ARRAY WRAPPER ERROR ON ERROR
              NULL ON EMPTY NULL ON MISMATCH MULTIVALUE)=40)

SQL>

conn sys/SysPassword1@//localhost:1521/pdb1 as sysdba
alter system flush shared_pool;
conn testuser1/testuser1@//localhost:1521/pdb1
set autotrace trace explain

select *
from t1 t
where json_exists(json_data,'$.words[*]?(@.word == "apple" && @.pages.number() == 25)');

Execution Plan
----------------------------------------------------------
Plan hash value: 1854116654

----------------------------------------------------------------------------------------------
| Id  | Operation                           | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |        |     1 |   185 |     2   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T1     |     1 |   185 |     2   (0)| 00:00:01 |
|   2 |   HASH UNIQUE                       |        |     1 |   185 |            |          |
|*  3 |    INDEX RANGE SCAN (MULTI VALUE)   | T1_MVI |     1 |       |     1   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(JSON_EXISTS2("JSON_DATA" /*+ LOB_BY_VALUE */  FORMAT OSON ,
              '$.words[*]?(@.word == "apple" && @.pages.number() == 25)' FALSE ON ERROR)=1)
   3 - access(JSON_QUERY("JSON_DATA" /*+ LOB_BY_VALUE */  FORMAT OSON ,
              '$.words.pages.number()' RETURNING NUMBER ASIS  WITHOUT ARRAY WRAPPER ERROR ON ERROR
              NULL ON EMPTY NULL ON MISMATCH MULTIVALUE)=25)

SQL>
3

Data Type Conversion Methods

We mentioned previously that non-composite multivalue indexes must include a data type conversion method. The list of data type conversion methods are show below, but the full list of item methods and their descriptions can be found here . - binary() - boolean() - booleanOnly() - date() - dateWithTime() - double() - dsInterval() - float() - number() - numberOnly() - string() - stringOnly() - timestamp() - ymInterval() When deciding on the conversion methods to use, remember the "Only" methods can be quite restrictive and reduce the chances of the index being used. Our original multivalue index uses the "number()" method. This can be used equally well by the following queries, using the "number()" and "numberOnly()" methods, and even when no conversion method is used. We recreate the index using the "numberOnly()" method, which is more restrictive. Now only the query using the "numberOnly()" conversion method can use the index. The other two do a full table scan. Choose the data type conversion methods used in the index definition carefully, to make sure they are actually used by your application.

Code/Command (click line numbers to comment):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
drop index t1_mvi;

create multivalue index t1_mvi on t1 t
  (t.json_data.words.pages.number());

-- number()
select *
from   t1
where  json_exists(json_data, '$.words.pages?(@.number() == 40)');

-- numberOnly()
select *
from   t1
where  json_exists(json_data, '$.words.pages?(@.numberOnly() == 40)');

-- No data type conversion method.
select *
from   t1
where  json_exists(json_data, '$.words.pages?(@ == 40)');

drop index t1_mvi;

create multivalue index t1_mvi on t1 t
  (t.json_data.words.pages.numberOnly());

-- numberOnly()
select *
from   t1
where  json_exists(json_data, '$.words.pages?(@.numberOnly() == 40)');
4

Composite Multivalue Indexes Using JSON_TABLE

We can create more complex composite multivalue indexes using the syntax. I find it easier to plan these out using a query before building the index itself. In this example we want to index all the pages that reference a specific word, so we use to flatten out the nested data. We used the error handler because it is mandatory for the multivalue indexes using , so it makes sense to test the query using it. We can now create the multivalue index using this same JSON_TABLE call. In the following example we search for books with the word "apple" on page 25, and we can see the index is used. The index can also be used for a query referencing one of the values, provided it is on the leading edge of the index. For example searching for books containing the word apple, with "word" being the first element in the composite index. Notice we don't use the index if we don't reference the leading edge of the index. In this case we search for books with a reference to page 25, but do not reference to the "word" element. Notice the index is not used. Similar to normal composite indexes, we need to be careful when deciding the order of the indexed elements. The approach is very tolerant of data type conversion methods. All the following queries are able to use the index.

Code/Command (click line numbers to comment):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
set autotrace off

select id, name, word, page
from t1,
  (json_table(json_data, '$.words[*]'
     error on error null on empty null on mismatch
     columns (
       word varchar2(10) path '$.word',
       nested
         path '$.pages[*]'
         columns (page number(20) path '$'))));

        ID NAME                 WORD             PAGE
---------- -------------------- ---------- ----------
         1 book 1               apple               5
         1 book 1               apple              10
         1 book 1               apple              15
         1 book 1               apple              20
         1 book 1               apple              25
         1 book 1               apple              30
         1 book 1               orange             10
         1 book 1               orange             20
         1 book 1               orange             30
         1 book 1               orange             40
         1 book 1               orange             50
         1 book 1               orange             60
         2 book 2               apple               1
         2 book 2               apple               6
         2 book 2               apple              11
         2 book 2               apple              16
         2 book 2               apple              21
         2 book 2               apple              26
         2 book 2               orange             11
         2 book 2               orange             21
         2 book 2               orange             31
         2 book 2               orange             41
         2 book 2               orange             51
         2 book 2               orange             61

24 rows selected.

SQL>

drop index t1_mvi;

create multivalue index t1_mvi on t1
  (json_table(json_data, '$.words[*]'
     error on error null on empty null on mismatch
     columns (
       word varchar2(10) path '$.word',
       nested
         path '$.pages[*]'
         columns (page number(20) path '$'))));

conn sys/SysPassword1@//localhost:1521/pdb1 as sysdba
alter system flush shared_pool;
conn testuser1/testuser1@//localhost:1521/pdb1
set autotrace trace explain

select *
from t1 t
where json_exists(json_data,'$.words[*]?(@.word == "apple" && @.pages == 25)');

Execution Plan
----------------------------------------------------------
Plan hash value: 1854116654

----------------------------------------------------------------------------------------------
| Id  | Operation                           | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |        |     1 |   185 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T1     |     1 |   185 |     2   (0)| 00:00:01 |
|   2 |   HASH UNIQUE                       |        |     1 |   185 |            |          |
|*  3 |    INDEX RANGE SCAN (MULTI VALUE)   | T1_MVI |     1 |       |     1   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access(JSON_QUERY("JSON_DATA" /*+ LOB_BY_VALUE */  FORMAT OSON ,
              '$.words[*].word' RETURNING VARCHAR2(10) ASIS  WITHOUT ARRAY WRAPPER ERROR ON ERROR
              NULL ON EMPTY NULL ON MISMATCH MULTIVALUE)='apple' AND JSON_QUERY("JSON_DATA" /*+
              LOB_BY_VALUE */  FORMAT OSON , '$.words[*].pages[*]' RETURNING NUMBER(20,0) ASIS
              WITHOUT ARRAY WRAPPER ERROR ON ERROR NULL ON EMPTY NULL ON MISMATCH MULTIVALUE)=25)

SQL>

conn sys/SysPassword1@//localhost:1521/pdb1 as sysdba
alter system flush shared_pool;
conn testuser1/testuser1@//localhost:1521/pdb1
set autotrace trace explain

select *
from t1 t
where json_exists(json_data,'$.words[*]?(@.word == "apple")');

Execution Plan
----------------------------------------------------------
Plan hash value: 1854116654

----------------------------------------------------------------------------------------------
| Id  | Operation                           | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |        |     1 |   185 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T1     |     1 |   185 |     2   (0)| 00:00:01 |
|   2 |   HASH UNIQUE                       |        |     1 |   185 |            |          |
|*  3 |    INDEX RANGE SCAN (MULTI VALUE)   | T1_MVI |     1 |       |     1   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access(JSON_QUERY("JSON_DATA" /*+ LOB_BY_VALUE */  FORMAT OSON ,
              '$.words[*].word' RETURNING VARCHAR2(10) ASIS  WITHOUT ARRAY WRAPPER ERROR ON ERROR
              NULL ON EMPTY NULL ON MISMATCH MULTIVALUE)='apple')

SQL>

conn sys/SysPassword1@//localhost:1521/pdb1 as sysdba
alter system flush shared_pool;
conn testuser1/testuser1@//localhost:1521/pdb1
set autotrace trace explain

select *
from t1 t
where json_exists(json_data,'$.words[*]?(@.pages.numberOnly() == 25)');

Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |   183 |    36   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |     1 |   183 |    36   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(JSON_EXISTS2("JSON_DATA" /*+ LOB_BY_VALUE */  FORMAT OSON
              , '$.words[*]?(@.pages.numberOnly() == 25)' FALSE ON ERROR)=1)

SQL>

-- No data type conversion methods.
select *
from t1 t
where json_exists(json_data,'$.words[*]?(@.word == "apple" && @.pages == 25)');

-- number()
select *
from t1 t
where json_exists(json_data,'$.words[*]?(@.word == "apple" && @.pages.number() == 25)');

-- numberOnly()
select *
from t1 t
where json_exists(json_data,'$.words[*]?(@.word == "apple" && @.pages.numberOnly() == 25)');

-- string()
select *
from t1 t
where json_exists(json_data,'$.words[*]?(@.word.string() == "apple" && @.pages == 25)');

-- stringOnly()
select *
from t1 t
where json_exists(json_data,'$.words[*]?(@.word.stringOnly() == "apple" && @.pages == 25)');

-- stringOnly() and numberOnly()
select *
from t1 t
where json_exists(json_data,'$.words[*]?(@.word.stringOnly() == "apple" && @.pages.numberOnly() == 25)');
5

Considerations

Here are some things to consider when using multivalue function-based indexes. - Every index creates additional overhead on DML performance. This is especially true of function-based indexes, so pick your indexes carefully. - For non-composite indexes, be careful in your choice of data type conversion methods. Make sure you are not reducing the usefulness of the index by being too restrictive. - Like any composite index, make a careful decision about the element/column order in the composite multivalue index. - Like any index, remember to check the impact on the performance of all statements that could be affected by the presence of the new index, not just the statement you are focusing on. For more information see: Hope this helps. Regards Tim...

Comments (0)

Please to add comments

No comments yet. Be the first to comment!