DBA Hub

📋Steps in this guide1/7

Loading JSON Data using External Tables in Oracle Database 23ai/26ai

In Oracle database 23ai/26ai loading JSON data using external tables got much simpler.

oracle 23configurationintermediate
by OracleDba
26 views
1

Directory Object

We create a directory object pointing to the "/tmp" directory on the database server. We make sure our test user can access this directory object.

Code/Command (click line numbers to comment):

1
2
3
4
5
6
conn sys/SysPassword1@//localhost:1521/freepdb1 as sysdba

create or replace directory tmp_dir as '/tmp';
grant read, write on directory tmp_dir to testuser1;

conn testuser1/testuser1@//localhost:1521/freepdb1
2

Array of JSON Objects

We create a file called "fruit_array.json" containing an array of JSON documents. We create an external table pointing at the "fruit_array.json" file. Notice the type is "oracle_bigdata" and the access parameters are set to "com.oracle.bigdata.fileformat = jsondoc". This access type understands JSON documents, which means we don't need to figure out how to describe the contents of the file so they can be loaded accurately. We query the data from the "fruit_array.json" file and we see the data is loaded as one JSON document. We use to pretty print it. If we wanted to see each document in the array as a separate row, we just indicate the JSON Path of the array in our external table definition. Now when we query the data we see each array element presented as a separate row. Alternatively we may see the array wrapped up in a document as follows. As long as we set the JSON path to the array correctly, we can still unpack this array. We recreate the external table using the new JSON path. We query the external table and see the array elements as separate rows.

Code/Command (click line numbers to comment):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
cat > /tmp/fruit_array.json <<EOF
[
  {"fruit":"apple", "quantity": 20},
  {"fruit":"orange", "quantity": 10},
  {"fruit":"banana", "quantity": 15}
]
EOF

drop table if exists json_ext;

create table json_ext (
  data json
)
  organization external
    (type oracle_bigdata
     access parameters (com.oracle.bigdata.fileformat = jsondoc)
     location (tmp_dir:'fruit_array.json'))
  parallel
  reject limit unlimited;

select json_serialize(data pretty)
from   json_ext;

JSON_SERIALIZE(DATAPRETTY)
--------------------------------------------------------------------------------
[
  {
    "fruit" : "apple",
    "quantity" : 20
  },
  {
    "fruit" : "orange",
    "quantity" : 10
  },
  {
    "fruit" : "banana",
    "quantity" : 15
  }
]


1 row selected.

SQL>

drop table if exists json_ext;

create table json_ext (
  data json
)
  organization external
    (type oracle_bigdata
     access parameters (
                         com.oracle.bigdata.json.path = '$[*]'
                         com.oracle.bigdata.fileformat = jsondoc
                       )
     location (tmp_dir:'fruit_array.json'))
  parallel
  reject limit unlimited;

select json_serialize(data pretty)
from   json_ext;

JSON_SERIALIZE(DATAPRETTY)
--------------------------------------------------------------------------------
{
  "fruit" : "apple",
  "quantity" : 20
}

{
  "fruit" : "orange",
  "quantity" : 10
}

{
  "fruit" : "banana",
  "quantity" : 15
}


3 rows selected.

SQL>

cat > /tmp/fruit_array2.json <<EOF
{
  "rows": [
    {"fruit":"apple", "quantity": 20},
    {"fruit":"orange", "quantity": 10},
    {"fruit":"banana", "quantity": 15}
  ]
}
EOF

drop table if exists json_ext;

create table json_ext (
  data json
)
  organization external
    (type oracle_bigdata
     access parameters (
                         com.oracle.bigdata.json.path = '$.rows[*]'
                         com.oracle.bigdata.fileformat = jsondoc
                       )
     location (tmp_dir:'fruit_array2.json'))
  parallel
  reject limit unlimited;

select json_serialize(data pretty)
from   json_ext;

JSON_SERIALIZE(DATAPRETTY)
--------------------------------------------------------------------------------
{
  "fruit" : "apple",
  "quantity" : 20
}

{
  "fruit" : "orange",
  "quantity" : 10
}

{
  "fruit" : "banana",
  "quantity" : 15
}


3 rows selected.

SQL>
3

JSON Document Per File

We create three files called "fruit_file{n}.json", each with a single JSON document. We recreate the external table with the location parameter pointing to multiple "fruit_file*.json" files using a wildcard. This time each document is loaded as a separate row.

Code/Command (click line numbers to comment):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
cat > /tmp/fruit_file1.json <<EOF
{"fruit":"apple", "quantity": 20}
EOF

cat > /tmp/fruit_file2.json <<EOF
{"fruit":"orange", "quantity": 10}
EOF

cat > /tmp/fruit_file3.json <<EOF
{"fruit":"banana", "quantity": 15}
EOF

drop table if exists json_ext;

create table json_ext (
  data json
)
  organization external
    (type oracle_bigdata
     access parameters (com.oracle.bigdata.fileformat = jsondoc)
     location (tmp_dir:'fruit_file*.json'))
  parallel
  reject limit unlimited;


SELECT json_serialize(data pretty)
FROM   json_ext;

JSON_SERIALIZE(DATAPRETTY)
--------------------------------------------------------------------------------
{
  "fruit" : "apple",
  "quantity" : 20
}

{
  "fruit" : "orange",
  "quantity" : 10
}

{
  "fruit" : "banana",
  "quantity" : 15
}


3 rows selected.

SQL>
4

Multiple JSON Documents Per File

We create a file called "fruit_multidoc.json" containing several JSON documents, with each document on a new line. We could recreate the external table, but this time we will override the location in the query. We point the table to the "fruit_multidoc.json" file. Even though the documents are read from a single file, they are loaded as a separate row per document.

Code/Command (click line numbers to comment):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
cat > /tmp/fruit_multidoc.json <<EOF
{"fruit":"apple", "quantity": 20}
{"fruit":"orange", "quantity": 10}
{"fruit":"banana", "quantity": 15}
EOF

SELECT json_serialize(data pretty)
FROM   json_ext external modify (
                   location (tmp_dir:'fruit_multidoc.json') 
                 );

JSON_SERIALIZE(DATAPRETTY)
--------------------------------------------------------------------------------
{
  "fruit" : "apple",
  "quantity" : 20
}

{
  "fruit" : "orange",
  "quantity" : 10
}

{
  "fruit" : "banana",
  "quantity" : 15
}


3 rows selected.

SQL>
5

Multiline JSON Documents

We create a file called "fruit_multidoc_multiline.json" containing three JSON documents. Each document is spread across multiple lines. We point the table to the "fruit_multidoc_multiline.json" file. The "jsondoc" format is smart enough to understand JSON, so we can have JSON documents that span multiple lines, and they will still be loaded correctly. Once again we have one row per document. The external table recognized each document was spread across multiple lines, and loaded it correctly.

Code/Command (click line numbers to comment):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
cat > /tmp/fruit_multidoc_multiline.json <<EOF
{
  "fruit":"apple",
  "quantity": 20,
  "extra_info": {"val1": "one",
                 "val2": "two"}
}
{
  "fruit":"orange",
  "quantity": 10,
  "extra_info": {"val1": "three",
                 "val2": "four"}
}
{
  "fruit":"banana",
  "quantity": 15,
  "extra_info": {"val1": "five",
                 "val2": "six"}
}
EOF

SELECT json_serialize(data pretty)
FROM   json_ext external modify (
                   location (tmp_dir:'fruit_multidoc_multiline.json') 
                 );

JSON_SERIALIZE(DATAPRETTY)
--------------------------------------------------------------------------------
{
  "fruit" : "apple",
  "quantity" : 20,
  "extra_info" :
  {
    "val1" : "one",
    "val2" : "two"
  }
}

{
  "fruit" : "orange",
  "quantity" : 10,
  "extra_info" :
  {
    "val1" : "three",
    "val2" : "four"
  }
}

{
  "fruit" : "banana",
  "quantity" : 15,
  "extra_info" :
  {
    "val1" : "five",
    "val2" : "six"
  }
}


3 rows selected.

SQL>
6

Loading JSON Data From External Tables

Now we can read the data using external tables, loading the data into our database can take one of two forms. - We load the JSON data directly into JSON columns. - We use the built JSON functions to convert the JSON data into rows and columns in relation tables. Here are examples of each method.

Code/Command (click line numbers to comment):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
-- JSON column
drop table if exists json_data_table;

create table json_data_table (
  id        number generated always as identity,
  json_data json
);


insert into json_data_table (json_data)
select data from json_ext;
commit;


-- Relational table.
drop table if exists relational_data_table;

create table relational_data_table (
  id        number generated always as identity,
  fruit     varchar2(10),
  quantity  number
);

-- Dot Notation
insert into relational_data_table(fruit, quantity)
select je.data.fruit,
       je.data.quantity
from   json_ext je;

-- JSON_VALUE
insert into relational_data_table(fruit, quantity)
select json_value(data, '$.fruit'),
       json_value(data, '$.quantity')
from   json_ext;
commit;
7

Additional Information

The "jsondoc" file format understands the annotations added to the JSON by the mongodump utility, so files produced by can be loaded using external tables. For more information see: Hope this helps. Regards Tim...

Comments (0)

Please to add comments

No comments yet. Be the first to comment!