JSON
type, it is not used or
produced by the following functions.
Trino supports three functions for querying JSON data:
json_exists<json-exists>
,
json_query<json-query>
, and json_value<json-value>
. Each of them
is based on the same mechanism of exploring and processing JSON input using
JSON path.
Trino also supports two functions for generating JSON data —
json_array<json-array>
, and json_object<json-object>
.
json-path-modes
.json-comparison-rules
.
<path1>
and <path2>
, are evaluated to sequences of
items. For arithmetic binary operators, each input sequence must contain a
single numeric item. The arithmetic operation is performed according to SQL
semantics, and it returns a sequence containing a single element with the
result.
The operators follow the same precedence rules as in SQL arithmetic operations,
and parentheses can be used for grouping.
<path>
is evaluated to a sequence of items. Every item must be
a numeric value. The unary plus or minus is applied to every item in the
sequence, following SQL semantics, and the results form the returned sequence.
<path>
return a sequence of three JSON objects:
<path>.customer
succeeds in the first and the third object,
but the second object lacks the required member. In strict mode, path
evaluation fails. In lax mode, the second object is silently skipped, and the
resulting sequence is 100, 300
.
All items in the input sequence must be JSON objects.
<path>
return a sequence of three JSON objects:
<path>
be a sequence containing a JSON object:
<subscripts>
list contains one or more subscripts. Each subscript
specifies a single index or a range (ends inclusive):
last
is set to the last index of the array.<path1>
, the result must be a singleton numeric item. For a range subscript <path2> to <path3>
, two numeric items are expected.<path>
return a sequence of three JSON arrays:
5 to 3
.
Note that the subscripts may overlap, and they do not need to follow the
element order. The order in the returned sequence follows the subscripts:
<path>
return a sequence of three JSON arrays:
true
, false
, or unknown
. Note that some
predicate expressions involve nested JSON path expressions. When evaluating the
nested path, the variable @
refers to the currently examined item from the
input sequence.
The following predicate expressions are supported:
exists
predicatetrue
if the nested path evaluates to a non-empty sequence, and
false
when the nested path evaluates to an empty sequence. If the path
evaluation throws an error, returns unknown
.
starts with
predicate<path>
must evaluate to a sequence of textual items, and the
other operand must evaluate to a single textual item. If evaluating of either
operand throws an error, the result is unknown
. All items from the sequence
are checked for starting with the right operand. The result is true
if a
match is found, otherwise false
. However, if any of the comparisons throws
an error, the result in the strict mode is unknown
. The result in the lax
mode depends on whether the match or the error was found first.
is unknown
predicatetrue
if the nested predicate evaluates to unknown
, and
false
otherwise.
unknown
. Items from the left and
right sequence are then compared pairwise. Similarly to the starts with
predicate, the result is true
if any of the comparisons returns true
,
otherwise false
. However, if any of the comparisons throws an error, for
example because the compared types are not compatible, the result in the strict
mode is unknown
. The result in the lax mode depends on whether the true
comparison or the error was found first.
true
false
false
false
true
or false
is returned if the
comparison is successfully performed. The semantics of the comparison is the
same as in SQL. In case of an error, e.g. comparing text and number,
unknown
is returned.
Comparing a scalar value with a JSON array or a JSON object, and comparing JSON
arrays/objects is an error, so unknown
is returned.
<path>
return a sequence of three JSON objects:
<path>
return a sequence -1, 23e4, "5.6"
:
<path>
return a sequence -1.5, -1, 1.3
:
<path>
be a sequence of three JSON objects:
"null"
for JSON null,"number"
for a numeric item,"string"
for a textual item,"boolean"
for a boolean item,"date"
for an item of type date,"time without time zone"
for an item of type time,"time with time zone"
for an item of type time with time zone,"timestamp without time zone"
for an item of type timestamp,"timestamp with time zone"
for an item of type timestamp with time zone,"array"
for JSON array,"object"
for JSON object,1
.
It is required that all items in the input sequence are JSON arrays.
Let <path>
return a sequence of three JSON arrays:
datetime()
JSON path item method and the
like_regex()
JSON path predicate. Trino does not support them.
Condition | Strict Mode | Lax Mode |
---|---|---|
Performing an operation which requires a non-array on an array, e.g.: $.key requires a JSON object; $.floor() requires a numeric value | ERROR | The array is automatically unnested, and the operation is performed on each array element. |
Performing an operation which requires an array on a non-array, e.g.: $[0] , $[*] , $.size() | ERROR | The non-array item is automatically wrapped in a singleton array, and the operation is performed on the array. |
A structural error: accessing a non-existent element of an array or a non-existent member of a JSON object, e.g.: $[-1] (array index out of bounds); $.key , where the input JSON object does not have a member key | ERROR | The error is suppressed, and the operation results in an empty sequence. |
<path>
return a sequence of three items, a JSON array, a JSON object,
and a scalar numeric value:
size()
method, the JSON object and the number are also
wrapped in singleton arrays:
floor()
method, the item "a"
causes type mismatch.
json_exists
function determines whether a JSON value satisfies a JSON
path specification.
json_path
is evaluated using the json_input
as the context variable
($
), and the passed arguments as the named variables ($variable_name
).
The returned value is true
if the path returns a non-empty sequence, and
false
if the path returns an empty sequence. If an error occurs, the
returned value depends on the ON ERROR
clause. The default value returned
ON ERROR
is FALSE
. The ON ERROR
clause is applied for the following
kinds of errors:
json_input
is a character string or a binary string. It should contain
a single JSON item. For a binary string, you can specify encoding.
json_path
is a string literal, containing the path mode specification, and
the path expression, following the syntax rules described in
linkjson-path-syntax-and-semantics
.
PASSING
clause you can pass arbitrary expressions to be used by the
path expression.
$
.
PASSING
clause:
customers
be a table containing two columns: id:bigint
,
description:varchar
.
id | description |
---|---|
101 | {"comment": "nice", "children": [10, 13, 16]} |
102 | {"comment": "problematic", "children" :[8, 11]} |
103 | {"comment": "knows best", "children": [2]} |
id | children_above_ten |
---|---|
101 | true |
102 | true |
103 | false |
ON ERROR
clause.
id | child_3_above_ten |
---|---|
101 | true |
102 | NULL |
103 | NULL |
json_query
function extracts a JSON value from a JSON value.
json_path
is evaluated using the json_input
as the
context variable ($
), and the passed arguments as the named variables
($variable_name
).
The returned value is a JSON item returned by the path. By default, it is
represented as a character string (varchar
). In the RETURNING
clause,
you can specify other character string type or varbinary
. With
varbinary
, you can also specify the desired encoding.
json_input
is a character string or a binary string. It should contain
a single JSON item. For a binary string, you can specify encoding.
json_path
is a string literal, containing the path mode specification, and
the path expression, following the syntax rules described in
linkjson-path-syntax-and-semantics
.
PASSING
clause you can pass arbitrary expressions to be used by the
path expression.
$
.
PASSING
clause:
ARRAY WRAPPER
clause lets you modify the output by wrapping the results
in a JSON array. WITHOUT ARRAY WRAPPER
is the default option. WITH CONDITIONAL ARRAY WRAPPER
wraps every result which is not a singleton JSON
array or JSON object. WITH UNCONDITIONAL ARRAY WRAPPER
wraps every result.
The QUOTES
clause lets you modify the result for a scalar string by
removing the double quotes being part of the JSON string representation.
customers
be a table containing two columns: id:bigint
,
description:varchar
.
id | description |
---|---|
101 | {"comment": "nice", "children": [10, 13, 16]} |
102 | {"comment": "problematic", "children": [8, 11]} |
103 | {"comment": "knows best", "children": [2]} |
children
array for each customer:
id | children |
---|---|
101 | [10,13,16] |
102 | [8,11] |
103 | [2] |
json_query
function can only output a single JSON item. If
you dont use array wrapper, you get an error for every customer with multiple
children. The error is handled according to the ON ERROR
clause.
id | children |
---|---|
101 | NULL |
102 | NULL |
103 | 2 |
id | last_child |
---|---|
101 | [16] |
102 | [11] |
103 | [2] |
ON EMPTY
clause. The
default value returned ON EMPTY
is NULL
. In the following example,
EMPTY ARRAY ON EMPTY
is specified.
id | children |
---|---|
101 | [13,16] |
102 | [] |
103 | [] |
QUOTES
clause. Note that KEEP QUOTES
is the default.
id | quoted_comment | unquoted_comment |
---|---|---|
101 | ”nice” | nice |
102 | ”problematic” | problematic |
103 | ”knows best” | knows best |
ON ERROR
clause. The
default value returned ON ERROR
is NULL
. One example of error is
multiple items returned by the path. Other errors caught and handled according
to the ON ERROR
clause are:
json_value
function extracts a scalar SQL value from a JSON value.
json_path
is evaluated using the json_input
as the context variable
($
), and the passed arguments as the named variables ($variable_name
).
The returned value is the SQL scalar returned by the path. By default, it is
converted to string (varchar
). In the RETURNING
clause, you can specify
other desired type: a character string type, numeric, boolean or datetime type.
json_input
is a character string or a binary string. It should contain
a single JSON item. For a binary string, you can specify encoding.
json_path
is a string literal, containing the path mode specification, and
the path expression, following the syntax rules described in
linkjson-path-syntax-and-semantics
.
PASSING
clause you can pass arbitrary expressions to be used by the
path expression.
$
.
PASSING
clause:
ON EMPTY
clause is applied. The
default value returned ON EMPTY
is NULL
. You can also specify the
default value:
ON ERROR
clause. The
default value returned ON ERROR
is NULL
. One example of error is
multiple items returned by the path. Other errors caught and handled according
to the ON ERROR
clause are:
customers
be a table containing two columns: id:bigint
,
description:varchar
.
id | description |
---|---|
101 | {"comment" : "nice", "children" : [10, 13, 16]} |
102 | {"comment" : "problematic", "children" : [8, 11]} |
103 | {"comment" : "knows best", "children" : [2]} |
comment
for each customer as char(12)
:
id | comment |
---|---|
101 | nice |
102 | problematic |
103 | knows best |
tinyint
:
id | child |
---|---|
101 | 10 |
102 | 8 |
103 | 2 |
ON ERROR
clause.
id | child |
---|---|
101 | 16 |
102 | err |
103 | err |
ON EMPTY
clause.
id | child |
---|---|
101 | 16 |
102 | missing |
103 | missing |
json_array
function creates a JSON array containing given elements.
FORMAT
and
ENCODING
specification.
You can pass SQL values of types boolean, numeric, and character string. They
are converted to corresponding JSON literals:
FORMAT
option is implicit:
null
, it is treated according to
the specified null treatment option. If ABSENT ON NULL
is specified, the
null element is omitted in the result. If NULL ON NULL
is specified, JSON
null
is added to the result. ABSENT ON NULL
is the default
configuration:
json_array
function returns varchar containing the textual
representation of the JSON array. With the RETURNING
clause, you can
specify other character string type:
json_object
function creates a JSON object containing given key-value pairs.
KEY
keyword:
FORMAT
and
ENCODING
specification.
You can pass SQL values of types boolean, numeric, and character string. They
are converted to corresponding JSON literals:
FORMAT
option is implicit:
null
for JSON object values. A null value is treated according to the
specified null treatment option. If NULL ON NULL
is specified, a JSON
object entry with null
value is added to the result. If ABSENT ON NULL
is specified, the entry is omitted in the result. NULL ON NULL
is the
default configuration.:
WITH UNIQUE KEYS
is specified, a duplicate key results in a query
failure:
FORMAT
specification.
If WITHOUT UNIQUE KEYS
is specified, duplicate keys are not supported due
to implementation limitation. WITHOUT UNIQUE KEYS
is the default
configuration.
json_object
function returns varchar containing the textual
representation of the JSON object. With the RETURNING
clause, you can
specify other character string type:
JSON
data type. Instead, JSON values
should be represented as string values. The remaining functionality of the
following functions is covered by the functions described previously.BOOLEAN
TINYINT
SMALLINT
INTEGER
BIGINT
REAL
DOUBLE
VARCHAR
ARRAY
, MAP
, and ROW
types can be cast to JSON when
the following requirements are met:
ARRAY
types can be cast when the element type of the array is one
of the supported types.MAP
types can be cast when the key type of the map is VARCHAR
and
the value type of the map is a supported type,ROW
types can be cast when every field type of the row is a supported
type.character string types <string-data-types>
treat the input as a string, not validated as JSON.
This means that a cast operation with a string-type input of invalid JSON
results in a successful cast to invalid JSON.
Instead, consider using the json_parse
function to
create validated JSON from a string.JSON
is not straightforward. Casting
from a standalone NULL
will produce SQL NULL
instead of
JSON null
. However, when casting from arrays or map containing
NULL
s, the produced JSON
will have null
s in it.
BOOLEAN
, TINYINT
, SMALLINT
, INTEGER
,
BIGINT
, REAL
, DOUBLE
or VARCHAR
is supported.
Casting to ARRAY
and MAP
is supported when the element type of
the array is one of the supported types, or when the key type of the map
is VARCHAR
and value type of the map is one of the supported types.
Behaviors of the casts are shown with the examples below:
JSON
to ROW
, both JSON array and JSON object are supported.
is_json_scalar(json)
→ boolean
Determine if json
is a scalar (i.e. a JSON number, a JSON string, true
, false
or null
):
json_array_contains(json, value)
→ boolean
Determine if value
exists in json
(a string containing a JSON array):
json_array_get(json_array, index)
→ json
JSON
value that
is not properly quoted (the value will not be surrounded by quotes
and any interior quotes will not be escaped).We recommend against using this function. It cannot be fixed without
impacting existing usages and may be removed in a future release.json_array
.
The index is zero-based:
json_array_length(json)
→ bigint
Returns the array length of json
(a string containing a JSON array):
json_extract(json, json_path)
→ json
Evaluates the [JSONPath]-like expression json_path
on json
(a string containing JSON) and returns the result as a JSON string:
json_query
function provides a more powerful and
feature-rich alternative to parse and extract JSON data.
json_extract_scalar(json, json_path)
→ varchar
Like json_extract, but returns the result value as a string (as opposed to being encoded as JSON). The value referenced by
json_path` must be a
scalar (boolean, number or string).
json_format(json)
→ varchar
Returns the JSON text serialized from the input JSON value.
This is inverse function to json_parse
:
json_format
and CAST(json AS VARCHAR)
have completely different semantics.json_format
serializes the input JSON value to JSON text conforming to
rfc7159
. The JSON value can be a JSON object, a JSON array, a JSON string,
a JSON number, true
, false
or null
.CAST(json AS VARCHAR)
casts the JSON value to the corresponding SQL VARCHAR value.
For JSON string, JSON number, true
, false
or null
, the cast
behavior is same as the corresponding SQL type. JSON object and JSON array
cannot be cast to VARCHAR.json_parse(string)
→ json
Returns the JSON value deserialized from the input JSON text.
This is inverse function to json_format
:
json_parse
expects a JSON text conforming to rfc7159
, and returns
the JSON value deserialized from the JSON text.
The JSON value can be a JSON object, a JSON array, a JSON string, a JSON number,
true
, false
or null
.
CAST(string AS JSON)
takes any VARCHAR value as input, and returns
a JSON string with its value set to input string.
json_size(json, json_path)
→ bigint
Like json_extract
, but returns the size of the value.
For objects or arrays, the size is the number of members, and the size of a scalar value is zero.