Primitive and compound JSON data types
JSON can represent (sub)values of four primitive data types and of two compound data types.
The primitive data types are string, number, boolean, and null. There is no way to declare the data type of a JSON value; rather, it emerges from the syntax of the representation.
Compare this with SQL and PL/pgSQL. SQL establishes the data type of a value from the metadata for the column in the table, or the field in the record, into which it is written or from which it is read. It also has the typecast notation, like ::text
or ::boolean
to establish the data type of a SQL literal. PL/pgSQL also supports the typecast notation and establishes the data type of a variable or a formal parameter by declaration. In this way, JSON is better compared with Python, which also implements an emergent data type paradigm. It is for this reason that, in the JSON type system, null is defined as a data type rather than as a "value" (strictly, the absence of information about the value) of one of the other data types.
Notice that JSON cannot represent a date-time value except as a conventionally formatted string value.
The two compound data types are object and array.
A JSON literal is represented in a SQL statement or a PL/pgSQL program by the enquoted ::json
or ::jsonb
typecast of a text
value that conforms to RFC 7159.
JSON string
A JSON string value is a sequence of zero, one, or many Unicode characters enclosed by the "
character. Here are some examples, shown as SQL literals:
'"Dog"'::jsonb
The empty string is legal, and is distinct from the JSON null.
'""'::jsonb
Case and whitespace are significant. Special characters within a string value need to be escaped, thus:
- Backspace: \b
- Form feed: \f
- Newline: \n
- Carriage return: \r
- Tab: \t
- Double quote: \"
- Backslash: \\
For example:
'"\"First line\"\n\"second line\""'::jsonb
The explanation of the difference between the ->
and ->>
operators is illustrated nicely by this JSON string value.
JSON number
Here are some examples, shown as SQL literals:
'17'::jsonb
and:
'4.2'::jsonb
and:
'2.99792E8'::jsonb
Notice that JSON makes no distinction between integers and real numbers.
JSON boolean
Here are the two allowed values, shown as SQL literals:
'true'::jsonb
and:
'false'::jsonb
JSON null
As explained above, null is special in JSON in that it is its own data type that allows exactly one "value", thus:
'null'::jsonb
JSON object
An object is a set of key-value pairs separated by commas and surrounded by curly braces. The order is insignificant. The values in an object do not have to have the same data types as each other. For example:
'{
"a 1" : "Abc",
"a 2" : 42,
"a 3" : true,
"a 4" : null,
"a 5" : {"x" : 1, "y": "Pqr"}
}'::jsonb
Keys are case-sensitive and whitespace within such keys is significant. They can even contain characters that must be escaped. However, if a key does include spaces and special characters, the syntax that you need to read its value can become rather complex. It is sensible, therefore, to avoid exploiting this freedom.
An object can include more than one key-value pair with the same key. This is not recommended, but the outcome is well-defined: the last-mentioned key-value pair, in left-to-right order, in the set of key-value pairs with the same key "wins". You can test this by reading the value for a specified key with an operator like ->
.
JSON array
An array is an ordered list of unnamed JSON values—in other words, the order is defined and is significant. The values in an array do not have to have the same data types as each other. For example:
'[1, 2, "Abc", true, false, null, {"x": 17, "y": 42}]'::jsonb
The values in an array are indexed from 0
. See the account of the ->
operator.
Example compound JSON value
{
"given_name" : "Fred",
"family_name" : "Smith",
"email_address" : "fred@example.com",
"hire_date" : "17-Jan-2015",
"job" : "sales",
"base_annual_salary" : 50000,
"commisission_rate" : 0.05,
"phones" : ["+11234567890", "+13216540987"]
}
This is a JSON object with eight fields. The first seven are primitive string or number values (one of which conventionally represents a date) and the eighth is an array of two primitive string values. The text representations of the phone numbers follow a convention by starting with + and (presumably) a country code. JSON has no mechanisms for defining such conventions and for enforcing conformance.
Note
To see how these limitations can be ameliorated when a JSON document is stored in a column in a SQL table, see Create indexes and check constraints onjson
and jsonb
columns.
In general, the top-level JSON document is an arbitrarily deep and wide hierarchy of subdocuments whose leaves are primitive values.
Notably, and in contrast to XML, JSON is not self-describing. Moreover, JSON does not support comments. But the intention is that the syntax should be reasonably intuitively obvious and human-readable.
It is the responsibility of the consumers of JSON documents to discover the composition rules of any corpus that they have to deal with by ad hoc methods—at best external documentation and at worst human (or mechanical) inspection.
Most programming languages have data types that correspond directly to JSON's primitive data types and to its compound object and array data types.
Note
Because YSQL manipulates a JSON document as the value of ajson
or jsonb
table-row intersection or PL/pgSQL variable, the terms "json
[sub]value" or "jsonb
[sub]value" (and JSON value as the superclass) are preferred from now on in this section—using "value" rather than "document".