JSON (JavaScript Object Notation data transfer format), defined by RFC 7159 (which derives from an obsolete version of RFC 4627) and ECMA-404, a lightweight text-based data exchange format based on JavaScript object literal syntax (although it is not a JavaScript subgroup). json
provides an API familiar to users of the standard library modules marshal and pickle. Converting basic Python objects to json:
>>> import json
>>> json.dumps(['foo', {'bar': ('baz', None, 1.0, 2)})
'['foo', {'bar': ['baz', null, 1.0, 2]}]'
>>>> print(json.dumps("\"foo\bar")
"\"foo\bar"
>>> print(json.dumps('\u1234')
"\u1234"
>>> print(json.dumps('\\\'))
"\\"
>>> print(json.dumps({"c": 0, "b": 0, "a": 0}, sort_keys=True))
{"a": 0, "b": 0, "c": 0}
>>> from io import StringIO
>>> io = StringIO()
>>> json.dump(['streaming API'], io)
>>> io.getvalue()
'['streaming API']'
Compact conversion:
>>> import json
>>> json.dumps([1, 2, 3, {'4': 5, '6': 7}], separators=(',', ':'))
'[1,2,3,{"4":5,"6":7}]'
Beautiful output:
>>> import json
>>> print(json.dumps({'4': 5, '6': 7}, sort_keys=True, indent=4))
{
"4": 5,
"6": 7
}
Decoding JSON, converting json to a Python object:
>>> import json
>>> json.loads('['foo', {'bar':['baz', null, 1.0, 2]}]')
['foo', {'bar': ['baz', None, 1.0, 2]}]
>>> json.loads('"\\"foo\\\bar")
''foo\x08ar''
>>> from io import StringIO
>>> io = StringIO('["streaming API"]')
>>> json.load(io)
['streaming API']
Specialized object decoding in JSON:
>>> import json
>>> def as_complex(dct):
... if '__complex__' in dct:
... return complex(dct['real'], dct['imag'])
... return dct
...
>>> json.loads('{"__complex__": true, "real": 1, "imag": 2}',
... object_hook=as_complex)
(1+2j)
>>> import decimal
>>> json.loads('1.1', parse_float=decimal.Decimal)
Decimal('1.1')
Extension JSONEncoder
:
>>> import json
>>> class ComplexEncoder(json.JSONEncoder):
... def default(self, obj):
... if isinstance(obj, complex):
... return [obj.real, obj.imag]
... # Let the base class default method raise the TypeError
... return json.JSONEncoder.default(self, obj)
...
>>> json.dumps(2 + 1j, cls=ComplexEncoder)
'[2.0, 1.0]'
>>> ComplexEncoder().encode(2 + 1j)
'[2.0, 1.0]'
>>> list(ComplexEncoder().iterencode(2 + 1j))
['[2.0', ', 1.0', ']']
The use of json.tool is recommended for verification and beautiful output:
$ echo '{'json':'obj'}' | python -m json.tool
{
"json": "obj"
}
$ echo '{1.2:3.4}' | python -m json.tool
Expecting property name enclosed in double quotes: line 1 column 2 (char 1)
JSON is a subset of YAML 1.2 JSON is created using the default settings of this module and is also a subset of YAML 1.0 and 1.1. This module can be used as a YAML serializer. Before Python 3.7, the key order of the dictionary was not preserved, so input and output data tended to be different. Since Python 3.7, the key order has been preserved, so it is no longer necessary to use collections.OrderedDict
to parse JSON.
Table of Contents
The main methods are
The json dump method
json.dump(obj, fp, *, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, cls=None, indent=None, separators=None, default=None, sort_keys=False, **kw)
Serializes obj
into JSON-like format by writing it to fp
(which supports .write()
) using this table. If skipkeys=True
(default: False
), then non-base dictionary keys(str
, int
, float
, bool
, None
) will be skipped, instead of throwing a TypeError
exception. The json
module always creates str
objects, not bytes
. Hence, fp.write()
must support str
input. When ensure_ascii=True
(default), all non-ASCII characters in the output will be escaped with \uXXXX
sequences. If ensure_ascii=False
, these characters will be written as is. When check_circular=False
(default: True
), then check_circular references for container types will be skipped, and such references will cause an OverflowError
(or a more serious error). If allow_nan=False
(default: True
), a ValueError
, according to JSON certification, will occur every time you try to serialize a float
value that is outside the allowed limits(nan
, inf
, -inf
). If allow_nan=True
, the JavaScript analogues(NaN
, Infinity
, -Infinity
) will be used. When indent
is a non-negative integer or string, then JSON objects and arrays will be rendered with that amount of indent. If the indent level is 0, negative or ""
, new lines without indentation will be used. None
(the default) reflects the most compact representation. If indent
string (e.g. "\t"
), that string is used as indent. Changes in version 3.2: Indent strings are allowed in addition to integers. Separators
must be tuple (item_separator, key_separator)
. The default is (', ', ': ')
if indent=None
and (',', ': ')
if otherwise. To get the most compact JSON representation you must specify (',', ':')
. Changes in version 3.4: Use(',', ':')
when indent=None
. The default
value must be a function. It is called for objects that cannot be serialized. The function must return the encoded version of the JSON object or call TypeError
. If default
is not specified, a TypeError
occurs. If sort_keys=True
(default: False
), the output dictionary keys will be sorted. To use your own JSONEncoder subclass (such as the one that overrides the default()
method to serialize additional types), specify it with the cls
argument; otherwise JSONEncoder
is used.
The json dumps method
json.dumps(obj, *, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, cls=None, indent=None, separators=None, default=None, sort_keys=False, **kw)
Serializes obj
to the str
string of JSON format using the conversion table. The arguments have the same value as for dump()
. Keys in key/value pairs are always strings. When a dictionary is converted to JSON, all keys in the dictionary are converted to strings. If, as a result, you convert it first to JSON and then back, the new dictionary may be different from, then you can get a dictionary identical to the original one. In other words, loads(dumps(x)) != x
if x has non string keys.
The json load method
json.loads(fp, *, cls=None, object_hook=None, parse_float=None, parse_int=None, parse_constant=None, object_pairs_hook=None, **kw)
Deserializes from fp
(a text or binary file that supports the .read()
method and contains a JSON document) into a Python object using this conversion table. object_hook
is an optional function that applies to the result of object decoding. The value returned by this function will be used, not the resulting dictionary dict
. This function is used to implement custom decoders (like JSON-RPC). object_pair_shook
is an optional function that applies to the result of decoding an object with a particular sequence of key/value pairs. The result returned by the function will be used instead of the original dict
dictionary. This function is used to implement custom decoders. If object_hook
is specified, object_pairs_hook
will take priority. If parse_float
is defined, it will be called for every JSON floating-point value. By default, this is equivalent to float(num_str)
. You can use another data type or parser for this value (e.g. decimal.Decimal
) If parse_int
is defined, it will be called to decode JSON int strings. By default, equivalent to int(num_str)
. You can use another data type or parser for this value (e.g. float
). If parse_constant
is defined, it will be called for strings: – Infinity
, Infinit
, NaN
. Can be used to raise exceptions on detecting invalid JSON numbers. parse_constant
is no longer called on null, true, fasle
. To use your own JSONDecoder
subclass, specify it with the cls
argument; otherwise JSONDecoder
is used. Additional keyword arguments will be passed to the class constructor. If the deserialized data is not a valid JSON document, a JSONDecodeError
will occur.
The method json loads
json.loads(s, *, cls=None, object_hook=None, parse_float=None, parse_int=None, parse_constant=None, object_pairs_hook=None, **kw)
Deserializes s
(an instance of str
, bytes
or bytearray
containing a JSON document) into a Python object using a conversion table. The rest of the arguments are the same as in load()
, except for the encoding, which is deprecated or ignored. If the deserialized data is not a valid JSON document, a JSONDecodeError
will occur.
Encoders and decoders
JSONDecoder
Class json.JSONDecoder
(*, object_hook=None, parse_float=None, parse_int=None, parse_constant=None, strict=True, object_pairs_hook=None) Simple JSON decoder. Performs the following transformations during decoding: JSONPythonobjectdictarrayliststringstrnumber (int)intnumber (real)floattrueTruefalseFalsenullNone It also understands NaN
, Infinity
, and -Infinity
as corresponding float
values that are outside the JSON specification. object_hook
will be called for each value of a decoded JSON object, and its return value will be used in the specified dict
location. Can be used to provide deserialization (e.g., to support JSON-RPC class hinting). object_pairs_hook
will be called for each decoded JSON object value with an ordered list of pairs. The return value of object_pairs_hook
will be used instead of dict
. This function can be used to start a standard decoder. If object_hook
is also defined, object_pairs_hook
will take precedence. parse_float
will be called for each floating point JSON value. By default, this is equivalent to float(num_str)
. Can be used for another data type or JSON float parser. (e.g. decimal.Decimal
). parse_int
will be called for a JSON int string. By default, equivalent to int(num_str)
. Can be used for other JSON data types and integer parsers (e.g. float
). parse_constant
will be called for strings: '-Infinity'
, 'Infinity'
, 'NaN'
. Can be used to raise exceptions when invalid JSON numbers are detected. If strict=False
(True
by default), then control characters within strings will be allowed. In this context, control characters are characters with codes in the range 0-31, including \t
(tab), \n
, \r
and \0
. If the deserialized data is not a valid JSON document, an error will be thrown JSONDecodeError
. decode(s) Returns the representation of s
in Python(str
– containing JSON document) JSONDecodeError
will be called if the JSON document is not valid (or not valid). raw_decode(s)
Decodes a JSON document from s
(str
beginning with the JSON document) and returns a tuple of 2 elements (the Python representation and the string index in s
where the document ended). Can be used to decode a JSON document from a string that has extra data at the end.
JSONEncoder
Class json.JSONEncoder
(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None) An extensible JSON encoder for Python data structures. Supports the following data types and default objects: PythonJSONdictobjectlist, tuplearraystrstringint, floatnumberTruetrueFalsefalseNonenull In order to be able to recognize other objects, the subclass must execute the default()
method, which will return a serializable object for o
if possible, otherwise it must call the parent class implementation (to call TypeError
). If skipkeys=False
(default), a TypeError
is called when trying to encode keys that are not str
, int
, float
, or None
. If skipkeys=True
, such elements are simply skipped. If ensure_ascii=True
(default), the output guarantees that all incoming non-ASCII characters are escaped with \uXXXX
sequences. But if ensure_ascii=False
, those characters are output as is. If check_circular=True
(default), then lists, dictionaries and self-encoded objects will be checked for cyclic references during encoding to prevent infinite recursion (which will cause OverflowError
). Otherwise, no such check is performed. If allow_nan=True
(default), then NaN
, Infinity
, and -Infinity
will be encoded as such. This does not conform to the JSON specification, but conforms to most JavaScript-based encoders and decoders. Otherwise such values will cause a ValueError
. If sort_keys=True
(default: False
), the output dictionary will be sorted by key names; this is useful for regression testing to compare JSON serialization daily. If indent
is a non-negative integer or string, then JSON objects and arrays will be output with that amount of indent. If indent level is 0, negative or ""
, new lines without indentation will be used. None
(the default) reflects the most compact representation. If indent
string (e.g. "\t"
), that string is used as indent. If separator
is specified (must be a tuple of type (item_separator, key_separator)
). The default is (', ', ': ')
if indent=None
and (',', ': ')
if not. To get the most compact JSON representation , you should use (',', ':')
to reduce the number of spaces. The default
value should be a function. It is called for objects that cannot be serialized. The function should return the encoded version of the JSON object or call TypeError
. If default
is not specified, a TypeError
occurs. default(o) Implement this method in a subclass so that it returns a serializable object for o
or calls the base implementation (to raise TypeError
). For example, to support arbitrary iterators, you can implement default as follows:
def default(self, o):
try:
iterable = iter(o)
except TypeError:
pass
else:
return list(iterable)
# let the base class raise a TypeError exception
return json.JSONEncoder.default(self, o)
encode(o) Returns a string representation of the JSON representation of the Python data structure. Example:
>>> json.JSONEncoder().encode({'foo': ['bar', 'baz']})
'{"foo": ["bar", "baz"]}'
iterencode(o) Encodes the passed object o and outputs each string representation as soon as it becomes available. For example:
for chunk in json.JSONEncoder().iterencode(bigobject):
mysocket.write(chunk)
JSONDecodeError exception
Exception json.JSONDecodeError
(msg, doc, pos) ValueError
subclass with additional attributes: msg
– unformatted error message. doc
– JSON parsing of the document. pos
– the first index of doc
, if the parsing failed. lineno
– string corresponding to pos.
colno
– the column corresponding to pos.
Standard matching and compatibility
The JSON format is specified in RFC 7159 and ECMA-404. This section describes the level of compliance of this module with the RFC. For simplicity, JSONEncoder
and JSONDecoder
subclasses, and parameters that differ from those specified, are not taken into account. This module is not RFC-compliant, setting some extensions that are workable for JavaScript but invalid for JSON. Specifically:
Infinite
andNaN
are accepted and output;- Repeated names within an object are accepted and output, but only the last value of the duplicated key.
Because the RFC allows RFC-compliant parsers to accept input texts that are not RFC-compliant, this module’s deserializer is technically RFC-standard.
Character decoding
The RFC requires JSON to be represented using UTF-8, UTF-16 or UTF-32, with UTF-8 being the recommended default for maximum compatibility. Possibly, but not necessarily for RFC, the serializers of this module set ensure_ascii=True
by default, so strings contain only ASCII characters. Other than the ensure_ascii
parameter, this module does not directly address the character encoding issue. The RFC forbids a byte sequence marker (BOM) at the start of JSON text and this module’s serializer does not add a BOM. The RFC allows, does not require JSON deserializers to ignore BOM at the input. The deserializer of this module causes ValueError
when BOM is present. The RFC does not explicitly forbid JSON strings that contain a byte sequence that does not match valid Unicode characters (e.g. unpaired UTF-16 substitutes), it notes – they can cause compatibility problems. By default this module accepts and outputs (if present in the source string) special code sequences.
Infinite and NaN
The RFC does not allow representation for infinite
or NaN
values. Despite this, by default this module accepts and outputs Infinity
, -Infinity
, and NaN
as if they were actually literal number values in JSON:
>>> # None of these calls will be exceptions, but the results are not JSON
>>> json.dumps(float('-inf'))
'-Infinity'
>>> json.dumps(float('nan'))
'NaN'
>>> # Same for deserialization
>>> json.loads('-Infinity')
-inf
>>> json.loads('NaN')
nan
The serializer uses the allow_nan
parameter to change this behavior. In the deserializer, this parameter is parse_constant
.
Repetitive names within an object
The RFC specifies that names within a JSON object must be unique, but it does not specify how repetitive names within JSON objects should be handled. By default, this module does not raise exceptions; instead, it ignores all but the last key/value pair for a given key:
>>> weird_json = '{"x": 1, "x": 2, "x": 3}'
>>> json.loads(weird_json)
{'x': 3}
The object_pairs_hook
parameter can be used to change this.
Top-level Non-Object, Non-Array value
The old version of JSON specified by the deprecated RFC 4627 required the JSON text top-level value to be a JSON object or array (Python dict
or list
), or was not a JSON null, boolean, number, string value
. RFC 7159 removed this restriction, so this module did not and never did apply this restriction in its serializer or deserializer. However, for maximum compatibility, you may voluntarily adhere to this restriction.
Implementation restrictions
Some JSON deserializer implementations have limits on:
- received JSON text size
- maximum nesting level of JSON objects and arrays
- range and precision of JSON numbers
- jSON string content and maximum string length
This module places no restrictions other than those that apply to the relevant Python types or the Python interpreter itself. When serializing to JSON, be careful of such restrictions in applications that may consume your JSON. In particular, numbers in JSON are often deserialized into IEEE 754 double precision numbers and are subject to the range and accuracy limitations of this representation. This is especially true when serializing Python int
values of extremely large values or when serializing instances of “unusual” numeric types, such as decimal.Decimal
.
Command Line Interface
Source code: Lib/json/tool.py The json.tool
module provides a simple command line interface to inspect and output JSON objects. If the optional infile
and outfile
arguments are not specified, sys.stdin
and sys.stdout
will be respectively:
$ echo '{"json: "obj"}' | python -m json.tool
{
"json": "obj"
$ echo '{"json": "obj"}' | python -m json.tool
{
"json": "obj"
}
$ echo '{1.2:3.4}' | python -m json.tool
Expecting property name enclosed in double quotes: line 1 column 2 (char 1)
Command-line capabilities
infile Checks and nicely outputs the JSON file:
$ python -m json.tool mp_films.json
[
{
"title": "And Now for Something Completely Different",
{ "year": 1971
},
{
"title": "Monty Python and the Holy Grail",
"year": 1975
}
]
$ python -m json.tool mp_films.json
[
{
"title": "And Now for Something Completely Different",
{ "year": 1971
},
{
"title": "Monty Python and the Holy Grail",
"year": 1975
}
]$ python -m json.tool mp_films.json
[
{
"title": "And Now for Something Completely Different",
"year": 1971
},
{
"title": "Monty Python and the Holy Grail",
"year": 1975
}
]
If infile
is not specified, read from sys.stdin
.