| Internet-Draft | UBER Format | March 2026 |
| Smith | Expires 21 September 2026 | [Page] |
This document defines the Universal Basic Element Representation (ÜBER), a language-independent, lightweight, text-based serialization format designed for the portable representation, transmission, and storage of structured data.¶
ÜBER employs a unified node architecture in which each node serves simultaneously as a value-carrier and a structural container. This recursive model allows any element to encapsulate a discrete data payload while concurrently functioning as a parent to nested members. The same structure also provides the granularity needed by implementations that target high-concurrency access patterns with localized, low-contention updates.¶
A top-level ÜBER profile consists of either an explicit root object or a sequence of members and directives. To prioritize human-centric design and authoring ergonomics, the grammar supports comments, flexible arbitrary-length key-value separators, optional commas, and dotted member names.¶
The ÜBER type system is defined by recursive containers and scalar
forms. Associative arrays ({}) and collections
([]) support arbitrary nesting, while concrete ordering
behavior remains implementation-dependent. Scalar forms include
deterministic numbers with arbitrary-precision integer syntax and
Java-aligned unsuffixed floating-point lexical forms, automatic
magnitude-based fitting and promotion,
versatile string representations including text blocks,
flexible boolean literals, and both explicit and omitted null
states.¶
This specification defines the lexical and syntactic grammar of ÜBER. It does not define implementation-specific runtime facilities except where a grammar element, such as directives, requires a brief statement of purpose.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 21 September 2026.¶
Copyright (c) 2026 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
ÜBER is a lightweight, text-based, language-independent data-serialization and data-interchange format for structured data. A complete top-level document is a profile, written either as an explicit root object or as a top-level sequence of profile statements, where each profile statement is a member or directive. The grammar is deliberately deterministic: prefixed forms are distinguished by leading syntax, and bare-token scalar forms are interpreted using a fixed fallback order in which unquoted text is terminal. This keeps parsing predictable across the broader authoring surface.¶
The core object model is recursive. A member can hold a scalar value, nested child members, or both at the same time. This unified node architecture is what allows valued members to exist without introducing a separate node taxonomy for container and scalar forms.¶
The format is intentionally human-oriented at the surface-syntax level. Comments, optional commas, unquoted forms, and implicit top-level profile forms are provided so that human-authored documents can remain concise without changing the deterministic underlying parse model.¶
The grammar defines recursive containers and scalar forms.
Associative arrays and collections may nest arbitrarily. Scalar forms
include deterministic numbers with arbitrary-precision integer syntax
and Java-aligned unsuffixed floating-point lexical forms, multiple
string syntaxes including text blocks,
flexible boolean literals, and a distinction between explicit
null and omitted values in member position.¶
The grammar defined here has the following high-level properties:¶
This document is restricted to the grammar of the format. Features such as defaults chaining, macro resolution, and concrete directive effects are outside the scope of this specification.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.¶
For convenience, this document also cites the combined requirements language specification as [BCP14].¶
ABNF in this document uses the notation of [RFC5234]. Case-sensitive literal strings use the extensions defined by [RFC7405].¶
For readability, some productions use ABNF prose values for character classes that are more naturally stated as exclusions, such as "any character except DQUOTE or reverse solidus". Such prose values are normative.¶
A complete top-level ÜBER document is a profile. A
profile is either:¶
The profile is the complete document unit. In the implicit form, the top level consists of profile statements, and a profile statement is either a member or a directive. The implicit form exists so that configuration-oriented documents can omit outer braces. Commas between statements are optional both at the top level and inside objects, and whitespace alone can separate adjacent statements. Nested objects, however, are always explicit.¶
Directives therefore belong to the profile model itself rather than to some separate outer feature layer. The grammar defines their shape but does not define directive names or effects. Specific directive vocabularies are left to implementations or to future specifications.¶
<CODE BEGINS>
profile = *ws object *ws
/ *ws profile-statements *ws
profile-statement = member / directive
profile-statements = profile-statement
/ profile-statements [ "," ] profile-statement
<CODE ENDS>
Whitespace in ÜBER includes inline spacing, line terminators, and comments. Comments are treated as whitespace wherever the grammar permits whitespace.¶
<CODE BEGINS>
ws = whitespace / comment
whitespace = inline-space / line-terminator
inline-space = SP / HTAB / %x0B / %x0C
line-terminator = LF / CR / CRLF
line-end = line-terminator / eof
eof = <end of input>
LF = %x0A
CR = %x0D
control-character = %x00-1F
comment = single-line-comment / block-comment
single-line-comment = single-line-marker [ comment-chars ] line-end
single-line-marker = "//" / "#" / "!"
block-comment = "/*" [ block-comment-chars ] "*/"
comment-chars =
<characters not containing CR or LF>
block-comment-chars = <any character sequence not containing "*/">
<CODE ENDS>
The line-oriented and block delimiter forms are defined by Figure 2. The block form terminates at the next matching closing delimiter.¶
<CODE BEGINS>
digit = DIGIT
onenine = %x31-39
octdigit = %x30-37
bindigit = %x30 / %x31
hexdigit = HEXDIG
digit-or-underscore = digit / "_"
hex-digit-or-underscore = hexdigit / "_"
octal-digit-or-underscore = octdigit / "_"
binary-digit-or-underscore = bindigit / "_"
decimal-digits = digit-or-underscore
/ decimal-digits digit-or-underscore
hex-digits = hex-digit-or-underscore
/ hex-digits hex-digit-or-underscore
octal-digits = octal-digit-or-underscore
/ octal-digits octal-digit-or-underscore
binary-digits = binary-digit-or-underscore
/ binary-digits binary-digit-or-underscore
true-literal = %s"true" / %s"yes" / %s"on"
false-literal = %s"false" / %s"no" / %s"off"
null-literal = %s"null"
<CODE ENDS>
Underscores are permitted inside digit runs as visual separators. The accepted placement is the one defined by these productions, including the permissive underscore handling defined by this specification.¶
A value is parsed deterministically. For prefixed forms, the leading syntax selects the parse path directly. Precedence-sensitive fallback applies only among the bare-token scalar candidates. The grammar accepts the following value forms:¶
<CODE BEGINS>
element = *ws value *ws
value = text-block
/ object
/ array
/ double-quoted-string
/ single-quoted-string
/ number
/ boolean
/ null-literal
/ unquoted-string
scalar-value = text-block
/ array
/ double-quoted-string
/ single-quoted-string
/ number
/ boolean
/ null-literal
/ unquoted-string
boolean = true-literal / false-literal
<CODE ENDS>
The object alternative is not present in scalar-value
because member syntax separately permits a trailing object after an
optional scalar value. This is what allows a member to hold both a
scalar value and nested members.¶
In the reference implementation, precedence matters only after the
prefixed forms have been excluded. The bare-token scalar branch is then
interpreted as number, boolean, null, and
finally unquoted-string fallback.¶
An object is a collection of members enclosed in braces. Members can be separated by commas, whitespace, or both. Trailing commas are not permitted.¶
<CODE BEGINS>
object = "{" [ members ] "}"
members = member
/ members [ "," ] member
member = *ws dotted-name *ws separator
[ scalar-element ] *ws [ object ]
scalar-element = *ws scalar-value *ws
separator = explicit-separator / ws
explicit-separator = separator-char
/ explicit-separator separator-char
separator-char = ":" / "="
<CODE ENDS>
A member MAY therefore hold only a scalar value, only an object, or a scalar value followed by an object. The last form creates a valued member: the member simultaneously has a scalar value and child members.¶
The explicit-separator production permits any non-empty run of
: and = characters. Accordingly, :,
=, ::, :=, ==, ::=, and
other mixed runs are all valid explicit separators.¶
This grammar assigns no distinct meaning to different explicit separator runs. They are syntactically equivalent ways to separate a member name from its value. An implementation or profile layered on top of ÜBER SHOULD NOT assign different semantics to different separator runs unless it intentionally defines a narrower grammar than the one specified here.¶
<CODE BEGINS> alpha : 1 beta = 2 gamma := 3 delta :: 4 epsilon == 5 zeta ::= 6 <CODE ENDS>
Member names are parsed as dotted paths. Each name atom contributes one level of path structure, and dots separate adjacent atoms unless the dot is escaped or appears inside a single-quoted name atom.¶
<CODE BEGINS>
dotted-name = name-atom
/ dotted-name *ws "." *ws name-atom
name-atom = single-quoted-string / dq-name-atom / uq-name
dq-name-atom = DQUOTE dq-name-content DQUOTE
dq-name-content = dq-name-segment
/ dq-name-content "." dq-name-segment
dq-name-segment = *dq-name-char
dq-name-char = dq-name-unescaped / "\" escape-sequence
dq-name-unescaped = <any character except DQUOTE,
"\" , "." or control-character>
uq-name = *uq-name-char
uq-name-char = uq-name-unescaped / "\" escape-sequence
uq-name-unescaped = <any character except whitespace,
",", "{", "}", "[", "]",
":", "=", DQUOTE, "'", "\", or ".">
<CODE ENDS>
Name atoms may use any of the three non-text-block string forms: single-quoted, double-quoted, or unquoted. Empty name atoms are therefore permitted in all three forms. This means that a dotted name can legally begin with a dot, end with a dot, contain adjacent dots, or contain an explicit empty quoted atom, each of which yields an empty-string key at that path level.¶
In the double-quoted and unquoted name forms, a bare dot is a chain
separator. A literal dot in those forms therefore requires escaping.
In the single-quoted name form, escapes are not processed, so
\. remains two literal characters, while a single-quoted
atom such as 'a.b' still contributes multiple path
segments.¶
An array is an ordered sequence of elements enclosed in brackets. Commas between elements are optional. As with objects, trailing commas are not permitted.¶
<CODE BEGINS>
array = "[" [ elements ] "]"
elements = element
/ elements [ "," ] element
<CODE ENDS>
ÜBER supports four string forms:¶
Double-quoted strings, text blocks, and unquoted strings recognize the general escape syntax. Single-quoted strings are literal except for the closing quote.¶
Text blocks follow the Java text-block model described by JEP 378.¶
<CODE BEGINS>
text-block = %s"\"\"\""
line-terminator
[ text-block-chars ]
%s"\"\"\""
text-block-chars = text-block-char
/ text-block-chars text-block-char
text-block-char = text-block-unescaped
/ line-terminator
/ "\" escape-sequence
text-block-unescaped = <any character except "\" ,
control-character,
or the closing three-DQUOTE delimiter>
double-quoted-string = DQUOTE [ dq-string-chars ] DQUOTE
dq-string-chars = dq-string-char
/ dq-string-chars dq-string-char
dq-string-char = dq-string-unescaped / "\" escape-sequence
dq-string-unescaped = <any character except DQUOTE,
"\" or control-character>
quoted-string = double-quoted-string / single-quoted-string
single-quoted-string = "'" [ sq-string-chars ] "'"
sq-string-chars = sq-string-char
/ sq-string-chars sq-string-char
sq-string-char = <any character except "'"
or control-character>
unquoted-string = uq-string-chars
uq-string-chars = uq-string-char
/ uq-string-chars uq-string-char
uq-string-char = uq-string-unescaped / "\" escape-sequence
uq-string-unescaped = <any character except whitespace,
control-character, ",", "{", "}",
"[", "]", ":", "=", DQUOTE,
"'", or "\">
<CODE ENDS>
<CODE BEGINS>
escape-sequence = simple-escape
/ unicode-escape
/ hex-escape
/ octal-escape
/ ws-escape
/ ","
/ "{"
/ "}"
/ "["
/ "]"
/ ":"
/ "="
simple-escape = %s"a" / %s"b" / %s"e" / %s"f" / %s"n" / %s"r"
/ %s"s" / %s"t" / %s"v"
/ "\" / "'" / DQUOTE / "/"
/ %s"0" / "." / "#" / "!" / "@"
ws-escape = SP
unicode-escape = %s"u" 4hexdigit
/ %s"u" 6hexdigit
/ %s"u" 8hexdigit
/ %s"u{" braced-unicode-digits "}"
braced-unicode-digits = hexdigit
/ braced-unicode-digits hex-digit-or-underscore
hex-escape = %s"x" hexdigit
/ hex-escape hexdigit
octal-escape = octdigit
/ octdigit octdigit
/ octdigit octdigit octdigit
<CODE ENDS>
The punctuation escapes and the whitespace escape each yield the literal character represented by the escaped form.¶
In unquoted strings, punctuation characters that would otherwise terminate the token MUST be escaped to be included literally.¶
ÜBER numbers include ordinary decimal numbers, hexadecimal integers, octal integers in both legacy and explicit-prefixed forms, binary integers, decimal floating-point literals, hexadecimal floating-point literals, and special floating-point keywords. This section defines only the grammar of those forms.¶
The digit productions used here are the underscore-permissive lexical
productions defined earlier in this document. In particular,
decimal-digits, hex-digits, octal-digits,
and binary-digits each include underscore forms exactly as
specified by this grammar.¶
<CODE BEGINS>
number = special-value / numeric-literal
special-value = [ sign ] special-keyword
special-keyword = %s"NaN" / %s"Infinity"
numeric-literal = integer-literal / floating-point-literal
integer-literal = [ sign ] integer-value
integer-value = decimal-integer
/ hexadecimal-integer
/ octal-integer
/ binary-integer
decimal-integer = onenine [ decimal-digits ]
/ "0"
hexadecimal-integer = "0" hex-indicator hex-digits
octal-integer = "0" octal-digits
/ "0" octal-indicator octal-digits
binary-integer = "0" binary-indicator binary-digits
hex-indicator = %s"x" / %s"X"
binary-indicator = %s"b" / %s"B"
octal-indicator = %s"o" / %s"O"
floating-point-literal = [ sign ] decimal-floating-point-literal
/ [ sign ]
hexadecimal-floating-point-literal
decimal-floating-point-literal
= decimal-digits decimal-float-tail
/ "." decimal-digits [ exponent ]
hexadecimal-floating-point-literal
= hex-significand binary-exponent
hex-significand = "0" hex-indicator hex-digits
/ "0" hex-indicator
hex-digits "." [ hex-digits ]
/ "0" hex-indicator "." hex-digits
decimal-float-tail = "." [ decimal-digits ] [ exponent ]
/ exponent
exponent = exponent-indicator [ sign ] decimal-digits
exponent-indicator = %s"e" / %s"E"
binary-exponent = binary-exponent-indicator
[ sign ]
decimal-digits
binary-exponent-indicator
= %s"p" / %s"P"
sign = "+" / "-"
<CODE ENDS>
A decimal integer either consists of the single digit zero or begins
with a non-zero digit. This avoids alternative decimal spellings with
leading zeros, leaving the leading-zero form available for the distinct
octal grammar. Octal literals are accepted both in that legacy
leading-zero form and in explicit 0o or 0O-prefixed
form. Floating-point literals follow Java-style unsuffixed decimal and
hexadecimal spellings, including leading-dot decimal forms such as
.5 and hexadecimal forms such as 0x1.fp3; numeric
type suffixes such as L, l, F, f,
D, and d are not part of the ÜBER grammar.¶
The grammar places no fixed upper bound on the length of integer digit sequences or decimal digit sequences. Accordingly, ÜBER syntactically permits arbitrarily large integers and arbitrarily precise decimal literals.¶
An implementation that maps numeric literals into host-language numeric types SHOULD preserve the value of the literal as faithfully as possible. This specification does not require any particular set of runtime numeric classes, so the concrete representation of values outside common fixed-width ranges or common floating-point precision is implementation-defined. A common strategy is automatic fitting and promotion of unsuffixed literals based on magnitude and required precision.¶
The Java reference implementation, for example, promotes unsuffixed
integers through Integer and Long before using
BigInteger. For decimal floating-point literals, it selects
Float, Double, or BigDecimal based on the
authored mantissa precision together with range. For hexadecimal
floating-point literals, it selects Float or
Double. Other implementations MAY choose different
concrete numeric classes while preserving the same grammar.¶
For example, a very large integer such as
999999999999999999999999999999 remains grammatically valid,
and an unsuffixed literal such as 1e400 remains
grammatically valid even when it exceeds the range of common binary
floating-point types.¶
A directive is a top-level statement that begins with the at-sign character. The directive grammar is part of this specification because it affects the accepted top-level syntax of the format.¶
This document does not define any directive names or directive effects. Implementations MAY define directive vocabularies, and future specifications MAY standardize such vocabularies, but those semantics are outside the scope of this document.¶
<CODE BEGINS>
directive = "@"
[ inline-space ]
directive-name
1*inline-space
value
directive-name = 1*LOWALPHA
LOWALPHA = %x61-7A
<CODE ENDS>
Because the directive payload is a normal value, the grammar
permits any value form after a directive name, including structured
values such as arrays or objects.¶
All valid JSON texts as defined by [RFC8259] are valid
ÜBER texts. An implementation or
protocol that wishes to remain within the JSON subset can do so by
using explicit root objects or arrays, comma-separated collections,
explicit colon separators, double-quoted strings, decimal numbers, and
the JSON literals true, false, and null.¶
ÜBER additionally defines the following grammar features, none of which are available in JSON:¶
NaN and Infinity numeric tokens,¶
yes and on, and¶
These extensions improve authoring ergonomics while preserving a deterministic parse order and a grammar that remains tractable for implementers.¶
For interchange between systems that are not part of a closed ecosystem, an ÜBER text MUST be encoded using UTF-8.¶
Implementations MAY accept other encodings when those encodings are selected out of band, by local policy, or by an explicit API parameter. This specification does not define an in-band charset declaration.¶
Generators intended for interoperable exchange MUST emit UTF-8 when producing octets.¶
Strings and member-name atoms denote Unicode character sequences after quote handling and escape processing have been applied.¶
Implementations comparing strings or member names for semantic equality
MUST compare the resulting character sequences, not the original source
spellings. For example, "a" and "\u0061" compare
equal, and "\\" and "\u005C" compare equal.¶
Single-quoted strings do not process escapes. Accordingly,
'\u0061' is six literal characters, not the single character
a.¶
This specification does not require Unicode normalization. Distinct Unicode character sequences remain distinct unless another specification defines additional normalization rules.¶
A parser MUST accept every text that conforms to the grammar in this document.¶
A parser MAY accept extensions in addition to ÜBER. Such extensions are outside this specification. An implementation that accepts extensions SHOULD make it possible to disable them when strict conformance is required.¶
Parsers MAY set limits on input size, nesting depth, collection size, numeric length, comment length, text-block length, and other resource dimensions. A parser MAY reject texts that exceed such limits.¶
Because directives are specified here only as syntax, a parser MAY recognize directive-shaped input yet reject or ignore particular directive names at a higher semantic layer.¶
A generator MUST emit texts that conform to this grammar.¶
When more than one surface spelling is available for the same data, a generator MAY choose any conforming spelling. For maximum interoperability, generators SHOULD prefer explicit root objects, explicit separators, comma-separated collections, and UTF-8 octet output.¶
Generators intended for recipients that only understand the JSON subset SHOULD avoid comments, directives, implicit top-level objects, single-quoted strings, unquoted strings, text blocks, non-decimal numbers, non-JSON boolean keywords, dotted-name shorthand, and valued members.¶
ÜBER permits several syntactic conveniences that can reduce interoperability when documents are exchanged with implementations that support only a narrower subset.¶
In particular, repeated effective member paths, mixing dotted-name shorthand with explicit nested objects, empty name segments, valued members, directive usage, non-decimal numbers, and additional boolean keywords can all reduce interoperability unless another specification defines their use more narrowly.¶
Different syntactic spellings can denote the same effective tree. For example, dotted-name shorthand and explicitly nested objects can encode the same hierarchical result. Specifications that require stable round-tripping SHOULD define a canonical surface form if this matters.¶
Valued members are part of the ÜBER grammar but do not map directly to ordinary JSON objects. Protocols that rely on plain JSON object models SHOULD avoid valued members.¶
For maximum interoperability, documents SHOULD avoid duplicate effective member paths, empty name segments, directives, valued members, and non-JSON literal forms unless those features are required by the application protocol.¶
This document requests registration of the media type
application/uber.¶
.uber¶
curtis.allen.smith@gmail.com¶
A conforming ÜBER parser:¶
A conforming specification of directive semantics MUST define directive names and their processing behavior separately from this grammar document.¶
The following examples illustrate the grammar in more detail. They are examples of syntax and structure only; they do not standardize any directive semantics or other implementation-defined runtime behavior.¶
<CODE BEGINS>
{
"server" : {
"host" : "127.0.0.1",
"port" : 8080,
"enabled" : true
},
"paths": [ "/srv/app", "/srv/log" ]
}
<CODE ENDS>
<CODE BEGINS> // host and port use different separator forms server.host : "127.0.0.1" server.port = 8080 enabled yes # commas remain optional paths [ /srv/app /srv/log, /srv/cache ] <CODE ENDS>
<CODE BEGINS>
{
users: [
alice
bob, // comma allowed but not required
carol
]
retry-count : 3
timeout-ms = 5000
}
<CODE ENDS>
<CODE BEGINS> alpha 1 beta : 2 gamma = 3 delta := 4 epsilon :: 5 zeta == 6 eta ::= 7 <CODE ENDS>
<CODE BEGINS>
{
simple.name : 1
"quoted.segment".name : 2
'literal.dot.name' : 3
escaped\.dot.name : 4
.leading.empty : 5
trailing.empty. : 6
}
<CODE ENDS>
<CODE BEGINS>
entry: scalar {
child: 1
nested.flag: on
}
<CODE ENDS>
<CODE BEGINS>
{
dq : "line\nbreak and escaped \{ braces \}"
sq : 'backslash sequences stay literal: \n \u0041'
block
"""
multi-line text block
with "quotes" and embedded line breaks
"""
uq : bareword
}
<CODE ENDS>
<CODE BEGINS>
{
decimal = 1_000_000
hexadecimal = 0xFF_EC_DE_5E
octal = 0755
octal-alt = 0o755
binary = 0b1010_0110
leading-dot = .5
scientific = 6.022e23
hex-float = 0x1.fp3
wider-int = 3_000_000_000
big-integer = 999999999999999999999999999999
big-decimal = 1e400
not-a-number = NaN
infinity = -Infinity
}
<CODE ENDS>
<CODE BEGINS>
@import imports/user.profile # implementation-defined semantics
@ example {
payload : true,
note : "semantics are implementation-defined"
}
<CODE ENDS>
<CODE BEGINS>
# human-oriented top-level form
app.name : "Example Service"
app.version : 1.2.0
app.enabled yes
server {
host : "127.0.0.1"
port = 8080
banner : """
Example Service
ready for requests
"""
}
paths.static /srv/www
paths.logs /srv/log
limits {
retries : 3
backoff-ms : 1_500
mask : 0xFF00
}
feature: on {
child.flag: on
}
@example [alpha beta gamma]
<CODE ENDS>
This document defines only grammar. Even so, parsers for the format are exposed to the usual risks of processing nested and potentially large textual inputs. Implementations SHOULD apply reasonable limits to input size, nesting depth, string length, numeric length, and comment length in order to mitigate resource-exhaustion attacks.¶
Implementations that define directive semantics should additionally consider the security implications of those semantics. Such semantics are not specified here.¶
IANA is requested to register the media type
application/uber as described in
Section 17.¶
This appendix consolidates the full grammar into one ABNF block. The production inventory and rule names are the ones defined by this document.¶
<CODE BEGINS>
; Lexical structure
ws = whitespace / comment
whitespace = inline-space / line-terminator
inline-space = SP / HTAB / %x0B / %x0C
line-terminator = CR / LF / CRLF
line-end = line-terminator / eof
eof = <end of input>
LF = %x0A
CR = %x0D
control-character = %x00-1F
comment = single-line-comment / block-comment
single-line-comment = single-line-marker
[ comment-chars ]
line-end
single-line-marker = "//" / "#" / "!"
block-comment = "/*" [ block-comment-chars ] "*/"
comment-chars =
<characters not containing CR or LF>
block-comment-chars =
<any character sequence
not containing "*/">
digit = DIGIT
onenine = %x31-39
octdigit = %x30-37
bindigit = %x30 / %x31
hexdigit = HEXDIG
digit-or-underscore = digit / "_"
hex-digit-or-underscore = hexdigit / "_"
octal-digit-or-underscore = octdigit / "_"
binary-digit-or-underscore = bindigit / "_"
decimal-digits = digit-or-underscore
/ decimal-digits digit-or-underscore
hex-digits = hex-digit-or-underscore
/ hex-digits hex-digit-or-underscore
octal-digits = octal-digit-or-underscore
/ octal-digits octal-digit-or-underscore
binary-digits = binary-digit-or-underscore
/ binary-digits binary-digit-or-underscore
true-literal = %s"true" / %s"yes" / %s"on"
false-literal = %s"false" / %s"no" / %s"off"
null-literal = %s"null"
; Syntactic structure
; Top-level profile
profile = *ws object *ws
/ *ws profile-statements *ws
profile-statement = member / directive
profile-statements = profile-statement
/ profile-statements
[ "," ]
profile-statement
directive = "@"
[ inline-space ]
directive-name
1*inline-space
value
directive-name = 1*LOWALPHA
LOWALPHA = %x61-7A
; Values
element = *ws value *ws
value = text-block
/ object
/ array
/ double-quoted-string
/ single-quoted-string
/ number
/ boolean
/ null-literal
/ unquoted-string
scalar-value = text-block
/ array
/ double-quoted-string
/ single-quoted-string
/ number
/ boolean
/ null-literal
/ unquoted-string
boolean = true-literal / false-literal
; Objects and members
object = "{" [ members ] "}"
members = member
/ members [ "," ] member
member = *ws dotted-name *ws separator
[ scalar-element ] *ws [ object ]
scalar-element = *ws scalar-value *ws
dotted-name = name-atom
/ dotted-name *ws "." *ws name-atom
name-atom = single-quoted-string
/ dq-name-atom
/ uq-name
dq-name-atom = DQUOTE dq-name-content DQUOTE
dq-name-content = dq-name-segment
/ dq-name-content "." dq-name-segment
dq-name-segment = *dq-name-char
dq-name-char =
<any character except DQUOTE, "\",
".", or control-character>
/ "\" escape-sequence
uq-name = *uq-name-char
uq-name-char =
<any character except whitespace, ",",
"{", "}", "[", "]", ":", "=",
DQUOTE, "'", "\", or ".">
/ "\" escape-sequence
separator = explicit-separator / ws
explicit-separator = separator-char
/ explicit-separator separator-char
separator-char = ":" / "="
; Arrays
array = "[" [ elements ] "]"
elements = element
/ elements [ "," ] element
; Strings
text-block = %s"\"\"\""
line-terminator
[ text-block-chars ]
%s"\"\"\""
text-block-chars = text-block-char
/ text-block-chars text-block-char
text-block-char =
<any character except "\",
control-character,
or the closing three-DQUOTE delimiter>
/ line-terminator
/ "\" escape-sequence
double-quoted-string = DQUOTE [ dq-string-chars ] DQUOTE
dq-string-chars = dq-string-char
/ dq-string-chars dq-string-char
dq-string-char =
<any character except DQUOTE, "\",
or control-character>
/ "\" escape-sequence
quoted-string = double-quoted-string
/ single-quoted-string
single-quoted-string = "'" [ sq-string-chars ] "'"
sq-string-chars = sq-string-char
/ sq-string-chars sq-string-char
sq-string-char =
<any character except "'"
or control-character>
unquoted-string = uq-string-chars
uq-string-chars = uq-string-char
/ uq-string-chars uq-string-char
uq-string-char =
<any character except whitespace,
control-character, ",", "{", "}",
"[", "]", ":", "=", DQUOTE, "'",
or "\">
/ "\" escape-sequence
ws-escape = SP
escape-sequence = simple-escape
/ unicode-escape
/ hex-escape
/ octal-escape
/ ws-escape
/ ","
/ "{"
/ "}"
/ "["
/ "]"
/ ":"
/ "="
simple-escape = %s"a" / %s"b" / %s"e"
/ %s"f" / %s"n" / %s"r"
/ %s"s" / %s"t" / %s"v"
/ "\" / "'" / DQUOTE / "/"
/ %s"0" / "." / "#" / "!" / "@"
unicode-escape = %s"u" 4hexdigit
/ %s"u" 6hexdigit
/ %s"u" 8hexdigit
/ %s"u{" braced-unicode-digits "}"
braced-unicode-digits = hexdigit
/ braced-unicode-digits
hex-digit-or-underscore
hex-escape = %s"x" hexdigit
/ hex-escape hexdigit
octal-escape = octdigit
/ octdigit octdigit
/ octdigit octdigit octdigit
; Numbers
number = special-value / numeric-literal
special-value = [ sign ] special-keyword
special-keyword = %s"NaN" / %s"Infinity"
numeric-literal = integer-literal / floating-point-literal
integer-literal = [ sign ] integer-value
integer-value = decimal-integer
/ hexadecimal-integer
/ octal-integer
/ binary-integer
decimal-integer = onenine [ decimal-digits ]
/ "0"
hexadecimal-integer = "0" hex-indicator hex-digits
octal-integer = "0" octal-digits
/ "0" octal-indicator octal-digits
binary-integer = "0" binary-indicator binary-digits
hex-indicator = %s"x" / %s"X"
binary-indicator = %s"b" / %s"B"
octal-indicator = %s"o" / %s"O"
floating-point-literal = [ sign ] decimal-floating-point-literal
/ [ sign ]
hexadecimal-floating-point-literal
decimal-floating-point-literal
= decimal-digits decimal-float-tail
/ "." decimal-digits [ exponent ]
hexadecimal-floating-point-literal
= hex-significand binary-exponent
hex-significand = "0" hex-indicator hex-digits
/ "0" hex-indicator
hex-digits "." [ hex-digits ]
/ "0" hex-indicator "." hex-digits
decimal-float-tail = "." [ decimal-digits ] [ exponent ]
/ exponent
exponent = exponent-indicator
[ sign ]
decimal-digits
exponent-indicator = %s"e" / %s"E"
binary-exponent = binary-exponent-indicator
[ sign ]
decimal-digits
binary-exponent-indicator = %s"p" / %s"P"
sign = "+" / "-"
<CODE ENDS>¶