Concise Binary Object Representation Maint&Ext                C. Bormann
Internet-Draft                                    Universität Bremen TZI
Intended status: Informational                               9 July 2024
Expires: 10 January 2025


                           On Numbers in CBOR
                     draft-bormann-cbor-numbers-00

Abstract

   The Concise Binary Object Representation (CBOR), as defined in STD 94
   (RFC 8949), is a data representation format whose design goals
   include the possibility of extremely small code size, fairly small
   message size, and extensibility without the need for version
   negotiation.

   Among the kinds of data that a data representation format needs to be
   able to carry, numbers have a prominent role, but also have inherent
   complexity that needs attention from protocol designers and
   implementers of CBOR libraries and of the applications that use them.

   This document gives an overview over number formats available in CBOR
   and some notable CBOR tags registered, and it attempts to provide
   information about opportunities and potential pitfalls of these
   number formats.


   // This is a rather drafty initial revision, pieced together from
   // various components, so it has a higher level of redundancy than
   // ultimately desired.

About This Document

   This note is to be removed before publishing as an RFC.

   Status information for this document may be found at
   https://datatracker.ietf.org/doc/draft-bormann-cbor-numbers/.

   Discussion of this document takes place on the Concise Binary Object
   Representation Maintenance and Extensions Working Group mailing list
   (mailto:cbor@ietf.org), which is archived at
   https://mailarchive.ietf.org/arch/browse/cbor/.  Subscribe at
   https://www.ietf.org/mailman/listinfo/cbor/.

   Source for this draft and an issue tracker can be found at
   https://github.com/cabo/cbor-numbers.


Bormann                  Expires 10 January 2025                [Page 1]

Internet-Draft                CBOR Numbers                     July 2024


Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at https://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on 10 January 2025.

Copyright Notice

   Copyright (c) 2024 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents (https://trustee.ietf.org/
   license-info) in effect on the date of publication of this document.
   Please review these documents carefully, as they describe your rights
   and restrictions with respect to this document.  Code Components
   extracted from this document must include Revised BSD License text as
   described in Section 4.e of the Trust Legal Provisions and are
   provided without warranty as described in the Revised BSD License.

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
     1.1.  Conventions and Definitions . . . . . . . . . . . . . . .   4
   2.  Integer Numbers . . . . . . . . . . . . . . . . . . . . . . .   4
   3.  IEEE 754 Floating Point Numbers . . . . . . . . . . . . . . .   5
     3.1.  Integer vs. Floating Point  . . . . . . . . . . . . . . .   6
     3.2.  Considerations for non-finite numbers and non-numbers . .   6
       3.2.1.  Protocol Design Considerations  . . . . . . . . . . .   7
       3.2.2.  Implementation Considerations . . . . . . . . . . . .   8
   4.  Other Floating Point Numbers  . . . . . . . . . . . . . . . .  10
   5.  Tagged Arrays of Numbers  . . . . . . . . . . . . . . . . . .  10
   6.  Security Considerations . . . . . . . . . . . . . . . . . . .  10
   7.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  11
   8.  References  . . . . . . . . . . . . . . . . . . . . . . . . .  11
     8.1.  Normative References  . . . . . . . . . . . . . . . . . .  11
     8.2.  Informative References  . . . . . . . . . . . . . . . . .  11


Bormann                  Expires 10 January 2025                [Page 2]

Internet-Draft                CBOR Numbers                     July 2024


   Appendix A.  Implementers' Checklists for Floating Point
           Values  . . . . . . . . . . . . . . . . . . . . . . . . .  12
     A.1.  NaN Payloads  . . . . . . . . . . . . . . . . . . . . . .  13
       A.1.1.  NaN Implementation Details  . . . . . . . . . . . . .  14
       A.1.2.  NaN Tests Examples  . . . . . . . . . . . . . . . . .  15
   Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . .  16
   Contributors  . . . . . . . . . . . . . . . . . . . . . . . . . .  16
   Author's Address  . . . . . . . . . . . . . . . . . . . . . . . .  17

1.  Introduction

   The Concise Binary Object Representation (CBOR), as defined in RFC
   8949 [STD94], is a data representation format whose design goals
   include the possibility of extremely small code size, fairly small
   message size, and extensibility without the need for version
   negotiation.

   Among the kinds of data that a data representation format needs to be
   able to carry, numbers have a prominent role, but also have inherent
   complexity that needs attention from protocol designers and
   implementers of CBOR libraries and of the applications that use them.

   This document gives an overview over number formats available in CBOR
   and some notable CBOR tags registered, and it attempts to provide
   information about opportunities and potential pitfalls of these
   number formats.

   It discusses CBOR representation of numbers in four main Sections:

   *  Integer Numbers (Section 2),

   *  IEEE 754 Floating Point Numbers (Section 3),

   *  Other Floating Point Numbers (Section 4),

   *  Tagged Arrays of Numbers (Section 5).

   These sections will generally address considerations such as:

   *  Encoding efficiency (number of bytes needed), possibly processing
      efficiency (CPU used in processing)

   *  Preferred Serialization, Common Deterministic Encoding Profile
      (CDE, [I-D.ietf-cbor-cde], see also [I-D.bormann-cbor-det] for
      more background discussion)

   *  Use by applications


Bormann                  Expires 10 January 2025                [Page 3]

Internet-Draft                CBOR Numbers                     July 2024


   *  Interoperability considerations, potential "dark corners"

1.1.  Conventions and Definitions

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
   "OPTIONAL" in this document are to be interpreted as described in
   BCP 14 [BCP14] when, and only when, they appear in all capitals, as
   shown here.

   Terms and definitions from [STD94], [RFC8610], and [IEEE754] apply.

2.  Integer Numbers

   CBOR provides representations of integer numbers in unsigned and
   negative forms:

   *  Unsigned integers up to 2^64−1, major type 0

   *  Negative integers down to −2^64, major type 1

   *  Unsigned integers with no size limitations, tag 2 on a byte string

   *  Negative integers with no size limitations, tag 3 on a byte string

   The latter two forms are often called "bignums" for historical
   reasons, the former "basic" integers.  The Concise Data Definition
   Language (CDDL) [RFC8610] has the types uint, nint, and int, for the
   ranges of values covered by major type 0, major type 1, and either of
   them, respectively; biguint, bignint, and bigint for the range of
   value covered by tag 2, tag3, and either; and unsigned and integer
   for a choice of either form (but interestingly no negative).  As the
   preferred encoding for an integer chooses between major type 0/1 and
   tag 2/3 automatically, in practice biguint and unsigned are the same
   type, as are bigint and integer.

   The Major type 0 numbers come in five different encoding sizes, as
   indicated by their initial byte: immediate ("1+0") encoding (0..23),
   one-byte ("1+1") (0..255), two-byte ("1+2", 0..65535), four-byte, and
   eight-byte.  The Preferred Serialization always uses the shortest of
   the major type 0 encodings available for an unsigned integer.  The
   intention is that there is no semantic difference between the major
   type 0 encodings, and there also is no semantic difference between
   major type 0 and tag 2.  This means that Preferred Serialization
   always uses major type 0 over tag 2 when possible, and the shortest
   encoding of these (and thus no leading zero bytes for the tagged
   encodings).  Major type 1 and tag 3 are analogous.


Bormann                  Expires 10 January 2025                [Page 4]

Internet-Draft                CBOR Numbers                     July 2024


   Note that there is no "signed type" in CBOR: as any specific number
   to be represented is either negative or not, it is represented as an
   unsigned integer or as a negative integer.  Major type 0 unsigned
   integers cover exactly the range of platform types such as uint64_t
   or u64.  Signed platform types such as int64_t or i64 can be
   represented in the lower half of the unsigned space and the upper
   half of the negative space.  Platforms typically have no nint64_t
   type that could take all negative numbers representable in major type
   1; generic decoders will therefore treat the lower half of the
   negative space in the same way they will treat bignums that do not
   fit the signed platform type.  Similarly, generic encoders for a
   platform with u128/i128 types will choose between major type 0/1 and
   tag 2/3 just like they would choose between the encoding sizes inside
   major type 0/1.

   While additional representation of integers could be developed, the
   options already provided by [STD94] should be able to satisfy most
   applications.

3.  IEEE 754 Floating Point Numbers

   While integer numbers are relatively easy to represent, floating
   point numbers as a realization of rational or real numbers are a much
   more varied subject.  Many rational or real numbers require rounding
   until they can be encoded as a floating point number.

   There are many choices that can be made when designing a machine
   representation for floating point numbers.  After decades of vendor-
   specific formats, IEEE standardized [IEEE754], initially in 1985,
   updated in 2008 and then 2019 (IEC 559 is then mirroring IEEE 754).
   This standard is widely adopted in hardware and software, offering
   choices such as binary vs. decimal floating point numbers, and
   different representation sizes.  Out of the large choice available,
   CBOR directly supports the three formats binary16, binary32, and
   binary64, i.e., the signed binary floating point formats in 16, 32,
   and 64 bits, colloquially known as half (16 bits), single (32 bits),
   and double (64 bits) precision.  Most platforms that support floating
   point computation support at least single precision, except for the
   most constrained ones also double precision, while half precision is
   mostly used for storage and interchange only and may be software-
   supported only.


Bormann                  Expires 10 January 2025                [Page 5]

Internet-Draft                CBOR Numbers                     July 2024


3.1.  Integer vs. Floating Point

   Mathematically speaking, integer numbers are a subset of the rational
   or real numbers from which floating point numbers are drawn.  In many
   programming environments, however, integer numbers are clearly
   separated from floating point numbers (the most notable exception
   being the original JavaScript language, which only had one number
   type).

   For specific applications, it may be desirable to represent all
   numbers that can be represented as integers as such, even if they are
   used where floating point numbers are used for non-integers.
   [I-D.mcnally-deterministic-cbor] defines a CDE application profile
   that enforces this for a certain subset of the integers.

   Most CBOR applications so far have tended to get by with the kind of
   strong separation between the integer and floating point worlds that
   programming environments usually favor, so our focus will not be on
   approaches for intermingling them in this document.

3.2.  Considerations for non-finite numbers and non-numbers

   IEEE754 distinguishes three kinds of floating point data item:

   *  finite floating-point number: A finite number that is
      representable in a floating-point format.  Note that these further
      divide into zero, subnormal, and normal; this distinction is
      usually not of interest in interchange, except that there are a
      few platforms with limited floating point support that may not
      support subnormal numbers.

   *  infinite floating-point number: One of the two values −Infinite
      and (positive) Infinite.  On many platforms, infinite numbers can
      be accessed via a floating point operation such as 1.0/0.0
      (positive infinity) or −1.0/0.0 (negative infinity); they react to
      comparisons as one would expect.

   *  NaN: a _floating point datum_ that is not a number (NaN), used to
      represent computations that didn't lead to a numeric result, not
      even an infinity.  A commonly implemented example for such a
      computation is 0.0/0.0.  The formats provide a way to include
      additional information with a NaN, such as its sign bit, whether
      operations on the NaN are intended to fail immediately (signaling)
      or just return another NaN (quiet), and some remaining bits that
      may carry additional information (intended as diagnostic).


Bormann                  Expires 10 January 2025                [Page 6]

Internet-Draft                CBOR Numbers                     July 2024


      It can be surprising that according to [IEEE754], NaN values
      always compare as different even if they have the same NaN
      information (i.e., are identical).  (There is also a totalorder
      relation that does give NaNs a defined place, depending on their
      sign bits; this only recently has been standardized as part of
      std::strong_order in C++20 [Cplusplus20].)

   Not all platforms that can use IEEE 754 do provide all these kinds,
   e.g., Erlang only provides finite floating-point numbers.  Platforms
   that do provide them widely vary in the way they provide access to
   non-finite numbers and NaNs beyond the floating point operations
   given above.  Usually there is an operation such as isnan() in C,
   which is needed as comparison to a NaN always yields inequality.

3.2.1.  Protocol Design Considerations

   CBOR supports the interchange of all kinds of IEEE 754 data items,
   including non-finite numbers and non-numbers (NaNs).  For an
   application developer that is already using IEEE 754 floating point,
   there is little additional consideration required: Both infinities
   and NaN are widely supported in IEEE-754 hardware and software by
   CPUs, OS’s and programming environments.  CBOR protocol designs can
   generally rely on infinities and NaN as a concept being supported,
   but implementations may run into dark corners of their platforms when
   it comes to distinguishing and preserving NaN information in NaN
   values.

   However, for a protocol that wants to achieve good interoperability
   over a wide variety of platforms, the fact that platforms differ in
   their support of non-finite numbers and NaNs becomes relevant.  (See
   Section 3.2.2 below for reasons for such differences.)  Protocol
   designs aiming for the widest possible platform support may want to
   implement replacements for infinite numbers and NaNs, or at least not
   rely on NaN information being successfully preserved during
   interchange.

JSON Compatibility

   Note that JSON supports neither infinite numbers nor NaN.  For
   protocols that are intended to work in both CBOR and JSON
   representations and need an out-of-band indicator comparable to NaN,
   a protocol developer might consider this (in CDDL, where float is not
   intended to be a NaN value):

   float-with-null = float / null

   Additional choices can be added for the infinities (e.g., false and
   true, to stay within the CBOR simple values), if required.


Bormann                  Expires 10 January 2025                [Page 7]

Internet-Draft                CBOR Numbers                     July 2024


   Since null, false and true have single-byte representations, the
   replacement of NaN, −Infinity, and (positive) Infinity by these
   values can save bytes even if JSON compatibility is not a
   consideration.

   Applications that need to preserve the information in a NaN (sign
   bit, quiet bit, payload) may want to replace null with an
   application-oriented representation of that information, or simply
   with a (left-aligned, truncating trailing zero bytes) byte string
   representing those bits:

   float-with-nan-replacement = float / bytes

   For JSON, the byte string can be base16- or base64-encoded, or it can
   be represented by an integer, preserving its left-aligned nature, or
   even as a (tagged) floating point value with a different exponent.

3.2.2.  Implementation Considerations

   All floating-point numbers, including zeros and infinities, are
   signed.  A NaN also carries a sign bit.  Each of the three formats
   binary16, binary32, and binary64 define a fixed assignment of bits in
   the representation towards the sign bit, an exponent, and a
   "significand" (which represents the mantissa, with details sometimes
   depending on the specific exponent value).

             +==========+==========+==========+=============+
             | Format   | Sign bit | Exponent | Significand |
             +==========+==========+==========+=============+
             | binary16 | 1        | 5        | 10          |
             +----------+----------+----------+-------------+
             | binary32 | 1        | 8        | 23          |
             +----------+----------+----------+-------------+
             | binary64 | 1        | 11       | 52          |
             +----------+----------+----------+-------------+

                Table 1: Bit Allocation in Floating Point
                                 Formats

   Infinite numbers are represented in each format choice with a sign
   bit, the highest available exponent value (all ones) and all-zero
   significand.  NaN values are represented with a sign bit, the highest
   available exponent value (all ones) and a non-zero significand, which
   carries a leading quiet bit with the rest of the bits allocated to
   the NaN payload.


Bormann                  Expires 10 January 2025                [Page 8]

Internet-Draft                CBOR Numbers                     July 2024


   To qualify as a generic encoder or decoder, a CBOR library needs to
   implement as much of [IEEE754] support as reasonably possible on the
   platform it addresses.  What is reasonably possible depends on:

   *  platform support for [IEEE754] numbers.  If there is no such
      support, the generic decoder may need to resort to offering the
      interchanged value to the application, suitably tagged.

   *  If there is partial support, it may be harder to find a good
      solution.  This is specifically a problem for platform support
      that works well in most cases, but exhibits some dark corners.
      E.g., the implementation may support a single NaN value
      consistently, but not preserve NaN information present in the NaN
      values.

   Where an implementation needs to convert between different floating
   point formats, e.g., because not all formats are fully supported by
   the platform, or to implement Preferred Serialization (as needed for
   Common Deterministic Encoding [I-D.ietf-cbor-cde]) in an encoder,
   conversion of NaNs in these formats is best done by operating on the
   bit patterns of the [IEEE754] number in the following way:

   *  Expansion (towards a larger size format):

      -  preserve the sign bit

      -  expand the (all-ones) exponent to the larger (all-ones)
         exponent

      -  fill up the significand with zero bits on the right

   *  Contraction (towards a smaller size format):

      -  preserve the sign bit

      -  truncate the (all-ones) exponent to the smaller (all-ones)
         exponent

      -  truncate the significand from the right; check if the removed
         bits were all zero.


Bormann                  Expires 10 January 2025                [Page 9]

Internet-Draft                CBOR Numbers                     July 2024


   If the contraction is optional, e.g., for Preferred Serialization, do
   not perform the contraction if the removed bits in the significand
   truncation aren't all zero.  If the contraction is required to fit
   into limited platform types (e.g., binary32 only), a failed
   truncation check indicates the loss of information and should be
   signaled to the application.  We say a contraction "preserves the NaN
   information" if subsequent expansion to the original size format
   recreates the exact same NaN value.

   Appendix A.1 gives additional detailed considerations for
   implementations that aspire to provide full support for NaNs,
   preserving NaN information.

4.  Other Floating Point Numbers

   RFC 8949 [STD94] also defines tags 4 and 5 for a representation of
   decimal and binary floating point numbers that is not constrained by
   the types provided by IEEE 754.  These tags are very flexible, but
   this flexibility comes with a choice of ways they could be integrated
   into a generic encoder.  Because of this flexibility, tags 4 and 5 do
   not define a Preferred Serialization or a deterministic encoding.

   Section 3.2 of [I-D.ietf-cbor-time-tag] uses representations derived
   from the tags 4 and 5 to represent timestamps.  Section 6.1 of
   [I-D.ietf-cbor-time-tag] lists various other tags that can be used
   for representing numbers for advanced arithmetic, including rational
   numbers in fraction form (tag 30).

5.  Tagged Arrays of Numbers

   [RFC8746] defines tags for typed arrays, i.e., arrays of numbers that
   all are represented in the same way.  The choices defined in the
   [RFC8746] are all based on traditional platform number
   representations (unsigned integers, signed integers, IEEE 754
   floating point values) and even come in little-endian and big-endian
   variants, often removing the need to convert the numbers from an
   internal to an interchange form.  As conversion for interchange is
   not envisioned, considerations for a preferred serialization are not
   applicable.  As the recipient may need a conversion for ingestion of
   the arrays, some considerations from Section 3 may apply.

6.  Security Considerations

   The general security considerations for representing data in common
   data representation formats apply, e.g., those in Section 10 of RFC
   8949 [STD94].

   (TODO)


Bormann                  Expires 10 January 2025               [Page 10]

Internet-Draft                CBOR Numbers                     July 2024


7.  IANA Considerations

   (TODO:

   Add nan'' registration when that is ready)

8.  References

8.1.  Normative References

   [BCP14]    Best Current Practice 14,
              <https://www.rfc-editor.org/info/bcp14>.
              At the time of writing, this BCP comprises the following:

              Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119,
              DOI 10.17487/RFC2119, March 1997,
              <https://www.rfc-editor.org/info/rfc2119>.

              Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
              2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
              May 2017, <https://www.rfc-editor.org/info/rfc8174>.

   [IEEE754]  IEEE, "IEEE Standard for Floating-Point Arithmetic", IEEE
              Std 754-2019, DOI 10.1109/IEEESTD.2019.8766229,
              <https://ieeexplore.ieee.org/document/8766229>.

   [RFC8610]  Birkholz, H., Vigano, C., and C. Bormann, "Concise Data
              Definition Language (CDDL): A Notational Convention to
              Express Concise Binary Object Representation (CBOR) and
              JSON Data Structures", RFC 8610, DOI 10.17487/RFC8610,
              June 2019, <https://www.rfc-editor.org/rfc/rfc8610>.

   [RFC8746]  Bormann, C., Ed., "Concise Binary Object Representation
              (CBOR) Tags for Typed Arrays", RFC 8746,
              DOI 10.17487/RFC8746, February 2020,
              <https://www.rfc-editor.org/rfc/rfc8746>.

   [STD94]    Internet Standard 94,
              <https://www.rfc-editor.org/info/std94>.
              At the time of writing, this STD comprises the following:

              Bormann, C. and P. Hoffman, "Concise Binary Object
              Representation (CBOR)", STD 94, RFC 8949,
              DOI 10.17487/RFC8949, December 2020,
              <https://www.rfc-editor.org/info/rfc8949>.

8.2.  Informative References


Bormann                  Expires 10 January 2025               [Page 11]

Internet-Draft                CBOR Numbers                     July 2024


   [Cplusplus20]
              International Organization for Standardization,
              "Programming languages - C++", Sixth Edition, ISO/
              IEC ISO/IEC JTC1 SC22 WG21 N 4860, March 2020,
              <https://isocpp.org/files/papers/N4860.pdf>.

   [I-D.bormann-cbor-det]
              Bormann, C., "CBOR: On Deterministic Encoding", Work in
              Progress, Internet-Draft, draft-bormann-cbor-det-02, 3
              March 2024, <https://datatracker.ietf.org/doc/html/draft-
              bormann-cbor-det-02>.

   [I-D.ietf-cbor-cde]
              Bormann, C., "CBOR Common Deterministic Encoding (CDE)",
              Work in Progress, Internet-Draft, draft-ietf-cbor-cde-02,
              3 March 2024, <https://datatracker.ietf.org/doc/html/
              draft-ietf-cbor-cde-02>.

   [I-D.ietf-cbor-time-tag]
              Bormann, C., Gamari, B., and H. Birkholz, "Concise Binary
              Object Representation (CBOR) Tags for Time, Duration, and
              Period", Work in Progress, Internet-Draft, draft-ietf-
              cbor-time-tag-12, 30 October 2023,
              <https://datatracker.ietf.org/doc/html/draft-ietf-cbor-
              time-tag-12>.

   [I-D.mcnally-deterministic-cbor]
              McNally, W., Allen, C., and C. Bormann, "dCBOR: A
              Deterministic CBOR Application Profile", Work in Progress,
              Internet-Draft, draft-mcnally-deterministic-cbor-10, 10
              June 2024, <https://datatracker.ietf.org/doc/html/draft-
              mcnally-deterministic-cbor-10>.

Appendix A.  Implementers' Checklists for Floating Point Values

   This check list employs [BCP14] keywords to indicate interoperability
   requirements on implementations.

   The following considerations apply to encoding (emitting) floating
   point values in a generic encoder:

   *  The length of the argument is encoded in the lower 5 bits of the
      first byte ("ai"), which indicates half precision (binary16, ai =
      0x19), single precision (binary32, ai = 0x1a) and double precision
      (binary64, ai = 0x1b).


Bormann                  Expires 10 January 2025               [Page 12]

Internet-Draft                CBOR Numbers                     July 2024


      For preferred serialization: if multiple of these encodings
      preserve the precision of the value to be encoded, only the
      shortest form of these MUST be emitted.  That implies that
      encoders MUST support half-precision and (if there is support for
      more than half precision on the platform) single-precision
      floating point.  Positive and negative infinity and zero MUST be
      represented in half-precision floating point.

   *  NaNs MUST be supported, for all values of NaN information allowed
      in [IEEE754].

      As with all floating point numbers, NaNs with payloads MUST be
      contracted to the shortest of double, single or half precision
      that preserves the NaN information.

      The reduction is performed by removing the rightmost N bits of the
      payload, where N is the difference in the number of bits in the
      significand (mantissa) between the original format and the reduced
      format.  The reduction is performed only (preserves the value
      only) if all the rightmost bits removed are zero.  (This will
      always reduce a double or single quiet NaN with an otherwise zero
      NaN payload, which is typically what is returned from an operation
      such as 0.0/0.0, to a half-precision quiet NaN encoded as
      0xf9 7e00.)

   The following considerations apply to decoding (ingesting) floating
   point values in a generic decoder that supports IEEE 754 floating-
   point numbers:

   *  Half-precision values MUST be accepted.

   *  Double- and single-precision values SHOULD be accepted; leaving
      these out is only foreseen for decoders that need to work in
      exceptionally constrained environments.

   *  If double-precision values are accepted, single-precision values
      MUST be accepted.

   *  NaNs, MUST be accepted, preserving the NaN information for use of
      the application.

A.1.  NaN Payloads

   An IEEE-754 data item has up to 52 bits in the significand.  For a
   NaN, the first of these bits is used to indicate whether the NaN is
   signalling (0) or quiet (1).  The up to 51 bits in the rest of the
   significand are called the "NAN payload".


Bormann                  Expires 10 January 2025               [Page 13]

Internet-Draft                CBOR Numbers                     July 2024


   The payload’s original purpose is diagnostic information to explain
   why a NaN was generated by a local computation.  There is no standard
   for the contents of a NaN payload.

   CBOR allows NaNs with non-zero payloads to be encoded.  (Due to the
   way infinite numbers are encoded in [IEEE754], zero-payload NaN
   always must be quiet NaNs.)

   As a result, if a protocol design does not use NaNs with non-zero
   payloads and is using preferred serialization then NaN must be
   encoded as a half-precision with the quiet bit set and the payload
   set as 0, specifically 0xF97E00.  If a design does not use NaNs with
   non-zero payloads and preferred serialization is not used, then the
   single and double precision quiet NaNs, 0xFA7FC00000 and
   0xFB7FF0000000000000, may also be used.

   NaN payloads have been in the IEEE-754 standard since 2008, but
   programming environments often still do not provide facilities (e.g.,
   APIs) to make use of them.  For example, in C there is the isnan()
   API to check if a value is a NaN, but there are no APIs to construct
   or access the NaN payload.  The typical way to work with a NaN
   payload is to reinterpret the floating-point value as an unsigned
   integer and then use shifts and masks to unpack the IEEE-754
   representation.

A.1.1.  NaN Implementation Details

   This section is primarily for CBOR library implementors.

   CBOR attempts to limit the MUSTs about CBOR implementations in order
   to allow its use in a large variety of constrained use cases.  For
   example, support for integers is not required because a protocol
   might need only strings.  Similarly, there is no MUST that requires
   support of NaN and NaNs with non-zero payloads, but the
   recommendation here is that any generic CBOR library that supports
   floating-point support NaNs, preferably also with non-zero NaN
   payloads.

   In most environments, there is little extra work to do to support NaN
   without payloads if floating-point is supported.  NaNs will usually
   flow through as any other floating-point value.

   Generic CBOR libraries are expected to support preferred
   serialization of floating-point including NaNs.  For NaNs with zero
   payloads, this requires reducing to a half-precision NaN without a
   payload.  This requires a few explicit extra lines of code.  See the
   sample half-precision implementation in Appendix D of RFC 8949.


Bormann                  Expires 10 January 2025               [Page 14]

Internet-Draft                CBOR Numbers                     July 2024


   The implementation of preferred serialization of NaN payloads needs a
   few more additional lines.  As with preferred serialization, NaN
   payloads must be reduced but only if they can be reduced without the
   loss of any non-zero payload bits.  Programming platform provided
   floating-point hardware and software may or may not do this correctly
   for double to single conversion.  The sample half-precision
   implementation in Appendix D of RFC 8949 only supports NaNs without
   payloads.

   A double precision NaN payload contains 51 bits, a single 22 bits and
   a half 9 bits, in each case all but the first bit of the significand.
   A double precision NaN can be reduced to a single precision NaN only
   if the right-most 29 payload bits are zero.  A single precision NaN
   can be reduced to a half precision NaN only if the right-most 13
   payload bits are zero.  A double NaN can be reduced to a half
   precision NaN only if the right-most 42 payload bits are zero.  Note
   that the exponent is always all-ones for NaN, so this is simpler than
   the equivalent contraction of regular, non-NAN, floating-point
   values.

   To implement the above, most CBOR libraries will have to reinterpret
   the floating point value as an unsigned integer and use shifts and
   masks, based in the internal representation defined in [IEEE754].

   Testing on some CPUs has shown them to do this correctly for
   conversion between single and double.  However, it may not be very
   useful to rely on platform libraries for the following reasons.
   First, they may provide no support at all for half-precision and
   half-precision is required for preferred serialization.  Second, NaN
   payloads are a relatively recent and very specialist feature that is
   not usually used in interchange.

   If platform implementation is relied upon, NaN payload reduction
   should be tested on each platform.  Open source libraries intended to
   run on multiple platforms may be better off not relying on the
   platform.

A.1.2.  NaN Tests Examples

   The IEEE-754 numbers are given as a 64-bit (binary64) or 32-bit
   (binary32) unsigned integer in hex to show the bits that make up the
   floating-point value.  All of the following are NaNs.


Bormann                  Expires 10 January 2025               [Page 15]

Internet-Draft                CBOR Numbers                     July 2024


    +====================+======================+=====================+
    | IEEE-754 Number    | CBOR Preferred       | Comment             |
    |                    | Serialization        |                     |
    +====================+======================+=====================+
    | 0x7ff8000000000000 | 0xf97e00             | qNaN contracted     |
    |                    |                      | from double to half |
    +--------------------+----------------------+---------------------+
    | 0x7ff8000000000001 | 0xfb7ff8000000000001 | Can't be contracted |
    |                    |                      | because of bit set  |
    |                    |                      | in right-side part  |
    |                    |                      | of payload          |
    +--------------------+----------------------+---------------------+
    | 0x7ffffc0000000000 | 0xf97fff             | 10-bit payload that |
    |                    |                      | can be contracted   |
    |                    |                      | to half             |
    +--------------------+----------------------+---------------------+
    | 0x7ff80000000003ff | 0xfb7ff80000000003ff | right-justified     |
    |                    |                      | payload can't be    |
    |                    |                      | contracted          |
    +--------------------+----------------------+---------------------+
    | 0x7fffffffe0000000 | 0xfa7fffffff         | 23-bit payload that |
    |                    |                      | reduces to single   |
    +--------------------+----------------------+---------------------+
    | 0x7ffffffff0000000 | 0xfb7ffffffff0000000 | 24-bit payload that |
    |                    |                      | can't be contracted |
    +--------------------+----------------------+---------------------+
    | 0x7fffffffffffffff | 0xfb7fffffffffffffff | All payload bits    |
    |                    |                      | set, can't be       |
    |                    |                      | contracted          |
    +--------------------+----------------------+---------------------+
    | 0x7fc00000         | 0xf97e00             | qNaN contracted     |
    |                    |                      | from single to half |
    +--------------------+----------------------+---------------------+
    | 0x7fffe000         | 0xf97fff             | single 10-bit       |
    |                    |                      | payload that can be |
    |                    |                      | contracted          |
    +--------------------+----------------------+---------------------+
    | 0x7fbff000         | 0xfa7fbff000         | single payload that |
    |                    |                      | can't be contracted |
    |                    |                      | to 10 bits          |
    +--------------------+----------------------+---------------------+

        Table 2: Examples for Preferred Serialization of NaN values

Acknowledgments

Contributors


Bormann                  Expires 10 January 2025               [Page 16]

Internet-Draft                CBOR Numbers                     July 2024


   Laurence Lundblade
   Security Theory LLC
   Email: lgl@securitytheory.com


   Laurence wrote much of the initial text about NaN processing.

Author's Address

   Carsten Bormann
   Universität Bremen TZI
   Postfach 330440
   D-28359 Bremen
   Germany
   Phone: +49-421-218-63921
   Email: cabo@tzi.org


Bormann                  Expires 10 January 2025               [Page 17]