Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Details of specifying Decimal numbers and algorithms #122

Open
waldemarhorwat opened this issue Apr 12, 2024 · 4 comments · Fixed by #138
Open

Details of specifying Decimal numbers and algorithms #122

waldemarhorwat opened this issue Apr 12, 2024 · 4 comments · Fixed by #138

Comments

@waldemarhorwat
Copy link

Can we set up a video meeting to discuss how to formalize Decimal numbers and the algorithms involving those in the spec? I'd like to ensure that the spec is correct, clear, unambiguous, and easy to read and, if desired, am willing to provide extensive help to achieve that. I'm available next week Tue-Thu.

@waldemarhorwat
Copy link
Author

waldemarhorwat commented Apr 15, 2024

I'd suggest representing Decimal128 values and algorithms analogously to how we represent Number and BigInt values.

Format

A Decimal128 value would be one of the following:

  • NaN𝔻
  • +∞𝔻
  • -∞𝔻
  • «v, q»𝔻, where v and q satisfy:
    • v is +0𝔻, -0𝔻, or a real number. Here +0𝔻 and -0𝔻 are symbols representing special values, not the real number 0.
    • q is an integer that satisfies -6176 ≤ q ≤ 6111
    • If v is a real number, v × 10-q is an integer n that satisfies 0 < |n| < 1034 

Some definitions:

  • A finite Decimal128 is a Decimal128 value of the form «v, q»𝔻.
  • A zero Decimal128 is a Decimal128 value of the form «v, q»𝔻 where v is +0𝔻 or -0𝔻.
  • A finite nonzero Decimal128 is a Decimal128 value of the form «v, q»𝔻 where v is a real number.

Cohort

The cohort of a Decimal128 value is the value without the q. Specifically:

  • cohort(NaN𝔻) is NaN𝔻
  • cohort(+∞𝔻) is +∞𝔻
  • cohort(-∞𝔻) is -∞𝔻
  • cohortv, q»𝔻) is v. Here v is +0𝔻, -0𝔻, or a real number.

The use of cohorts is pervasive in simplifying specifications of spec algorithm steps, the bulk of which do not depend on q.

Exponent and Significand

Every finite nonzero Decimal128 value «v, q»𝔻 has an exponent and a significand. The exponent is the unique integer e and the significand is the unique real number s that satisfy v = s × 10e and 1 ≤ |s| < 10.

Normalized and Denormalized Values

The exponent e of a finite nonzero Decimal128 value will always be in the range 6144 ≥ e ≥ -6176.

  • A finite nonzero Decimal128 value is normalized if its exponent e is in the range 6144 ≥ e ≥ -6143.
  • A finite nonzero Decimal128 value is denormalized if its exponent e is in the range -6144 ≥ e ≥ -6176.

This corresponds to the IEEE notion of normalized and denormalized values. Denormalized values have progressively limited resolution as they approach zero.

Truncated Exponent

Rounding behavior of denormalized Decimal128 values diverges from that of normalized Decimal128 values. For the purposes of describing rounding algorithms it's useful to define the notion of a truncated exponent te of a finite Decimal128 value.

  • The truncated exponent te of a normalized Decimal128 value is the same as its exponent e.
  • The truncated exponent te of a zero or denormalized Decimal128 value is -6143.

Given this, we can define the scaled significand ss of a finite Decimal128 value as:

  • The scaled significand ss of a zero Decimal128 value is 0.
  • The scaled significand ss of a finite nonzero Decimal128 value «v, q»𝔻 is ss = v × 1033-te, where te is the truncated exponent of the Decimal128 value.

ss will always be an integer with an absolute value less than 1034.

The main importance of a scaled significand ss is in rounding behavior, for which IEEE 754 arithmetic algorithms make different choices depending on whether ss is odd or even. The notion of scaled significands should not be exposed to users.

Edits

  • Added 𝔻 subscript to +0𝔻 and -0𝔻 to reduce confusion with the real number 0.
  • Added cohort function.

@waldemarhorwat
Copy link
Author

waldemarhorwat commented Apr 16, 2024

Some tentative items from the discussion on Tuesday, 16 April 2024. Please add comments if I missed anything.

Overall

  • Decimal should be easy to use and understand by folks without extensive numerics training. It should behave as one would expect a numeric class to work given everyday knowledge of math.
  • Decimal should be usable by numerics experts without surprises to them.
  • Decimal should provide enough common operations that folks don't need to convert a Decimal to Number and back just because the Decimal operation they want is missing.

Specification Techniques

  • We should precisely specify the results of Decimal operations, just like we do so for Number and BigInt operations, by performing operations on mathematical values and explicitly listing the rounding steps.
    • It's desirable for the ECMAScript spec to be self-contained. The IEEE specs are hard to obtain, hard to read, contain numerous extraneous functions, modes, encoding formats, and details not used by ECMAScript, and differ in details and nomenclature depending on which version of the IEEE spec one is looking at.
  • Arithmetic operations should have the well-known IEEE numerical behavior, so we need ±0 and NaN's.
  • Reviewed and discussed the proposal above.

Terminology

  • IEEE 754 has several different notions of what an exponent and a significand is, and they're easy to confuse.
  • For any user-visible APIs, exponent should have the everyday scientific notation meaning of floor(log10(|v|)), or perhaps what's called truncated exponent above.
  • IEEE 754 already has definitions of normalized and denormalized values, which are as in the proposed formalism above. We should not overload these terms to also denote the presence or absence of trailing zeroes in a particular encoding.

Quanta

  • IEEE 754 tracks the number of trailing zeroes for some (but not all) Decimal values. The number of trailing zeros has no effect on the mathematical results of the common Decimal operations, but can be used when formatting Decimal values back to strings.
    • The number of trailing zeroes is encoded in the quantum q in the formalism above.
    • The default toString should provide strings independent of q.
    • We should provide an alternate quantum-preserving conversion to strings that distinguishes based on q. Note that this will render a Decimal value equal to 2500 as one of the thirty-three possible cohort members 25e2, 250e1, 2500, 2500.0, 2500.00, …, 2500.000000000000000000000000000000 depending on q.
    • q will have no effect on the results of arithmetic and comparison operations.
      • User-visible comparison operations (=, ≠, <, ≤, >, ≥) should follow the usual IEEE semantics of comparing mathematical values, NaN ≠ NaN, +0 = -0, etc.
      • There's no strong use case for totalOrder. We don't even support it on Numbers yet.
        • For using Decimal values as keys in a map, use either the standard toString or the quantum-preserving conversions to strings depending on whether trailing zeroes are significant or not in your use case.

Naming

  • Temporal has a notion of of typed time values and uses "equals" to mean the same time in the same time zone, whereas "compare" means the same time.
  • Numbers in math have a strong precedent that "equals" means numerical equality. Having it also compare quanta (which most folks would not even be aware of) would be confusing.

Formats

  • We briefly discussed the possibility of typed arrays containing Decimal128 values, perhaps as a future extension. To do this we'd need to define the binary encoding of Decimal128 values, with the IEEE standard offering four mutually incompatible ones. Without typed arrays the choice of representation would be private to an implementation of ECMAScript and not visible via user APIs.

Conversions

  • Conversions to and from Decimal128 should be explicit.
  • We should provide the necessary explicit conversions between the various number formats. Use cases where different inputs have different numeric types will often come up. If we don't support those, users will hack around them and create chaos.
  • The natural way of converting from Decimal128 to Number is to treat the Decimal128 value as a real number and round it to the nearest Number in the usual way.
    • This is the way Number literals work in the language today.
    • Throwing an exception if the conversion is inexact would have significant usability downsides because there are common use cases where such a conversion is needed.
  • There are two possible ways to convert from a Number x to Decimal128:
    • Return the Decimal128 value closest to the mathematical value of x; this corresponds to the decimal string that x.toPrecision(34) would return. Most users who are not numerical experts would consider this behavior, which converts the Number 3.1 to the Decimal128 3.100000000000000088817841970012523 to be weird, so this should not be the default mode of the conversion. We may wish to provide it as a separate function for math experts.
    • Return the Decimal128 value that would be obtained had we converted x to a String s and then s to a Decimal128. This would convert the Number 3.1 to the Decimal128 3.1.
      • The conversion to s may have a bit of implementation latitude in some cases, but there is a way to specify a unique best conversion here.

@waldemarhorwat
Copy link
Author

waldemarhorwat commented Apr 17, 2024

Here's my summary of the discussion on Wednesday, 17 April 2024. Please add comments if I missed anything.

Overall

  • Should we order the goals listed in yesterday's overall section? That would be helpful if some of them conflict.
    • Conflicts are hypothetical at this stage. In practice the goals listed yesterday for Decimal128 shouldn't conflict. Maybe wait until we run into an actual conflict?

Rounding Modes and Implementability

  • The spec currently proposes nine rounding modes and names them "ceil", "floor", "expand", "trunc", "halfEven", "halfExpand", "halfCeil", "halfFloor", and "halfTrunc".
  • IEEE 754 defines only five rounding modes and names them "roundTiesToEven", "roundTiesToAway", "roundTowardPositive", "roundTowardNegative", "roundTowardZero".
  • These rounding modes can be applied to coercions as well as arithmetic operations, where the arithmetic is done exactly and then the exact result is rounded to a representatible Decimal128 value according to the rounding mode.

This disparity leads to questions of both nomenclature and implementability:

  • Which user nomenclature should we adopt to name the rounding modes? To answer this we should take a look at how these are named in existing practice.

  • The spec defines Decimal rounding modes not present in IEEE 754. This has the potential to create trouble for implementations using decimal hardware or existing high-performance libraries, which would generally not be written in ECMAScript. Hardware or libraries support the modes present in IEEE 754 for arithmetic. What should an implementation do if someone tries to take the sine or square root of a Decimal or divide two Decimals using a non-IEEE-754 rounding mode? Hardware or libraries won't provide the unrounded answer, only the already-rounded answer according to a chosen IEEE 754 rounding mode, so one can't just stick a different rounding step on the end of the algorithm. This becomes hard to implement.

    • Preferred solution for initial Decimal128 support: Keep rounding mode parameters for conversions, but omit rounding mode parameters from arithmetic operations; instead, just use roundTiesToEven for those like we do for Numbers.
    • We don't have folks clamoring for arithmetic operations on Numbers using one of the other IEEE 754 rounding modes, so it's likely we won't have folks clamoring for those on Decimal arithmetic operations either.
    • Side note: some arithmetic operations never round and always produce exact results, so they shouldn't take a rounding mode. Examples include comparisons, negation, absolute value, and remainder (this is true for both IEEE remainder and traditional remainder). It's not hard to mathematically prove that these operations are always exact.
    • If we do encounter good use cases, it's easy add a rounding mode parameter to arithmetic operations later in an upwards compatible manner.
  • Depending on what conversions and rounding modes we support, their definitions could use some care. An example is toExponential-style rounding of the Decimal value 9.5𝔻 to one significant digit using the roundTiesToEven mode. The nearest Decimals with one significant digit are 9𝔻 and 10𝔻, they're equally near, and neither has an even least significant digit (9 and 1 respectively).

    • Note that this weirdness arises due to a conversion requesting rounding to one significant digit. It doesn't arise for rounding in, for example, IEEE 754 arithmetic because that's always done to 34 significant digits.

User Operations on Mantissa and Exponent

Some users will want to compose a Decimal out of a mantissa and exponent and decompose a Decimal back into a mantissa and exponent. That's desirable in various applications and we should make sure that how to do this correctly is readily apparent. Some options:

  • Rely only on Decimal↔String conversions.
    • Composition: If someone wants to create a Decimal with a given mantissa and exponent, they can convert the mantissa and exponent into strings, string-concatenate mantissa+"e"+exponent, and convert the resulting string into a Decimal. This works for simple use cases but has some rough edges (what if mantissa is +∞?).
    • Decomposition: Splitting a Decimal into a mantissa and exponent is harder and provides more opportunities for bugs to creep in. Users would have to parse the string and possibly deal with it being or not being in exponential notation.
  • To solve the decomposition problem, provide ways to extract the mantissa and exponent from a Decimal.
    • The easiest approach to understand is to provide mantissa(d) and exponent(d) methods (or accessors) with the following behavior, which matches the everyday concept of scientific notation:
      • exponent(d) returns the exponent (or truncated exponent) defined here. The result is an integer and its type should probably be Number, not Decimal.
      • mantissa(d) returns a Decimal whose absolute value is between 1 inclusive and 10 exclusive, except in the case of zero (or denormals if we choose truncated exponent for the exponent method).
      • If someone wants the mantissa as an integer instead, they can just compute mantissa(d) * 1033. This will always be an exact integer. They can convert it to a BigInt if they like.
  • To help with the composition problem, we could provide a scale10 operation as specified in IEEE 754: Given a Decimal d and an integer Number n, scale10(dn) is the Decimal d × 10n rounded in the usual way. This provides multiplication and division by powers of 10.
    • If we don't provide such a function or equivalent, users will generate powers of 10 by creating a string starting with "1e" followed by the exponent and convert the string to a Decimal. This almost works but can cause intermediate overflow problems: scale10(0.02𝔻, 6145) will correctly return 2e6143𝔻, whereas the string hack will produce an infinity instead.

Explicit Conversions

Decimal ↔ String

Decimal → String

  • The default conversion of a Decimal to a String should be as discussed yesterday.
    • Just like the Number → String conversion, the default Decimal → String conversion should switch to scientific notation for values with sufficiently high or low magnitude. Given Decimal's greater number of significant digits, the thresholds for this switch should be greater than they are for Number.
  • We can also provide a conversion of a Decimal to a String that distinguishes cohort members (reproduces trailing zeroes) as either a separate function or a separate mode. This one is tricky to use due to it unexpectedly generating scientific notation for Decimals with negative numbers of trailing zeroes.
  • To support users explicitly wanting non-scientific or scientific notation strings, we should provide Decimal equivalents of toPrecision, toExponential, etc. Rounding details are important here, so these should take optional rounding modes.

String → Decimal

  • This should be straightforward, treating the string as a mathematical real number and then rounding it to a Decimal using roundTiesToEven.
  • Should we provide a version that can take a different rounding mode?

Decimal ↔ Number

We discussed those yesterday and settled on the choices:

  • Decimal → Number treats the Decimal as a mathematical real number and then rounds it to a Number in the usual way using roundTiesToEven.
  • Number → Decimal should be done as if the conversion were Number → String → Decimal for the reasons outlined yesterday. We want the Number 3.1 to produce the Decimal 3.1, not the Decimal 3.100000000000000088817841970012523.
    • We do not want to throw on inexact conversions here. There will be lots of cases where users need to do heterogeneous arithmetic and need to convert a Number to a Decimal.

Decimal ↔ BigInt

  • BigInt → Decimal treats the BigInt as a mathematical integer and then rounds it to a Decimal using roundTiesToEven.
  • Decimal → BigInt should follow the precedent of Number → BigInt and produce the BigInt if the Decimal is an integer or throw if not.

Conclusion

  • Waldemar is offering to help with writing spec text and algorithms for the July meeting

jessealama added a commit that referenced this issue May 2, 2024
jessealama added a commit that referenced this issue May 2, 2024
* Define a Makefile for basic editorial work

* Add a number of suggested definitions

Relates to #122.
jessealama added a commit that referenced this issue May 15, 2024
* Fix how we refer to exceptions.

+ For `Decimal128.prototype.round`, use `mark` to mark an
incomplete algorithm.

+ Use `*this* value` rather than `**this** value`.

Part of #122
@jessealama jessealama reopened this May 17, 2024
@jessealama
Copy link
Collaborator

(This issue got unintentionally closed; reopening it.)

jessealama added a commit that referenced this issue May 30, 2024
…ion, and remainder (#143)

* Add `toFixed`, `toPrecision`, and `toExponential`

* Define AO to massage a potential Decimal128 value to an actual one

* Say that Decimal128 values aren't ECMAScript language values

* Inline AO used only in the constructor

* Fix references to undefined variables and mark an in-progres spot

* Flesh out `round` method

* Fix specification for exponent in `divide`

* Add checks for rounding mode

* Flesh out constructor

* Add brand checks

* Add links

* Add reference for rounding

* Add more IEEE 754 references

* Use "Otherwise"

* Add note about how we convert from Number

relates to #122
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants