diff --git a/AGENTS.md b/AGENTS.md index debde13..a8b4fb2 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -10,8 +10,8 @@ dashu is a library set of arbitrary precision numbers implemented in pure Rust, |---|---|---| | `dashu-base` | `base/` | Common trait definitions and utilities | | `dashu-int` | `integer/` | Arbitrary precision integers (`UBig`, `IBig`) | -| `dashu-float` | `float/` | Arbitrary precision floats (`FBig`) | -| `dashu-ratio` | `rational/` | Arbitrary precision rationals (`RBig`) | +| `dashu-float` | `float/` | Arbitrary precision floats (`FBig`, `DBig`, `CachedFBig`) | +| `dashu-ratio` | `rational/` | Arbitrary precision rationals (`RBig`, `Relaxed`) | | `dashu-macros` | `macros/` | Procedural macros for literal big numbers | | `dashu-python` | `python/` | PyO3 Python bindings (not in default members) | | *(benchmark)* | `benchmark/` | Profiling scratchpad, not a comprehensive benchmark suite | @@ -46,6 +46,7 @@ Note: always `--exclude dashu-python` when running workspace-wide commands, sinc - Third-party trait implementations go in a `third_party/` module per crate, feature-gated - When borrowing an algorithm idea from GMP (or any other library), do **not** reference its function names in our docstrings or comments. Describe the algorithm in our own terms and use our own function names (e.g. write `add_mul_dword_same_len_in_place`, never `addmul_2` / `mpn_addmul_2`). External function names must not appear anywhere in the repo. - Tests for a specific algorithm/kernel belong in the same source file as the implementation, as a `#[cfg(test)] mod tests` block at the bottom — not in a separate integration test file under `tests/`. Reserve `tests/` for cross-cutting or public-API tests. +- When debugging or writing test assertions, use `{:?}` (or `{:#?}` for the verbose form with digit/bit counts) to inspect arbitrary precision values. The [`Debug`] format prints a compact head‥tail representation (most significant digits `..` least significant digits) instead of dumping the entire number, making it readable even for thousand-digit integers. ## Feature flags diff --git a/float/CHANGELOG.md b/float/CHANGELOG.md index 79c62ae..d0805bd 100644 --- a/float/CHANGELOG.md +++ b/float/CHANGELOG.md @@ -1,5 +1,77 @@ # Changelog +## Unreleased + +### Add +- `FpError` now carries `Overflow(Sign)` and `Underflow(Sign)` variants. Repr-level arithmetic + functions (`mul_finite_reprs`, `repr_div`, `sqr`, `cubic`, `exp_internal`, `powi`) detect + exponent overflow/underflow and return these errors. At the `FBig`/`CachedFBig` convenience + layer they are converted to signed infinity or signed zero, matching IEEE 754 overflow/underflow + semantics. The `Context` layer returns the raw error, giving callers the choice. +- `exp` and `exp_m1` now accept infinite input, returning the IEEE-correct result (`exp(+inf) = +inf`, + `exp(-inf) = +0`, `exp_m1(+inf) = +inf`, `exp_m1(-inf) = -1`). +- `ConstCache` now also caches the base-free `√10005` isqrt used by π (`ConstCache::pi`), so a + repeat π call at the same or lower precision reuses it instead of recomputing. The isqrt is held + as a base-free integer (`floor(√10005 · 2^bits)`, computed via Karatsuba `UBig::sqrt`) and folded + into the π integer ratio, so no cross-base conversion is needed. `ConstCache::total_words()` + counts it; `total_terms()` is unchanged (the isqrt isn't a series). Measured warm-π speedup: + ~20× at 500 digits, ~1.3× at 5000. +- IEEE-754 signed zero (`-0`): operations now produce the sign of zero mandated by the standard + (e.g. `1 / -inf = -0`, `sqrt(-0) = -0`, `ceil(-0) = -0`, cancellation under round-toward-negative). + `+0` and `-0` compare equal; `-0.0` round-trips through `f32`/`f64`. +- `FpError` (`InfiniteInput`, `OutOfDomain`, `Indeterminate`) and `FpResult = Result, FpError>`. + Infinite *outputs* are returned as values inside `Ok` (`1/0 → +inf`, `ln(0) → -inf`, `exp(huge) → +inf`, + `tan(π/2) → ±inf`); infinite *inputs* are `Err(InfiniteInput)` (making infinities terminal, which + structurally avoids the NaN-producing indeterminate forms); domain errors (`0/0`, `sqrt(-x)`, `ln(-x)`, + `asin(|x|>1)`, `pow(neg, non-int)`) are `Err`. The `FBig`/`CachedFBig` convenience layers panic on error. + +### Change +- **Breaking (encoding):** infinities are re-encoded with sentinel exponents `isize::MAX`/`isize::MIN` + (was `1`/`-1`), and `-0` is encoded at exponent `-1`. `normalize()` preserves these special values + instead of clobbering them; `Repr`'s `PartialEq`/`Eq` are now manual so `+0 == -0`. +- **Breaking (result model):** `Context` arithmetic/transcendental/trig methods now return + `FpResult>` (= `Result>, FpError>`) instead of `Rounded>` + (arithmetic) / `FpResult` (the old trig enum). The old `FpResult` enum is **removed** (replaced by + the type alias). `FBig::tan`/`asin`/`acos`/`atan2` now return `Self` (panic on error) instead of the + enum, matching the other trig methods. +- `atan2(±finite, +inf)` now returns the signed zero of `y` (now that signed zero is supported). +- `powf(±0, y)` returns the *positive* result (`+0` for `y > 0`, `+inf` for `y < 0`) — matching the + common float-pow convention (a float exponent doesn't track parity). Use `powi` for the sign-correct + result (`pow(-0, odd) = -0`). +- Removed the unused `panic_overflow`/`panic_underflow`/`panic_infinite`/`panic_power_negative_base`/ + `panic_root_negative` helpers (their conditions are now `FpError`s). + +### Fix +- `exp(huge)` / `exp_m1(huge)` now return `+inf` (or `0` / `-1` for huge negative arguments) instead of + panicking when the scaled exponent overflows `isize`; `powi` likewise returns `±inf`/`0` on + astronomically large results. +- `exp` / `exp_m1` at high precision (≳ a few thousand digits) returned values wrong in the low bits. + The series working precision was sized `p + O(log p)`, but the final `Bⁿ` powering amplifies the + series' relative error by `Bⁿ`, so it must carry `≈ n ≈ √p` extra digits (cf. MPFR's + `q = precy + 2·K + …`, `K ≈ √precy`). The working precision is now `p + 2n` (`n = 2^⌊log₂ p / 2⌋`) + and the final powering runs at that same precision instead of `2p`. Verified correct against an + independent pure-Taylor reference up to 8192 bits / 4000 digits; also faster (roughly half the + multiply cost at large `p`). +- The `FBig` `+`/`-` operators now produce `-0` on exact cancellation under round-toward-negative + (`Down`), matching `Context::add`/`sub` (previously the equal-exponent fast path yielded `+0`). +- `ShrAssign` (`>>=`) for `FBig` previously subtracted the shift amount twice; it now shifts exactly once. + +### Add +- Add the `ConstCache` type and the `CachedFBig` wrapper. `ConstCache` caches exact binary-splitting tree state for mathematical constants (π, ln2, ln10, ln(B)) so that repeated calls at increasing precision *extend* prior work instead of recomputing from scratch. `CachedFBig` is an `FBig` carrying a shared `Rc>` handle: its transcendental operations (`ln`, `exp`, `sin`/`cos`/…, `pi`, base conversion) thread that handle through the `Context` methods, reusing/extending the cached state. `Context` and `FBig` stay `Copy` + `Send` + `Sync` + `no_std` (so `static_fbig!`/`static_dbig!` keep working); only `CachedFBig` is `!Send + !Sync` (sharing state via `Rc>`). Because `Context` accepts `Option<&mut ConstCache>`, users can build `Arc>`-based variants too. +- Add the `ConstCache` type and the `CachedFBig` wrapper. `ConstCache` caches exact binary-splitting tree state for mathematical constants (π, ln2, ln10, ln(B)) so that repeated calls at increasing precision *extend* prior work instead of recomputing from scratch. `CachedFBig` is an `FBig` carrying a shared `Rc>` handle: its transcendental operations (`ln`, `exp`, `sin`/`cos`/…, `pi`, base conversion) thread that handle through the `Context` methods, reusing/extending the cached state. `Context` and `FBig` stay `Copy` + `Send` + `Sync` + `no_std` (so `static_fbig!`/`static_dbig!` keep working); only `CachedFBig` is `!Send + !Sync` (sharing state via `Rc>`). Because `Context` accepts `Option<&mut ConstCache>`, users can build `Arc>`-based variants too. +- Mixed operators for `CachedFBig`: it now supports binary operations with `FBig` and with all primitive integer types (`u8`–`usize`, `i8`–`isize`, `UBig`, `IBig`), in both directions. The cache handle from the `CachedFBig` operand is preserved. `From for CachedFBig` and `From for FBig` are implemented for ergonomic conversion. +- `CachedFBig::cache()` provides read-only access to the shared `ConstCache`, with `ConstCache::total_terms()` and `total_words()` for cache size inspection, and `CachedFBig::clear_cache()` / `ConstCache::clear()` to free cached memory. + +### Change +- **Breaking (low-level `Context` API):** the `Context` constant-source methods (`ln`, `ln_1p`, `exp`, `exp_m1`, `powf`, `pi`, `sin`, `cos`, `sin_cos`, `tan`, `asin`, `acos`, `atan`, `atan2`, and the internal `ln2`/`ln10`/`ln_base`/`convert_base`) now take an additional `cache: Option<&mut ConstCache>` parameter, threading an optional shared cache. The high-level `FBig` API is unchanged (it passes `None`). +- Removed the `MathCache` type (subsumed by `ConstCache`, which is now public with `&mut self` methods). +- `Context::iacoth` (used internally by `ln`) now evaluates the series with binary splitting instead of an iterative loop, reusing the shared `iacoth_bs` helper. This keeps `Q` at O(p) digits and improves high-precision performance; behavior is unchanged (pinned by existing fixtures). +- `iacoth_bs` now skips its first several leaves via a compile-time constant basecase (the `L(6)`/`L(9)`/`L(99)` initial blocks). The precomputed `(P, Q, T)` values are kept within `u32` so the constants are portable across `Word = u16`/`u32`/`u64` (the `DoubleWord` constructor is `const` on every configuration). + +### Fix + +- Replace `f64::ceil()` in `ConstCache`'s precision/bit helpers with a `no_std`-safe integer ceiling (`ceil_usize`). `f64::ceil` is `std`-only on the crate's MSRV and broke the workspace `--all-features --tests` build, where `dashu-float` is compiled without `std` as a dependency of `dashu-ratio`. + ## 0.4.5 ### Add @@ -9,13 +81,11 @@ - Add π constant computation (`FBig::pi()` and `Context::pi()`) using the Chudnovsky algorithm with binary splitting ([#60](https://github.com/cmpute/dashu/pull/60)). - Add `FpResult` enum to handle non-finite math operation results (NaN, Infinite, Overflow, Underflow) without panicking ([#60](https://github.com/cmpute/dashu/pull/60)). - Add `panic_nan`, `panic_overflow`, `panic_underflow`, and `panic_infinite` helpers to the `error` module. -- Optional `rand_v09` (rand 0.9, MSRV 1.63) and `rand_v010` (rand 0.10, MSRV 1.85) features mirroring `rand_v08`. The default `rand` feature still maps to `rand_v08`. -- The random-float distributions (`Uniform01`, `UniformFBig`) and their sampling now live once in the version-agnostic `dashu_float::rand` module. The per-version modules are now private trait bindings. ### Fix -- Fix rounding issues in `to_f32()` and `to_f64()` ([#53](https://github.com/cmpute/dashu/issues/53), [#56](https://github.com/cmpute/dashu/issues/56)). -- Fix several rounding bugs in `FBig`/`Context` addition and subtraction: severe-cancellation collapse, spurious-ULP errors from negligible operands, the window-edge boundary, and `Context::sub` with a zero left operand under directed rounding modes. -- Fix `FBig::fract()` inflating context precision and `split_at_point_internal` using an incorrect fractional scale for values smaller than one. +- Fix rounding issues in `to_32()` and `to_f64()` (fixes [#53](https://github.com/cmpute/dashu/issues/53) and [#56](https://github.com/cmpute/dashu/issues/56)). +- Fix `FBig::fract()` inflating context precision for values smaller than one. +- Fix `split_at_point_internal` using incorrect fractional scale for numbers smaller than one, causing incorrect rounding results. ## 0.4.4 diff --git a/float/src/add.rs b/float/src/add.rs index aee7bfd..de3f1a0 100644 --- a/float/src/add.rs +++ b/float/src/add.rs @@ -1,5 +1,5 @@ use crate::{ - error::assert_finite_operands, + error::{assert_finite_operands, FpError, FpResult}, fbig::FBig, helper_macros, repr::{Context, Repr, Word}, @@ -14,6 +14,16 @@ use core::{ use dashu_base::Sign::{self, *}; use dashu_int::{IBig, UBig}; +/// Build a `Repr` from a cancellation result, producing `-0` (instead of `+0`) when the +/// significand is zero and the rounding mode is roundTowardNegative (IEEE 754 §6.3). +fn cancel_zero(sig: IBig, exp: isize) -> Repr { + if sig.is_zero() && R::IS_ROUND_TOWARD_NEGATIVE { + Repr::neg_zero() + } else { + Repr::new(sig, exp) + } +} + impl Add for FBig { type Output = Self; @@ -114,7 +124,7 @@ fn add_val_val( lhs.repr } else { match lhs.repr.exponent.cmp(&rhs.repr.exponent) { - Ordering::Equal => context.repr_round(Repr::new( + Ordering::Equal => context.repr_round(cancel_zero::( lhs.repr.significand + rhs.repr.significand, lhs.repr.exponent, )), @@ -147,7 +157,7 @@ fn add_val_ref( Positive => lhs.repr.significand + &rhs.repr.significand, Negative => lhs.repr.significand - &rhs.repr.significand, }; - context.repr_round(Repr::new(sum_signif, lhs.repr.exponent)) + context.repr_round(cancel_zero::(sum_signif, lhs.repr.exponent)) } Ordering::Greater => context.repr_add_large_small(lhs.repr, &rhs.repr, rhs_sign), Ordering::Less => context.repr_add_small_large(lhs.repr, &rhs.repr, rhs_sign), @@ -172,7 +182,7 @@ fn add_ref_val( lhs.repr.clone() } else { match lhs.repr.exponent.cmp(&rhs.repr.exponent) { - Ordering::Equal => context.repr_round(Repr::new( + Ordering::Equal => context.repr_round(cancel_zero::( &lhs.repr.significand + rhs.repr.significand, lhs.repr.exponent, )), @@ -200,7 +210,7 @@ fn add_ref_ref( lhs.repr.clone() } else { match lhs.repr.exponent.cmp(&rhs.repr.exponent) { - Ordering::Equal => context.repr_round(Repr::new( + Ordering::Equal => context.repr_round(cancel_zero::( &lhs.repr.significand + rhs_sign * rhs.repr.significand.clone(), lhs.repr.exponent, )), @@ -224,9 +234,20 @@ impl Context { mut low: (IBig, usize), is_sub: bool, ) -> Rounded> { + // A zero produced by exact cancellation is -0 only under roundTowardNegative (Down), + // +0 otherwise (IEEE 754 §6.3). + let neg_cancel = is_sub && R::IS_ROUND_TOWARD_NEGATIVE; + let make_repr = |sig: IBig, exp: isize| -> Repr { + if sig.is_zero() && neg_cancel { + Repr::neg_zero() + } else { + Repr::new(sig, exp) + } + }; + if !self.is_limited() { // short cut for unlimited precision - return Rounded::Exact(Repr::new(significand, exponent)); + return Rounded::Exact(make_repr(significand, exponent)); } // use one extra digit to prevent cancellation in rounding @@ -278,12 +299,12 @@ impl Context { // perform rounding if low.0.is_zero() { - Rounded::Exact(Repr::new(significand, exponent)) + Rounded::Exact(make_repr(significand, exponent)) } else { // By now significand should have at least full precision. After adjustment, the digits length // could be one more than the precision. We don't shrink the extra digit. let adjust = R::round_fract::(&significand, low.0, low.1); - Rounded::Inexact(Repr::new(significand + adjust, exponent), adjust) + Rounded::Inexact(make_repr(significand + adjust, exponent), adjust) } } @@ -485,11 +506,13 @@ impl Context { /// let context = Context::::new(2); /// let a = DBig::from_str("1.234")?; /// let b = DBig::from_str("6.789")?; - /// assert_eq!(context.add(&a.repr(), &b.repr()), Inexact(DBig::from_str("8.0")?, NoOp)); + /// assert_eq!(context.add(&a.repr(), &b.repr()), Ok(Inexact(DBig::from_str("8.0")?, NoOp))); /// # Ok::<(), ParseError>(()) /// ``` - pub fn add(&self, lhs: &Repr, rhs: &Repr) -> Rounded> { - assert_finite_operands(lhs, rhs); + pub fn add(&self, lhs: &Repr, rhs: &Repr) -> FpResult> { + if lhs.is_infinite() || rhs.is_infinite() { + return Err(FpError::InfiniteInput); + } let sum = if lhs.is_zero() { self.repr_round_ref(rhs) @@ -498,13 +521,14 @@ impl Context { } else { match lhs.exponent.cmp(&rhs.exponent) { Ordering::Equal => { - self.repr_round(Repr::new(&lhs.significand + &rhs.significand, lhs.exponent)) + let sig = &lhs.significand + &rhs.significand; + self.repr_round(cancel_zero::(sig, lhs.exponent)) } Ordering::Greater => self.repr_add_large_small(lhs.clone(), rhs, Positive), Ordering::Less => self.repr_add_small_large(lhs.clone(), rhs, Positive), } }; - sum.map(|v| FBig::new(v, *self)) + Ok(sum.map(|v| FBig::new(v, *self))) } /// Subtract two floating point numbers under this context. @@ -523,12 +547,14 @@ impl Context { /// let b = DBig::from_str("6.789")?; /// assert_eq!( /// context.sub(&a.repr(), &b.repr()), - /// Inexact(DBig::from_str("-5.6")?, SubOne) + /// Ok(Inexact(DBig::from_str("-5.6")?, SubOne)) /// ); /// # Ok::<(), ParseError>(()) /// ``` - pub fn sub(&self, lhs: &Repr, rhs: &Repr) -> Rounded> { - assert_finite_operands(lhs, rhs); + pub fn sub(&self, lhs: &Repr, rhs: &Repr) -> FpResult> { + if lhs.is_infinite() || rhs.is_infinite() { + return Err(FpError::InfiniteInput); + } let sum = if lhs.is_zero() { // Round `-rhs` directly rather than negating *after* rounding. For the asymmetric @@ -541,13 +567,14 @@ impl Context { } else { match lhs.exponent.cmp(&rhs.exponent) { Ordering::Equal => { - self.repr_round(Repr::new(&lhs.significand - &rhs.significand, lhs.exponent)) + let sig = &lhs.significand - &rhs.significand; + self.repr_round(cancel_zero::(sig, lhs.exponent)) } Ordering::Greater => self.repr_add_large_small(lhs.clone(), rhs, Negative), Ordering::Less => self.repr_add_small_large(lhs.clone(), rhs, Negative), } }; - sum.map(|v| FBig::new(v, *self)) + Ok(sum.map(|v| FBig::new(v, *self))) } } @@ -571,6 +598,7 @@ mod tests { // 1.00 - 0.99999999 = 1e-8 (exactly representable at precision 3) assert_eq!( ctx.sub(&r::<10>(100, -2), &r::<10>(99999999, -8)) + .unwrap() .value() .repr(), &r::<10>(1, -8) @@ -578,6 +606,7 @@ mod tests { // 1.00 - 0.99950001 = 4.9999e-4, rounds to 5.00e-4 (HalfAway) assert_eq!( ctx.sub(&r::<10>(100, -2), &r::<10>(99950001, -8)) + .unwrap() .value() .repr(), &r::<10>(500, -6) @@ -590,6 +619,7 @@ mod tests { // 2^20 - (2^20 - 1) = 1, with the operands 20 exponent positions apart assert_eq!( ctx.sub(&r::<2>(1, 20), &r::<2>((1i128 << 20) - 1, 0)) + .unwrap() .value() .repr(), &r::<2>(1, 0) @@ -597,6 +627,7 @@ mod tests { // same magnitude gap but the smaller-exponent operand is on the left assert_eq!( ctx.sub(&r::<2>((1i128 << 20) - 1, 0), &r::<2>(1, 20)) + .unwrap() .value() .repr(), &r::<2>(-1, 0) @@ -604,6 +635,7 @@ mod tests { // 2^30 - (2^30 - 1) = 1 assert_eq!( ctx.sub(&r::<2>(1, 30), &r::<2>((1i128 << 30) - 1, 0)) + .unwrap() .value() .repr(), &r::<2>(1, 0) @@ -618,6 +650,7 @@ mod tests { // 2^20 + (-(2^20 - 1)) = 1 assert_eq!( ctx.add(&r::<2>(1, 20), &r::<2>(-((1i128 << 20) - 1), 0)) + .unwrap() .value() .repr(), &r::<2>(1, 0) @@ -639,7 +672,13 @@ mod tests { fn sub_mild_unchanged() { let ctx = Context::::new(3); // 101 - 0.2 = 100.8, kept as 1008 * 10^-1 (one guard digit, as before) - assert_eq!(ctx.sub(&r::<10>(101, 0), &r::<10>(2, -1)).value().repr(), &r::<10>(1008, -1)); + assert_eq!( + ctx.sub(&r::<10>(101, 0), &r::<10>(2, -1)) + .unwrap() + .value() + .repr(), + &r::<10>(1008, -1) + ); } // Regression for the branch-1 signum-proxy bug (SUM-BUG.md §2c): when the larger @@ -653,18 +692,37 @@ mod tests { fn add_negligible_short_operand_no_spurious_ulp() { // base 2 + HalfAway: the exact tie case let ctx = Context::::new(10); - assert_eq!(ctx.add(&r::<2>(1, 0), &r::<2>(1, -100)).value().repr(), &r::<2>(1, 0)); - assert_eq!(ctx.sub(&r::<2>(1, 0), &r::<2>(1, -100)).value().repr(), &r::<2>(1, 0)); + assert_eq!( + ctx.add(&r::<2>(1, 0), &r::<2>(1, -100)) + .unwrap() + .value() + .repr(), + &r::<2>(1, 0) + ); + assert_eq!( + ctx.sub(&r::<2>(1, 0), &r::<2>(1, -100)) + .unwrap() + .value() + .repr(), + &r::<2>(1, 0) + ); // larger short operand (digits < precision), negligible addend let ctx = Context::::new(50); assert_eq!( ctx.add(&r::<2>(0x12345, 0), &r::<2>(1, -200)) + .unwrap() .value() .repr(), &r::<2>(0x12345, 0) ); // base 10 was never affected (1 < ½·10), but check it stays correct let ctx = Context::::new(10); - assert_eq!(ctx.add(&r::<10>(1, 0), &r::<10>(1, -100)).value().repr(), &r::<10>(1, 0)); + assert_eq!( + ctx.add(&r::<10>(1, 0), &r::<10>(1, -100)) + .unwrap() + .value() + .repr(), + &r::<10>(1, 0) + ); } } diff --git a/float/src/cmp.rs b/float/src/cmp.rs index 92dbaad..09f5700 100644 --- a/float/src/cmp.rs +++ b/float/src/cmp.rs @@ -70,8 +70,8 @@ fn repr_cmp_same_base( } }; - // case 3: compare with 0 - match (lhs.is_zero(), rhs.is_zero()) { + // case 3: compare with 0 (both +0 and -0 are zero) + match (lhs.significand.is_zero(), rhs.significand.is_zero()) { (true, true) => return Ordering::Equal, (true, false) => { // rhs must be positive, otherwise case 2 will return diff --git a/float/src/convert.rs b/float/src/convert.rs index 03066d8..e1d5a08 100644 --- a/float/src/convert.rs +++ b/float/src/convert.rs @@ -10,8 +10,9 @@ use dashu_base::{ use dashu_int::{IBig, UBig, Word}; use crate::{ - error::{assert_finite, panic_unlimited_precision}, + error::{assert_finite, panic_unlimited_precision, FpError}, fbig::FBig, + math::cache::{reborrow_cache, ConstCache}, repr::{Context, Repr}, round::{ mode::{HalfAway, HalfEven, Zero}, @@ -54,7 +55,11 @@ macro_rules! impl_from_float_for_fbig { fn try_from(f: $t) -> Result { match f.decode() { - Ok((man, exp)) => Ok(Repr::new(man.into(), exp as _)), + Ok((man, exp)) => Ok(if man == 0 && f.is_sign_negative() { + Self::neg_zero() + } else { + Repr::new(man.into(), exp as _) + }), Err(FpCategory::Infinite) => match f.sign() { Sign::Positive => Ok(Self::infinity()), Sign::Negative => Ok(Self::neg_infinity()), @@ -70,7 +75,12 @@ macro_rules! impl_from_float_for_fbig { fn try_from(f: $t) -> Result { match f.decode() { Ok((man, exp)) => { - let repr = Repr::new(man.into(), exp as _); + // preserve the sign of a signed zero (-0.0 -> Repr::neg_zero()) + let repr = if man == 0 && f.is_sign_negative() { + Repr::neg_zero() + } else { + Repr::new(man.into(), exp as _) + }; // The precision is inferenced from the mantissa, because the mantissa of // normal float is always normalized. This will produce correct precision @@ -354,7 +364,7 @@ impl FBig { ) -> Rounded> { let context = Context::::new(precision); context - .convert_base(self.repr) + .convert_base(self.repr, None) .map(|repr| FBig::new(repr, context)) } @@ -429,12 +439,12 @@ impl FBig { let context = Context::::new(24); context - .convert_base::(self.repr.clone()) + .convert_base::(self.repr.clone(), None) .and_then(|v| context.repr_round_ref(&v)) .and_then(|v| v.into_f32_internal()) } - /// Convert the float number to [f64] with [HalfEven] rounding mode regardless of the mode associated with this number. + /// Convert the float number to [f64] with the rounding mode associated with the type. /// /// Note that the conversion is inexact even if the number is infinite. /// @@ -456,7 +466,7 @@ impl FBig { let context = Context::::new(53); context - .convert_base::(self.repr.clone()) + .convert_base::(self.repr.clone(), None) .and_then(|v| context.repr_round_ref(&v)) .and_then(|v| v.into_f64_internal()) } @@ -465,7 +475,11 @@ impl FBig { impl Context { // Convert the [Repr] from base B to base NewB, with the precision under the target base from this context. #[allow(non_upper_case_globals)] - fn convert_base(&self, repr: Repr) -> Rounded> { + fn convert_base( + &self, + repr: Repr, + mut cache: Option<&mut ConstCache>, + ) -> Rounded> { // shortcut if NewB is the same as B if NewB == B { return Exact(Repr { @@ -547,19 +561,34 @@ impl Context { let signif = repr.significand * Repr::::BASE.pow(repr.exponent as usize); Exact(Repr::new(signif, 0)) } else { - let num = Repr::new(repr.significand, 0); - let den = Repr::new(Repr::::BASE.pow(-repr.exponent as usize).into(), 0); - self.repr_div(num, den) + let num: Repr = Repr::new(repr.significand, 0); + let den: Repr = + Repr::new(Repr::::BASE.pow(-repr.exponent as usize).into(), 0); + match self.repr_div(num, den) { + Ok(v) => v.map(|r: Repr| Repr { + significand: r.significand, + exponent: r.exponent, + }), + Err(FpError::Overflow(sign)) => { + Inexact(Repr::::infinity_with_sign(sign), Rounding::NoOp) + } + Err(FpError::Underflow(sign)) => { + Inexact(Repr::::zero_with_sign(sign), Rounding::NoOp) + } + Err(_) => unreachable!(), + } } } else { // if the exponent is large, then we first estimate the result exponent as floor(exponent * log(B) / log(NewB)), // then the fractional part is multiplied with the original significand let work_context = Context::::new(2 * self.precision); // double the precision to get the precise logarithm let new_exp = repr.exponent - * work_context - .ln(&Repr::new(Repr::::BASE.into(), 0)) - .value(); - let (exponent, rem) = new_exp.div_rem_euclid(work_context.ln_base::()); + * work_context.unwrap_fp(work_context.ln( + &Repr::new(Repr::::BASE.into(), 0), + reborrow_cache(&mut cache), + )); + let (exponent, rem) = + new_exp.div_rem_euclid(work_context.ln_base::(reborrow_cache(&mut cache))); let exponent: isize = exponent.try_into().unwrap(); let exp_rem = rem.exp(); let significand = repr.significand * exp_rem.repr.significand; @@ -577,6 +606,10 @@ impl Repr { debug_assert!(self.significand.bit_len() <= 24); let sign = self.sign(); + if self.is_neg_zero() { + // encode() would drop the sign of -0; preserve it exactly + return Exact(sign * 0f32); + } let man24: i32 = self.significand.try_into().unwrap(); if self.exponent >= 128 { // max f32 = 2^128 * (1 - 2^-24) @@ -617,7 +650,7 @@ impl Repr { let context = Context::::new(24); context - .convert_base::(self.clone()) + .convert_base::(self.clone(), None) .and_then(|v| context.repr_round_ref(&v)) .and_then(|v| v.into_f32_internal()) } @@ -629,6 +662,10 @@ impl Repr { debug_assert!(self.significand.bit_len() <= 53); let sign = self.sign(); + if self.is_neg_zero() { + // encode() would drop the sign of -0; preserve it exactly + return Exact(sign * 0f64); + } let man53: i64 = self.significand.try_into().unwrap(); if self.exponent >= 1024 { // max f64 = 2^1024 × (1 − 2^−53) @@ -669,7 +706,7 @@ impl Repr { let context = Context::::new(53); context - .convert_base::(self.clone()) + .convert_base::(self.clone(), None) .and_then(|v| context.repr_round_ref(&v)) .and_then(|v| v.into_f64_internal()) } diff --git a/float/src/div.rs b/float/src/div.rs index d3dcc72..e4c0aed 100644 --- a/float/src/div.rs +++ b/float/src/div.rs @@ -1,16 +1,75 @@ use crate::{ - error::{assert_finite_operands, assert_limited_precision}, + error::{assert_finite_operands, assert_limited_precision, FpError, FpResult}, fbig::FBig, helper_macros::{self, impl_binop_assign_by_taking}, repr::{Context, Repr, Word}, - round::{Round, Rounded}, + round::{Round, Rounded, Rounding}, utils::{digit_len, shl_digits_in_place, split_digits}, }; use core::ops::{Div, DivAssign, Rem, RemAssign}; -use dashu_base::{Approximation, DivEuclid, DivRem, DivRemEuclid, Inverse, RemEuclid}; +use dashu_base::{Approximation, DivEuclid, DivRem, DivRemEuclid, Inverse, RemEuclid, Sign}; use dashu_int::{fast_div::ConstDivisor, modular::IntoRing, IBig, UBig}; -macro_rules! impl_div_or_rem_for_fbig { +/// Attach the dividend/divisor XOR sign to a zero quotient: the raw quotient significand is +/// `+0`, so the sign of a zero result (`0/finite`, or a finite/finite that rounds to zero) is +/// `sign(lhs) XOR sign(rhs)`. +fn div_repr(sign_negative: bool, significand: IBig, exponent: isize) -> Repr { + if significand.is_zero() { + if sign_negative { + Repr::neg_zero() + } else { + Repr::zero() + } + } else { + Repr::new(significand, exponent) + } +} + +macro_rules! impl_div_for_fbig { + (impl $op:ident, $method:ident, $repr_method:ident) => { + impl $op> for FBig { + type Output = FBig; + fn $method(self, rhs: FBig) -> Self::Output { + let context = Context::max(self.context, rhs.context); + let rounded = context.unwrap_fp_repr(context.$repr_method(self.repr, rhs.repr)); + FBig::new(rounded, context) + } + } + + impl<'l, R: Round, const B: Word> $op> for &'l FBig { + type Output = FBig; + fn $method(self, rhs: FBig) -> Self::Output { + let context = Context::max(self.context, rhs.context); + let rounded = + context.unwrap_fp_repr(context.$repr_method(self.repr.clone(), rhs.repr)); + FBig::new(rounded, context) + } + } + + impl<'r, R: Round, const B: Word> $op<&'r FBig> for FBig { + type Output = FBig; + fn $method(self, rhs: &FBig) -> Self::Output { + let context = Context::max(self.context, rhs.context); + let rounded = + context.unwrap_fp_repr(context.$repr_method(self.repr, rhs.repr.clone())); + FBig::new(rounded, context) + } + } + + impl<'l, 'r, R: Round, const B: Word> $op<&'r FBig> for &'l FBig { + type Output = FBig; + fn $method(self, rhs: &FBig) -> Self::Output { + let context = Context::max(self.context, rhs.context); + let rounded = context.unwrap_fp_repr( + context.$repr_method(self.repr.clone(), rhs.repr.clone()), + ); + FBig::new(rounded, context) + } + } + }; +} + +macro_rules! impl_rem_for_fbig { (impl $op:ident, $method:ident, $repr_method:ident) => { impl $op> for FBig { type Output = FBig; @@ -50,8 +109,8 @@ macro_rules! impl_div_or_rem_for_fbig { } }; } -impl_div_or_rem_for_fbig!(impl Div, div, repr_div); -impl_div_or_rem_for_fbig!(impl Rem, rem, repr_rem); +impl_div_for_fbig!(impl Div, div, repr_div); +impl_rem_for_fbig!(impl Rem, rem, repr_rem); impl_binop_assign_by_taking!(impl DivAssign, div_assign, div); impl_binop_assign_by_taking!(impl RemAssign, rem_assign, rem); @@ -188,7 +247,7 @@ impl Inverse for FBig { #[inline] fn inv(self) -> Self::Output { - self.context.inv(&self.repr).value() + self.context.unwrap_fp(self.context.inv(&self.repr)) } } @@ -197,7 +256,7 @@ impl Inverse for &FBig { #[inline] fn inv(self) -> Self::Output { - self.context.inv(&self.repr).value() + self.context.unwrap_fp(self.context.inv(&self.repr)) } } @@ -214,17 +273,46 @@ fn align_as_int(lhs: FBig, rhs: FBig) -> (I } impl Context { - pub(crate) fn repr_div(&self, lhs: Repr, rhs: Repr) -> Rounded> { + pub(crate) fn repr_div( + &self, + lhs: Repr, + rhs: Repr, + ) -> FpResult> { assert_finite_operands(&lhs, &rhs); assert_limited_precision(self.precision); + let sign_negative = lhs.sign() != rhs.sign(); + let sign = if sign_negative { + Sign::Negative + } else { + Sign::Positive + }; + + if rhs.significand.is_zero() { + if lhs.significand.is_zero() { + // 0/0 is indeterminate; callers that can signal it (Context::div) check first, + // otherwise fall through to div_rem which panics on division by zero. + } else { + // finite / 0 = ±inf (sign = XOR), returned as a value + return Ok(Approximation::Exact(Repr::infinity_with_sign(sign))); + } + } + // this method don't deal with the case where lhs significand is too large debug_assert!(lhs.digits() <= self.precision + rhs.digits()); let (mut q, mut r) = lhs.significand.div_rem(&rhs.significand); - let mut e = lhs.exponent - rhs.exponent; + let mut e = lhs.exponent.checked_sub(rhs.exponent).ok_or({ + if lhs.exponent >= 0 { + FpError::Overflow(sign) + } else { + FpError::Underflow(sign) + } + })?; if r.is_zero() { - return Approximation::Exact(Repr::new(q, e)); + return Ok(Approximation::Exact( + div_repr(sign_negative, q, e).check_finite_exponent()?, + )); } let ddigits = digit_len::(&rhs.significand); @@ -233,7 +321,7 @@ impl Context { let rdigits = digit_len::(&r); // rdigits <= ddigits let shift = ddigits + self.precision - rdigits; shl_digits_in_place::(&mut r, shift); - e -= shift as isize; + e = e.checked_sub(shift as isize).ok_or(FpError::Underflow(sign))?; let (q0, r0) = r.div_rem(&rhs.significand); q = q0; r = r0; @@ -244,7 +332,7 @@ impl Context { let shift = ddigits + self.precision - ndigits; shl_digits_in_place::(&mut q, shift); shl_digits_in_place::(&mut r, shift); - e -= shift as isize; + e = e.checked_sub(shift as isize).ok_or(FpError::Underflow(sign))?; let (q0, r0) = r.div_rem(&rhs.significand); q += q0; @@ -252,17 +340,19 @@ impl Context { } } - if r.is_zero() { - Approximation::Exact(Repr::new(q, e)) + let repr = if r.is_zero() { + Approximation::Exact(div_repr(sign_negative, q, e)) } else { let adjust = R::round_ratio(&q, r, &rhs.significand); - Approximation::Inexact(Repr::new(q + adjust, e), adjust) - } + Approximation::Inexact(div_repr(sign_negative, q + adjust, e), adjust) + }; + Ok(repr) } pub(crate) fn repr_rem(&self, lhs: Repr, rhs: Repr) -> Rounded> { assert_finite_operands(&lhs, &rhs); + let lhs_is_neg_zero = lhs.is_neg_zero(); let (lhs_sign, lhs_signif) = lhs.significand.into_parts(); let (_, rhs_signif) = rhs.significand.into_parts(); @@ -320,9 +410,25 @@ impl Context { let exponent = lhs.exponent.min(rhs.exponent); if significand.is_zero() { - Approximation::Exact(Repr::zero()) + // the sign of a zero remainder follows the dividend (±0) + Approximation::Exact(if lhs_is_neg_zero { + Repr::neg_zero() + } else { + Repr::zero() + }) } else { - self.repr_round(Repr::new(significand, exponent)) + match Repr::new(significand, exponent).check_finite_exponent() { + Ok(repr) => self.repr_round(repr), + Err(e) => match e { + FpError::Overflow(sign) => { + Approximation::Inexact(Repr::infinity_with_sign(sign), Rounding::NoOp) + } + FpError::Underflow(sign) => { + Approximation::Inexact(Repr::zero_with_sign(sign), Rounding::NoOp) + } + _ => unreachable!(), + }, + } } } @@ -340,7 +446,7 @@ impl Context { /// let context = Context::::new(2); /// let a = DBig::from_str("-1.234")?; /// let b = DBig::from_str("6.789")?; - /// assert_eq!(context.div(&a.repr(), &b.repr()), Inexact(DBig::from_str("-0.18")?, NoOp)); + /// assert_eq!(context.div(&a.repr(), &b.repr()), Ok(Inexact(DBig::from_str("-0.18")?, NoOp))); /// # Ok::<(), ParseError>(()) /// ``` /// @@ -349,8 +455,13 @@ impl Context { /// To do euclidean division on the float numbers (get an integer quotient and remainder, equivalent to C99's /// `fmod` and `remquo`), please use the methods provided by traits [DivEuclid], [RemEuclid] and [DivRemEuclid]. /// - pub fn div(&self, lhs: &Repr, rhs: &Repr) -> Rounded> { - assert_finite_operands(lhs, rhs); + pub fn div(&self, lhs: &Repr, rhs: &Repr) -> FpResult> { + if lhs.is_infinite() || rhs.is_infinite() { + return Err(FpError::InfiniteInput); + } + if lhs.significand.is_zero() && rhs.significand.is_zero() { + return Err(FpError::Indeterminate); // 0/0 + } let lhs_repr = if !lhs.is_zero() && lhs.digits_ub() > rhs.digits_lb() + self.precision { // shrink lhs if it's larger than necessary @@ -360,8 +471,9 @@ impl Context { } else { lhs.clone() }; - self.repr_div(lhs_repr, rhs.clone()) - .map(|v| FBig::new(v, *self)) + Ok(self + .repr_div(lhs_repr, rhs.clone())? + .map(|v| FBig::new(v, *self))) } /// Calculate the remainder of `⌈lhs / rhs⌋`. @@ -381,13 +493,16 @@ impl Context { /// let context = Context::::new(3); /// let a = DBig::from_str("6.789")?; /// let b = DBig::from_str("-1.234")?; - /// assert_eq!(context.rem(&a.repr(), &b.repr()), Exact(DBig::from_str("-0.615")?)); + /// assert_eq!(context.rem(&a.repr(), &b.repr()), Ok(Exact(DBig::from_str("-0.615")?))); /// # Ok::<(), ParseError>(()) /// ``` - pub fn rem(&self, lhs: &Repr, rhs: &Repr) -> Rounded> { - assert_finite_operands(lhs, rhs); - self.repr_rem(lhs.clone(), rhs.clone()) - .map(|v| FBig::new(v, *self)) + pub fn rem(&self, lhs: &Repr, rhs: &Repr) -> FpResult> { + if lhs.is_infinite() || rhs.is_infinite() { + return Err(FpError::InfiniteInput); + } + Ok(self + .repr_rem(lhs.clone(), rhs.clone()) + .map(|v| FBig::new(v, *self))) } /// Compute the multiplicative inverse of an `FBig` @@ -401,12 +516,17 @@ impl Context { /// /// let context = Context::::new(2); /// let a = DBig::from_str("-1.234")?; - /// assert_eq!(context.inv(&a.repr()), Inexact(DBig::from_str("-0.81")?, NoOp)); + /// assert_eq!(context.inv(&a.repr()), Ok(Inexact(DBig::from_str("-0.81")?, NoOp))); /// # Ok::<(), ParseError>(()) /// ``` #[inline] - pub fn inv(&self, f: &Repr) -> Rounded> { - self.repr_div(Repr::one(), f.clone()) - .map(|v| FBig::new(v, *self)) + pub fn inv(&self, f: &Repr) -> FpResult> { + if f.is_infinite() { + return Err(FpError::InfiniteInput); + } + // inv(±0) = ±inf (produced as a value by repr_div) + Ok(self + .repr_div(Repr::one(), f.clone())? + .map(|v| FBig::new(v, *self))) } } diff --git a/float/src/error.rs b/float/src/error.rs index 55c7890..5d938c0 100644 --- a/float/src/error.rs +++ b/float/src/error.rs @@ -1,4 +1,78 @@ -use crate::repr::{Repr, Word}; +use dashu_base::Sign; +use dashu_int::Word; + +use crate::fbig::FBig; +use crate::repr::{Context, Repr}; +use crate::round::{Round, Rounded}; +use core::fmt::{self, Display, Formatter}; + +/// Error returned by floating-point operations that cannot produce a usable result. +/// +/// # Errors vs. special values +/// +/// Infinite *outputs* (e.g. `1/0 → +inf`, `ln(0) → -inf`) are **not** errors — they are +/// legitimate [`Exact`] values produced by operations whose mathematical result is genuinely +/// infinite. Overflow and underflow are distinct: the mathematical result is finite, but its +/// magnitude exceeds the representable exponent range. These are reported as +/// [`Overflow`](FpError::Overflow) / [`Underflow`](FpError::Underflow), and converted to +/// signed infinity / signed zero at the convenience layer via `Context::unwrap_fp` (or the +/// `Repr`-level counterpart `Context::unwrap_fp_repr`). Because the true result was finite, +/// the converted value is always [`Inexact`] with `Rounding::NoOp`. +/// +/// The remaining variants ([`InfiniteInput`](FpError::InfiniteInput), +/// [`OutOfDomain`](FpError::OutOfDomain), [`Indeterminate`](FpError::Indeterminate)) signal +/// that an operation could not proceed, and always panic at the convenience layer. +#[derive(Clone, Copy, Debug, PartialEq, Eq)] +pub enum FpError { + /// An operand was infinite. Infinities are terminal values: they can be produced and + /// compared, but not fed back into arithmetic. + InfiniteInput, + + /// The mathematical result is not a real number (domain error), e.g. `sqrt(-x)` for `x > 0`, + /// `ln(-x)`, `asin(|x| > 1)`, `pow(negative, non-integer)`, an even root of a negative value. + OutOfDomain, + + /// An indeterminate form, e.g. `0 / 0`. + Indeterminate, + + /// The result magnitude is too large to represent as a finite number. + /// + /// At the `FBig` convenience layer this is converted to a signed infinity via + /// `Context::unwrap_fp` (or to a signed [`Repr`] via `Context::unwrap_fp_repr`). + /// The converted result is always [`Inexact`]: the true result was a very large + /// finite number, and infinity is an approximation. + Overflow(Sign), + + /// The result magnitude is too small to represent as a finite non-zero number. + /// + /// At the `FBig` convenience layer this is converted to a signed zero via + /// `Context::unwrap_fp` (or to a signed [`Repr`] via `Context::unwrap_fp_repr`). + /// The converted result is always [`Inexact`]: the true result was a very small + /// non-zero number, and zero is an approximation. + Underflow(Sign), +} + +impl Display for FpError { + fn fmt(&self, f: &mut Formatter) -> fmt::Result { + match self { + FpError::InfiniteInput => { + f.write_str("arithmetic with an infinite input is not allowed") + } + FpError::OutOfDomain => f.write_str("the operation result is out of domain"), + FpError::Indeterminate => f.write_str("the operation result is an indeterminate form"), + FpError::Overflow(_) => f.write_str("overflow: the result is too large to represent"), + FpError::Underflow(_) => f.write_str("underflow: the result is too small to represent"), + } + } +} + +#[cfg(feature = "std")] +impl std::error::Error for FpError {} + +/// The result of a floating point operation: a correctly-rounded value (which may be an +/// infinity produced as a value), or an [`FpError`] when the operation cannot proceed. +pub type FpResult = Result, FpError>; + #[inline] pub const fn assert_finite(repr: &Repr) { @@ -31,16 +105,6 @@ pub const fn panic_unlimited_precision() -> ! { panic!("precision cannot be 0 (unlimited) for this operation!") } -/// Panics when the base of the power operation is negative -pub const fn panic_power_negative_base() -> ! { - panic!("powering on negative bases could result in complex number!") -} - -/// Panics when taking an even order root of an negative number -pub fn panic_root_negative() -> ! { - panic!("the root is a complex number!") -} - /// Panics when taking the zeroth root of a number pub fn panic_root_zeroth() -> ! { panic!("finding 0th root is not allowed!") @@ -51,17 +115,53 @@ pub fn panic_nan() -> ! { panic!("the result of the operation is NaN!") } -/// Panics when the result of an operation overflows -pub fn panic_overflow() -> ! { - panic!("the result of the operation overflowed!") +/// Panics when an operation is out of domain (e.g. sqrt of a negative number) +pub fn panic_out_of_domain() -> ! { + panic!("the operation result is out of domain!") } -/// Panics when the result of an operation underflows -pub fn panic_underflow() -> ! { - panic!("the result of the operation underflowed!") -} +impl Context { + /// Unwrap an [`FpResult`], returning the value directly. + /// + /// Converts [`Overflow`](FpError::Overflow) to a signed infinity and + /// [`Underflow`](FpError::Underflow) to a signed zero. All other error + /// variants panic (infinite input, out-of-domain, indeterminate). + #[inline] + pub(crate) fn unwrap_fp( + &self, + result: FpResult>, + ) -> FBig { + match result { + Ok(value) => value.value(), + Err(FpError::Overflow(sign)) => { + FBig::new(Repr::infinity_with_sign(sign), *self) + } + Err(FpError::Underflow(sign)) => { + FBig::new(Repr::zero_with_sign(sign), *self) + } + Err(FpError::InfiniteInput) => panic_operate_with_inf(), + Err(FpError::OutOfDomain) => panic_out_of_domain(), + Err(FpError::Indeterminate) => panic_nan(), + } + } + + /// Unwrap an [`FpResult`] at the [`Repr`] level, returning the [`Repr`] directly. + /// + /// Converts [`Overflow`](FpError::Overflow) / [`Underflow`](FpError::Underflow) to + /// signed infinity / signed zero; panics on all other error variants. + #[inline] + pub(crate) fn unwrap_fp_repr( + &self, + result: FpResult>, + ) -> Repr { + match result { + Ok(value) => value.value(), + Err(FpError::Overflow(sign)) => Repr::infinity_with_sign(sign), + Err(FpError::Underflow(sign)) => Repr::zero_with_sign(sign), + Err(FpError::InfiniteInput) => panic_operate_with_inf(), + Err(FpError::OutOfDomain) => panic_out_of_domain(), + Err(FpError::Indeterminate) => panic_nan(), + } + } -/// Panics when the result of an operation is an exact infinity -pub fn panic_infinite() -> ! { - panic!("the result of the operation is an exact infinity!") } diff --git a/float/src/exp.rs b/float/src/exp.rs index 489659b..70c3ecd 100644 --- a/float/src/exp.rs +++ b/float/src/exp.rs @@ -1,10 +1,12 @@ use core::convert::TryInto; use crate::{ - error::{assert_finite, assert_limited_precision, panic_power_negative_base}, + error::{assert_finite, assert_limited_precision, FpError, FpResult}, fbig::FBig, + math::cache::{reborrow_cache, ConstCache}, repr::{Context, Repr, Word}, - round::{Round, Rounded}, + round::Round, + utils::ceil_usize, }; use dashu_base::{AbsOrd, Approximation::*, BitTest, DivRemEuclid, EstimatedLog2, Sign}; use dashu_int::IBig; @@ -23,7 +25,7 @@ impl FBig { /// ``` #[inline] pub fn powi(&self, exp: IBig) -> FBig { - self.context.powi(&self.repr, exp).value() + self.context.unwrap_fp(self.context.powi(&self.repr, exp)) } /// Raise the floating point number to an floating point power. @@ -41,7 +43,7 @@ impl FBig { #[inline] pub fn powf(&self, exp: &Self) -> Self { let context = Context::max(self.context, exp.context); - context.powf(&self.repr, &exp.repr).value() + context.unwrap_fp(context.powf(&self.repr, &exp.repr, None)) } /// Calculate the exponential function (`eˣ`) on the floating point number. @@ -57,7 +59,7 @@ impl FBig { /// ``` #[inline] pub fn exp(&self) -> FBig { - self.context.exp(&self.repr).value() + self.context.unwrap_fp(self.context.exp(&self.repr, None)) } /// Calculate the exponential minus one function (`eˣ-1`) on the floating point number. @@ -73,7 +75,7 @@ impl FBig { /// ``` #[inline] pub fn exp_m1(&self) -> FBig { - self.context.exp_m1(&self.repr).value() + self.context.unwrap_fp(self.context.exp_m1(&self.repr, None)) } } @@ -92,7 +94,7 @@ impl Context { /// /// let context = Context::::new(2); /// let a = DBig::from_str_native("-1.234")?; - /// assert_eq!(context.powi(&a.repr(), 10.into()), Inexact(DBig::from_str_native("8.2")?, AddOne)); + /// assert_eq!(context.powi(&a.repr(), 10.into()), Ok(Inexact(DBig::from_str_native("8.2")?, AddOne))); /// # Ok::<(), ParseError>(()) /// ``` /// @@ -100,8 +102,10 @@ impl Context { /// /// Panics if the precision is unlimited and the exponent is negative. In this case, the exact /// result is likely to have infinite digits. - pub fn powi(&self, base: &Repr, exp: IBig) -> Rounded> { - assert_finite(base); + pub fn powi(&self, base: &Repr, exp: IBig) -> FpResult> { + if base.is_infinite() { + return Err(FpError::InfiniteInput); + } let (exp_sign, exp) = exp.into_parts(); if exp_sign == Sign::Negative { @@ -111,16 +115,48 @@ impl Context { let guard_bits = self.precision.bit_len() * 2; // heuristic let rev_context = Context::::new(self.precision + guard_bits); - let pow = rev_context.powi(base, exp.into()).value(); - let inv = rev_context.repr_div(Repr::one(), pow.repr); - let repr = inv.and_then(|v| self.repr_round(v)); - return repr.map(|v| FBig::new(v, *self)); + let pow = rev_context.unwrap_fp(rev_context.powi(base, exp.into())); + let inv = rev_context.unwrap_fp_repr(rev_context.repr_div(Repr::one(), pow.repr)); + let repr = self.repr_round(inv); + return Ok(repr.map(|v| FBig::new(v, *self))); } if exp.is_zero() { - return Exact(FBig::ONE); + return Ok(Exact(FBig::ONE)); } else if exp.is_one() { let repr = self.repr_round_ref(base); - return repr.map(|v| FBig::new(v, *self)); + return Ok(repr.map(|v| FBig::new(v, *self))); + } + + // Guard against exponent overflow for astronomically large results: the result + // magnitude has log2 ≈ exp·log2(base); if that exceeds the isize exponent range, + // return ±inf (|base| > 1) or 0 (|base| < 1) instead of overflowing mid-computation. + let base_log2 = base.log2_est() as f64; + let threshold = (isize::MAX as f64) * (B.log2_est() as f64); + let exp_f64 = i64::try_from(&exp).ok().map(|e| e as f64); + let overflows = match exp_f64 { + Some(e) => e * base_log2 > threshold, + None => base_log2 != 0.0, // exp doesn't fit i64: overflows unless |base| == 1 + }; + if overflows { + return if base_log2 > 0.0 { + Err(FpError::Overflow( + if base.sign() == Sign::Negative { + Sign::Negative + } else { + Sign::Positive + }, + )) + } else { + // |base| < 1 and exponent huge → underflow to signed zero + let underflow_sign = if base.sign() == Sign::Negative + && exp.bit(0) + { + Sign::Negative + } else { + Sign::Positive + }; + Err(FpError::Underflow(underflow_sign)) + }; } let work_context = if self.is_limited() { @@ -133,19 +169,19 @@ impl Context { // binary exponentiation from left to right let mut p = exp.bit_len() - 2; - let mut res = work_context.sqr(base); + let mut res = work_context.unwrap_fp(work_context.sqr(base)); loop { if exp.bit(p) { - res = res.and_then(|v| work_context.mul(v.repr(), base)); + res = work_context.unwrap_fp(work_context.mul(res.repr(), base)); } if p == 0 { break; } p -= 1; - res = res.and_then(|v| work_context.sqr(v.repr())); + res = work_context.unwrap_fp(work_context.sqr(res.repr())); } - res.and_then(|v| v.with_precision(self.precision)) + Ok(res.with_precision(self.precision)) } /// Raise the floating point number to an floating point power under this context. @@ -163,40 +199,58 @@ impl Context { /// let context = Context::::new(2); /// let x = DBig::from_str_native("1.23")?; /// let y = DBig::from_str_native("-4.56")?; - /// assert_eq!(context.powf(&x.repr(), &y.repr()), Inexact(DBig::from_str_native("0.39")?, AddOne)); + /// assert_eq!(context.powf(&x.repr(), &y.repr(), None), Ok(Inexact(DBig::from_str_native("0.39")?, AddOne))); /// # Ok::<(), ParseError>(()) /// ``` /// /// # Panics /// /// Panics if the precision is unlimited. - pub fn powf(&self, base: &Repr, exp: &Repr) -> Rounded> { - assert_finite(base); + pub fn powf( + &self, + base: &Repr, + exp: &Repr, + mut cache: Option<&mut ConstCache>, + ) -> FpResult> { + if base.is_infinite() || exp.is_infinite() { + return Err(FpError::InfiniteInput); + } assert_limited_precision(self.precision); // TODO: we can allow it if exp is integer // shortcuts if exp.is_zero() { - return Exact(FBig::ONE); + return Ok(Exact(FBig::ONE)); } else if exp.is_one() { let repr = self.repr_round_ref(base); - return repr.map(|v| FBig::new(v, *self)); - } else if base.is_zero() { - return Exact(FBig::ZERO); + return Ok(repr.map(|v| FBig::new(v, *self))); + } else if base.significand.is_zero() { + // With a *float* exponent the result on a zero base is the positive one — this + // matches the common float-pow convention (e.g. CPython: `(-0.0) ** y == 0.0`), + // which doesn't track the parity of the exponent: + // pow(±0, y > 0) = +0, pow(±0, y < 0) = +inf. + // For the sign-correct result (e.g. `pow(-0, odd) = -0`), use the integer-exponent + // [`powi`](Context::powi). Short-circuiting here also avoids the negative-base path. + return Ok(Exact(if exp.sign() == Sign::Negative { + FBig::new(Repr::infinity(), *self) + } else { + FBig::ZERO + })); } if base.sign() == Sign::Negative { // TODO: we should allow negative base when exp is an integer - panic_power_negative_base() + return Err(FpError::OutOfDomain); } // x^y = exp(y*ln(x)), use a simple rule for guard bits - let guard_digits = 10 + self.precision.log2_est() as usize; + let guard_digits = 10 + ceil_usize(self.precision.log2_est()); let work_context = Context::::new(self.precision + guard_digits); - let res = work_context - .ln(base) - .and_then(|v| work_context.mul(&v.repr, exp)) - .and_then(|v| work_context.exp(&v.repr)); - res.and_then(|v| v.with_precision(self.precision)) + // ln and exp each consult/extend the shared cache; reborrows are sequential. + let ln_val = work_context.unwrap_fp(work_context.ln(base, reborrow_cache(&mut cache))); + let mul_val = work_context.unwrap_fp(work_context.mul(ln_val.repr(), exp)); + let exp_val = + work_context.unwrap_fp(work_context.exp(mul_val.repr(), reborrow_cache(&mut cache))); + Ok(exp_val.with_precision(self.precision)) } /// Calculate the exponential function (`eˣ`) on the floating point number under this context. @@ -211,12 +265,25 @@ impl Context { /// /// let context = Context::::new(2); /// let a = DBig::from_str_native("-1.234")?; - /// assert_eq!(context.exp(&a.repr()), Inexact(DBig::from_str_native("0.29")?, NoOp)); + /// assert_eq!(context.exp(&a.repr(), None), Ok(Inexact(DBig::from_str_native("0.29")?, NoOp))); /// # Ok::<(), ParseError>(()) /// ``` #[inline] - pub fn exp(&self, x: &Repr) -> Rounded> { - self.exp_internal(x, false) + pub fn exp( + &self, + x: &Repr, + cache: Option<&mut ConstCache>, + ) -> FpResult> { + if x.is_infinite() { + return Ok(Exact(FBig::new( + match x.sign() { + Sign::Positive => Repr::infinity(), + Sign::Negative => Repr::zero(), + }, + *self, + ))); + } + self.exp_internal(x, false, cache) } /// Calculate the exponential minus one function (`eˣ-1`) on the floating point number under this context. @@ -231,26 +298,43 @@ impl Context { /// /// let context = Context::::new(2); /// let a = DBig::from_str_native("-0.1234")?; - /// assert_eq!(context.exp_m1(&a.repr()), Inexact(DBig::from_str_native("-0.12")?, SubOne)); + /// assert_eq!(context.exp_m1(&a.repr(), None), Ok(Inexact(DBig::from_str_native("-0.12")?, SubOne))); /// # Ok::<(), ParseError>(()) /// ``` #[inline] - pub fn exp_m1(&self, x: &Repr) -> Rounded> { - self.exp_internal(x, true) + pub fn exp_m1( + &self, + x: &Repr, + cache: Option<&mut ConstCache>, + ) -> FpResult> { + if x.is_infinite() { + return match x.sign() { + Sign::Positive => Ok(Exact(FBig::new(Repr::infinity(), *self))), + Sign::Negative => Ok(Exact(-FBig::ONE)), // exp_m1(−∞) = −1 + }; + } + self.exp_internal(x, true, cache) } // TODO: change reduction to (x - s log2) / 2ⁿ, so that the final powering is always base 2, and doesn't depends on powi. // the powering exp(r)^(2ⁿ) could be optimized by noticing (1+x)^2 - 1 = x^2 + 2x // consider this change after having a benchmark - fn exp_internal(&self, x: &Repr, minus_one: bool) -> Rounded> { + fn exp_internal( + &self, + x: &Repr, + minus_one: bool, + mut cache: Option<&mut ConstCache>, + ) -> FpResult> { assert_finite(x); assert_limited_precision(self.precision); + let input_sign = x.sign(); - if x.is_zero() { + if x.significand.is_zero() { + // exp(±0) = 1, exp_m1(±0) = +0 return match minus_one { - false => Exact(FBig::ONE), - true => Exact(FBig::ZERO), + false => Ok(Exact(FBig::ONE)), + true => Ok(Exact(FBig::ZERO)), }; } @@ -262,9 +346,17 @@ impl Context { // Maclaurin series: exp(r) = 1 + Σ(rⁱ/i!) // There will be about p/log_B(r) summations when calculating the series, to prevent - // loss of significant, we needs about log_B(p) guard digits. - let series_guard_digits = (self.precision.log2_est() / B.log2_est()) as usize + 2; - let pow_guard_digits = (self.precision.bit_len() as f32 * B.log2_est() * 2.) as usize; // heuristic + // loss of significance, we need about log_B(p) guard digits. + let series_guard_digits = ceil_usize(self.precision.log2_est() / B.log2_est()) + 2; + + // Reduction power: the series value is later raised to Bⁿ, which amplifies its + // relative error by a factor of Bⁿ. So the series (and the squarings) must carry + // about n extra base-B digits for the result to come out correct to p digits. We + // use 2n for safety — this mirrors MPFR's working precision q = precy + 2·K + … + // (K ≈ √precy is MPFR's squaring count, the analogue of our n). The log_B(p) + // summation/squaring rounding terms are already covered by series_guard_digits. + let n = 1usize << (self.precision.bit_len() / 2); + let pow_guard_digits = 2 * n; let work_precision; // When minus_one is true and |x| < 1/B, the input is fed into the Maclaurin series without scaling @@ -284,12 +376,23 @@ impl Context { work_precision = self.precision + series_guard_digits + pow_guard_digits; let context = Context::::new(work_precision); let x = FBig::new(context.repr_round_ref(x).value(), context); - let logb = context.ln_base::(); + let logb = context.ln_base::(reborrow_cache(&mut cache)); let (s, r) = x.div_rem_euclid(logb); - // here m is roughly equal to sqrt(self.precision) - let n = 1usize << (self.precision.bit_len() / 2); - let s: isize = s.try_into().expect("exponent is too large"); + let s: isize = match s.try_into() { + Ok(v) => v, + Err(_) => { + // |floor(x / ln B)| overflows isize — x is astronomically large, so the + // result is an infinity (x → +∞) or underflows to the limit (x → −∞). + return if input_sign == Sign::Positive { + Err(FpError::Overflow(Sign::Positive)) + } else if minus_one { + Ok(Exact(-FBig::ONE)) // exp_m1(−∞) = −1 (finite) + } else { + Err(FpError::Underflow(Sign::Positive)) // exp(−∞) = +0 + }; + } + }; (s, n, r) }; @@ -316,16 +419,19 @@ impl Context { } if no_scaling { - sum.with_precision(self.precision) + Ok(sum.with_precision(self.precision)) } else if minus_one { - // add extra digits to compensate for the subtraction - Context::::new(self.precision + self.precision / 8 + 1) // heuristic - .powi(sum.repr(), Repr::::BASE.pow(n).into()) - .map(|v| (v << s) - FBig::ONE) - .and_then(|v| v.with_precision(self.precision)) + // Power at the series' working precision (it already carries the 2n guard + // digits that the Bⁿ powering amplifies away). The final "−1" can cancel at + // most ~1 leading digit here (the |x| < 1/B case is handled by no_scaling), + // which the same guard digits comfortably absorb. + let pow_ctx = Context::::new(work_precision); + let v = pow_ctx.unwrap_fp(pow_ctx.powi(sum.repr(), Repr::::BASE.pow(n).into())); + Ok(((v << s) - FBig::ONE).with_precision(self.precision)) } else { - self.powi(sum.repr(), Repr::::BASE.pow(n).into()) - .map(|v| v << s) + let pow_ctx = Context::::new(work_precision); + let v = pow_ctx.unwrap_fp(pow_ctx.powi(sum.repr(), Repr::::BASE.pow(n).into())); + Ok((v << s).with_precision(self.precision)) } } } diff --git a/float/src/fbig.rs b/float/src/fbig.rs index c8e1e57..0e2233f 100644 --- a/float/src/fbig.rs +++ b/float/src/fbig.rs @@ -85,17 +85,21 @@ use dashu_int::{DoubleWord, IBig}; /// /// # IEEE 754 behavior compliance /// -/// The representation of the floating point number doesn't follows the IEEE 754 standard, as it's not +/// The representation of the floating point number doesn't follow the IEEE 754 standard, as it's not /// designed for arbitrary precision numbers. The key differences include: -/// * [FBig] doesn't support NaN values. In places where IEEE 754 operations generate NaNs, `FBig` will panic. +/// * [FBig] doesn't support NaN values. In places where IEEE 754 operations would generate NaNs +/// (e.g. `0/0`, `sqrt(-1)`), the [`Context`] layer returns an [`FpError`](crate::FpError) and the +/// [FBig] convenience methods panic. /// * [FBig] doesn't have subnormal values. -/// * [FBig] doesn't have negative zeros¹. There is only on zero value ([FBig::ZERO]). -/// * Division by zero and logarithm on zero panic instead of returning infinities. -/// * [FBig] operations will panic if the result overflows or underflows¹. -/// * [FBig] does support infinities, but currently infinities are not allowed to be operated with, except for -/// equality test and comparison¹. -/// -/// ¹ These behaviors are subject to changes in the future. +/// * [FBig] supports IEEE-754 signed zero (`-0`): operations that produce a zero result carry +/// the sign mandated by IEEE 754 (e.g. `1 / -inf = -0`, `sqrt(-0) = -0`, `ceil(-0) = -0`). +/// `+0` and `-0` compare equal. +/// * Infinities are **terminal values**: they can be produced (e.g. `1 / 0 → +inf`, `ln(0) → -inf`, +/// `exp(huge) → +inf`, `tan(π/2) → +inf`), compared, and printed, but feeding an infinity into a +/// further operation is an error at the [`Context`] layer ([`FpError::InfiniteInput`]) and panics +/// at the [FBig] layer. This structurally avoids the IEEE indeterminate forms (`inf − inf`, `inf/inf`, +/// `0·inf`). The only exceptions are `atan(±inf) = ±π/2` and the `atan2` signed-∞ quadrants, which +/// have well-defined finite results. /// /// # Convert from/to `f32`/`f64` /// diff --git a/float/src/fbig_cached.rs b/float/src/fbig_cached.rs new file mode 100644 index 0000000..9c13100 --- /dev/null +++ b/float/src/fbig_cached.rs @@ -0,0 +1,446 @@ +//! A cached floating-point number — [`FBig`] with a shared constant cache attached. + +use alloc::rc::Rc; +use core::cell::RefCell; + +use dashu_base::Sign; + +use crate::error::panic_unlimited_precision; +use crate::fbig::FBig; +use crate::math::cache::ConstCache; +use crate::repr::{Context, Repr, Word}; +use crate::round::{mode, Round, Rounded}; +use crate::utils::digit_len; + +/// A floating-point number that carries a shared handle to a [`ConstCache`]. +/// +/// It is functionally an [`FBig`]: same in-memory representation (`fbig`), +/// plus an [`Rc>`] handle. The difference is that the +/// transcendental operations (`ln`, `exp`, `sin`, `cos`, …, `pi`, base conversion) +/// thread that handle into the underlying [`Context`] methods, so they reuse and +/// progressively extend the cached exact binary-splitting state instead of +/// recomputing constants from scratch on every call. +/// +/// `Context`/`FBig` themselves stay `Copy` + `Send` + `Sync` + `no_std` (so +/// [`static_fbig!`](dashu_macros::static_fbig!) keeps working); only this cached +/// wrapper is `!Send + !Sync`, because it shares state through an `Rc>`. +/// To share one cache across threads, build an analogous type over +/// `Arc>` instead (the [`Context`] methods accept +/// `Option<&mut ConstCache>`, independent of the container). +/// +/// Every value-producing operation returns a `CachedFBig` that preserves the +/// handle, so `(a + b).ln().exp()` stays cached throughout — no silent cache loss. +/// When two `CachedFBig` values with different cache handles interact in a binary +/// operation, the LHS (left-hand-side) cache is preserved in the result. For +/// `FBig op CachedFBig`, the `CachedFBig` operand's cache is preserved. +/// +/// # Examples +/// +/// ``` +/// use core::cell::RefCell; +/// use core::str::FromStr; +/// use dashu_float::{CachedFBig, ConstCache, Context}; +/// use dashu_float::round::mode::HalfAway; +/// use std::rc::Rc; +/// +/// let cache = Rc::new(RefCell::new(ConstCache::new())); +/// // build a cached decimal number 1.234 +/// let x = CachedFBig::::with_cache( +/// dashu_float::Repr::new(1234.into(), -3), +/// Context::new(50), +/// ); +/// +/// // ln / exp reuse the same shared cache handle +/// let _ = x.clone().ln().exp(); +/// ``` +pub struct CachedFBig { + pub(crate) fbig: FBig, + pub(crate) cache: Rc>, +} + +impl CachedFBig { + /// Wrap an [`FBig`], sharing the given cache handle. + #[inline] + pub fn new(value: FBig, cache: Rc>) -> Self { + Self { fbig: value, cache } + } + + /// Build from raw parts, sharing the given cache handle. + #[inline] + pub fn from_repr(repr: Repr, context: Context, cache: Rc>) -> Self { + Self { + fbig: FBig::new(repr, context), + cache, + } + } + + /// Build from raw parts with a fresh, exclusive cache. + #[inline] + pub fn with_cache(repr: Repr, context: Context) -> Self { + Self::from_repr(repr, context, Rc::new(RefCell::new(ConstCache::new()))) + } + + /// Build a `CachedFBig` from an [`FBig`] result, re-attaching this value's + /// shared cache handle (cloned cheaply via `Rc`). + #[inline] + pub(crate) fn from_fbig(fbig: FBig, cache: &Rc>) -> Self { + Self { + fbig, + cache: Rc::clone(cache), + } + } + + /// Borrow the inner [`FBig`]. + #[inline] + pub fn as_fbig(&self) -> &FBig { + &self.fbig + } + + /// Drop the cache handle and return the underlying [`FBig`]. + #[inline] + pub fn into_fbig(self) -> FBig { + self.fbig + } + + /// Borrow the shared constant cache immutably. + /// + /// Use this to inspect cache state, e.g. `cached.cache().total_terms()`. + #[inline] + pub fn cache(&self) -> impl core::ops::Deref + '_ { + self.cache.borrow() + } + + /// Clear all cached constant state, freeing the underlying memory. + /// + /// The next transcendental operation will recompute constants from scratch. + #[inline] + pub fn clear_cache(&self) { + self.cache.borrow_mut().clear(); + } + + /// π at `precision` base-`B` digits, reusing/extending `cache`. + pub fn pi(precision: usize, cache: &Rc>) -> Self { + let fbig = { + let mut c = cache.borrow_mut(); + Context::::new(precision).pi::(Some(&mut *c)).value() + }; + Self::from_fbig(fbig, cache) + } + + // ----- accessors ----- + + /// Maximum precision set for the number (see [`FBig::precision`]). + #[inline] + pub const fn precision(&self) -> usize { + self.fbig.context.precision + } + + /// Number of significant digits (see [`FBig::digits`]). + #[inline] + pub fn digits(&self) -> usize { + self.fbig.repr.digits() + } + + /// The associated context. + #[inline] + pub const fn context(&self) -> Context { + self.fbig.context + } + + /// The underlying representation. + #[inline] + pub const fn repr(&self) -> &Repr { + &self.fbig.repr + } + + /// Consume and return the underlying representation. + #[inline] + pub fn into_repr(self) -> Repr { + self.fbig.repr + } + + /// Sign of the number (see [`FBig::sign`]). + #[inline] + pub const fn sign(&self) -> Sign { + self.fbig.repr.sign() + } + + /// Change precision, preserving the handle (see [`FBig::with_precision`]). + pub fn with_precision(&self, precision: usize) -> Rounded { + self.fbig + .clone() + .with_precision(precision) + .map(|f| Self::from_fbig(f, &self.cache)) + } + + /// Change rounding mode, preserving the handle (see [`FBig::with_rounding`]). + pub fn with_rounding(&self) -> CachedFBig { + CachedFBig::from_fbig(self.fbig.clone().with_rounding::(), &self.cache) + } +} + +impl CachedFBig { + /// ULP of the number (see [`FBig::ulp`]). + pub fn ulp(&self) -> Self { + if self.fbig.context.precision == 0 { + panic_unlimited_precision(); + } + let repr = Repr { + significand: dashu_int::IBig::ONE, + exponent: self.fbig.repr.exponent + self.fbig.repr.digits() as isize + - self.fbig.context.precision as isize, + }; + Self::from_repr(repr, self.fbig.context, Rc::clone(&self.cache)) + } + + /// Convert to an integer (see [`FBig::to_int`]). + pub fn to_int(&self) -> Rounded { + self.fbig.clone().to_int() + } + + /// Convert to `f32` (see [`FBig::to_f32`]). + pub fn to_f32(&self) -> Rounded { + self.fbig.clone().to_f32() + } + + /// Convert to `f64` (see [`FBig::to_f64`]). + pub fn to_f64(&self) -> Rounded { + self.fbig.clone().to_f64() + } + + /// Construct from significand + exponent, with a fresh cache (see [`FBig::from_parts`]). + pub fn from_parts(significand: dashu_int::IBig, exponent: isize) -> Self { + let precision = digit_len::(&significand).max(1); + let repr = Repr::new(significand, exponent); + Self::with_cache(repr, Context::new(precision)) + } +} + +// --------------------------------------------------------------------------- +// From / Into +// --------------------------------------------------------------------------- + +impl From> for CachedFBig { + #[inline] + fn from(fbig: FBig) -> Self { + Self::new(fbig, Rc::new(RefCell::new(ConstCache::new()))) + } +} + +impl From> for FBig { + #[inline] + fn from(cached: CachedFBig) -> Self { + cached.into_fbig() + } +} + +impl FBig { + /// Attach a shared cache handle, turning this [`FBig`] into a [`CachedFBig`]. + #[inline] + pub fn into_cached(self, cache: Rc>) -> CachedFBig { + CachedFBig::new(self, cache) + } +} + +// --------------------------------------------------------------------------- +// Clone / Default / Debug / comparisons +// --------------------------------------------------------------------------- + +impl Clone for CachedFBig { + #[inline] + fn clone(&self) -> Self { + Self { + fbig: self.fbig.clone(), + cache: Rc::clone(&self.cache), + } + } +} + +impl Default for CachedFBig { + /// Default value: 0 with a fresh cache. + #[inline] + fn default() -> Self { + Self::with_cache(Repr::zero(), Context::new(0)) + } +} + +impl core::fmt::Debug for CachedFBig { + fn fmt(&self, f: &mut core::fmt::Formatter<'_>) -> core::fmt::Result { + f.debug_struct("CachedFBig") + .field("repr", &self.fbig.repr) + .field("precision", &self.fbig.context.precision) + .finish() + } +} + +impl PartialEq> for CachedFBig { + #[inline] + fn eq(&self, other: &CachedFBig) -> bool { + // value equality, mirroring FBig (compares the representation only). + self.fbig.repr == other.fbig.repr + } +} + +impl Eq for CachedFBig {} + +#[cfg(test)] +mod tests { + use super::*; + use crate::round::mode; + use alloc::format; + + fn handle() -> Rc> { + Rc::new(RefCell::new(ConstCache::new())) + } + + /// An `FBig` with value `n` at the given precision (so inexact results match the + /// `CachedFBig` operands built at the same precision). + fn fbig(n: i32, prec: usize) -> FBig { + FBig::from_repr(Repr::new(n.into(), 0), Context::new(prec)) + } + + #[test] + fn test_pi_matches_fbig() { + for &precision in &[10usize, 50, 100] { + let h = handle(); + let cached = CachedFBig::::pi(precision, &h).into_fbig(); + let direct = FBig::::pi(precision); + assert_eq!(cached, direct, "pi mismatch at precision {precision}"); + } + } + + #[test] + fn test_transcendentals_match_fbig() { + let x = CachedFBig::::with_cache( + Repr::new(1234.into(), -3), // 1.234 + Context::new(50), + ); + let y = FBig::::from_repr(Repr::new(1234.into(), -3), Context::new(50)); + + assert_eq!(x.clone().ln().into_fbig(), y.clone().ln()); + assert_eq!(x.clone().exp().into_fbig(), y.clone().exp()); + assert_eq!(x.clone().sin().into_fbig(), y.clone().sin()); + assert_eq!(x.clone().cos().into_fbig(), y.clone().cos()); + assert_eq!(x.clone().exp_m1().into_fbig(), y.clone().exp_m1()); + assert_eq!(x.clone().ln_1p().into_fbig(), y.clone().ln_1p()); + assert_eq!(x.powf(&x.clone()).into_fbig(), y.clone().powf(&y)); + } + + #[test] + fn test_cache_extension_matches_scratch() { + // Extending π 100 -> 1000 through one shared handle must equal a from-scratch compute. + let h = handle(); + let _pi_100 = CachedFBig::::pi(100, &h); + let pi_1000 = CachedFBig::::pi(1000, &h).into_fbig(); + let direct = Context::::new(1000).pi::<10>(None).value(); + assert_eq!(pi_1000, direct); + } + + #[test] + fn test_cache_survives_arithmetic() { + // a and b share one cache handle; the sum must keep it so the subsequent + // ln() reuses the same shared cache. + let h = handle(); + let a = CachedFBig::::from_repr( + Repr::new(2.into(), 0), + Context::new(30), + h.clone(), + ); + let b = CachedFBig::::from_repr( + Repr::new(3.into(), 0), + Context::new(30), + h.clone(), + ); + let sum_ln = (a.clone() + b.clone()).ln().into_fbig(); + let expected = (fbig(2, 30) + fbig(3, 30)).ln(); + assert_eq!(sum_ln, expected); + } + + #[test] + fn test_arithmetic_matches_fbig() { + let a = + CachedFBig::::with_cache(Repr::new(2.into(), 0), Context::new(20)); + let b = + CachedFBig::::with_cache(Repr::new(3.into(), 0), Context::new(20)); + + assert_eq!((a.clone() + b.clone()).into_fbig(), fbig(2, 20) + fbig(3, 20)); + assert_eq!((a.clone() - b.clone()).into_fbig(), fbig(2, 20) - fbig(3, 20)); + assert_eq!((a.clone() * b.clone()).into_fbig(), fbig(2, 20) * fbig(3, 20)); + assert_eq!((a.clone() / b.clone()).into_fbig(), fbig(2, 20) / fbig(3, 20)); + } + + #[test] + fn test_debug_compiles() { + let x = CachedFBig::::with_cache( + Repr::new(1234.into(), -3), + Context::new(50), + ); + let s = format!("{:?}", x); + assert!(s.contains("CachedFBig")); + } + + #[test] + fn test_arithmetic_with_fbig() { + let a = + CachedFBig::::with_cache(Repr::new(2.into(), 0), Context::new(20)); + let b = fbig(3, 20); + + // CachedFBig op FBig — cache preserved (LHS) + let c = a.clone() + b.clone(); + assert_eq!(c.into_fbig(), fbig(2, 20) + fbig(3, 20)); + + // FBig op CachedFBig — cache preserved (RHS) + let d = b.clone() + a.clone(); + assert_eq!(d.into_fbig(), fbig(3, 20) + fbig(2, 20)); + + // Sub, Mul, Div + assert_eq!((a.clone() - b.clone()).into_fbig(), fbig(2, 20) - fbig(3, 20)); + assert_eq!((a.clone() * b.clone()).into_fbig(), fbig(2, 20) * fbig(3, 20)); + assert_eq!((a.clone() / b.clone()).into_fbig(), fbig(2, 20) / fbig(3, 20)); + } + + #[test] + fn test_arithmetic_with_primitives() { + let a = + CachedFBig::::with_cache(Repr::new(2.into(), 0), Context::new(20)); + + // CachedFBig op primitive + assert_eq!((a.clone() + 3u8).into_fbig(), fbig(2, 20) + 3u8); + assert_eq!((a.clone() - 3i32).into_fbig(), fbig(2, 20) - 3i32); + assert_eq!((a.clone() * 4u64).into_fbig(), fbig(2, 20) * 4u64); + + // Primitive op CachedFBig + assert_eq!((3u8 + a.clone()).into_fbig(), 3u8 + fbig(2, 20)); + assert_eq!((10i32 - a.clone()).into_fbig(), 10i32 - fbig(2, 20)); + } + + #[test] + fn test_cache_size() { + let x = CachedFBig::::with_cache( + Repr::new(1234.into(), -3), + Context::new(50), + ); + let _ = x.ln(); + // After computing ln(1.234), the cache should have some state + assert!(x.cache().total_terms() > 0); + assert!(x.cache().total_words() > 0); + } + + #[test] + fn test_cache_clear() { + let x = CachedFBig::::with_cache( + Repr::new(1234.into(), -3), + Context::new(50), + ); + let before_clear = x.ln().into_fbig(); + assert!(x.cache().total_terms() > 0); + + x.clear_cache(); + assert_eq!(x.cache().total_terms(), 0); + assert_eq!(x.cache().total_words(), 0); + + // After clearing, recomputation still produces the same result + let after_clear = x.ln().into_fbig(); + assert_eq!(after_clear, before_clear); + } +} diff --git a/float/src/fbig_cached_ops.rs b/float/src/fbig_cached_ops.rs new file mode 100644 index 0000000..8a3c589 --- /dev/null +++ b/float/src/fbig_cached_ops.rs @@ -0,0 +1,449 @@ +//! Operators for [`CachedFBig`] — all binary/unary operators with cache preservation. + +use core::ops::{ + Add, AddAssign, Div, DivAssign, Mul, MulAssign, Neg, Rem, RemAssign, Sub, SubAssign, +}; + +use dashu_base::Abs; + +use crate::fbig::FBig; +use crate::fbig_cached::CachedFBig; +use crate::repr::{Context, Word}; +use crate::round::Round; + +// --------------------------------------------------------------------------- +// CachedFBig op CachedFBig (preserves LHS cache) +// --------------------------------------------------------------------------- + +macro_rules! impl_cached_binop { + ($Op:ident, $op:ident) => { + impl $Op> for CachedFBig { + type Output = CachedFBig; + #[inline] + fn $op(self, rhs: CachedFBig) -> Self::Output { + CachedFBig::from_fbig($Op::$op(self.fbig, rhs.fbig), &self.cache) + } + } + }; +} +impl_cached_binop!(Add, add); +impl_cached_binop!(Sub, sub); +impl_cached_binop!(Mul, mul); +impl_cached_binop!(Div, div); +impl_cached_binop!(Rem, rem); + +macro_rules! impl_cached_binop_assign { + ($OpAssign:ident, $op_assign:ident, $Op:ident, $op:ident) => { + impl $OpAssign> for CachedFBig { + #[inline] + fn $op_assign(&mut self, rhs: CachedFBig) { + let res = $Op::$op(self.fbig.clone(), rhs.fbig); + self.fbig = res; + } + } + }; +} +impl_cached_binop_assign!(AddAssign, add_assign, Add, add); +impl_cached_binop_assign!(SubAssign, sub_assign, Sub, sub); +impl_cached_binop_assign!(MulAssign, mul_assign, Mul, mul); +impl_cached_binop_assign!(DivAssign, div_assign, Div, div); +impl_cached_binop_assign!(RemAssign, rem_assign, Rem, rem); + +// --------------------------------------------------------------------------- +// CachedFBig op FBig and FBig op CachedFBig (cache preserved from CachedFBig side) +// --------------------------------------------------------------------------- + +macro_rules! impl_cached_binop_one_way_with_fbig { + ($Op:ident, $op:ident) => { + impl $Op> for CachedFBig { + type Output = CachedFBig; + #[inline] + fn $op(self, rhs: FBig) -> Self::Output { + CachedFBig::from_fbig($Op::$op(self.fbig, rhs), &self.cache) + } + } + impl<'l, R: Round, const B: Word> $Op> for &'l CachedFBig { + type Output = CachedFBig; + #[inline] + fn $op(self, rhs: FBig) -> Self::Output { + CachedFBig::from_fbig($Op::$op(self.fbig.clone(), rhs), &self.cache) + } + } + impl<'r, R: Round, const B: Word> $Op<&'r FBig> for CachedFBig { + type Output = CachedFBig; + #[inline] + fn $op(self, rhs: &FBig) -> Self::Output { + CachedFBig::from_fbig($Op::$op(self.fbig, rhs.clone()), &self.cache) + } + } + impl<'l, 'r, R: Round, const B: Word> $Op<&'r FBig> for &'l CachedFBig { + type Output = CachedFBig; + #[inline] + fn $op(self, rhs: &FBig) -> Self::Output { + CachedFBig::from_fbig($Op::$op(self.fbig.clone(), rhs.clone()), &self.cache) + } + } + }; +} + +macro_rules! impl_cached_binop_reverse_with_fbig { + ($Op:ident, $op:ident) => { + impl $Op> for FBig { + type Output = CachedFBig; + #[inline] + fn $op(self, rhs: CachedFBig) -> Self::Output { + CachedFBig::from_fbig($Op::$op(self, rhs.fbig), &rhs.cache) + } + } + impl<'l, R: Round, const B: Word> $Op> for &'l FBig { + type Output = CachedFBig; + #[inline] + fn $op(self, rhs: CachedFBig) -> Self::Output { + CachedFBig::from_fbig($Op::$op(self.clone(), rhs.fbig), &rhs.cache) + } + } + impl<'r, R: Round, const B: Word> $Op<&'r CachedFBig> for FBig { + type Output = CachedFBig; + #[inline] + fn $op(self, rhs: &CachedFBig) -> Self::Output { + CachedFBig::from_fbig($Op::$op(self, rhs.fbig.clone()), &rhs.cache) + } + } + impl<'l, 'r, R: Round, const B: Word> $Op<&'r CachedFBig> for &'l FBig { + type Output = CachedFBig; + #[inline] + fn $op(self, rhs: &CachedFBig) -> Self::Output { + CachedFBig::from_fbig($Op::$op(self.clone(), rhs.fbig.clone()), &rhs.cache) + } + } + }; +} + +impl_cached_binop_one_way_with_fbig!(Add, add); +impl_cached_binop_one_way_with_fbig!(Sub, sub); +impl_cached_binop_one_way_with_fbig!(Mul, mul); +impl_cached_binop_one_way_with_fbig!(Div, div); +impl_cached_binop_one_way_with_fbig!(Rem, rem); + +impl_cached_binop_reverse_with_fbig!(Add, add); +impl_cached_binop_reverse_with_fbig!(Sub, sub); +impl_cached_binop_reverse_with_fbig!(Mul, mul); +impl_cached_binop_reverse_with_fbig!(Div, div); +impl_cached_binop_reverse_with_fbig!(Rem, rem); + +// assign: CachedFBig op= FBig + +macro_rules! impl_cached_binop_assign_with_fbig { + ($OpAssign:ident, $op_assign:ident, $Op:ident, $op:ident) => { + impl $OpAssign> for CachedFBig { + #[inline] + fn $op_assign(&mut self, rhs: FBig) { + self.fbig = $Op::$op(self.fbig.clone(), rhs); + } + } + impl $OpAssign<&FBig> for CachedFBig { + #[inline] + fn $op_assign(&mut self, rhs: &FBig) { + self.fbig = $Op::$op(self.fbig.clone(), rhs.clone()); + } + } + }; +} + +impl_cached_binop_assign_with_fbig!(AddAssign, add_assign, Add, add); +impl_cached_binop_assign_with_fbig!(SubAssign, sub_assign, Sub, sub); +impl_cached_binop_assign_with_fbig!(MulAssign, mul_assign, Mul, mul); +impl_cached_binop_assign_with_fbig!(DivAssign, div_assign, Div, div); +impl_cached_binop_assign_with_fbig!(RemAssign, rem_assign, Rem, rem); + +// --------------------------------------------------------------------------- +// CachedFBig op Primitive and Primitive op CachedFBig +// (delegate through the FBig-side impls above) +// --------------------------------------------------------------------------- + +macro_rules! impl_cached_binop_one_way_with_primitive { + ($Op:ident, $op:ident, $target:ty) => { + impl $Op<$target> for CachedFBig { + type Output = CachedFBig; + #[inline] + fn $op(self, rhs: $target) -> Self::Output { + self.$op(FBig::::from(rhs)) + } + } + impl<'l, R: Round, const B: Word> $Op<$target> for &'l CachedFBig { + type Output = CachedFBig; + #[inline] + fn $op(self, rhs: $target) -> Self::Output { + self.$op(FBig::::from(rhs)) + } + } + impl<'r, R: Round, const B: Word> $Op<&'r $target> for CachedFBig { + type Output = CachedFBig; + #[inline] + fn $op(self, rhs: &$target) -> Self::Output { + self.$op(FBig::::from(rhs.clone())) + } + } + impl<'l, 'r, R: Round, const B: Word> $Op<&'r $target> for &'l CachedFBig { + type Output = CachedFBig; + #[inline] + fn $op(self, rhs: &$target) -> Self::Output { + self.$op(FBig::::from(rhs.clone())) + } + } + }; +} + +macro_rules! impl_cached_binop_reverse_with_primitive { + ($Op:ident, $op:ident, $target:ty) => { + impl $Op> for $target { + type Output = CachedFBig; + #[inline] + fn $op(self, rhs: CachedFBig) -> Self::Output { + FBig::::from(self).$op(rhs) + } + } + impl<'l, R: Round, const B: Word> $Op> for &'l $target { + type Output = CachedFBig; + #[inline] + fn $op(self, rhs: CachedFBig) -> Self::Output { + FBig::::from(self.clone()).$op(rhs) + } + } + impl<'r, R: Round, const B: Word> $Op<&'r CachedFBig> for $target { + type Output = CachedFBig; + #[inline] + fn $op(self, rhs: &CachedFBig) -> Self::Output { + FBig::::from(self).$op(rhs) + } + } + impl<'l, 'r, R: Round, const B: Word> $Op<&'r CachedFBig> for &'l $target { + type Output = CachedFBig; + #[inline] + fn $op(self, rhs: &CachedFBig) -> Self::Output { + FBig::::from(self.clone()).$op(rhs) + } + } + }; +} + +macro_rules! impl_cached_binop_assign_with_primitive { + ($OpAssign:ident, $op_assign:ident, $Op:ident, $op:ident, $target:ty) => { + impl $OpAssign<$target> for CachedFBig { + #[inline] + fn $op_assign(&mut self, rhs: $target) { + self.$op_assign(FBig::::from(rhs)); + } + } + impl $OpAssign<&$target> for CachedFBig { + #[inline] + fn $op_assign(&mut self, rhs: &$target) { + self.$op_assign(FBig::::from(rhs.clone())); + } + } + }; +} + +macro_rules! impl_cached_binop_with_primitives { + ($Op:ident, $op:ident $(, $t:ty)*) => { + $( + impl_cached_binop_one_way_with_primitive!($Op, $op, $t); + impl_cached_binop_reverse_with_primitive!($Op, $op, $t); + )* + }; +} + +macro_rules! impl_cached_binop_assign_with_primitives { + ($OpAssign:ident, $op_assign:ident, $Op:ident, $op:ident $(, $t:ty)*) => { + $( + impl_cached_binop_assign_with_primitive!($OpAssign, $op_assign, $Op, $op, $t); + )* + }; +} + +// Unsigned +impl_cached_binop_with_primitives!(Add, add, u8, u16, u32, u64, u128, usize, dashu_int::UBig); +impl_cached_binop_with_primitives!(Sub, sub, u8, u16, u32, u64, u128, usize, dashu_int::UBig); +impl_cached_binop_with_primitives!(Mul, mul, u8, u16, u32, u64, u128, usize, dashu_int::UBig); +impl_cached_binop_with_primitives!(Div, div, u8, u16, u32, u64, u128, usize, dashu_int::UBig); +impl_cached_binop_with_primitives!(Rem, rem, u8, u16, u32, u64, u128, usize, dashu_int::UBig); + +// Signed +impl_cached_binop_with_primitives!(Add, add, i8, i16, i32, i64, i128, isize, dashu_int::IBig); +impl_cached_binop_with_primitives!(Sub, sub, i8, i16, i32, i64, i128, isize, dashu_int::IBig); +impl_cached_binop_with_primitives!(Mul, mul, i8, i16, i32, i64, i128, isize, dashu_int::IBig); +impl_cached_binop_with_primitives!(Div, div, i8, i16, i32, i64, i128, isize, dashu_int::IBig); +impl_cached_binop_with_primitives!(Rem, rem, i8, i16, i32, i64, i128, isize, dashu_int::IBig); + +// Assign variants +#[rustfmt::skip] +impl_cached_binop_assign_with_primitives!(AddAssign, add_assign, Add, add, + u8, u16, u32, u64, u128, usize, dashu_int::UBig, + i8, i16, i32, i64, i128, isize, dashu_int::IBig); +#[rustfmt::skip] +impl_cached_binop_assign_with_primitives!(SubAssign, sub_assign, Sub, sub, + u8, u16, u32, u64, u128, usize, dashu_int::UBig, + i8, i16, i32, i64, i128, isize, dashu_int::IBig); +#[rustfmt::skip] +impl_cached_binop_assign_with_primitives!(MulAssign, mul_assign, Mul, mul, + u8, u16, u32, u64, u128, usize, dashu_int::UBig, + i8, i16, i32, i64, i128, isize, dashu_int::IBig); +#[rustfmt::skip] +impl_cached_binop_assign_with_primitives!(DivAssign, div_assign, Div, div, + u8, u16, u32, u64, u128, usize, dashu_int::UBig, + i8, i16, i32, i64, i128, isize, dashu_int::IBig); +#[rustfmt::skip] +impl_cached_binop_assign_with_primitives!(RemAssign, rem_assign, Rem, rem, + u8, u16, u32, u64, u128, usize, dashu_int::UBig, + i8, i16, i32, i64, i128, isize, dashu_int::IBig); + +// --------------------------------------------------------------------------- +// Unary operators +// --------------------------------------------------------------------------- + +impl Neg for CachedFBig { + type Output = CachedFBig; + #[inline] + fn neg(self) -> Self::Output { + CachedFBig::from_fbig(-self.fbig, &self.cache) + } +} + +impl Abs for CachedFBig { + type Output = CachedFBig; + #[inline] + fn abs(self) -> Self::Output { + CachedFBig::from_fbig(Abs::abs(self.fbig), &self.cache) + } +} + +// --------------------------------------------------------------------------- +// Math functions (forward to Context / FBig, preserve cache handle) +// --------------------------------------------------------------------------- + +/// Forward a unary function to a [`Context`] method returning `FpResult`, panicking on +/// error and re-attaching the cache handle. Returns a bare `CachedFBig`. +macro_rules! forward_to_context { + ($name:ident) => { + #[doc = concat!("See [`FBig::", stringify!($name), "`].")] + #[inline] + pub fn $name(&self) -> CachedFBig { + let mut c = self.cache.borrow_mut(); + let fbig = self + .fbig + .context + .unwrap_fp(self.fbig.context.$name::(&self.fbig.repr, Some(&mut *c))); + CachedFBig::from_fbig(fbig, &self.cache) + } + }; +} + +/// Forward a unary function to a [`Context`] method returning `FpResult`, panicking on +/// error and discarding the rounding info. Returns a bare `CachedFBig`. +macro_rules! forward_to_context_unwrap { + ($name:ident) => { + #[doc = concat!("See [`FBig::", stringify!($name), "`].")] + #[inline] + pub fn $name(&self) -> CachedFBig { + let mut c = self.cache.borrow_mut(); + let fbig = self + .fbig + .context + .unwrap_fp(self.fbig.context.$name::(&self.fbig.repr, Some(&mut *c))); + CachedFBig::from_fbig(fbig, &self.cache) + } + }; +} + +/// Forward a unary function that delegates to the inner [`FBig`] (no cache needed). +macro_rules! forward_to_fbig { + ($name:ident) => { + #[doc = concat!("See [`FBig::", stringify!($name), "`].")] + #[inline] + pub fn $name(&self) -> CachedFBig { + CachedFBig::from_fbig(self.fbig.clone().$name(), &self.cache) + } + }; + ($name:ident($arg:ident: $arg_ty:ty)) => { + #[doc = concat!("See [`FBig::", stringify!($name), "`].")] + #[inline] + pub fn $name(&self, $arg: $arg_ty) -> CachedFBig { + CachedFBig::from_fbig(self.fbig.clone().$name($arg), &self.cache) + } + }; +} + +impl CachedFBig { + forward_to_context!(ln); + forward_to_context!(ln_1p); + forward_to_context!(exp); + forward_to_context!(exp_m1); + + /// Square root (see [`Context::sqrt`]). + #[inline] + pub fn sqrt(&self) -> Self { + let fbig = self + .fbig + .context + .unwrap_fp(self.fbig.context.sqrt::(&self.fbig.repr)); + Self::from_fbig(fbig, &self.cache) + } + + /// Multiplicative inverse (see [`Context::inv`]). + #[inline] + pub fn inv(&self) -> Self { + let fbig = self + .fbig + .context + .unwrap_fp(self.fbig.context.inv::(&self.fbig.repr)); + Self::from_fbig(fbig, &self.cache) + } + + forward_to_context_unwrap!(sin); + forward_to_context_unwrap!(cos); + forward_to_context_unwrap!(tan); + forward_to_context_unwrap!(asin); + forward_to_context_unwrap!(acos); + forward_to_context_unwrap!(atan); + + forward_to_fbig!(powi(exp: dashu_int::IBig)); + forward_to_fbig!(sqr); + forward_to_fbig!(cubic); + + /// `self^exp` (see [`FBig::powf`]). + pub fn powf(&self, exp: &Self) -> Self { + let context = Context::max(self.fbig.context, exp.fbig.context); + let mut c = self.cache.borrow_mut(); + let fbig = + context.unwrap_fp(context.powf::(&self.fbig.repr, &exp.fbig.repr, Some(&mut *c))); + Self::from_fbig(fbig, &self.cache) + } + + /// Sine and cosine together (see [`FBig::sin_cos`]). + pub fn sin_cos(&self) -> (Self, Self) { + let mut guard = self.cache.borrow_mut(); + let cache = Some(&mut *guard); + let (s, c) = self.fbig.context.sin_cos::(&self.fbig.repr, cache); + ( + Self::from_fbig(self.fbig.context.unwrap_fp(s), &self.cache), + Self::from_fbig(self.fbig.context.unwrap_fp(c), &self.cache), + ) + } + + /// `atan2(y, x)` (see [`FBig::atan2`]). + pub fn atan2(&self, x: &Self) -> Self { + let mut c = self.cache.borrow_mut(); + let fbig = self.fbig.context.unwrap_fp(self.fbig.context.atan2::( + &self.fbig.repr, + &x.fbig.repr, + Some(&mut *c), + )); + Self::from_fbig(fbig, &self.cache) + } + + /// Reciprocal `1/x` — alias for [`Self::inv`]. + #[inline] + pub fn reciprocal(&self) -> Self { + self.inv() + } +} diff --git a/float/src/lib.rs b/float/src/lib.rs index 1af2a3d..2a0e34b 100644 --- a/float/src/lib.rs +++ b/float/src/lib.rs @@ -53,7 +53,7 @@ //! use dashu_base::Approximation::*; //! use dashu_float::{Context, round::{mode::HalfAway, Rounding::*}}; //! let ctxt = Context::::new(6); -//! assert_eq!(ctxt.exp(DBig::ONE.repr()), Inexact(c, NoOp)); +//! assert_eq!(ctxt.exp(DBig::ONE.repr(), None), Ok(Inexact(c, NoOp))); //! # Ok::<(), ParseError>(()) //! ``` //! @@ -72,6 +72,8 @@ mod div; mod error; mod exp; mod fbig; +mod fbig_cached; +mod fbig_cached_ops; mod fmt; mod helper_macros; mod iter; @@ -94,7 +96,10 @@ mod utils; pub use third_party::*; pub use fbig::FBig; +pub use fbig_cached::CachedFBig; +pub use math::cache::ConstCache; pub use repr::{Context, Repr}; +pub use {crate::error::FpError, crate::math::FpResult}; /// Multi-precision float number with decimal exponent and [HalfAway][round::mode::HalfAway] rounding mode pub type DBig = FBig; @@ -102,6 +107,7 @@ pub type DBig = FBig; #[doc(hidden)] pub use dashu_int::Word; // for macros -// TODO: allow operations with inf, but only panic when the result is nan (inf - inf and inf / inf) -// for division with zero (and other functions that has different limits at zero), -// we might forbidden it because we don't want to support negative zero in this library. +// Infinities are now first-class *terminal* values: operations produce them as results +// (1/0 → +inf, ln(0) → -inf, …) but reject infinite *inputs* (FpError::InfiniteInput / panic), +// which structurally avoids the NaN-producing indeterminate forms (inf−inf, inf/inf, 0·inf). +// IEEE signed zero is supported (`-0`), so 1/-0 = -inf is well-defined. diff --git a/float/src/log.rs b/float/src/log.rs index f3c4d7c..5525e07 100644 --- a/float/src/log.rs +++ b/float/src/log.rs @@ -7,11 +7,14 @@ use dashu_base::{ use dashu_int::IBig; use crate::{ - error::{assert_finite, assert_limited_precision}, + error::{assert_finite, assert_limited_precision, FpError, FpResult}, fbig::FBig, + math::cache::{reborrow_cache, ConstCache}, repr::{Context, Repr, Word}, round::{Round, Rounded}, + utils::ceil_usize, }; +use core::cmp::Ordering; impl EstimatedLog2 for Repr { // currently a Word has at most 64 bits, so log2() < f32::MAX @@ -75,7 +78,7 @@ impl FBig { /// ``` #[inline] pub fn ln(&self) -> Self { - self.context.ln(&self.repr).value() + self.context.unwrap_fp(self.context.ln(&self.repr, None)) } /// Calculate the natural logarithm function (`log(x+1)`) on the float number @@ -92,7 +95,7 @@ impl FBig { /// ``` #[inline] pub fn ln_1p(&self) -> Self { - self.context.ln_1p(&self.repr).value() + self.context.unwrap_fp(self.context.ln_1p(&self.repr, None)) } } @@ -101,32 +104,41 @@ impl Context { /// /// The precision of the output will be larger than self.precision #[inline] - fn ln2(&self) -> FBig { + fn ln2(&self, cache: Option<&mut ConstCache>) -> FBig { + if let Some(c) = cache { + return c.ln2::(self.precision); + } // log(2) = 4L(6) + 2L(99) // see formula (24) from Gourdon, Xavier, and Pascal Sebah. // "The Logarithmic Constant: Log 2." (2004) 4 * self.iacoth(6.into()) + 2 * self.iacoth(99.into()) } - /// Calculate log(2) + /// Calculate log(10) /// /// The precision of the output will be larger than self.precision #[inline] - fn ln10(&self) -> FBig { + fn ln10(&self, cache: Option<&mut ConstCache>) -> FBig { + if let Some(c) = cache { + return c.ln10::(self.precision); + } // log(10) = log(2) + log(5) = 3log(2) + 2L(9) - 3 * self.ln2() + 2 * self.iacoth(9.into()) + 3 * self.ln2(None) + 2 * self.iacoth(9.into()) } /// Calculate log(B), for internal use only /// /// The precision of the output will be larger than self.precision #[inline] - pub(crate) fn ln_base(&self) -> FBig { + pub(crate) fn ln_base(&self, cache: Option<&mut ConstCache>) -> FBig { + if let Some(c) = cache { + return c.ln_base::(self.precision); + } match B { - 2 => self.ln2(), - 10 => self.ln10(), - i if i.is_power_of_two() => self.ln2() * i.trailing_zeros(), - _ => self.ln(&Repr::new(Repr::::BASE.into(), 0)).value(), + 2 => self.ln2(None), + 10 => self.ln10(None), + i if i.is_power_of_two() => self.ln2(None) * i.trailing_zeros(), + _ => self.unwrap_fp(self.ln(&Repr::new(Repr::::BASE.into(), 0), None)), } } @@ -134,50 +146,30 @@ impl Context { /// /// This method is intended to be used in logarithm calculation, /// so the precision of the output will be larger than desired precision. + /// + /// Evaluated by binary splitting (see [`iacoth_bs`][crate::math::consts::iacoth_bs]): + /// the exact integer tree state `(P, Q, T)` over `[1, N)` satisfies + /// `L(n) = (Q + T)/(n·Q)`, with `Q` kept at O(p) digits by the ratio-form + /// term recurrence. fn iacoth(&self, n: IBig) -> FBig { - /* - * use Maclaurin series: - * 1 1 n+1 1 - * atanh(—) = — log(———) = Σ ——————————— - * n 2 n-1 i≥0 n²ⁱ⁺¹(2i+1) - * - * Therefore to achieve precision B^p, the series should be stopped at - * n²ⁱ⁺¹(2i+1) / n = B^p - * => 2i*ln(n) + ln(2i+1) = p ln(B) - * ~> 2i*ln(n) = p ln(B) - * => 2i = p/log_B(n) - * - * There will be i summations when calculating the series, to prevent - * loss of significant, we needs about log_B(i) guard digits. - * log_B(i) - * <= log_B(p/2log_B(n)) - * = log_B(p/2) - log_B(log_B(n)) - * <= log_B(p/2) - */ - - // extras digits are added to ensure precise result - // TODO: test if we can use log_B(p/2log_B(n)) directly - let guard_digits = (self.precision.log2_est() / B.log2_est()) as usize; - let work_context = Self::new(self.precision + guard_digits + 2); + let n: u32 = (&n).try_into().expect("iacoth argument must fit in u32"); - let n = work_context.convert_int(n).value(); - let inv = FBig::ONE / n; - let inv2 = inv.sqr(); - let mut sum = inv.clone(); - let mut pow = inv; + // number of series terms until r_k < B^{-p}: (2k+1)·log_B(n) > p. + // The count is generously over-provisioned, so a truncating cast stands in + // for a ceiling. + let log_b_n = n.log2_est() / B.log2_est(); + let num_terms = (self.precision as f32 / (2.0 * log_b_n)) as usize + 10; - let mut k: usize = 3; - loop { - pow *= &inv2; + let (_p, q, t) = crate::math::consts::iacoth_bs(n, 1, num_terms + 1); - let increase = &pow / work_context.convert_int::(k.into()).value(); - if increase < sum.sub_ulp() { - return sum; - } + // L(n) = (Q + T) / (n·Q). Extra guard digits absorb the division's rounding + // (the binary-splitting state is exact, so only this single round loses anything). + let guard_digits = ceil_usize(self.precision.log2_est() / B.log2_est()); + let work_context = Self::new(self.precision + guard_digits + 2); - sum += increase; - k += 2; - } + let num = work_context.convert_int::(q.as_ibig() + &t).value(); + let denom = work_context.convert_int::(IBig::from(n) * &q).value(); + num / denom } /// Calculate the natural logarithm function (`log(x)`) on the float number under this context. @@ -193,12 +185,26 @@ impl Context { /// /// let context = Context::::new(2); /// let a = DBig::from_str("1.234")?; - /// assert_eq!(context.ln(&a.repr()), Inexact(DBig::from_str("0.21")?, NoOp)); + /// assert_eq!(context.ln(&a.repr(), None), Ok(Inexact(DBig::from_str("0.21")?, NoOp))); /// # Ok::<(), ParseError>(()) /// ``` #[inline] - pub fn ln(&self, x: &Repr) -> Rounded> { - self.ln_internal(x, false) + pub fn ln( + &self, + x: &Repr, + cache: Option<&mut ConstCache>, + ) -> FpResult> { + if x.is_infinite() { + return Err(FpError::InfiniteInput); + } + if x.significand.is_zero() { + // ln(±0) = -inf (a value, not an error) + return Ok(Exact(FBig::new(Repr::neg_infinity(), *self))); + } + if x.sign() == Sign::Negative { + return Err(FpError::OutOfDomain); + } + Ok(self.ln_internal(x, false, cache)) } /// Calculate the natural logarithm function (`log(x+1)`) on the float number under this context. @@ -214,26 +220,55 @@ impl Context { /// /// let context = Context::::new(2); /// let a = DBig::from_str("0.1234")?; - /// assert_eq!(context.ln_1p(&a.repr()), Inexact(DBig::from_str("0.12")?, AddOne)); + /// assert_eq!(context.ln_1p(&a.repr(), None), Ok(Inexact(DBig::from_str("0.12")?, AddOne))); /// # Ok::<(), ParseError>(()) /// ``` #[inline] - pub fn ln_1p(&self, x: &Repr) -> Rounded> { - self.ln_internal(x, true) + pub fn ln_1p( + &self, + x: &Repr, + cache: Option<&mut ConstCache>, + ) -> FpResult> { + if x.is_infinite() { + return Err(FpError::InfiniteInput); + } + // Domain of ln_1p is x > -1. x == -1 gives -inf; x < -1 is out of domain. + if x.sign() == Sign::Negative && !x.significand.is_zero() { + match FBig::::new(x.clone(), *self).abs_cmp(&FBig::ONE) { + Ordering::Greater => return Err(FpError::OutOfDomain), // x < -1 + Ordering::Equal => return Ok(Exact(FBig::new(Repr::neg_infinity(), *self))), + _ => {} + } + } + Ok(self.ln_internal(x, true, cache)) } - fn ln_internal(&self, x: &Repr, one_plus: bool) -> Rounded> { + fn ln_internal( + &self, + x: &Repr, + one_plus: bool, + mut cache: Option<&mut ConstCache>, + ) -> Rounded> { assert_finite(x); assert_limited_precision(self.precision); - if (one_plus && x.is_zero()) || (!one_plus && x.is_one()) { - return Exact(FBig::ZERO); + if !one_plus && x.is_one() { + return Exact(FBig::ZERO); // ln(1) = +0 + } + if one_plus && x.significand.is_zero() { + // ln_1p(±0) = ±0 + let zero = if x.is_neg_zero() { + FBig::new(Repr::neg_zero(), *self) + } else { + FBig::ZERO + }; + return Exact(zero); } // A simple algorithm: // - let log(x) = log(x/2^s) + slog2 where s = floor(log2(x)) // - such that x*2^s is close to but larger than 1 (and x*2^s < 2) - let guard_digits = (self.precision.log2_est() / B.log2_est()) as usize + 2; + let guard_digits = ceil_usize(self.precision.log2_est() / B.log2_est()) + 2; let mut work_precision = self.precision + guard_digits + one_plus as usize; let context = Context::::new(work_precision); let x = FBig::new(context.repr_round_ref(x).value(), context); @@ -298,7 +333,7 @@ impl Context { let result: FBig = if no_scaling { 2 * sum } else { - 2 * sum + s * work_context.ln2() + 2 * sum + s * work_context.ln2::(reborrow_cache(&mut cache)) }; result.with_precision(self.precision) } @@ -339,25 +374,25 @@ mod tests { #[test] fn test_ln2_ln10() { let context = Context::::new(45); - let decimal_ln2 = context.ln2::<10>().with_precision(45).value(); + let decimal_ln2 = context.ln2::<10>(None).with_precision(45).value(); assert_eq!( decimal_ln2.repr.significand, IBig::from_str_radix("693147180559945309417232121458176568075500134", 10).unwrap() ); - let decimal_ln10 = context.ln10::<10>().with_precision(45).value(); + let decimal_ln10 = context.ln10::<10>(None).with_precision(45).value(); assert_eq!( decimal_ln10.repr.significand, IBig::from_str_radix("230258509299404568401799145468436420760110148", 10).unwrap() ); let context = Context::::new(180); - let binary_ln2 = context.ln2::<2>().with_precision(180).value(); + let binary_ln2 = context.ln2::<2>(None).with_precision(180).value(); assert_eq!( binary_ln2.repr.significand, IBig::from_str_radix("1062244963371879310175186301324412638028404515790072203", 10) .unwrap() ); - let binary_ln10 = context.ln10::<2>().with_precision(180).value(); + let binary_ln10 = context.ln10::<2>(None).with_precision(180).value(); assert_eq!( binary_ln10.repr.significand, IBig::from_str_radix("882175346869410758689845931257775553286341791676474847", 10) @@ -365,3 +400,151 @@ mod tests { ); } } + +#[cfg(test)] +mod bench_iacoth { + use super::*; + use crate::round::mode; + use std::hint::black_box; + use std::time::Instant; + + /// Master's iterative iacoth: full-precision FBig Maclaurin series. + fn iacoth_iter(precision: usize, n: u32) -> FBig { + let guard = (precision.log2_est() / B.log2_est()) as usize; + let work = Context::::new(precision + guard + 2); + let nf = work.convert_int::(n.into()).value(); + let inv = FBig::::ONE / nf; + let inv2 = inv.sqr(); + let mut sum = inv.clone(); + let mut pow = inv; + let mut k: usize = 3; + loop { + pow *= &inv2; + let increase = &pow / work.convert_int::(k.into()).value(); + if increase < sum.sub_ulp() { + return sum; + } + sum += increase; + k += 2; + } + } + + /// ln(2) via the iterative series: 4·L(6) + 2·L(99). + fn ln2_iter(precision: usize) -> FBig { + let l6 = iacoth_iter::(precision, 6); + let l99 = iacoth_iter::(precision, 99); + (4u8 * l6 + 2u8 * l99).with_precision(precision).value() + } + + /// ln(2) via the current binary-splitting path. + fn ln2_bs(precision: usize) -> FBig { + Context::::new(precision) + .ln2::(None) + .with_precision(precision) + .value() + } + + #[test] + #[ignore] + fn bench_ln2_iter_vs_binary_splitting() { + let precisions: &[usize] = &[50, 200, 1_000, 5_000, 10_000]; + eprintln!("\nln(2) computation: iterative (master) vs binary splitting (this branch)"); + eprintln!("{:>8} {:>14} {:>14} {:>10}", "digits", "iterative", "bin-split", "speedup"); + for &p in precisions { + // correctness: agree to p-5 digits + let bs = ln2_bs::(p); + let it = ln2_iter::(p); + let check_digits = p.saturating_sub(5).max(1); + assert_eq!( + bs.clone().with_precision(check_digits), + it.clone().with_precision(check_digits), + "mismatch at p={p}" + ); + + let reps = if p <= 200 { + 50 + } else if p <= 1_000 { + 10 + } else { + 1 + }; + let t0 = Instant::now(); + for _ in 0..reps { + black_box(ln2_iter::(p)); + } + let t_iter = t0.elapsed() / reps as u32; + + let t0 = Instant::now(); + for _ in 0..reps { + black_box(ln2_bs::(p)); + } + let t_bs = t0.elapsed() / reps as u32; + + let speedup = t_iter.as_secs_f64() / t_bs.as_secs_f64(); + eprintln!("{:>8} {:>11.2?} {:>14.2?} {:>9.2}x", p, t_iter, t_bs, speedup); + } + } +} + +#[cfg(test)] +mod bench_pi_sqrt { + use super::*; + use crate::round::mode; + use std::hint::black_box; + use std::time::Instant; + + #[test] + #[ignore] + fn bench_pi_repeat_call() { + // First call computes the series + sqrt; second call reuses both. + use crate::math::cache::ConstCache; + let precisions: &[usize] = &[500, 5_000]; + eprintln!("\nπ repeat-call (ConstCache): cold vs warm (sqrt + series reused)"); + eprintln!("{:>8} {:>12} {:>12} {:>10}", "digits", "cold", "warm", "warm/cold"); + for &p in precisions { + let mut c = ConstCache::new(); + let t0 = std::time::Instant::now(); + black_box(c.pi::<10, mode::Zero>(p)); + let cold = t0.elapsed(); + let t0 = std::time::Instant::now(); + black_box(c.pi::<10, mode::Zero>(p)); + let warm = t0.elapsed(); + eprintln!( + "{:>8} {:>12.2?} {:>12.2?} {:>9.2}x", + p, + cold, + warm, + cold.as_secs_f64() / warm.as_secs_f64() + ); + } + } + #[test] + #[ignore] + fn bench_pi_vs_sqrt10005() { + let precisions: &[usize] = &[50, 500, 5_000]; + eprintln!("\nπ (Chudnovsky) vs its sqrt(10005) sub-computation"); + eprintln!("{:>8} {:>12} {:>12} {:>10}", "digits", "pi_total", "sqrt10005", "sqrt %"); + for &p in precisions { + // time the full pi computation + let reps = if p <= 500 { 10 } else { 1 }; + let t0 = Instant::now(); + for _ in 0..reps { + black_box(Context::::new(p).pi::<10>(None).value()); + } + let t_pi = t0.elapsed() / reps as u32; + + // time just sqrt(10005) at the same working precision pi uses + // (work precision ≈ p + guard; use p*2 bits of significand to mirror it) + let ctx = Context::::new(p); + let arg = ctx.convert_int::<10>(10005i32.into()).value(); + let t0 = Instant::now(); + for _ in 0..reps { + black_box(ctx.unwrap_fp(ctx.sqrt(&arg.repr))); + } + let t_sqrt = t0.elapsed() / reps as u32; + + let pct = 100.0 * t_sqrt.as_secs_f64() / t_pi.as_secs_f64(); + eprintln!("{:>8} {:>12.2?} {:>12.2?} {:>8.1}%", p, t_pi, t_sqrt, pct); + } + } +} diff --git a/float/src/math/cache.rs b/float/src/math/cache.rs new file mode 100644 index 0000000..7de7b76 --- /dev/null +++ b/float/src/math/cache.rs @@ -0,0 +1,539 @@ +//! Opt-in cache of mathematical constants, enabling progressive refinement. + +use core::fmt; + +use dashu_base::{BitTest, EstimatedLog2}; +use dashu_int::{IBig, UBig}; + +use crate::fbig::FBig; +use crate::math::consts::{chudnovsky_bs, merge}; +use crate::repr::{Context, Repr, Word}; +use crate::round::{Round, Rounded}; +use crate::utils::ceil_usize; +use crate::{error::assert_limited_precision, math::consts::iacoth_bs}; + +/// Binary-splitting tree state — exact integers, losslessly extensible. +/// +/// Represents `binary_split(start, num_terms)` as the universal triple +/// `(P, Q, T)`, where `start` is 0 for π and 1 for `L(n)` (whose `k=0` term +/// `1/n` is pulled out). These are pure integers: independent of base and +/// rounding mode. To extend to `new_terms > num_terms`, compute the right half +/// over the new range and merge with the universal `T' = T_l·Q_r + P_l·T_r`. +#[derive(Clone)] +pub(crate) struct CachedState { + pub p: UBig, + pub q: UBig, + pub t: IBig, + pub num_terms: usize, +} + +/// An opt-in cache of mathematical constants. +/// +/// Holds exact binary-splitting tree state so that repeated calls at increasing +/// precision *extend* prior work instead of recomputing from scratch. For +/// example, computing π at 100 digits and then at 1000 digits only pays for the +/// extra ~900 digits of work. +/// +/// The cache is **base-free**: a single [`ConstCache`] serves any base. The base +/// and rounding mode are specified on each method call (e.g. +/// `cache.pi::<10, HalfAway>(100)` for 100 decimal digits). +/// +/// `ConstCache` is a plain struct of big integers, so it is `Send + Sync`. The +/// methods take `&mut self` (they extend the cached state on a miss), so a caller +/// either owns one directly, or — to share it across many values and operations — +/// wraps it in `Rc>` as the +/// [`CachedFBig`](crate::CachedFBig) type does. To share one cache across +/// threads, wrap a `ConstCache` (or a `CachedFBig`) in `Arc>`. +/// +/// # Examples +/// +/// ``` +/// use dashu_float::ConstCache; +/// use dashu_float::round::mode::HalfAway; +/// +/// let mut cache = ConstCache::new(); +/// // first call computes from scratch +/// let _pi_100 = cache.pi::<10, HalfAway>(100).value(); +/// // second call at higher precision extends the cached state +/// let pi_1000 = cache.pi::<10, HalfAway>(1000).value(); +/// assert!(pi_1000.to_string().starts_with("3.141592653589793")); +/// ``` +pub struct ConstCache { + pi: Option, + /// `L(6)`, `L(9)`, `L(99)` — the sub-series used by ln2 / ln10. + iacoth_6: Option, + iacoth_9: Option, + iacoth_99: Option, + /// Base-free integer `floor(sqrt(10005) · 2^sqrt_10005_bits)`, reused by π. + /// Unlike the series slots this holds a plain value (not a `(P,Q,T)` triple) and + /// is extended by a fresh Karatsuba `UBig::sqrt` — Newton refinement would be no + /// faster, since `UBig::sqrt` is already O(M(n)). + sqrt_10005: Option, + sqrt_10005_bits: usize, +} + +impl ConstCache { + /// Create an empty cache. + pub const fn new() -> Self { + Self { + pi: None, + iacoth_6: None, + iacoth_9: None, + iacoth_99: None, + sqrt_10005: None, + sqrt_10005_bits: 0, + } + } + + /// `floor(sqrt(10005) · 2^bits)` as a base-free integer, cached and extended on + /// demand. Used by [`pi`](Self::pi). Computed via Karatsuba `UBig::sqrt` (O(M(n))). + /// Returns the value together with the number of bits it actually corresponds to + /// (which may be larger than requested, when a higher-precision value is reused). + fn sqrt_10005(&mut self, bits: usize) -> (UBig, usize) { + if bits > self.sqrt_10005_bits { + let n = UBig::from(10005u32) << (2 * bits); + self.sqrt_10005 = Some(dashu_base::SquareRoot::sqrt(&n)); + self.sqrt_10005_bits = bits; + } + (self.sqrt_10005.as_ref().unwrap().clone(), self.sqrt_10005_bits) + } + + /// π at `precision` base-`B` digits, rounded per `R`. Extends any prior π + /// state cached in `self`. + /// + /// # Panics + /// + /// Panics if `precision` is 0. + pub fn pi(&mut self, precision: usize) -> Rounded> { + assert_limited_precision(precision); + + let bits = bits_for_precision::(precision); + let num_terms = (bits * 100 / 4708) + 1; + + let (_p, q, t) = extend_or_compute(&mut self.pi, 0, num_terms, chudnovsky_bs); + + // Finalize: π = 426880·√10005·Q / T (identical to Context::pi) + let guard_bits = num_terms.bit_len() + 32; + let work_bits = bits + guard_bits; + let work_precision = precision_for_bits::(work_bits); + let work = Context::::new(work_precision); + + // Finalize: π = 426880·√10005·Q / T. With √10005 ≈ isqrt_val·2^(-isqrt_bits) + // from the base-free cached isqrt, this folds into a single integer ratio + // π = (426880 · isqrt_val · Q) / (T · 2^isqrt_bits), + // avoiding any cross-base conversion of √10005 (convert_int is the fast path, + // the same one used for Q and T). + let (isqrt_val, isqrt_bits) = self.sqrt_10005(work_bits); + let num = IBig::from(426_880) * IBig::from(isqrt_val) * IBig::from(q); + let den = t << isqrt_bits; + let num_f = work.convert_int::(num).value(); + let den_f = work.convert_int::(den).value(); + let pi = num_f / den_f; + pi.with_precision(precision) + } + + /// `L(n) = acoth(n)` at `precision` base-`B` digits, extending its cached + /// series state. Only `n ∈ {6, 9, 99}` are cached (the sub-series of ln2 / ln10). + fn iacoth(&mut self, precision: usize) -> FBig { + // terms until r_k < B^{-p}: (2k+1)·log_B(n) > p. The count is generously + // over-provisioned (extra terms only add precision), so a plain (truncating) + // cast suffices in place of a ceiling. + let log_b_n = N.log2_est() / B.log2_est(); + let required_terms = (precision as f32 / (2.0 * log_b_n)) as usize + 10; + + let slot = match N { + 6 => &mut self.iacoth_6, + 9 => &mut self.iacoth_9, + 99 => &mut self.iacoth_99, + _ => unreachable!("iacoth only caches n ∈ {{6, 9, 99}}"), + }; + let (_p, q, t) = extend_or_compute(slot, 1, required_terms, |a, b| iacoth_bs(N, a, b)); + + // L(n) = (Q + T) / (n·Q) + let guard = ceil_usize(precision.log2_est() / B.log2_est()) + 2; + let work = Context::::new(precision + guard); + let num = work.convert_int::(q.as_ibig() + &t).value(); + let denom = work.convert_int::(IBig::from(N) * &q).value(); + num / denom + } + + /// ln(2) at `precision` base-`B` digits, reusing the cached `L(6)` and + /// `L(99)` sub-series. + /// + /// # Panics + /// + /// Panics if `precision` is 0. + pub fn ln2(&mut self, precision: usize) -> FBig { + // log(2) = 4·L(6) + 2·L(99) (Gourdon & Sebah, "Log 2") + let work = precision + combine_guard::(precision); + let l6 = self.iacoth::<6, B, R>(work); + let l99 = self.iacoth::<99, B, R>(work); + (4u8 * l6 + 2u8 * l99).with_precision(precision).value() + } + + /// ln(10) at `precision` base-`B` digits, reusing the cached `L(6)`, `L(99)`, + /// and `L(9)` sub-series. + /// + /// # Panics + /// + /// Panics if `precision` is 0. + pub fn ln10(&mut self, precision: usize) -> FBig { + // log(10) = 3·log(2) + 2·L(9) = 3·(4·L(6) + 2·L(99)) + 2·L(9) + // = 12·L(6) + 6·L(99) + 2·L(9) + // Flattening avoids the intermediate rounding of ln2 inside the product. + let work = precision + combine_guard::(precision); + let l6 = self.iacoth::<6, B, R>(work); + let l99 = self.iacoth::<99, B, R>(work); + let l9 = self.iacoth::<9, B, R>(work); + (12u8 * l6 + 6u8 * l99 + 2u8 * l9) + .with_precision(precision) + .value() + } + + /// ln(B) at `precision` base-`B` digits, reusing the cached ln2 / ln10 where + /// possible. + /// + /// # Panics + /// + /// Panics if `precision` is 0. + pub fn ln_base(&mut self, precision: usize) -> FBig { + match B { + 2 => self.ln2::(precision), + 10 => self.ln10::(precision), + b if b.is_power_of_two() => { + // ln(2^k) = k·ln(2); evaluate ln2 at elevated precision so the + // k·ln2 product survives the final round. + let work = precision + combine_guard::(precision); + let bits = b.trailing_zeros() as usize; + (bits * self.ln2::(work)) + .with_precision(precision) + .value() + } + _ => { + // generic base: no cached L(n) sub-series applies, so compute + // ln(B) directly through Context::ln on the base literal. + let ctx = Context::::new(precision); + ctx.unwrap_fp(ctx.ln::( + &Repr::new(Repr::::BASE.into(), 0), + // no cache for the generic base (its L(n) isn't cached) + None, + )) + } + } + } + + /// Sum of `num_terms` across all populated cache slots. + #[inline] + pub fn total_terms(&self) -> usize { + let sum = |s: &Option| s.as_ref().map_or(0, |s| s.num_terms); + sum(&self.pi) + sum(&self.iacoth_6) + sum(&self.iacoth_9) + sum(&self.iacoth_99) + } + + /// Sum of word counts across all cached big integers (P, Q, T, and the cached + /// `√10005` isqrt). + /// + /// This reflects the underlying storage words used by the cached state. + #[inline] + pub fn total_words(&self) -> usize { + let slot_words = |s: &Option| { + s.as_ref().map_or(0, |s| { + s.p.as_words().len() + s.q.as_words().len() + s.t.as_sign_words().1.len() + }) + }; + slot_words(&self.pi) + + slot_words(&self.iacoth_6) + + slot_words(&self.iacoth_9) + + slot_words(&self.iacoth_99) + + self.sqrt_10005.as_ref().map_or(0, |s| s.as_words().len()) + } + + /// Clear all cached constant state, freeing the underlying memory. + /// + /// After calling `clear()`, the next constant computation will start from scratch + /// rather than extending the prior cached state. + #[inline] + pub fn clear(&mut self) { + self.pi = None; + self.iacoth_6 = None; + self.iacoth_9 = None; + self.iacoth_99 = None; + self.sqrt_10005 = None; + self.sqrt_10005_bits = 0; + } +} + +impl Default for ConstCache { + #[inline] + fn default() -> Self { + Self::new() + } +} + +/// Ensure `slot` holds state for at least `target` terms, then return `(P, Q, T)` +/// covering `target` terms (or more, when an existing higher-precision state +/// already covers `target` — finalize then rounds down to the requested precision). +/// +/// `range_bs(a, b)` computes the leaf-merged state over `[a, b)` and must handle +/// `a == b` by returning the identity `(1, 1, 0)`. +fn extend_or_compute( + slot: &mut Option, + start: usize, + target: usize, + range_bs: F, +) -> (UBig, UBig, IBig) +where + F: Fn(usize, usize) -> (UBig, UBig, IBig), +{ + match slot { + // Already have >= target terms: reuse (extra terms only add precision). + Some(s) if s.num_terms >= target => (s.p.clone(), s.q.clone(), s.t.clone()), + // Have fewer terms: extend the right half [num_terms, target) and merge. + Some(s) => { + let (pr, qr, tr) = range_bs(s.num_terms, target); + let (p, q, t) = merge(&s.p, &s.q, &s.t, &pr, &qr, &tr); + *slot = Some(CachedState { + p: p.clone(), + q: q.clone(), + t: t.clone(), + num_terms: target, + }); + (p, q, t) + } + // Cold: compute from `start`. + None => { + let (p, q, t) = range_bs(start, target); + *slot = Some(CachedState { + p: p.clone(), + q: q.clone(), + t: t.clone(), + num_terms: target, + }); + (p, q, t) + } + } +} + +/// Reborrow an `Option<&mut ConstCache>` so it can be threaded into several +/// sequential sub-calls. `as_deref_mut` is the natural reborrow here; clippy's +/// `needless_option_as_deref` flags it (the deref target equals the referent), +/// so the lint is allowed at this single centralized spot. +#[inline] +#[allow(clippy::needless_option_as_deref)] +pub(crate) fn reborrow_cache<'a>( + cache: &'a mut Option<&mut ConstCache>, +) -> Option<&'a mut ConstCache> { + cache.as_deref_mut() +} + +/// Number of bits needed to represent `precision` base-`B` digits exactly. +/// +/// For power-of-two bases this is exact; for arbitrary bases it uses the upper +/// bound from [`EstimatedLog2`], which is far tighter than `ilog2(B) + 1`. +fn bits_for_precision(precision: usize) -> usize { + if B.is_power_of_two() { + precision.saturating_mul(B.ilog2() as usize) + } else { + // ub ≥ log2(B) with error ≤ 2/256. Multiply in f64 so the product + // is exact for precision up to 2^53. +1 guards float rounding. + let ub = B.log2_bounds().1; + ceil_usize(precision as f32 * ub) + 1 + } +} + +/// Convert a work-precision expressed in bits back to base-`B` digits. +/// +/// For base 2 the identity holds; for power-of-two bases it uses ceiling +/// division; for arbitrary bases it inverts the lower bound from +/// [`EstimatedLog2`] to get a tight ceiling. +fn precision_for_bits(bits: usize) -> usize { + if B.is_power_of_two() { + let log2 = B.ilog2() as usize; + (bits + log2 - 1) / log2 + } else { + // lb ≤ log2(B), so 1/lb ≥ 1/log2(B). +1 guards float rounding. + let lb = B.log2_bounds().0; + ceil_usize(bits as f32 / lb) + 1 + } +} + +/// Guard digits added when combining sub-series, large enough that the linear +/// combination and its final round to `precision` are unaffected by summation +/// rounding (a few digits cover the constant multipliers and term count). +fn combine_guard(precision: usize) -> usize { + ceil_usize(precision.log2_est() / B.log2_est()) + 4 +} + +impl fmt::Debug for ConstCache { + fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { + f.debug_struct("ConstCache") + .field("pi", &DebugSlot(&self.pi)) + .field("iacoth_6", &DebugSlot(&self.iacoth_6)) + .field("iacoth_9", &DebugSlot(&self.iacoth_9)) + .field("iacoth_99", &DebugSlot(&self.iacoth_99)) + .finish() + } +} + +/// Newtype so we can implement `Debug` for `&Option` via the +/// big-integer `Debug` formatters. +struct DebugSlot<'a>(&'a Option); + +impl fmt::Debug for DebugSlot<'_> { + fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { + match self.0 { + Some(s) => f + .debug_struct("CachedState") + .field("num_terms", &s.num_terms) + .field("p", &s.p) + .field("q", &s.q) + .field("t", &s.t) + .finish(), + None => f.write_str("None"), + } + } +} + +#[cfg(test)] +mod tests { + use super::*; + use crate::round::mode; + use alloc::format; + + #[test] + fn test_pi_matches_context() { + // Cache miss must reproduce Context::pi exactly. + for &precision in &[10usize, 50, 100] { + let mut cache = ConstCache::new(); + let cached = cache.pi::<10, mode::HalfEven>(precision).value(); + let direct = Context::::new(precision) + .pi::<10>(None) + .value(); + assert_eq!(cached, direct, "pi mismatch at precision {precision}"); + } + } + + #[test] + fn test_pi_lower_precision_reuses() { + // Compute at high precision, then a lower-precision request must round + // down from the cached state and still be correct. + let mut cache = ConstCache::new(); + let _pi_high = cache.pi::<10, mode::HalfEven>(200).value(); + // the slot now holds >=200 terms; a 50-digit request reuses it + let pi_50 = cache.pi::<10, mode::HalfEven>(50).value(); + let direct = Context::::new(50).pi::<10>(None).value(); + assert_eq!(pi_50, direct); + } + + #[test] + fn test_pi_extension_matches_scratch() { + // Extending 100 -> 1000 must be bit-identical to a from-scratch 1000-digit compute. + let mut cache = ConstCache::new(); + let _pi_100 = cache.pi::<10, mode::HalfAway>(100).value(); + let pi_1000_extended = cache.pi::<10, mode::HalfAway>(1000).value(); + + let direct = Context::::new(1000).pi::<10>(None).value(); + assert_eq!(pi_1000_extended, direct); + } + + #[test] + fn test_iacoth_matches_context() { + let mut cache = ConstCache::new(); + // ln2 / ln10 via cache must match ln(2)/ln(10) computed independently + // through Context::ln (a different, atanh-based algorithm) at several precisions. + for &precision in &[20usize, 45, 80] { + let cached_ln2 = cache + .ln2::<10, mode::Zero>(precision) + .with_precision(precision) + .value(); + let ln2_ctx = Context::::new(precision); + let direct_ln2 = + ln2_ctx.unwrap_fp(ln2_ctx.ln::<10>(&Repr::new(2.into(), 0), None)); + assert_eq!(cached_ln2, direct_ln2, "ln2 mismatch at precision {precision}"); + + let cached_ln10 = cache + .ln10::<10, mode::Zero>(precision) + .with_precision(precision) + .value(); + let ln10_ctx = Context::::new(precision); + let direct_ln10 = + ln10_ctx.unwrap_fp(ln10_ctx.ln::<10>(&Repr::new(10.into(), 0), None)); + assert_eq!(cached_ln10, direct_ln10, "ln10 mismatch at precision {precision}"); + } + } + + #[test] + fn test_iacoth_extension_matches_scratch() { + // Extend ln2 from low to high precision; result must match from-scratch. + let mut cache = ConstCache::new(); + let _ln2_low = cache.ln2::<10, mode::HalfAway>(20); + let ln2_high = cache.ln2::<10, mode::HalfAway>(120); + + let mut fresh = ConstCache::new(); + let direct = fresh.ln2::<10, mode::HalfAway>(120); + assert_eq!(ln2_high, direct); + } + + #[test] + fn test_ln_base() { + // binary base: ln(base) == ln(2) + let mut cache = ConstCache::new(); + let ln_base = cache.ln_base::<2, mode::HalfAway>(50); + let ln2 = cache.ln2::<2, mode::HalfAway>(50); + assert_eq!(ln_base, ln2); + + // power-of-two base: ln(8) = 3·ln(2) + let ln8 = cache.ln_base::<8, mode::HalfAway>(50); + let expected = 3u8 * cache.ln2::<8, mode::HalfAway>(50); + assert_eq!(ln8.with_precision(50).value(), expected.with_precision(50).value()); + } + + #[test] + fn test_debug_shows_bigint_head_tail() { + let mut cache = ConstCache::new(); + let _ = cache.pi::<10, mode::HalfAway>(100); + let s = format!("{:?}", cache); + assert!(s.contains("pi")); + assert!(s.contains("num_terms")); + // UBig/IBig Debug prints head..tail, so the output stays compact + assert!(s.contains(".."), "Debug output should use head..tail truncation"); + assert!(s.len() < 512); + } + + #[test] + fn test_sqrt_10005_cached_and_counted() { + // Computing π caches the base-free √10005 isqrt; total_words counts it, and + // clear() frees it. + let mut cache = ConstCache::new(); + assert_eq!(cache.total_terms(), 0); + assert_eq!(cache.total_words(), 0); + + let _ = cache.pi::<10, mode::HalfAway>(200); + // the isqrt is now cached (total_terms stays series-only; words include isqrt) + assert!(cache.total_words() > 0); + + cache.clear(); + assert_eq!(cache.total_terms(), 0); + assert_eq!(cache.total_words(), 0); + + // after clear, π recomputes from scratch and still matches the direct value + let direct = Context::::new(50).pi::<10>(None).value(); + let after_clear = cache.pi::<10, mode::HalfAway>(50).value(); + assert_eq!(after_clear, direct); + } + + #[test] + fn test_sqrt_10005_reuse_higher_precision() { + // A high-precision π call caches a high-bit isqrt; a later lower-precision + // call must reuse it (no recompute) and still be correct. + let mut cache = ConstCache::new(); + let _high = cache.pi::<2, mode::HalfEven>(1000); + let words_after_high = cache.total_words(); + + let low = cache.pi::<2, mode::HalfEven>(100).value(); + // word count unchanged ⇒ isqrt (and series) were reused, not recomputed + assert_eq!(cache.total_words(), words_after_high); + + let direct = Context::::new(100).pi::<2>(None).value(); + assert_eq!(low, direct); + } +} diff --git a/float/src/math/consts.rs b/float/src/math/consts.rs index ee680b3..bdb0955 100644 --- a/float/src/math/consts.rs +++ b/float/src/math/consts.rs @@ -1,6 +1,7 @@ use crate::{ error::assert_limited_precision, fbig::FBig, + math::cache::ConstCache, repr::{Context, Word}, round::{Round, Rounded}, }; @@ -19,10 +20,12 @@ impl Context { /// terms into large products, it allows the library to leverage fast /// multiplication algorithms (like Toom-3 or FFT) as the numbers grow, /// leading to significant performance gains over simple iterative summation. - /// - /// // TODO: consider adding a static cache for π at common precisions. #[must_use] - pub fn pi(&self) -> Rounded> { + pub fn pi(&self, cache: Option<&mut ConstCache>) -> Rounded> { + if let Some(c) = cache { + return c.pi::(self.precision); + } + assert_limited_precision(self.precision); // Calculate required bits based on target precision in base B. @@ -54,9 +57,9 @@ impl Context { let q_f = work_context.convert_int::(q.into()).value(); let t_f = work_context.convert_int::(t).value(); - let sqrt_10005 = work_context - .sqrt(&work_context.convert_int::(10005.into()).value().repr) - .value(); + let sqrt_10005 = work_context.unwrap_fp(work_context.sqrt( + &work_context.convert_int::(10005.into()).value().repr, + )); let constant = work_context.convert_int::(426_880.into()).value(); let pi = (constant * sqrt_10005 * q_f) / t_f; @@ -64,20 +67,48 @@ impl Context { } } +/// Universal binary-splitting merge: +/// `combine((P_l,Q_l,T_l), (P_r,Q_r,T_r)) = (P_l·P_r, Q_l·Q_r, T_l·Q_r + P_l·T_r)`. +/// +/// This operation is associative, so the `(P, Q, T)` for a range is independent of +/// how the recursion splits it — which is exactly what lets a cached partial tree +/// state be extended by merging in a freshly computed right half. +pub(crate) fn merge( + pl: &UBig, + ql: &UBig, + tl: &IBig, + pr: &UBig, + qr: &UBig, + tr: &IBig, +) -> (UBig, UBig, IBig) { + let p = pl * pr; + let q = ql * qr; + // re-interpret the magnitudes as signed without cloning the big integers + let t = qr.as_ibig() * tl + pl.as_ibig() * tr; + (p, q, t) +} + /// Binary splitting implementation for the Chudnovsky series. -/// Returns (P, Q, T) for the range [a, b). -fn chudnovsky_bs(a: usize, b: usize) -> (UBig, UBig, IBig) { +/// Returns `(P, Q, T)` for the range `[a, b)`. An empty range `[a, a)` yields the +/// identity `(1, 1, 0)`, so callers may safely merge a cached left state with a +/// right half that starts exactly where the left one ended. +pub(crate) fn chudnovsky_bs(a: usize, b: usize) -> (UBig, UBig, IBig) { + if a >= b { + return (UBig::ONE, UBig::ONE, IBig::ZERO); + } if b - a == 1 { + const COEFF1: IBig = IBig::from_parts_const(Sign::Positive, 13591409); + const COEFF2: IBig = IBig::from_parts_const(Sign::Positive, 545140134); + // Base case: calculate single term if a == 0 { - return (UBig::ONE, UBig::ONE, IBig::from_parts_const(Sign::Positive, 13_591_409)); + return (UBig::ONE, UBig::ONE, COEFF1); } let k = a as u64; let p = UBig::from(6 * k - 5) * (2 * k - 1) * (6 * k - 1); - let q = UBig::from(k).pow(3) * UBig::from_u64(10_939_058_860_032_000); - let t_val = IBig::from_parts_const(Sign::Positive, 13_591_409) - + IBig::from_parts_const(Sign::Positive, 545_140_134) * k; + let q = UBig::from(k).pow(3) * 10_939_058_860_032_000u64; + let t_val = COEFF1 + COEFF2 * k; let t_abs = &p * t_val.unsigned_abs(); let t = IBig::from(t_abs) * Sign::from(a % 2 == 1); return (p, q, t); @@ -88,18 +119,136 @@ fn chudnovsky_bs(a: usize, b: usize) -> (UBig, UBig, IBig) { let (p_l, q_l, t_l) = chudnovsky_bs(a, mid); let (p_r, q_r, t_r) = chudnovsky_bs(mid, b); - let p = &p_l * &p_r; - let q = &q_l * &q_r; - // T = T_L * Q_R + T_R * P_L - let t = IBig::from(q_r) * t_l + IBig::from(p_l) * t_r; - (p, q, t) + // T = T_L * Q_R + T_R * P_L (the universal merge) + merge(&p_l, &q_l, &t_l, &p_r, &q_r, &t_r) +} + +/// Binary splitting for `L(n) = acoth(n) = Σ_{k≥0} 1/(n^{2k+1}(2k+1))` over `[1, b)`. +/// +/// Term ratio (k≥1): `r_k/r_{k-1} = p_k/q_k` with `p_k = 2k-1`, `q_k = (2k+1)·n²`. +/// The `k=0` term `r_0 = 1/n` is pulled out; over `[1, b)` the tree state satisfies +/// `T/Q = n · Σ_{k=1}^{b-1} 1/((2k+1)·n^{2k+1})`, hence `L(n) = (Q + T)/(n·Q)`. +/// +/// Using the ratio form (rather than independent `1/q_k` terms) keeps +/// `Q = Π(2k+1)·n²` at O(p) digits: each leaf multiplies only small integers +/// (with `n²` folded in), no `n.pow(2k+1)` per leaf. +pub(crate) fn iacoth_bs(n: u32, a: usize, b: usize) -> (UBig, UBig, IBig) { + debug_assert!(a >= 1, "iacoth_bs leaf index must be >= 1"); + if a >= b { + return (UBig::ONE, UBig::ONE, IBig::ZERO); // identity + } + // Precomputed initial block [1, 1+K): skip its K leaves on every fresh + // computation. Because the merge is associative, the constant triple is + // identical to the recursively computed state regardless of split order. + // It only applies at the series start (a == 1); recursive/extend calls have + // a >= 1 + K and never reach this branch. + if a == 1 { + if let Some((k, p0, q0, t0)) = iacoth_initial_block(n) { + // the precomputed block covers [1, 1+k); use it when [a, b) reaches + // past its end (b > k) + if b > k { + let (pr, qr, tr) = iacoth_bs(n, 1 + k, b); + return merge(&p0, &q0, &t0, &pr, &qr, &tr); + } + } + } + if b - a == 1 { + // leaf at k = a (a >= 1): (p_a, q_a, p_a), p_a = 2a-1, q_a = (2a+1)·n² + let pa = UBig::from(2 * a - 1); + let n2 = UBig::from(n).pow(2); + let qa = UBig::from(2 * a + 1) * n2; + let ta = IBig::from(pa.clone()); + return (pa, qa, ta); + } + let mid = (a + b) / 2; + let (pl, ql, tl) = iacoth_bs(n, a, mid); + let (pr, qr, tr) = iacoth_bs(n, mid, b); + merge(&pl, &ql, &tl, &pr, &qr, &tr) // universal merge +} + +/// Precomputed binary-splitting state for `L(n) = acoth(n)` over the first `K` +/// terms `[1, 1+K)`, stored as `(K, P, Q, T)`. `K` is chosen (per `n`) so that +/// `P`, `Q` and `|T|` each fit in a `u32`. Since `DoubleWord` is `u32`/`u64`/`u128` +/// for `Word = u16`/`u32`/`u64`, a `u32`-sized literal is accepted by +/// [`UBig::from_dword`] / [`IBig::from_parts_const`] on **every** configuration — +/// so this single set of constants is portable without needing to detect the +/// `Word` width (which is internal to `dashu-int`). The constants also use the +/// inline small-integer representation, so instantiating them never allocates. +/// +/// Only the sub-series that back ln2 / ln10 (`n ∈ {6, 9, 99}`) are precomputed; +/// π cannot use this trick because its 2-term `T` already overflows a `u32`. +fn iacoth_initial_block(n: u32) -> Option<(usize, UBig, UBig, IBig)> { + match n { + // L(6) over [1, 5): 4 leaves. + 6 => Some(IACOTH_6_INITIAL), + // L(9) over [1, 4): 3 leaves. + 9 => Some(IACOTH_9_INITIAL), + // L(99) over [1, 3): 2 leaves. + 99 => Some(IACOTH_99_INITIAL), + _ => None, + } } +/// `(K, P, Q, T)` for `L(6)` over `[1, 5)` (4 leaves). +const IACOTH_6_INITIAL: (usize, UBig, UBig, IBig) = ( + 4, + UBig::from_word(105), + UBig::from_dword(1587237120), + IBig::from_parts_const(Sign::Positive, 14946549), +); +/// `(K, P, Q, T)` for `L(9)` over `[1, 4)` (3 leaves). +const IACOTH_9_INITIAL: (usize, UBig, UBig, IBig) = ( + 3, + UBig::from_word(15), + UBig::from_dword(55801305), + IBig::from_parts_const(Sign::Positive, 231351), +); +/// `(K, P, Q, T)` for `L(99)` over `[1, 3)` (2 leaves). +const IACOTH_99_INITIAL: (usize, UBig, UBig, IBig) = ( + 2, + UBig::from_word(3), + UBig::from_dword(1440894015), + IBig::from_parts_const(Sign::Positive, 49008), +); + impl FBig { /// Calculate π with the given precision and the default rounding mode. #[inline] #[must_use] pub fn pi(precision: usize) -> Self { - Context::::new(precision).pi().value() + Context::::new(precision).pi(None).value() + } +} + +#[cfg(test)] +mod tests { + use super::*; + + /// Independently (left-fold) merge the first `k` leaves of `L(n)` and check + /// that the result matches the precomputed constant triple. Guards against + /// transcription errors in the `IACOTH_*_INITIAL` literals; correctness of the + /// finalized `L(n)` values is covered by `log::tests` and `cache::tests`. + #[test] + fn test_iacoth_initial_blocks() { + fn check(n: u32, expected: &(usize, UBig, UBig, IBig)) { + let k = expected.0; + // independently (left-fold) merge the first k leaves + let mut acc = (UBig::ONE, UBig::ONE, IBig::ZERO); + for kk in 1..=k { + let pa = UBig::from(2 * kk as u64 - 1); + let qa = UBig::from(2 * kk as u64 + 1) * UBig::from(n).pow(2); + let ta = IBig::from(pa.clone()); + let (p, q, t) = merge(&acc.0, &acc.1, &acc.2, &pa, &qa, &ta); + acc = (p, q, t); + } + assert_eq!(acc.0, expected.1, "P mismatch for n={n}"); + assert_eq!(acc.1, expected.2, "Q mismatch for n={n}"); + assert_eq!(acc.2, expected.3, "T mismatch for n={n}"); + // the seed branch must reproduce the same state as the full recursion + assert_eq!(iacoth_bs(n, 1, 1 + k), (acc.0, acc.1, acc.2)); + } + check(6, &IACOTH_6_INITIAL); + check(9, &IACOTH_9_INITIAL); + check(99, &IACOTH_99_INITIAL); } } diff --git a/float/src/math/mod.rs b/float/src/math/mod.rs index 1c0e431..1fda718 100644 --- a/float/src/math/mod.rs +++ b/float/src/math/mod.rs @@ -1,86 +1,7 @@ //! Advanced mathematical functions -use crate::{ - error::{panic_infinite, panic_nan, panic_overflow, panic_underflow}, - fbig::FBig, - repr::{Context, Repr, Word}, - round::{Round, Rounded}, -}; - +pub mod cache; pub mod consts; pub mod trig; -/// The result of an advanced mathematical operation. -/// -/// This enum is used to handle non-finite results (NaN, Infinite) and -/// boundary conditions (Overflow, Underflow) without panicking, -/// as the core [`FBig`] type only represents finite numbers. -/// -/// Finite results are wrapped in a [Rounded] to preserve rounding information. -#[derive(Clone, Debug, PartialEq, Eq)] -pub enum FpResult { - Normal(Rounded>), - Overflow, - Underflow, - NaN, - /// An exact infinite result is obtained from finite inputs, such as - /// divide by zero or logarithm of zero. - Infinite, -} - -impl FpResult { - /// Convert the result into an [`FBig`] with the given context. - /// - /// # Panics - /// Panics if the result is not `Normal`. - #[inline] - #[must_use] - pub fn value(self, context: &Context) -> FBig { - match self { - Self::Normal(rounded) => FBig::new(rounded.value(), *context), - Self::NaN => panic_nan(), - Self::Infinite => panic_infinite(), - Self::Overflow => panic_overflow(), - Self::Underflow => panic_underflow(), - } - } - - /// Convert the result into an optional [`FBig`] with the given context. - /// Returns `None` if the result is not `Normal`. - #[inline] - #[must_use] - pub fn ok(self, context: &Context) -> Option>> { - match self { - Self::Normal(rounded) => Some(rounded.map(|repr| FBig::new(repr, *context))), - _ => None, - } - } - - /// Returns `true` if the result is `NaN`. - #[inline] - #[must_use] - pub const fn is_nan(&self) -> bool { - matches!(self, Self::NaN) - } - - /// Returns `true` if the result is `Infinite`. - #[inline] - #[must_use] - pub const fn is_infinite(&self) -> bool { - matches!(self, Self::Infinite) - } - - /// Returns `true` if the result is a normal finite value. - #[inline] - #[must_use] - pub const fn is_normal(&self) -> bool { - matches!(self, Self::Normal(_)) - } - - /// Returns `true` if the result is a finite value (Normal, Overflow, or Underflow). - #[inline] - #[must_use] - pub const fn is_finite(&self) -> bool { - matches!(self, Self::Normal(_) | Self::Overflow | Self::Underflow) - } -} +pub use crate::error::{FpError, FpResult}; diff --git a/float/src/math/trig.rs b/float/src/math/trig.rs index 457e8cd..0bf7a27 100644 --- a/float/src/math/trig.rs +++ b/float/src/math/trig.rs @@ -1,9 +1,12 @@ use crate::{ - error::assert_limited_precision, + error::{assert_limited_precision, FpError}, fbig::FBig, - math::FpResult, + math::{ + cache::{reborrow_cache, ConstCache}, + FpResult, + }, repr::{Context, Repr, Word}, - round::Round, + round::{Round, Rounded}, }; use core::cmp::Ordering; use core::convert::TryFrom; @@ -18,6 +21,20 @@ enum Quadrant { Fourth, } +/// Build a `Normal` result equal to `±0`, preserving the sign of `x` (used by `sin`/`tan`/`sin_cos` +/// at zero input, where `sin(-0) = -0` and `tan(-0) = -0`). +fn signed_zero_normal( + ctx: &Context, + x: &Repr, +) -> FpResult> { + let zero = if x.is_neg_zero() { + Repr::neg_zero() + } else { + Repr::zero() + }; + Ok(FBig::::new(zero, *ctx).with_precision(ctx.precision)) +} + impl Context { /// Calculate the internal work context for trigonometric functions based on input magnitude. /// @@ -40,11 +57,15 @@ impl Context { /// Reduces the argument to the first quadrant for trigonometric evaluation. /// Returns the internal work context, the reduced argument `r`, and the quadrant `k % 4`. - fn reduce_to_quadrant(self, x: &Repr) -> (Self, FBig, Quadrant) { + fn reduce_to_quadrant( + self, + x: &Repr, + mut cache: Option<&mut ConstCache>, + ) -> (Self, FBig, Quadrant) { let work_context = self.compute_work_context_trig(x); let x_f = FBig::::new(work_context.repr_round(x.clone()).value(), work_context); - let pi = work_context.pi::().value(); + let pi = work_context.pi::(reborrow_cache(&mut cache)).value(); let half_pi = &pi / 2; let x_scaled: FBig = &x_f / &half_pi; let k_f = x_scaled.round(); @@ -71,19 +92,22 @@ impl Context { } /// Calculate the sine of the floating point representation. - #[must_use] - pub fn sin(&self, x: &Repr) -> FpResult { + pub fn sin( + &self, + x: &Repr, + mut cache: Option<&mut ConstCache>, + ) -> FpResult> { if x.is_infinite() { - return FpResult::NaN; + return Err(FpError::InfiniteInput); } assert_limited_precision(self.precision); - if x.is_zero() { - let res = FBig::::ZERO.with_precision(self.precision); - return FpResult::Normal(res.map(|v| v.repr)); + if x.significand.is_zero() { + // sin(±0) = ±0 + return signed_zero_normal(self, x); } - let (work_context, r, quadrant) = self.reduce_to_quadrant(x); + let (work_context, r, quadrant) = self.reduce_to_quadrant(x, reborrow_cache(&mut cache)); // 3. Evaluate the reduced series based on the quadrant let res = match quadrant { @@ -92,7 +116,7 @@ impl Context { Quadrant::Third => -work_context.sin_internal(&r), Quadrant::Fourth => -work_context.cos_internal(&r), }; - FpResult::Normal(res.with_precision(self.precision).map(|v| v.repr)) + Ok(res.with_precision(self.precision)) } /// Internal Taylor series for sine: S(x) = x - x^3/3! + x^5/5! - ... @@ -122,19 +146,22 @@ impl Context { } /// Calculate the cosine of the floating point representation. - #[must_use] - pub fn cos(&self, x: &Repr) -> FpResult { + pub fn cos( + &self, + x: &Repr, + mut cache: Option<&mut ConstCache>, + ) -> FpResult> { if x.is_infinite() { - return FpResult::NaN; + return Err(FpError::InfiniteInput); } assert_limited_precision(self.precision); - if x.is_zero() { - let res = FBig::::ONE.with_precision(self.precision); - return FpResult::Normal(res.map(|v| v.repr)); + if x.significand.is_zero() { + // cos(±0) = 1 + return Ok(FBig::::ONE.with_precision(self.precision)); } - let (work_context, r, quadrant) = self.reduce_to_quadrant(x); + let (work_context, r, quadrant) = self.reduce_to_quadrant(x, reborrow_cache(&mut cache)); // 3. Evaluate the reduced series based on the quadrant let res = match quadrant { @@ -143,7 +170,7 @@ impl Context { Quadrant::Third => -work_context.cos_internal(&r), Quadrant::Fourth => work_context.sin_internal(&r), }; - FpResult::Normal(res.with_precision(self.precision).map(|v| v.repr)) + Ok(res.with_precision(self.precision)) } /// Internal Taylor series for cosine: C(x) = 1 - x^2/2! + x^4/4! - ... @@ -175,20 +202,24 @@ impl Context { /// Calculate both the sine and cosine of the floating point representation. /// /// This is more efficient than calling `sin` and `cos` separately. - #[must_use] - pub fn sin_cos(&self, x: &Repr) -> (FpResult, FpResult) { + pub fn sin_cos( + &self, + x: &Repr, + mut cache: Option<&mut ConstCache>, + ) -> (FpResult>, FpResult>) { if x.is_infinite() { - return (FpResult::NaN, FpResult::NaN); + return (Err(FpError::InfiniteInput), Err(FpError::InfiniteInput)); } assert_limited_precision(self.precision); - if x.is_zero() { - let s = FBig::::ZERO.with_precision(self.precision); - let c = FBig::::ONE.with_precision(self.precision); - return (FpResult::Normal(s.map(|v| v.repr)), FpResult::Normal(c.map(|v| v.repr))); + if x.significand.is_zero() { + // sin(±0) = ±0, cos(±0) = 1 + let s = signed_zero_normal(self, x); + let c = Ok(FBig::::ONE.with_precision(self.precision)); + return (s, c); } - let (work_context, r, quadrant) = self.reduce_to_quadrant(x); + let (work_context, r, quadrant) = self.reduce_to_quadrant(x, reborrow_cache(&mut cache)); let (sin_r, cos_r) = work_context.sin_cos_internal(&r); @@ -199,10 +230,7 @@ impl Context { Quadrant::Fourth => (-cos_r, sin_r), }; - ( - FpResult::Normal(s.with_precision(self.precision).map(|v| v.repr)), - FpResult::Normal(c.with_precision(self.precision).map(|v| v.repr)), - ) + (Ok(s.with_precision(self.precision)), Ok(c.with_precision(self.precision))) } /// Simultaneously evaluate Taylor series for sine and cosine. @@ -247,20 +275,23 @@ impl Context { /// Calculate the tangent of the floating point representation. /// /// # Note - /// Near odd multiples of π/2, the result returns `Infinite`. - #[must_use] - pub fn tan(&self, x: &Repr) -> FpResult { + /// Near odd multiples of π/2, the result is an infinity (returned as a value, not an error). + pub fn tan( + &self, + x: &Repr, + mut cache: Option<&mut ConstCache>, + ) -> FpResult> { if x.is_infinite() { - return FpResult::NaN; + return Err(FpError::InfiniteInput); } assert_limited_precision(self.precision); - if x.is_zero() { - let res = FBig::::ZERO.with_precision(self.precision); - return FpResult::Normal(res.map(|v| v.repr)); + if x.significand.is_zero() { + // tan(±0) = ±0 + return signed_zero_normal(self, x); } - let (work_context, r, quadrant) = self.reduce_to_quadrant(x); + let (work_context, r, quadrant) = self.reduce_to_quadrant(x, reborrow_cache(&mut cache)); let (sin_r, cos_r) = work_context.sin_cos_internal(&r); let (s_f, c_f) = match quadrant { @@ -271,27 +302,37 @@ impl Context { }; if c_f.repr.is_zero() { - return FpResult::Infinite; + // tan hits a pole: the result is an infinity with the sign of the numerator. + let inf = if s_f.sign() == Sign::Negative { + Repr::neg_infinity() + } else { + Repr::infinity() + }; + return Ok(Rounded::Exact(FBig::new(inf, *self))); } - FpResult::Normal(self.div(&s_f.repr, &c_f.repr).map(|v| v.repr)) + self.div(&s_f.repr, &c_f.repr) + .map(|r| r.and_then(|f| f.with_precision(self.precision))) } /// Calculate the arcsine of the floating point representation. /// /// # Methodology /// Uses the identity: `asin(x) = atan(x / sqrt(1 - x^2))` - /// Returns `NaN` if `|x| > 1`. - #[must_use] - pub fn asin(&self, x: &Repr) -> FpResult { + /// Returns `Err(OutOfDomain)` if `|x| > 1`. + pub fn asin( + &self, + x: &Repr, + mut cache: Option<&mut ConstCache>, + ) -> FpResult> { if x.is_infinite() { - return FpResult::NaN; + return Err(FpError::InfiniteInput); } assert_limited_precision(self.precision); let x_orig = FBig::::new(x.clone(), *self); // Domain check: |x| must be <= 1 if x_orig.abs_cmp(&FBig::ONE).is_gt() { - return FpResult::NaN; + return Err(FpError::OutOfDomain); } let guard_digits = 50; @@ -300,17 +341,21 @@ impl Context { let x_f = FBig::::new(work_context.repr_round(x.clone()).value(), work_context); - let res = work_context.asin_internal(&x_f); - FpResult::Normal(res.with_precision(self.precision).map(|v| v.repr)) + let res = work_context.asin_internal(&x_f, reborrow_cache(&mut cache)); + Ok(res.with_precision(self.precision)) } - fn asin_internal(self, x_f: &FBig) -> FBig { + fn asin_internal( + self, + x_f: &FBig, + mut cache: Option<&mut ConstCache>, + ) -> FBig { let one = FBig::::ONE.with_precision(self.precision).value(); let x2 = x_f.sqr(); - let d = self.sqrt(&(one - x2).repr).value(); + let d = self.unwrap_fp(self.sqrt(&(one - x2).repr)); if d.repr.is_zero() { - let pi = self.pi::().value(); + let pi = self.pi::(reborrow_cache(&mut cache)).value(); let half_pi: FBig = pi / 2; if x_f.sign() == Sign::Positive { return half_pi; @@ -318,7 +363,7 @@ impl Context { return -half_pi; } - self.atan_with_reduction(&(x_f / d)) + self.atan_with_reduction(&(x_f / d), reborrow_cache(&mut cache)) } /// Calculate the arccosine of the floating point representation. @@ -326,17 +371,20 @@ impl Context { /// # Methodology /// Uses the identity: `acos(x) = pi/2 - asin(x)`. /// Higher precision is used internally to avoid catastrophic cancellation near x ≈ 1. - #[must_use] - pub fn acos(&self, x: &Repr) -> FpResult { + pub fn acos( + &self, + x: &Repr, + mut cache: Option<&mut ConstCache>, + ) -> FpResult> { if x.is_infinite() { - return FpResult::NaN; + return Err(FpError::InfiniteInput); } assert_limited_precision(self.precision); let x_orig = FBig::::new(x.clone(), *self); // Domain check: |x| must be <= 1 if x_orig.abs_cmp(&FBig::ONE).is_gt() { - return FpResult::NaN; + return Err(FpError::OutOfDomain); } let guard_digits = 50; @@ -345,32 +393,36 @@ impl Context { let x_f = FBig::::new(work_context.repr_round(x.clone()).value(), work_context); - let asin_x = work_context.asin_internal(&x_f); - let pi = work_context.pi::().value(); + let asin_x = work_context.asin_internal(&x_f, reborrow_cache(&mut cache)); + let pi = work_context.pi::(reborrow_cache(&mut cache)).value(); let half_pi: FBig = pi / 2; let res: FBig = half_pi - asin_x; - FpResult::Normal(res.with_precision(self.precision).map(|v| v.repr)) + Ok(res.with_precision(self.precision)) } /// Calculate the arctangent of the floating point representation. - #[must_use] - pub fn atan(&self, x: &Repr) -> FpResult { + pub fn atan( + &self, + x: &Repr, + mut cache: Option<&mut ConstCache>, + ) -> FpResult> { if x.is_infinite() { - let pi = self.pi::().value(); + // atan(±inf) = ±π/2 — preserved (a well-defined finite result for an infinite input) + let pi = self.pi::(reborrow_cache(&mut cache)).value(); let half_pi: FBig = pi / 2; let res: FBig = if x.sign() == Sign::Positive { half_pi } else { -half_pi }; - return FpResult::Normal(res.with_precision(self.precision).map(|v| v.repr)); + return Ok(res.with_precision(self.precision)); } assert_limited_precision(self.precision); - if x.is_zero() { - let res = FBig::::ZERO.with_precision(self.precision); - return FpResult::Normal(res.map(|v| v.repr)); + if x.significand.is_zero() { + // atan(±0) = ±0 + return signed_zero_normal(self, x); } let guard_digits = 50; @@ -378,19 +430,23 @@ impl Context { let work_context = Self::new(work_precision); let x_f = FBig::::new(work_context.repr_round(x.clone()).value(), work_context); - let res = work_context.atan_with_reduction(&x_f); - FpResult::Normal(res.with_precision(self.precision).map(|v| v.repr)) + let res = work_context.atan_with_reduction(&x_f, reborrow_cache(&mut cache)); + Ok(res.with_precision(self.precision)) } /// Internal arctangent that includes range reduction but no guard digit allocation. - fn atan_with_reduction(self, x_f: &FBig) -> FBig { + fn atan_with_reduction( + self, + x_f: &FBig, + mut cache: Option<&mut ConstCache>, + ) -> FBig { let sign = x_f.sign(); let mut x_abs = x_f.clone(); if sign == Sign::Negative { x_abs = -x_abs; } let mut res = if x_abs >= FBig::::ONE.with_precision(self.precision).value() { - let pi = self.pi::().value(); + let pi = self.pi::(reborrow_cache(&mut cache)).value(); let inv_x = FBig::::ONE.with_precision(self.precision).value() / x_abs; (pi / 2) - self.atan_internal(&inv_x) } else { @@ -429,11 +485,15 @@ impl Context { /// Calculate the arctangent of y / x. /// /// Handles signed infinities according to IEEE 754 standards. - /// Returns `NaN` if both arguments are zero. - #[must_use] - pub fn atan2(&self, y: &Repr, x: &Repr) -> FpResult { - if y.is_zero() && x.is_zero() { - return FpResult::NaN; + /// Returns `Err(OutOfDomain)` if both arguments are zero. + pub fn atan2( + &self, + y: &Repr, + x: &Repr, + mut cache: Option<&mut ConstCache>, + ) -> FpResult> { + if y.is_finite() && x.is_finite() && y.significand.is_zero() && x.significand.is_zero() { + return Err(FpError::OutOfDomain); } assert_limited_precision(self.precision); @@ -446,29 +506,49 @@ impl Context { if y.is_infinite() || x.is_infinite() { let (sy, sx) = (y.sign() == Sign::Positive, x.sign() == Sign::Positive); let res: FBig = match (y.is_infinite(), x.is_infinite(), sy, sx) { - (true, true, true, true) => work_context.pi::().value() / 4, - (true, true, true, false) => work_context.pi::().value() * 3 / 4, + (true, true, true, true) => { + work_context.pi::(reborrow_cache(&mut cache)).value() / 4 + } + (true, true, true, false) => { + work_context.pi::(reborrow_cache(&mut cache)).value() * 3 / 4 + } (true, true, false, true) => { - let pi4: FBig = work_context.pi::().value() / 4; + let pi4: FBig = + work_context.pi::(reborrow_cache(&mut cache)).value() / 4; -pi4 } (true, true, false, false) => { - let pi34: FBig = work_context.pi::().value() * 3 / 4; + let pi34: FBig = + work_context.pi::(reborrow_cache(&mut cache)).value() * 3 / 4; -pi34 } - (true, false, true, _) => work_context.pi::().value() / 2, + (true, false, true, _) => { + work_context.pi::(reborrow_cache(&mut cache)).value() / 2 + } (true, false, false, _) => { - let half_pi: FBig = work_context.pi::().value() / 2; + let half_pi: FBig = + work_context.pi::(reborrow_cache(&mut cache)).value() / 2; -half_pi } - (false, true, _, true) => FBig::::ZERO.with_precision(work_precision).value(), - (false, true, true, false) => work_context.pi::().value(), - (false, true, false, false) => -work_context.pi::().value(), + (false, true, _, true) => { + // atan2(±finite, +inf) = ±0 (signed zero of y) + if sy { + FBig::::ZERO.with_precision(work_precision).value() + } else { + FBig::::new(Repr::neg_zero(), work_context) + .with_precision(work_precision) + .value() + } + } + (false, true, true, false) => { + work_context.pi::(reborrow_cache(&mut cache)).value() + } + (false, true, false, false) => { + -work_context.pi::(reborrow_cache(&mut cache)).value() + } _ => unreachable!(), }; - // Note: atan2(finite, +inf) returns unsigned ZERO. IEEE 754 requires signed zero, - // but `Repr` does not distinguish signed zero. - return FpResult::Normal(res.with_precision(self.precision).map(|v| v.repr)); + return Ok(res.with_precision(self.precision)); } let y_f = FBig::::new(work_context.repr_round(y.clone()).value(), work_context); @@ -476,29 +556,31 @@ impl Context { match x_f.cmp(&FBig::::ZERO) { Ordering::Greater => { - let res = work_context.atan_with_reduction(&(y_f / x_f)); - FpResult::Normal(res.with_precision(self.precision).map(|v| v.repr)) + let res = + work_context.atan_with_reduction(&(y_f / x_f), reborrow_cache(&mut cache)); + Ok(res.with_precision(self.precision)) } Ordering::Less => { - let pi = work_context.pi::().value(); + let pi = work_context.pi::(reborrow_cache(&mut cache)).value(); let y_sign = y_f.sign(); - let atan_yx = work_context.atan_with_reduction(&(y_f / x_f)); + let atan_yx = + work_context.atan_with_reduction(&(y_f / x_f), reborrow_cache(&mut cache)); let res = if y_sign == Sign::Positive { atan_yx + pi } else { atan_yx - pi }; - FpResult::Normal(res.with_precision(self.precision).map(|v| v.repr)) + Ok(res.with_precision(self.precision)) } Ordering::Equal => { // x == 0 case - let pi = work_context.pi::().value(); + let pi = work_context.pi::(reborrow_cache(&mut cache)).value(); let half_pi: FBig = pi / 2; if y_f > FBig::::ZERO { - FpResult::Normal(half_pi.with_precision(self.precision).map(|v| v.repr)) + Ok(half_pi.with_precision(self.precision)) } else { let res = -half_pi; - FpResult::Normal(res.with_precision(self.precision).map(|v| v.repr)) + Ok(res.with_precision(self.precision)) } } } @@ -509,21 +591,19 @@ impl FBig { /// Calculate the sine of the floating point number. /// /// # Panics - /// Panics if the input is infinite or the result is not representable as a normal value. + /// Panics if the input is infinite. #[inline] - #[must_use] pub fn sin(&self) -> Self { - self.context.sin(&self.repr).value(&self.context) + self.context.unwrap_fp(self.context.sin(&self.repr, None)) } /// Calculate the cosine of the floating point number. /// /// # Panics - /// Panics if the input is infinite or the result is not representable as a normal value. + /// Panics if the input is infinite. #[inline] - #[must_use] pub fn cos(&self) -> Self { - self.context.cos(&self.repr).value(&self.context) + self.context.unwrap_fp(self.context.cos(&self.repr, None)) } /// Calculate both the sine and cosine of the floating point number. @@ -531,57 +611,54 @@ impl FBig { /// This is more efficient than calling `sin` and `cos` separately. /// /// # Panics - /// Panics if the input is infinite or the results are not representable as normal values. + /// Panics if the input is infinite. #[inline] - #[must_use] pub fn sin_cos(&self) -> (Self, Self) { - let (s, c) = self.context.sin_cos(&self.repr); - (s.value(&self.context), c.value(&self.context)) + let (s, c) = self.context.sin_cos(&self.repr, None); + (self.context.unwrap_fp(s), self.context.unwrap_fp(c)) } /// Calculate the tangent of the floating point number. /// - /// Returns `FpResult` to safely handle non-finite results (e.g., at singularities). + /// At odd multiples of π/2 the result is an infinity (returned as a value). + /// + /// # Panics + /// Panics if the input is infinite. #[inline] - #[must_use] - pub fn tan(&self) -> FpResult { - self.context.tan(&self.repr) + pub fn tan(&self) -> Self { + self.context.unwrap_fp(self.context.tan(&self.repr, None)) } /// Calculate the arcsine of the floating point number. /// - /// Returns `FpResult` to safely handle domain errors (e.g., |x| > 1). + /// # Panics + /// Panics if the input is infinite or `|self| > 1` (out of domain). #[inline] - #[must_use] - pub fn asin(&self) -> FpResult { - self.context.asin(&self.repr) + pub fn asin(&self) -> Self { + self.context.unwrap_fp(self.context.asin(&self.repr, None)) } /// Calculate the arccosine of the floating point number. /// - /// Returns `FpResult` to safely handle domain errors (e.g., |x| > 1). + /// # Panics + /// Panics if the input is infinite or `|self| > 1` (out of domain). #[inline] - #[must_use] - pub fn acos(&self) -> FpResult { - self.context.acos(&self.repr) + pub fn acos(&self) -> Self { + self.context.unwrap_fp(self.context.acos(&self.repr, None)) } - /// Calculate the arctangent of the floating point number. - /// - /// # Panics - /// Panics if the result is not representable as a normal value. + /// Calculate the arctangent of the floating point number. `atan(±inf) = ±π/2`. #[inline] - #[must_use] pub fn atan(&self) -> Self { - self.context.atan(&self.repr).value(&self.context) + self.context.unwrap_fp(self.context.atan(&self.repr, None)) } - /// Calculate the arctangent of y / x. + /// Calculate the arctangent of `self / x`. /// - /// Returns `FpResult` to safely handle special cases like (0,0) or infinities. + /// # Panics + /// Panics if both arguments are zero. #[inline] - #[must_use] - pub fn atan2(&self, x: &Self) -> FpResult { - self.context.atan2(&self.repr, &x.repr) + pub fn atan2(&self, x: &Self) -> Self { + self.context.unwrap_fp(self.context.atan2(&self.repr, &x.repr, None)) } } diff --git a/float/src/mul.rs b/float/src/mul.rs index 5b674d7..7423529 100644 --- a/float/src/mul.rs +++ b/float/src/mul.rs @@ -1,14 +1,62 @@ +use dashu_base::Sign; use dashu_int::{IBig, UBig}; use crate::{ - error::{assert_finite, assert_finite_operands}, + error::{assert_finite_operands, FpError, FpResult}, fbig::FBig, helper_macros, repr::{Context, Repr, Word}, - round::{Round, Rounded}, + round::Round, }; use core::ops::{Mul, MulAssign}; +/// Raw product of two finite reprs, attaching the XOR sign of the operands to a zero product +/// (the significand product alone is `+0`, losing the sign). +/// +/// Returns an error when the result exponent overflows or underflows `isize`. +fn mul_finite_reprs(lhs: &Repr, rhs: &Repr) -> Result, FpError> { + let significand = &lhs.significand * &rhs.significand; + if significand.is_zero() { + return Ok(if lhs.sign() != rhs.sign() { + Repr::neg_zero() + } else { + Repr::zero() + }); + } + let sign = if lhs.sign() != rhs.sign() { + Sign::Negative + } else { + Sign::Positive + }; + let exponent = lhs.exponent.checked_add(rhs.exponent).ok_or_else(|| { + debug_assert!( + lhs.exponent.is_positive() == rhs.exponent.is_positive(), + "checked_add overflow with mixed-sign exponents is impossible" + ); + if lhs.exponent > 0 { + FpError::Overflow(sign) + } else { + FpError::Underflow(sign) + } + })?; + Repr::new(significand, exponent).check_finite_exponent() +} + +macro_rules! unwrap_mul_repr { + ($result:expr, $context:expr) => { + match $result { + Ok(r) => r, + Err(FpError::Overflow(sign)) => { + return FBig::new(Repr::infinity_with_sign(sign), $context); + } + Err(FpError::Underflow(sign)) => { + return FBig::new(Repr::zero_with_sign(sign), $context); + } + Err(_) => unreachable!(), + } + }; +} + impl Mul<&FBig> for &FBig { type Output = FBig; @@ -17,10 +65,7 @@ impl Mul<&FBig> for &FBig { assert_finite_operands(&self.repr, &rhs.repr); let context = Context::max(self.context, rhs.context); - let repr = Repr::new( - &self.repr.significand * &rhs.repr.significand, - self.repr.exponent + rhs.repr.exponent, - ); + let repr = unwrap_mul_repr!(mul_finite_reprs(&self.repr, &rhs.repr), context); FBig::new(context.repr_round(repr).value(), context) } } @@ -33,10 +78,7 @@ impl Mul<&FBig> for FBig { assert_finite_operands(&self.repr, &rhs.repr); let context = Context::max(self.context, rhs.context); - let repr = Repr::new( - self.repr.significand * &rhs.repr.significand, - self.repr.exponent + rhs.repr.exponent, - ); + let repr = unwrap_mul_repr!(mul_finite_reprs(&self.repr, &rhs.repr), context); FBig::new(context.repr_round(repr).value(), context) } } @@ -49,10 +91,7 @@ impl Mul> for &FBig { assert_finite_operands(&self.repr, &rhs.repr); let context = Context::max(self.context, rhs.context); - let repr = Repr::new( - &self.repr.significand * rhs.repr.significand, - self.repr.exponent + rhs.repr.exponent, - ); + let repr = unwrap_mul_repr!(mul_finite_reprs(&self.repr, &rhs.repr), context); FBig::new(context.repr_round(repr).value(), context) } } @@ -65,10 +104,7 @@ impl Mul> for FBig { assert_finite_operands(&self.repr, &rhs.repr); let context = Context::max(self.context, rhs.context); - let repr = Repr::new( - self.repr.significand * rhs.repr.significand, - self.repr.exponent + rhs.repr.exponent, - ); + let repr = unwrap_mul_repr!(mul_finite_reprs(&self.repr, &rhs.repr), context); FBig::new(context.repr_round(repr).value(), context) } } @@ -98,7 +134,7 @@ impl FBig { /// ``` #[inline] pub fn sqr(&self) -> Self { - self.context.sqr(&self.repr).value() + self.context.unwrap_fp(self.context.sqr(&self.repr)) } /// Compute the cubic of this number (`self * self * self`) @@ -115,7 +151,7 @@ impl FBig { /// ``` #[inline] pub fn cubic(&self) -> Self { - self.context.cubic(&self.repr).value() + self.context.unwrap_fp(self.context.cubic(&self.repr)) } } @@ -136,12 +172,14 @@ impl Context { /// let b = DBig::from_str("6.789")?; /// assert_eq!( /// context.mul(&a.repr(), &b.repr()), - /// Inexact(DBig::from_str("-8.4")?, SubOne) + /// Ok(Inexact(DBig::from_str("-8.4")?, SubOne)) /// ); /// # Ok::<(), ParseError>(()) /// ``` - pub fn mul(&self, lhs: &Repr, rhs: &Repr) -> Rounded> { - assert_finite_operands(lhs, rhs); + pub fn mul(&self, lhs: &Repr, rhs: &Repr) -> FpResult> { + if lhs.is_infinite() || rhs.is_infinite() { + return Err(FpError::InfiniteInput); + } // at most double the precision is required to get a correct result // shrink the input operands if necessary @@ -167,11 +205,8 @@ impl Context { rhs }; - let repr = Repr::new( - &lhs_repr.significand * &rhs_repr.significand, - lhs_repr.exponent + rhs_repr.exponent, - ); - self.repr_round(repr).map(|v| FBig::new(v, *self)) + let repr = mul_finite_reprs(lhs_repr, rhs_repr)?; + Ok(self.repr_round(repr).map(|v| FBig::new(v, *self))) } /// Calculate the square of the floating point number under this context. @@ -187,11 +222,13 @@ impl Context { /// /// let context = Context::::new(2); /// let a = DBig::from_str("-1.234")?; - /// assert_eq!(context.sqr(&a.repr()), Inexact(DBig::from_str("1.5")?, NoOp)); + /// assert_eq!(context.sqr(&a.repr()), Ok(Inexact(DBig::from_str("1.5")?, NoOp))); /// # Ok::<(), ParseError>(()) /// ``` - pub fn sqr(&self, f: &Repr) -> Rounded> { - assert_finite(f); + pub fn sqr(&self, f: &Repr) -> FpResult> { + if f.is_infinite() { + return Err(FpError::InfiniteInput); + } // shrink the input operands if necessary let max_precision = if self.is_limited() { @@ -208,8 +245,17 @@ impl Context { f }; - let repr = Repr::new(f_repr.significand.sqr().into(), 2 * f_repr.exponent); - self.repr_round(repr).map(|v| FBig::new(v, *self)) + let exponent = f_repr.exponent.checked_mul(2).ok_or({ + // sqr always produces a non-negative result + if f_repr.exponent > 0 { + FpError::Overflow(Sign::Positive) + } else { + FpError::Underflow(Sign::Positive) + } + })?; + let repr = Repr::new(f_repr.significand.sqr().into(), exponent); + let repr = repr.check_finite_exponent()?; + Ok(self.repr_round(repr).map(|v| FBig::new(v, *self))) } /// Calculate the cubic of the floating point number under this context. @@ -225,11 +271,13 @@ impl Context { /// /// let context = Context::::new(2); /// let a = DBig::from_str("-1.234")?; - /// assert_eq!(context.cubic(&a.repr()), Inexact(DBig::from_str("-1.9")?, SubOne)); + /// assert_eq!(context.cubic(&a.repr()), Ok(Inexact(DBig::from_str("-1.9")?, SubOne))); /// # Ok::<(), ParseError>(()) /// ``` - pub fn cubic(&self, f: &Repr) -> Rounded> { - assert_finite(f); + pub fn cubic(&self, f: &Repr) -> FpResult> { + if f.is_infinite() { + return Err(FpError::InfiniteInput); + } // shrink the input operands if necessary let max_precision = if self.is_limited() { @@ -246,7 +294,25 @@ impl Context { f }; - let repr = Repr::new(f_repr.significand.cubic(), 3 * f_repr.exponent); - self.repr_round(repr).map(|v| FBig::new(v, *self)) + let repr = if f_repr.significand.is_zero() { + // cubic(±0) = ±0 (odd power preserves sign) + if f_repr.is_neg_zero() { + Repr::neg_zero() + } else { + Repr::zero() + } + } else { + let sign = f_repr.sign(); + let exponent = f_repr.exponent.checked_mul(3).ok_or({ + if f_repr.exponent > 0 { + FpError::Overflow(sign) + } else { + FpError::Underflow(sign) + } + })?; + let repr = Repr::new(f_repr.significand.cubic(), exponent); + repr.check_finite_exponent()? + }; + Ok(self.repr_round(repr).map(|v| FBig::new(v, *self))) } } diff --git a/float/src/repr.rs b/float/src/repr.rs index 1a52911..0ed5595 100644 --- a/float/src/repr.rs +++ b/float/src/repr.rs @@ -1,5 +1,5 @@ use crate::{ - error::assert_finite, + error::{assert_finite, FpError}, round::{Round, Rounded}, utils::{digit_len, split_digits, split_digits_ref}, }; @@ -21,25 +21,50 @@ use dashu_int::{IBig, UBig}; /// `base^(precision+1)`; the guard digit is what lets a much-smaller operand be reduced to a /// sign-only sticky bit during alignment without mis-rounding. /// -/// # Infinity +/// # Infinity and signed zero /// -/// This struct supports representing the infinity, but the infinity is only supposed to be used -/// as sentinels. That is, only equality test and comparison are implemented for the infinity. -/// Any other operations on the infinity will lead to panic. If an operation result is too large -/// or too small, the operation will **panic** instead of returning an infinity. +/// Special values are encoded with a zero significand and a sentinel exponent: +/// - value zero (`+0`): exponent = 0 +/// - negative zero (`-0`): exponent = -1 +/// - positive infinity (`+inf`): exponent = `isize::MAX` +/// - negative infinity (`-inf`): exponent = `isize::MIN` +/// +/// The infinities are only supposed to be consumed as sentinels: only equality test and +/// comparison are implemented for them, and any arithmetic operation that takes an infinity +/// as input will lead to panic (at the `FBig` layer) or return an error (at the `Context` +/// layer). If an operation result is too large or too small, the operation will return an +/// infinity (as a value) at the `Context` layer, or panic at the `FBig` layer. /// -#[derive(PartialEq, Eq)] pub struct Repr { - /// The significand of the floating point number. If the significand is zero, then the number is: - /// - Zero, if exponent = 0 - /// - Positive infinity, if exponent > 0 - /// - Negative infinity, if exponent < 0 + /// The significand of the floating point number. If the significand is zero, then the + /// number is a special value identified by the exponent (see the struct-level docs): + /// `+0`, `-0`, `+inf`, or `-inf`. pub(crate) significand: IBig, /// The exponent of the floating point number. pub(crate) exponent: isize, } +impl PartialEq for Repr { + /// Two representations are equal when they denote the same value. In particular `+0` + /// and `-0` compare equal, as do two infinities of the same sign. + #[inline] + fn eq(&self, other: &Self) -> bool { + if self.significand.is_zero() && other.significand.is_zero() { + let (self_inf, other_inf) = (self.is_infinite(), other.is_infinite()); + match (self_inf, other_inf) { + (true, true) => self.sign() == other.sign(), + (false, false) => true, // both are ±0 + _ => false, // one is zero, the other is infinite + } + } else { + self.significand == other.significand && self.exponent == other.exponent + } + } +} + +impl Eq for Repr {} + /// The context containing runtime information for the floating point number and its operations. /// /// The context currently consists of a *precision limit* and a *rounding mode*. All the operation @@ -77,6 +102,34 @@ pub struct Context { _marker: PhantomData, } +/// Flip the sign of a special-value exponent: `+0 (0) <-> -0 (-1)`, `+inf (MAX) <-> -inf (MIN)`. +/// For any other (non-canonical) exponent the plain negation is used, which is safe because such +/// values have magnitude strictly less than `isize::MAX`. +#[inline] +const fn negate_special_exponent(exp: isize) -> isize { + match exp { + 0 => -1, + -1 => 0, + isize::MAX => isize::MIN, + isize::MIN => isize::MAX, + other => -other, + } +} + +/// Build a `Repr` from a rounded significand, preserving the input sign when rounding +/// produces zero (`significand * B^exponent` where the significand collapsed to `+0`). +fn rounded_to_repr( + significand: IBig, + exponent: isize, + input_negative: bool, +) -> Repr { + if significand.is_zero() && input_negative { + Repr::neg_zero() + } else { + Repr::new(significand, exponent) + } +} + impl Repr { /// The base of the representation. It's exposed as an [IBig] constant. pub const BASE: UBig = UBig::from_word(B); @@ -110,28 +163,41 @@ impl Repr { pub const fn infinity() -> Self { Self { significand: IBig::ZERO, - exponent: 1, + exponent: isize::MAX, } } /// Create a [Repr] instance representing the negative infinity #[inline] pub const fn neg_infinity() -> Self { + Self { + significand: IBig::ZERO, + exponent: isize::MIN, + } + } + /// Create a [Repr] instance representing the negative zero (`-0`) + /// + /// Negative zero is produced by operations (e.g. `1 / -inf`, `ceil(-0)`, cancellation + /// under round-toward-negative) and is distinct from `+0` only in operations that are + /// sensitive to the sign of zero (e.g. `1 / -0 = -inf`). It compares equal to `+0`. + #[inline] + pub const fn neg_zero() -> Self { Self { significand: IBig::ZERO, exponent: -1, } } - // XXX: Add support for representing NEG_ZERO, but don't provide method to generate it. - // neg_zero: exponent -1, infinity: exponent: isize::MAX, neg_infinity: exponent: isize::MIN - /// Determine if the [Repr] represents zero /// + /// Note that this returns `true` only for `+0`; use [`Self::is_neg_zero`] to detect `-0`, + /// or check `self.significand.is_zero()` to detect either signed zero. + /// /// # Examples /// /// ``` /// # use dashu_float::Repr; /// assert!(Repr::<2>::zero().is_zero()); + /// assert!(!Repr::<10>::neg_zero().is_zero()); /// assert!(!Repr::<10>::one().is_zero()); /// ``` #[inline] @@ -139,6 +205,21 @@ impl Repr { self.significand.is_zero() && self.exponent == 0 } + /// Determine if the [Repr] represents the negative zero (`-0`) + /// + /// # Examples + /// + /// ``` + /// # use dashu_float::Repr; + /// assert!(Repr::<2>::neg_zero().is_neg_zero()); + /// assert!(!Repr::<10>::zero().is_neg_zero()); + /// assert!(!Repr::<10>::one().is_neg_zero()); + /// ``` + #[inline] + pub const fn is_neg_zero(&self) -> bool { + self.significand.is_zero() && self.exponent == -1 + } + /// Determine if the [Repr] represents one /// /// # Examples @@ -162,10 +243,11 @@ impl Repr { /// assert!(Repr::<2>::infinity().is_infinite()); /// assert!(Repr::<10>::neg_infinity().is_infinite()); /// assert!(!Repr::<10>::one().is_infinite()); + /// assert!(!Repr::<10>::neg_zero().is_infinite()); /// ``` #[inline] pub const fn is_infinite(&self) -> bool { - self.significand.is_zero() && self.exponent != 0 + self.significand.is_zero() && (self.exponent == isize::MAX || self.exponent == isize::MIN) } /// Determine if the [Repr] represents a finite number @@ -205,12 +287,15 @@ impl Repr { /// Get the sign of the number /// + /// Note that `-0` has a negative sign (so `1 / -0 = -inf`), while `+0` has a positive sign. + /// /// # Examples /// /// ``` /// # use dashu_base::Sign; /// # use dashu_float::Repr; /// assert_eq!(Repr::<2>::zero().sign(), Sign::Positive); + /// assert_eq!(Repr::<2>::neg_zero().sign(), Sign::Negative); /// assert_eq!(Repr::<2>::neg_one().sign(), Sign::Negative); /// assert_eq!(Repr::<10>::neg_infinity().sign(), Sign::Negative); /// ``` @@ -227,30 +312,96 @@ impl Repr { } } + /// Negate the number, correctly toggling the sign of `±0` and `±inf` by flipping the + /// special-value exponent (negating the significand alone is a no-op for zero). + #[inline] + pub(crate) fn neg(self) -> Self { + if self.significand.is_zero() { + Self { + significand: self.significand, + exponent: negate_special_exponent(self.exponent), + } + } else { + Self { + significand: -self.significand, + exponent: self.exponent, + } + } + } + + /// Check that a `Repr` with a non-zero significand has a valid finite exponent. + /// + /// Returns [`FpError::Overflow`] or [`FpError::Underflow`] when the exponent collides with + /// the `+inf`/`-inf` sentinels (`isize::MAX` / `isize::MIN`). Zero-significand reprs + /// (canonical special values) always pass. + pub(crate) fn check_finite_exponent(self) -> Result { + if !self.significand.is_zero() { + if self.exponent == isize::MAX { + Err(FpError::Overflow(self.sign())) + } else if self.exponent == isize::MIN { + Err(FpError::Underflow(self.sign())) + } else { + Ok(self) + } + } else { + Ok(self) + } + } + + /// Create the `Repr` for a signed infinity from the mathematical sign of a result that + /// overflowed. + #[inline] + pub(crate) const fn infinity_with_sign(sign: Sign) -> Self { + match sign { + Sign::Positive => Self::infinity(), + Sign::Negative => Self::neg_infinity(), + } + } + + /// Create the `Repr` for a signed zero from the mathematical sign of a result that + /// underflowed. + #[inline] + pub(crate) const fn zero_with_sign(sign: Sign) -> Self { + match sign { + Sign::Positive => Self::zero(), + Sign::Negative => Self::neg_zero(), + } + } + /// Normalize the float representation so that the significand is not divisible by the base. - /// Any floats with zero significand will be considered as zero value (instead of an `INFINITY`) + /// + /// A zero significand denotes a canonical special value (`+0`, `-0`, `+inf`, `-inf`) and is + /// returned unchanged; any other (non-canonical) zero significand is normalized to `+0`. pub(crate) fn normalize(self) -> Self { + if self.significand.is_zero() { + // Preserve the four canonical special-value encodings; collapse anything else to +0. + if self.exponent == 0 + || self.exponent == -1 + || self.exponent == isize::MAX + || self.exponent == isize::MIN + { + return self; + } + return Self::zero(); + } + let Self { mut significand, mut exponent, } = self; - if significand.is_zero() { - return Self::zero(); - } - if B == 2 { let shift = significand.trailing_zeros().unwrap(); significand >>= shift; - exponent += shift as isize; + exponent = exponent.saturating_add(shift as isize); } else if B.is_power_of_two() { let bits = B.trailing_zeros() as usize; let shift = significand.trailing_zeros().unwrap() / bits; significand >>= shift * bits; - exponent += shift as isize; + exponent = exponent.saturating_add(shift as isize); } else { let (sign, mut mag) = significand.into_parts(); let shift = mag.remove(&UBig::from_word(B)).unwrap(); - exponent += shift as isize; + exponent = exponent.saturating_add(shift as isize); significand = IBig::from_parts(sign, mag); } Self { @@ -491,9 +642,12 @@ impl Context { let digits = repr.digits(); if digits > self.precision { let shift = digits - self.precision; + let input_neg = repr.sign() == Sign::Negative; let (signif_hi, signif_lo) = split_digits::(repr.significand, shift); let adjust = R::round_fract::(&signif_hi, signif_lo, shift); - Inexact(Repr::new(signif_hi + adjust, repr.exponent + shift as isize), adjust) + let sig = signif_hi + adjust; + let result = rounded_to_repr(sig, repr.exponent + shift as isize, input_neg); + Inexact(result, adjust) } else { Exact(repr) } @@ -509,11 +663,64 @@ impl Context { let digits = repr.digits(); if digits > self.precision { let shift = digits - self.precision; + let input_neg = repr.sign() == Sign::Negative; let (signif_hi, signif_lo) = split_digits_ref::(&repr.significand, shift); let adjust = R::round_fract::(&signif_hi, signif_lo, shift); - Inexact(Repr::new(signif_hi + adjust, repr.exponent + shift as isize), adjust) + let sig = signif_hi + adjust; + let result = rounded_to_repr(sig, repr.exponent + shift as isize, input_neg); + Inexact(result, adjust) } else { Exact(repr.clone()) } } } + +#[cfg(test)] +mod tests { + use super::*; + use dashu_base::Sign; + + #[test] + fn infinity_encoding() { + assert_eq!(Repr::<2>::infinity().exponent, isize::MAX); + assert_eq!(Repr::<10>::neg_infinity().exponent, isize::MIN); + assert!(Repr::<2>::infinity().is_infinite()); + assert!(Repr::<10>::neg_infinity().is_infinite()); + assert!(!Repr::<2>::infinity().is_finite()); + assert_eq!(Repr::<2>::infinity().sign(), Sign::Positive); + assert_eq!(Repr::<10>::neg_infinity().sign(), Sign::Negative); + } + + #[test] + fn neg_zero_encoding() { + assert_eq!(Repr::<2>::neg_zero().exponent, -1); + assert!(Repr::<2>::neg_zero().is_neg_zero()); + assert!(!Repr::<2>::neg_zero().is_zero()); + assert!(!Repr::<2>::neg_zero().is_infinite()); + assert_eq!(Repr::<2>::neg_zero().sign(), Sign::Negative); + assert_eq!(Repr::<2>::zero().sign(), Sign::Positive); + } + + #[test] + fn normalize_preserves_specials() { + // infinities are preserved (the previous clobbering bug) + assert_eq!(Repr::<2>::infinity(), Repr::<2>::infinity().normalize()); + assert_eq!(Repr::<10>::neg_infinity(), Repr::<10>::neg_infinity().normalize()); + // +0 is preserved + assert_eq!(Repr::<2>::zero(), Repr::<2>::zero().normalize()); + // a stray zero significand with a non-sentinel exponent collapses to +0 + let stray: Repr<2> = Repr { + significand: IBig::ZERO, + exponent: 7, + }; + assert_eq!(Repr::<2>::zero(), stray.normalize()); + // non-zero significands are still normalized + let r: Repr<2> = Repr { + significand: IBig::from(0b10100i32), + exponent: 0, + }; + let r = r.normalize(); + assert_eq!(r.significand, IBig::from(0b101i32)); + assert_eq!(r.exponent, 2); + } +} diff --git a/float/src/root.rs b/float/src/root.rs index 3c0c480..a0ac229 100644 --- a/float/src/root.rs +++ b/float/src/root.rs @@ -2,10 +2,10 @@ use dashu_base::{Approximation, CubicRoot, Sign, SquareRoot, SquareRootRem, Unsi use dashu_int::{IBig, UBig}; use crate::{ - error::{assert_finite, assert_limited_precision, panic_root_negative, panic_root_zeroth}, + error::{assert_limited_precision, panic_root_zeroth, FpError, FpResult}, fbig::FBig, repr::{Context, Repr, Word}, - round::{Round, Rounded}, + round::Round, utils::{shl_digits, split_digits_ref}, }; @@ -13,7 +13,7 @@ impl SquareRoot for FBig { type Output = Self; #[inline] fn sqrt(&self) -> Self { - self.context.sqrt(self.repr()).value() + self.context.unwrap_fp(self.context.sqrt(self.repr())) } } @@ -21,7 +21,7 @@ impl CubicRoot for FBig { type Output = Self; #[inline] fn cbrt(&self) -> Self { - self.context.cbrt(self.repr()).value() + self.context.unwrap_fp(self.context.cbrt(self.repr())) } } @@ -50,7 +50,7 @@ impl FBig { /// Panics if `n` is zero, or if `n` is even and the number is negative. #[inline] pub fn nth_root(&self, n: usize) -> Self { - self.context.nth_root(n, self.repr()).value() + self.context.unwrap_fp(self.context.nth_root(n, self.repr())) } } @@ -68,18 +68,24 @@ impl Context { /// /// let context = Context::::new(2); /// let a = DBig::from_str("1.23")?; - /// assert_eq!(context.sqrt(&a.repr()), Inexact(DBig::from_str("1.1")?, NoOp)); + /// assert_eq!(context.sqrt(&a.repr()), Ok(Inexact(DBig::from_str("1.1")?, NoOp))); /// # Ok::<(), ParseError>(()) /// ``` /// /// # Panics /// /// Panics if the precision is unlimited. - pub fn sqrt(&self, x: &Repr) -> Rounded> { - assert_finite(x); + pub fn sqrt(&self, x: &Repr) -> FpResult> { + if x.is_infinite() { + return Err(FpError::InfiniteInput); + } assert_limited_precision(self.precision); + if x.significand.is_zero() { + // sqrt(+0) = +0, sqrt(-0) = -0 (preserve the sign of zero) + return Ok(Approximation::Exact(FBig::new(x.clone(), *self))); + } if x.sign() == Sign::Negative { - panic_root_negative() + return Err(FpError::OutOfDomain); } // adjust the signifcand so that the exponent is even @@ -107,9 +113,10 @@ impl Context { }); Approximation::Inexact(root + adjust, adjust) }; - res.map(|signif| Repr::new(signif, exp)) + Ok(res + .map(|signif| Repr::new(signif, exp)) .and_then(|v| self.repr_round(v)) - .map(|v| FBig::new(v, *self)) + .map(|v| FBig::new(v, *self))) } /// Calculate the cubic root of the floating point number. @@ -125,7 +132,7 @@ impl Context { /// /// let context = Context::::new(2); /// let a = DBig::from_str("8")?; - /// assert_eq!(context.cbrt(&a.repr()), Exact(DBig::from_str("2")?)); + /// assert_eq!(context.cbrt(&a.repr()), Ok(Exact(DBig::from_str("2")?))); /// # Ok::<(), ParseError>(()) /// ``` /// @@ -133,7 +140,7 @@ impl Context { /// /// Panics if the precision is unlimited. #[inline] - pub fn cbrt(&self, x: &Repr) -> Rounded> { + pub fn cbrt(&self, x: &Repr) -> FpResult> { self.nth_root(3, x) } @@ -150,15 +157,17 @@ impl Context { /// /// let context = Context::::new(2); /// let a = DBig::from_str("27")?; - /// assert_eq!(context.nth_root(3, &a.repr()), Exact(DBig::from_str("3")?)); + /// assert_eq!(context.nth_root(3, &a.repr()), Ok(Exact(DBig::from_str("3")?))); /// # Ok::<(), ParseError>(()) /// ``` /// /// # Panics /// /// Panics if `n` is zero, if the precision is unlimited, or if `n` is even and `x` is negative. - pub fn nth_root(&self, n: usize, x: &Repr) -> Rounded> { - assert_finite(x); + pub fn nth_root(&self, n: usize, x: &Repr) -> FpResult> { + if x.is_infinite() { + return Err(FpError::InfiniteInput); + } assert_limited_precision(self.precision); if n == 0 { panic_root_zeroth() @@ -166,14 +175,16 @@ impl Context { debug_assert!(n < isize::MAX as usize); let sign = x.sign(); if sign == Sign::Negative && n % 2 == 0 { - panic_root_negative() + return Err(FpError::OutOfDomain); } if n == 1 { - return self.repr_round_ref(x).map(|v| FBig::new(v, *self)); + return Ok(self.repr_round_ref(x).map(|v| FBig::new(v, *self))); } if x.significand.is_zero() { - // UBig::ZERO.nth_root(n) erroneously returns ONE, so short-circuit here - return Approximation::Exact(FBig::new(Repr::zero(), *self)); + // UBig::ZERO.nth_root(n) erroneously returns ONE, so short-circuit here. + // An even root of -0 already errored above, so reaching here the sign is + // preserved: odd root of ±0 is ±0. + return Ok(Approximation::Exact(FBig::new(x.clone(), *self))); } // operate on the magnitude so that shifting/splitting keep a clean sign; @@ -225,8 +236,9 @@ impl Context { }); Approximation::Inexact(signed_root.clone() + adjust, adjust) }; - res.map(|signif| Repr::new(signif, exp)) + Ok(res + .map(|signif| Repr::new(signif, exp)) .and_then(|v| self.repr_round(v)) - .map(|v| FBig::new(v, *self)) + .map(|v| FBig::new(v, *self))) } } diff --git a/float/src/round.rs b/float/src/round.rs index ee86205..2e59592 100644 --- a/float/src/round.rs +++ b/float/src/round.rs @@ -87,6 +87,11 @@ pub trait Round: Copy { /// The rounding operation that rounds to an opposite direction type Reverse: Round; + /// Whether this mode rounds toward negative infinity (IEEE roundTowardNegative). + /// Used to determine the sign of a zero produced by exact cancellation: per IEEE 754, + /// `x + (-x)` yields `-0` only under roundTowardNegative, `+0` otherwise. + const IS_ROUND_TOWARD_NEGATIVE: bool; + /// Calculate the rounding of the number (integer + rem), assuming rem != 0 and |rem| < 1. /// `low_half_test` should tell |rem|.cmp(0.5) fn round_low_part Ordering>( @@ -157,6 +162,7 @@ pub trait ErrorBounds: Round { impl Round for mode::Zero { type Reverse = mode::Away; + const IS_ROUND_TOWARD_NEGATIVE: bool = false; #[inline] fn round_low_part Ordering>( @@ -195,6 +201,7 @@ impl ErrorBounds for mode::Zero { impl Round for mode::Away { type Reverse = mode::Zero; + const IS_ROUND_TOWARD_NEGATIVE: bool = false; #[inline] fn round_low_part Ordering>( @@ -237,6 +244,7 @@ impl ErrorBounds for mode::Away { impl Round for mode::Down { type Reverse = mode::Up; + const IS_ROUND_TOWARD_NEGATIVE: bool = true; #[inline] fn round_low_part Ordering>( @@ -264,6 +272,7 @@ impl ErrorBounds for mode::Down { impl Round for mode::Up { type Reverse = mode::Down; + const IS_ROUND_TOWARD_NEGATIVE: bool = false; #[inline] fn round_low_part Ordering>( @@ -291,6 +300,7 @@ impl ErrorBounds for mode::Up { impl Round for mode::HalfAway { type Reverse = Self; + const IS_ROUND_TOWARD_NEGATIVE: bool = false; #[inline] fn round_low_part Ordering>( @@ -350,6 +360,7 @@ impl ErrorBounds for mode::HalfAway { impl Round for mode::HalfEven { type Reverse = Self; + const IS_ROUND_TOWARD_NEGATIVE: bool = false; #[inline] fn round_low_part Ordering>( diff --git a/float/src/round_ops.rs b/float/src/round_ops.rs index 31a831a..c8c8535 100644 --- a/float/src/round_ops.rs +++ b/float/src/round_ops.rs @@ -9,6 +9,27 @@ use dashu_base::{Approximation::*, Sign}; use dashu_int::{IBig, Word}; impl FBig { + /// Return `±0` carrying the same sign as `self` (used by trunc/round/fract where a zero + /// result inherits the sign of the input). The result has precision 0, matching `FBig::ZERO`. + fn signed_zero(&self) -> Self { + let zero = if self.repr.sign() == Sign::Negative { + Repr::neg_zero() + } else { + Repr::zero() + }; + FBig::new(zero, Context::new(0)) + } + + /// Build an integer `Repr` (exponent 0) from `value`, attaching the input's sign when + /// `value` is zero (so e.g. `trunc(-0.5)` yields `-0`). + fn signed_int(&self, value: IBig) -> Repr { + if value.is_zero() && self.repr.sign() == Sign::Negative { + Repr::neg_zero() + } else { + Repr::new(value, 0) + } + } + /// Get the integral part of the float /// /// See [FBig::round] for how the output precision is determined. @@ -36,13 +57,13 @@ impl FBig { if self.repr.exponent >= 0 { return self.clone(); } else if self.repr.smaller_than_one() { - return Self::ZERO; + return self.signed_zero(); } let shift = (-self.repr.exponent) as usize; let signif = shr_digits::(&self.repr.significand, shift); let context = Context::new(self.context.precision.saturating_sub(shift)); - FBig::new(Repr::new(signif, 0), context) + FBig::new(self.signed_int(signif), context) } // Split the float number at the radix point, assuming it exists (the number is not a integer). @@ -132,7 +153,7 @@ impl FBig { pub fn fract(&self) -> Self { assert_finite(&self.repr); if self.repr.exponent >= 0 { - return Self::ZERO; + return self.signed_zero(); } else if self.repr.smaller_than_one() { return self.clone(); } @@ -167,7 +188,7 @@ impl FBig { #[inline] pub fn ceil(&self) -> Self { assert_finite(&self.repr); - if self.repr.is_zero() || self.repr.exponent >= 0 { + if self.repr.significand.is_zero() || self.repr.exponent >= 0 { return self.clone(); } else if self.repr.smaller_than_one() { return match self.repr.sign() { @@ -207,7 +228,7 @@ impl FBig { #[inline] pub fn floor(&self) -> Self { assert_finite(&self.repr); - if self.repr.exponent >= 0 { + if self.repr.significand.is_zero() || self.repr.exponent >= 0 { return self.clone(); } else if self.repr.smaller_than_one() { return match self.repr.sign() { @@ -255,18 +276,18 @@ impl FBig { /// Panics if the number is infinte pub fn round(&self) -> Self { assert_finite(&self.repr); - if self.repr.exponent >= 0 { + if self.repr.significand.is_zero() || self.repr.exponent >= 0 { return self.clone(); } else if self.repr.exponent + (self.repr.digits_ub() as isize) < -2 { // to determine if the number rounds to zero, we need to make sure |self| < 0.5 // which is stricter than `self.repr.smaller_than_one()` - return Self::ZERO; + return self.signed_zero(); } let (hi, lo, precision) = self.split_at_point_internal(); let rounding = mode::HalfAway::round_fract::(&hi, lo, precision); let context = Context::new(self.context.precision.saturating_sub(precision)); - FBig::new(Repr::new(hi + rounding, 0), context) + FBig::new(self.signed_int(hi + rounding), context) } /// Round the number to the nearest multiple of `BASE^exp`. diff --git a/float/src/shift.rs b/float/src/shift.rs index 1ddc43b..92a694b 100644 --- a/float/src/shift.rs +++ b/float/src/shift.rs @@ -6,7 +6,7 @@ impl Shl for FBig { #[inline] fn shl(mut self, rhs: isize) -> Self::Output { assert_finite(&self.repr); - if !self.repr.is_zero() { + if !self.repr.significand.is_zero() { self.repr.exponent += rhs; } self @@ -17,7 +17,7 @@ impl ShlAssign for FBig { #[inline] fn shl_assign(&mut self, rhs: isize) { assert_finite(&self.repr); - if !self.repr.is_zero() { + if !self.repr.significand.is_zero() { self.repr.exponent += rhs; } } @@ -28,7 +28,7 @@ impl Shr for FBig { #[inline] fn shr(mut self, rhs: isize) -> Self::Output { assert_finite(&self.repr); - if !self.repr.is_zero() { + if !self.repr.significand.is_zero() { self.repr.exponent -= rhs; } self @@ -39,9 +39,8 @@ impl ShrAssign for FBig { #[inline] fn shr_assign(&mut self, rhs: isize) { assert_finite(&self.repr); - if !self.repr.is_zero() { + if !self.repr.significand.is_zero() { self.repr.exponent -= rhs; } - self.repr.exponent -= rhs; } } diff --git a/float/src/sign.rs b/float/src/sign.rs index 7568f2d..dd46c3f 100644 --- a/float/src/sign.rs +++ b/float/src/sign.rs @@ -8,7 +8,8 @@ use dashu_base::{Abs, Sign, Signed}; use dashu_int::IBig; impl FBig { - /// Get the sign of the number. Zero value has a positive sign. + /// Get the sign of the number. Positive zero has a positive sign, negative zero has a + /// negative sign. /// /// # Examples /// @@ -41,11 +42,12 @@ impl FBig { /// # Ok::<(), ParseError>(()) /// ``` pub const fn signum(&self) -> Self { - let significand = if self.repr.significand.is_zero() && self.repr.exponent != 0 { - if self.repr.exponent > 0 { - IBig::ONE - } else { - IBig::NEG_ONE + let significand = if self.repr.significand.is_zero() { + // distinguish infinities from signed zero; signum(±0) = +0 + match self.repr.exponent { + isize::MAX => IBig::ONE, + isize::MIN => IBig::NEG_ONE, + _ => IBig::ZERO, } } else { self.repr.significand.signum() @@ -61,9 +63,8 @@ impl FBig { impl Neg for Repr { type Output = Self; #[inline] - fn neg(mut self) -> Self::Output { - self.significand = -self.significand; - self + fn neg(self) -> Self::Output { + Repr::neg(self) } } @@ -71,7 +72,7 @@ impl Neg for FBig { type Output = Self; #[inline] fn neg(mut self) -> Self::Output { - self.repr.significand = -self.repr.significand; + self.repr = self.repr.neg(); self } } @@ -87,7 +88,17 @@ impl Neg for &FBig { impl Abs for FBig { type Output = Self; fn abs(mut self) -> Self::Output { - self.repr.significand = self.repr.significand.abs(); + // flip -0 -> +0 and -inf -> +inf by toggling the special-value exponent; + // finite values take the absolute value of their significand. + if self.repr.significand.is_zero() { + if self.repr.exponent == -1 { + self.repr.exponent = 0; + } else if self.repr.exponent == isize::MIN { + self.repr.exponent = isize::MAX; + } + } else { + self.repr.significand = self.repr.significand.abs(); + } self } } @@ -95,25 +106,31 @@ impl Abs for FBig { impl Mul> for Sign { type Output = FBig; #[inline] - fn mul(self, mut rhs: FBig) -> Self::Output { - rhs.repr.significand *= self; - rhs + fn mul(self, rhs: FBig) -> Self::Output { + match self { + Sign::Positive => rhs, + Sign::Negative => -rhs, + } } } impl Mul for FBig { type Output = FBig; #[inline] - fn mul(mut self, rhs: Sign) -> Self::Output { - self.repr.significand *= rhs; - self + fn mul(self, rhs: Sign) -> Self::Output { + match rhs { + Sign::Positive => self, + Sign::Negative => -self, + } } } impl MulAssign for FBig { #[inline] fn mul_assign(&mut self, rhs: Sign) { - self.repr.significand *= rhs; + if rhs == Sign::Negative { + self.repr = self.repr.clone().neg(); + } } } diff --git a/float/src/third_party/num_order.rs b/float/src/third_party/num_order.rs index 2905df5..11241d4 100644 --- a/float/src/third_party/num_order.rs +++ b/float/src/third_party/num_order.rs @@ -286,6 +286,13 @@ impl NumHash for Repr { const M127: i128 = i128::MAX; const M127U: u128 = M127 as u128; + // Zero and infinities have a zero significand, so their residue hash is 0. + // Short-circuit to also avoid overflow when negating the isize::MIN sentinel + // exponent that encodes -inf. + if self.significand.is_zero() { + return 0i128.num_hash(state); + } + let signif_residue = &self.significand % M127; let signif_hash = MInt::new(signif_residue.unsigned_abs(), &M127U); let exp_hash = if B == 2 { @@ -316,3 +323,201 @@ impl NumHash for FBig { self.repr.num_hash(state) } } + +#[cfg(test)] +mod tests { + use super::*; + use core::cmp::Ordering; + use crate::DBig; + use num_order::{NumHash, NumOrd}; + + /// Default binary FBig (Zero rounding, base 2). + type FBin = FBig; + + /// Hash a `NumHash` value to u64 for comparison. + fn num_hash(value: &T) -> u64 { + use std::collections::hash_map::DefaultHasher; + use std::hash::Hasher; + let mut hasher = DefaultHasher::new(); + value.num_hash(&mut hasher); + hasher.finish() + } + + // -- NumOrd for Repr (same base) -- + + #[test] + fn test_num_ord_repr_zero_vs_neg_zero() { + // +0 == -0 (IEEE 754) + assert_eq!( + Repr::<2>::zero().num_cmp(&Repr::<2>::neg_zero()), + Ordering::Equal + ); + assert_eq!( + Repr::<2>::neg_zero().num_cmp(&Repr::<2>::zero()), + Ordering::Equal + ); + } + + #[test] + fn test_num_ord_repr_neg_zero_vs_finite() { + let one = Repr::<2>::one(); + let neg_one = Repr::<2>::neg_one(); + // -0 < positive + assert_eq!(Repr::<2>::neg_zero().num_cmp(&one), Ordering::Less); + // -0 > negative + assert_eq!(Repr::<2>::neg_zero().num_cmp(&neg_one), Ordering::Greater); + } + + #[test] + fn test_num_ord_repr_infinities() { + // +inf > -inf + assert_eq!( + Repr::<2>::infinity().num_cmp(&Repr::<2>::neg_infinity()), + Ordering::Greater + ); + // -inf < +inf + assert_eq!( + Repr::<2>::neg_infinity().num_cmp(&Repr::<2>::infinity()), + Ordering::Less + ); + // +inf == +inf + assert_eq!( + Repr::<2>::infinity().num_cmp(&Repr::<2>::infinity()), + Ordering::Equal + ); + // -inf == -inf + assert_eq!( + Repr::<2>::neg_infinity().num_cmp(&Repr::<2>::neg_infinity()), + Ordering::Equal + ); + } + + #[test] + fn test_num_ord_repr_zero_vs_infinity() { + // +0 < +inf + assert_eq!( + Repr::<2>::zero().num_cmp(&Repr::<2>::infinity()), + Ordering::Less + ); + // -0 < +inf + assert_eq!( + Repr::<2>::neg_zero().num_cmp(&Repr::<2>::infinity()), + Ordering::Less + ); + // +0 > -inf + assert_eq!( + Repr::<2>::zero().num_cmp(&Repr::<2>::neg_infinity()), + Ordering::Greater + ); + // -0 > -inf + assert_eq!( + Repr::<2>::neg_zero().num_cmp(&Repr::<2>::neg_infinity()), + Ordering::Greater + ); + } + + // -- NumOrd for Repr (cross-base) -- + + #[test] + fn test_num_ord_repr_cross_base_zero() { + // Base-2 neg_zero == Base-10 zero + assert_eq!( + Repr::<2>::neg_zero().num_cmp(&Repr::<10>::zero()), + Ordering::Equal + ); + // Base-2 neg_zero == Base-10 neg_zero + assert_eq!( + Repr::<2>::neg_zero().num_cmp(&Repr::<10>::neg_zero()), + Ordering::Equal + ); + } + + #[test] + fn test_num_ord_repr_cross_base_infinity() { + // Base-2 +inf == Base-10 +inf + assert_eq!( + Repr::<2>::infinity().num_cmp(&Repr::<10>::infinity()), + Ordering::Equal + ); + // Base-2 +inf > Base-10 -inf + assert_eq!( + Repr::<2>::infinity().num_cmp(&Repr::<10>::neg_infinity()), + Ordering::Greater + ); + // Base-2 -inf == Base-10 -inf + assert_eq!( + Repr::<2>::neg_infinity().num_cmp(&Repr::<10>::neg_infinity()), + Ordering::Equal + ); + } + + // -- NumOrd for FBig -- + + #[test] + fn test_num_ord_fbig_neg_zero() { + let negz: FBin = FBig::from_repr_const(Repr::<2>::neg_zero()); + let posz = FBin::ZERO; + assert_eq!(negz.num_cmp(&posz), Ordering::Equal); + assert_eq!(posz.num_cmp(&negz), Ordering::Equal); + + // -0 < +1, -0 > -1 + assert_eq!(negz.num_cmp(&FBin::ONE), Ordering::Less); + assert_eq!(negz.num_cmp(&FBin::NEG_ONE), Ordering::Greater); + } + + #[test] + fn test_num_ord_fbig_cross_base_zero() { + let negz: FBin = FBig::from_repr_const(Repr::<2>::neg_zero()); + assert_eq!(negz.num_cmp(&DBig::ZERO), Ordering::Equal); + assert_eq!(DBig::ZERO.num_cmp(&negz), Ordering::Equal); + } + + // -- NumHash for Repr -- + + #[test] + fn test_num_hash_repr_zero_neg_zero_equal() { + // +0 and -0 compare equal, so they must hash the same + assert_eq!( + num_hash(&Repr::<2>::zero()), + num_hash(&Repr::<2>::neg_zero()) + ); + assert_eq!( + num_hash(&Repr::<10>::zero()), + num_hash(&Repr::<10>::neg_zero()) + ); + } + + #[test] + fn test_num_hash_repr_infinities_same_sign() { + // Same-sign infinities hash the same + assert_eq!( + num_hash(&Repr::<2>::infinity()), + num_hash(&Repr::<10>::infinity()) + ); + assert_eq!( + num_hash(&Repr::<2>::neg_infinity()), + num_hash(&Repr::<10>::neg_infinity()) + ); + } + + #[test] + fn test_num_hash_repr_zero_matches_integer_zero() { + // +0 and -0 should hash the same as integer zero + assert_eq!(num_hash(&Repr::<2>::zero()), num_hash(&0i128)); + assert_eq!(num_hash(&Repr::<2>::neg_zero()), num_hash(&0i128)); + } + + // -- NumHash for FBig -- + + #[test] + fn test_num_hash_fbig_neg_zero() { + let negz: FBin = FBig::from_repr_const(Repr::<2>::neg_zero()); + assert_eq!(num_hash(&negz), num_hash(&FBin::ZERO)); + } + + #[test] + fn test_num_hash_fbig_cross_base_zero() { + let negz: FBin = FBig::from_repr_const(Repr::<2>::neg_zero()); + assert_eq!(num_hash(&negz), num_hash(&DBig::ZERO)); + } +} diff --git a/float/src/third_party/num_traits.rs b/float/src/third_party/num_traits.rs index 0c93e7e..8b780a9 100644 --- a/float/src/third_party/num_traits.rs +++ b/float/src/third_party/num_traits.rs @@ -173,7 +173,7 @@ impl num_traits::Signed for FBig { #[inline] fn is_positive(&self) -> bool { - !self.repr.is_zero() && self.repr.sign() == Sign::Positive + self.repr.sign() == Sign::Positive } #[inline] diff --git a/float/src/third_party/rand.rs b/float/src/third_party/rand.rs index 01f2278..36aaeb2 100644 --- a/float/src/third_party/rand.rs +++ b/float/src/third_party/rand.rs @@ -77,10 +77,9 @@ impl UniformFBig { // so that we can ensure we don't reach the right bound. let unit: FBig = self.sampler.sample01::(rng); let context = unit.context(); - let scaled = context.mul(unit.repr(), &self.scale).value(); + let scaled = context.unwrap_fp(context.mul(unit.repr(), &self.scale)); context - .add(scaled.repr(), &self.offset) - .value() + .unwrap_fp(context.add(scaled.repr(), &self.offset)) .with_rounding() } } diff --git a/float/src/utils.rs b/float/src/utils.rs index 52b45e8..fcae8e3 100644 --- a/float/src/utils.rs +++ b/float/src/utils.rs @@ -169,6 +169,19 @@ pub const fn ilog_exact(n: Word, base: Word) -> u32 { } } +/// `ceil(x) as usize` without `f64::ceil`, which is `std`-only on this crate's +/// MSRV (the `f64` inherent methods only landed in `core` in Rust 1.85). Valid for +/// non-negative `x` within `usize` range. +#[inline] +pub(crate) fn ceil_usize(x: f32) -> usize { + let i = x as usize; + if x > i as f32 { + i + 1 + } else { + i + } +} + /// Factor `b` as `newb^a * r`, returning `(a, r)` where `gcd(r, newb) = 1`. /// Returns `(0, b)` if `newb` does not divide `b`. pub const fn factor_base(b: Word, newb: Word) -> (usize, Word) { diff --git a/float/tests/add.rs b/float/tests/add.rs index dd52cea..b266dc3 100644 --- a/float/tests/add.rs +++ b/float/tests/add.rs @@ -91,7 +91,7 @@ fn test_add_binary() { context.sub(c.repr(), a.repr()), context.sub(c.repr(), b.repr()), ) { - (Exact(vc), Exact(vb), Exact(va)) => { + (Ok(Exact(vc)), Ok(Exact(vb)), Ok(Exact(va))) => { assert_eq!(va, *a); assert_eq!(vb, *b); assert_eq!(vc, *c); @@ -113,7 +113,7 @@ fn test_add_binary() { test_add(a, b, c); test_add(b, a, c); - if let Inexact(v, e) = Context::max(a.context(), b.context()).add(a.repr(), b.repr()) { + if let Ok(Inexact(v, e)) = Context::max(a.context(), b.context()).add(a.repr(), b.repr()) { assert_eq!(v, *c); assert_eq!(e, *rnd); } else { @@ -154,7 +154,7 @@ fn test_add_decimal() { context.sub(c.repr(), a.repr()), context.sub(c.repr(), b.repr()), ) { - (Exact(vc), Exact(vb), Exact(va)) => { + (Ok(Exact(vc)), Ok(Exact(vb)), Ok(Exact(va))) => { assert_eq!(va, *a); assert_eq!(vb, *b); assert_eq!(vc, *c); @@ -190,7 +190,7 @@ fn test_add_decimal() { test_add(a, b, c); test_add(b, a, c); - if let Inexact(v, e) = Context::max(a.context(), b.context()).add(a.repr(), b.repr()) { + if let Ok(Inexact(v, e)) = Context::max(a.context(), b.context()).add(a.repr(), b.repr()) { assert_eq!(v, *c); assert_eq!(e, *rnd); } else { @@ -213,7 +213,7 @@ fn test_sub_binary() { for (a, b, c, rnd) in &inexact_cases { test_sub(a, b, c); - if let Inexact(v, e) = Context::max(a.context(), b.context()).sub(a.repr(), b.repr()) { + if let Ok(Inexact(v, e)) = Context::max(a.context(), b.context()).sub(a.repr(), b.repr()) { assert_eq!(v, *c); assert_eq!(e, *rnd); } else { @@ -240,7 +240,7 @@ fn test_sub_decimal() { for (a, b, c) in &exact_cases { test_sub(a, b, c); - if let Exact(v) = Context::max(a.context(), b.context()).sub(a.repr(), b.repr()) { + if let Ok(Exact(v)) = Context::max(a.context(), b.context()).sub(a.repr(), b.repr()) { assert_eq!(v, *c); } else { panic!("the result should be exact!") @@ -263,7 +263,7 @@ fn test_sub_decimal() { for (a, b, c, rnd) in &inexact_cases { test_sub(a, b, c); - if let Inexact(v, e) = Context::max(a.context(), b.context()).sub(a.repr(), b.repr()) { + if let Ok(Inexact(v, e)) = Context::max(a.context(), b.context()).sub(a.repr(), b.repr()) { assert_eq!(v, *c); assert_eq!(e, *rnd); } else { @@ -385,18 +385,46 @@ fn test_sub_zero_operand_directed_rounding() { // b = -1234 → 0 - b = +1234, rounded to precision 2: // Up (toward +∞): 1300 Down (toward −∞): 1200 let b = repr(-1234, 0); - assert_eq!(Context::::new(2).sub(&zero, &b).value().repr(), &repr(13, 2)); - assert_eq!(Context::::new(2).sub(&zero, &b).value().repr(), &repr(12, 2)); + assert_eq!(Context::::new(2).sub(&zero, &b).unwrap().value().repr(), &repr(13, 2)); + assert_eq!( + Context::::new(2) + .sub(&zero, &b) + .unwrap() + .value() + .repr(), + &repr(12, 2) + ); // b = +1234 → 0 - b = -1234, rounded to precision 2: // Up (toward +∞): -1200 Down (toward −∞): -1300 let b = repr(1234, 0); - assert_eq!(Context::::new(2).sub(&zero, &b).value().repr(), &repr(-12, 2)); - assert_eq!(Context::::new(2).sub(&zero, &b).value().repr(), &repr(-13, 2)); + assert_eq!(Context::::new(2).sub(&zero, &b).unwrap().value().repr(), &repr(-12, 2)); + assert_eq!( + Context::::new(2) + .sub(&zero, &b) + .unwrap() + .value() + .repr(), + &repr(-13, 2) + ); // The symmetric `Away` mode is unaffected by the negation order: ±1234 → ±1300. let b = repr(-1234, 0); - assert_eq!(Context::::new(2).sub(&zero, &b).value().repr(), &repr(13, 2)); + assert_eq!( + Context::::new(2) + .sub(&zero, &b) + .unwrap() + .value() + .repr(), + &repr(13, 2) + ); let b = repr(1234, 0); - assert_eq!(Context::::new(2).sub(&zero, &b).value().repr(), &repr(-13, 2)); + assert_eq!( + Context::::new(2) + .sub(&zero, &b) + .unwrap() + .value() + .repr(), + &repr(-13, 2) + ); } diff --git a/float/tests/div.rs b/float/tests/div.rs index 61bcfea..42ae6fd 100644 --- a/float/tests/div.rs +++ b/float/tests/div.rs @@ -75,7 +75,7 @@ fn test_div_binary() { for (a, b, c) in &exact_cases { test_div(a, b, c); - if let Exact(v) = Context::max(a.context(), b.context()).div(a.repr(), b.repr()) { + if let Ok(Exact(v)) = Context::max(a.context(), b.context()).div(a.repr(), b.repr()) { assert_eq!(v, *c); } else { panic!("the result should be exact!") @@ -95,7 +95,7 @@ fn test_div_binary() { for (a, b, c) in &inexact_cases { test_div(a, b, c); - if let Inexact(v, e) = Context::max(a.context(), b.context()).div(a.repr(), b.repr()) { + if let Ok(Inexact(v, e)) = Context::max(a.context(), b.context()).div(a.repr(), b.repr()) { assert_eq!(v, *c); assert_eq!(e, NoOp); } else { @@ -119,7 +119,7 @@ fn test_div_decimal() { for (a, b, c) in &exact_cases { test_div(a, b, c); - if let Exact(v) = Context::max(a.context(), b.context()).div(a.repr(), b.repr()) { + if let Ok(Exact(v)) = Context::max(a.context(), b.context()).div(a.repr(), b.repr()) { assert_eq!(v, *c); } else { panic!("the result should be exact!") @@ -141,7 +141,7 @@ fn test_div_decimal() { for (a, b, c, rnd) in &inexact_cases { test_div(a, b, c); - if let Inexact(v, e) = Context::max(a.context(), b.context()).div(a.repr(), b.repr()) { + if let Ok(Inexact(v, e)) = Context::max(a.context(), b.context()).div(a.repr(), b.repr()) { assert_eq!(v, *c); assert_eq!(e, *rnd); } else { diff --git a/float/tests/exp.rs b/float/tests/exp.rs index a442a4f..056d20c 100644 --- a/float/tests/exp.rs +++ b/float/tests/exp.rs @@ -23,7 +23,7 @@ fn test_powi_binary() { ]; for (base, exp, pow) in &exact_cases { assert_eq!(base.powi(exp.clone()), *pow); - if let Exact(v) = base.context().powi(base.repr(), exp.clone()) { + if let Ok(Exact(v)) = base.context().powi(base.repr(), exp.clone()) { assert_eq!(v, *pow); } else { panic!("the result should be exact!") @@ -45,7 +45,7 @@ fn test_powi_binary() { for (base, exp, pow) in &inexact_cases { assert_eq!(base.powi(exp.clone()), *pow); - if let Inexact(v, e) = base.context().powi(base.repr(), exp.clone()) { + if let Ok(Inexact(v, e)) = base.context().powi(base.repr(), exp.clone()) { assert_eq!(v, *pow); assert_eq!(e, NoOp); } else { @@ -72,7 +72,7 @@ fn test_powi_decimal() { ]; for (base, exp, pow) in &exact_cases { assert_eq!(base.powi(exp.clone()), *pow); - if let Exact(v) = base.context().powi(base.repr(), exp.clone()) { + if let Ok(Exact(v)) = base.context().powi(base.repr(), exp.clone()) { assert_eq!(v, *pow); } else { panic!("the result should be exact!") @@ -94,7 +94,7 @@ fn test_powi_decimal() { for (base, exp, pow, rnd) in &inexact_cases { assert_eq!(base.powi(exp.clone()), *pow); - if let Inexact(v, e) = base.context().powi(base.repr(), exp.clone()) { + if let Ok(Inexact(v, e)) = base.context().powi(base.repr(), exp.clone()) { assert_eq!(v, *pow); assert_eq!(e, *rnd); } else { @@ -157,7 +157,7 @@ fn test_exp_binary() { ]; for (exp, pow) in &inexact_cases { assert_eq!(exp.exp(), *pow); - if let Inexact(v, e) = exp.context().exp(exp.repr()) { + if let Ok(Inexact(v, e)) = exp.context().exp(exp.repr(), None) { assert_eq!(v, *pow); assert_eq!(e, NoOp); } else { @@ -206,7 +206,7 @@ fn test_exp_decimal() { ]; for (exp, pow, rnd) in &inexact_cases { assert_eq!(exp.exp(), *pow); - if let Inexact(v, e) = exp.context().exp(exp.repr()) { + if let Ok(Inexact(v, e)) = exp.context().exp(exp.repr(), None) { assert_eq!(v, *pow); assert_eq!(e, *rnd); } else { @@ -222,9 +222,10 @@ fn test_exp_unlimited_precision() { } #[test] -#[should_panic] fn test_exp_inf() { - let _ = DBig::INFINITY.exp(); + // exp(+inf) = +inf, exp(-inf) = +0 + assert_eq!(DBig::INFINITY.exp(), DBig::INFINITY); + assert_eq!(DBig::NEG_INFINITY.exp(), DBig::ZERO); } #[test] @@ -265,7 +266,7 @@ fn test_exp_m1_binary() { for (exp, pow) in &inexact_cases { assert_eq!(exp.exp_m1(), *pow); - if let Inexact(v, e) = exp.context().exp_m1(exp.repr()) { + if let Ok(Inexact(v, e)) = exp.context().exp_m1(exp.repr(), None) { assert_eq!(v, *pow); assert_eq!(e, NoOp); } else { @@ -315,7 +316,7 @@ fn test_exp_m1_decimal() { for (exp, pow, rnd) in &inexact_cases { assert_eq!(exp.exp_m1(), *pow); - if let Inexact(v, e) = exp.context().exp_m1(exp.repr()) { + if let Ok(Inexact(v, e)) = exp.context().exp_m1(exp.repr(), None) { assert_eq!(v, *pow); assert_eq!(e, *rnd); } else { @@ -331,9 +332,10 @@ fn test_exp_m1_unlimited_precision() { } #[test] -#[should_panic] fn test_exp_m1_inf() { - let _ = DBig::INFINITY.exp_m1(); + // exp_m1(+inf) = +inf, exp_m1(-inf) = -1 + assert_eq!(DBig::INFINITY.exp_m1(), DBig::INFINITY); + assert_eq!(DBig::NEG_INFINITY.exp_m1(), -DBig::ONE); } #[test] @@ -370,13 +372,13 @@ fn test_powf_binary() { for (x, pow, npow) in &xx_inexact_cases { assert_eq!(x.powf(x), *pow); assert_eq!(x.powf(&-x), *npow); - if let Inexact(v, e) = x.context().powf(x.repr(), x.repr()) { + if let Ok(Inexact(v, e)) = x.context().powf(x.repr(), x.repr(), None) { assert_eq!(v, *pow); assert_eq!(e, NoOp); } else { panic!("the result should be inexact!") } - if let Inexact(v, e) = x.context().powf(x.repr(), (-x).repr()) { + if let Ok(Inexact(v, e)) = x.context().powf(x.repr(), (-x).repr(), None) { assert_eq!(v, *npow); assert_eq!(e, NoOp); } else { @@ -412,13 +414,13 @@ fn test_powf_decimal() { for (x, pow, rnd, npow, nrnd) in &xx_inexact_cases { assert_eq!(x.powf(x), *pow); assert_eq!(x.powf(&-x), *npow); - if let Inexact(v, e) = x.context().powf(x.repr(), x.repr()) { + if let Ok(Inexact(v, e)) = x.context().powf(x.repr(), x.repr(), None) { assert_eq!(v, *pow); assert_eq!(e, *rnd); } else { panic!("the result should be inexact!") } - if let Inexact(v, e) = x.context().powf(x.repr(), (-x).repr()) { + if let Ok(Inexact(v, e)) = x.context().powf(x.repr(), (-x).repr(), None) { assert_eq!(v, *npow); assert_eq!(e, *nrnd); } else { diff --git a/float/tests/fpresult.rs b/float/tests/fpresult.rs new file mode 100644 index 0000000..c073617 --- /dev/null +++ b/float/tests/fpresult.rs @@ -0,0 +1,194 @@ +//! Tests for the `FpResult` contract: infinite inputs → `Err`, infinite outputs → `Ok(±inf)`, +//! and domain/indeterminate errors. + +use dashu_base::Approximation::*; +use dashu_base::Sign; +use dashu_float::{ + math::{FpError, FpResult}, + round::mode, + Context, FBig, Repr, +}; +use dashu_int::IBig; + +fn r2(sig: i32, exp: isize) -> Repr<2> { + Repr::new(sig.into(), exp) +} + +#[test] +fn test_div_by_zero_is_infinity() { + let ctx = Context::::new(53); + // finite / 0 = ±inf (a value, not an error); sign = XOR + let pos = ctx.div::<2>(&r2(1, 0), &Repr::<2>::zero()).unwrap().value(); + assert!(pos.repr().is_infinite()); + assert_eq!(pos.repr().sign(), Sign::Positive); + + let neg = ctx + .div::<2>(&r2(-1, 0), &Repr::<2>::zero()) + .unwrap() + .value(); + assert_eq!(neg.repr().sign(), Sign::Negative); + + // 1 / -0 = -inf + let neg2 = ctx + .div::<2>(&r2(1, 0), &Repr::<2>::neg_zero()) + .unwrap() + .value(); + assert_eq!(neg2.repr().sign(), Sign::Negative); +} + +#[test] +fn test_zero_over_zero_is_indeterminate() { + let ctx = Context::::new(53); + assert_eq!( + ctx.div::<2>(&Repr::<2>::zero(), &Repr::<2>::zero()), + Err(FpError::Indeterminate) + ); +} + +#[test] +fn test_inv_zero_is_infinity() { + let ctx = Context::::new(53); + let r = ctx.inv::<2>(&Repr::<2>::zero()).unwrap().value(); + assert!(r.repr().is_infinite()); + assert_eq!(r.repr().sign(), Sign::Positive); +} + +#[test] +fn test_ln_zero_is_neg_infinity() { + let ctx = Context::::new(53); + let r = ctx.ln::<2>(&Repr::<2>::zero(), None).unwrap().value(); + assert!(r.repr().is_infinite()); + assert_eq!(r.repr().sign(), Sign::Negative); +} + +#[test] +fn test_domain_errors() { + let ctx = Context::::new(53); + assert_eq!(ctx.sqrt::<2>(&r2(-1, 0)), Err(FpError::OutOfDomain)); + assert_eq!(ctx.ln::<2>(&r2(-1, 0), None), Err(FpError::OutOfDomain)); + assert_eq!(ctx.asin::<2>(&r2(2, 0), None), Err(FpError::OutOfDomain)); + assert_eq!( + ctx.atan2::<2>(&Repr::<2>::zero(), &Repr::<2>::zero(), None), + Err(FpError::OutOfDomain) + ); +} + +#[test] +fn test_infinite_input_is_error() { + let ctx = Context::::new(53); + let inf = Repr::<2>::infinity(); + assert_eq!(ctx.add::<2>(&inf, &r2(1, 0)), Err(FpError::InfiniteInput)); + assert_eq!(ctx.mul::<2>(&inf, &r2(1, 0)), Err(FpError::InfiniteInput)); + assert_eq!(ctx.sqrt::<2>(&inf), Err(FpError::InfiniteInput)); + // exp(+inf) = +inf, exp(-inf) = +0 + assert!(ctx.exp::<2>(&inf, None).unwrap().value().repr().is_infinite()); + assert_eq!(ctx.exp::<2>(&inf, None).unwrap().value().repr().sign(), Sign::Positive); + assert!(ctx.exp::<2>(&Repr::<2>::neg_infinity(), None).unwrap().value().repr().is_zero()); + assert_eq!(ctx.sin::<2>(&inf, None), Err(FpError::InfiniteInput)); +} + +#[test] +fn test_atan_infinity_is_preserved() { + let ctx = Context::::new(53); + // atan(±inf) = ±π/2 — a finite result, preserved (not an error) + let r = ctx.atan::<2>(&Repr::<2>::infinity(), None).unwrap().value(); + assert!(r.repr().sign() == Sign::Positive); + // it should be approximately π/2 + assert!(r > FBig::::ONE); +} + +#[test] +fn test_fbig_div_zero_produces_infinity() { + // FBig convenience layer: 1 / 0 yields an infinity-valued FBig (no panic). + let one = FBig::::try_from(1.0f64).unwrap(); + let zero = FBig::::try_from(0.0f64).unwrap(); + let inf = one / zero; + assert!(inf.repr().is_infinite()); +} + +#[test] +#[should_panic] +fn test_fbig_zero_over_zero_panics() { + // 0 / 0 is indeterminate; the FBig layer panics. + let zero = FBig::::try_from(0.0f64).unwrap(); + let _ = zero.clone() / zero; +} + +#[test] +#[should_panic] +fn test_fbig_sqrt_negative_panics() { + // sqrt(-1) is out of domain; the FBig layer panics. + let neg_one = FBig::::try_from(-1.0f64).unwrap(); + use dashu_base::SquareRoot; + let _ = neg_one.sqrt(); +} + +#[test] +fn test_fpresult_type_alias() { + // FpResult is Result, FpError>. + let r: FpResult = Ok(Exact(FBig::ZERO)); + assert!(r.is_ok()); +} + +#[test] +fn test_exp_overflow_is_infinity() { + let ctx = Context::::new(53); + // exp(huge) overflows the isize exponent range -> Overflow error at Context level, + // converted to +inf by the FBig convenience layer. + // Need x large enough that floor(x/ln2) > isize::MAX, i.e. x > ~2^62.5. + let huge = Repr::new(IBig::from(1) << 63, 0); + assert_eq!( + ctx.exp::<2>(&huge, None), + Err(FpError::Overflow(Sign::Positive)) + ); + + // exp(huge negative) underflows to +0 + let neg = Repr::new(-(IBig::from(1) << 63), 0); + assert_eq!( + ctx.exp::<2>(&neg, None), + Err(FpError::Underflow(Sign::Positive)) + ); + + // exp_m1(huge negative) -> -1 (a finite value, not an error) + let m1 = ctx.exp_m1::<2>(&neg, None).unwrap().value(); + assert_eq!(m1, -FBig::::ONE); +} + +#[test] +fn test_powf_zero_base() { + use dashu_float::DBig; + // powf with a float exponent returns the *positive* result on a zero base + // (matching the common float-pow convention); use powi for the signed result. + let ctx = Context::::new(53); + // powf(-0, 2.0) = +0 (NOT -0) + let r = ctx + .powf::<2>(&Repr::<2>::neg_zero(), &Repr::new(2.into(), 0), None) + .unwrap() + .value(); + assert!(r.repr().is_zero(), "expected +0"); + assert!(!r.repr().is_neg_zero(), "powf(-0, x) should be +0, not -0"); + // powf(0, -1) = +inf + let r = ctx + .powf::<2>(&Repr::<2>::zero(), &Repr::new((-1i32).into(), 0), None) + .unwrap() + .value(); + assert!(r.repr().is_infinite()); + assert_eq!(r.repr().sign(), Sign::Positive); + + // powi(-0, 3) = -0 (the sign-correct, integer-exponent variant) + let r = ctx + .powi::<2>(&Repr::<2>::neg_zero(), 3.into()) + .unwrap() + .value(); + assert!(r.repr().is_neg_zero()); + let _ = DBig::ZERO; +} + +#[test] +fn test_shr_assign_shifts_once() { + // Regression: shr_assign previously subtracted rhs twice. + let mut x = FBig::::try_from(8.0f64).unwrap(); // 2^3 + x >>= 1; // 2^2 = 4 + let y = FBig::::try_from(4.0f64).unwrap(); + assert_eq!(x, y); +} diff --git a/float/tests/log.rs b/float/tests/log.rs index e94b23f..05aac91 100644 --- a/float/tests/log.rs +++ b/float/tests/log.rs @@ -37,7 +37,7 @@ fn test_ln_binary() { ]; for (x, ln) in &inexact_cases { assert_eq!(x.ln(), *ln); - if let Inexact(v, e) = x.context().ln(x.repr()) { + if let Ok(Inexact(v, e)) = x.context().ln(x.repr(), None) { assert_eq!(v, *ln); assert_eq!(e, NoOp); } else { @@ -80,7 +80,7 @@ fn test_ln_decimal() { ]; for (x, ln, rnd) in &inexact_cases { assert_eq!(x.ln(), *ln); - if let Inexact(v, e) = x.context().ln(x.repr()) { + if let Ok(Inexact(v, e)) = x.context().ln(x.repr(), None) { assert_eq!(v, *ln); assert_eq!(e, *rnd); } else { @@ -134,7 +134,7 @@ fn test_ln_1p_binary() { ]; for (x, ln) in &inexact_cases { assert_eq!(x.ln_1p(), *ln); - if let Inexact(v, e) = x.context().ln_1p(x.repr()) { + if let Ok(Inexact(v, e)) = x.context().ln_1p(x.repr(), None) { assert_eq!(v, *ln); assert_eq!(e, NoOp); } else { @@ -178,7 +178,7 @@ fn test_ln_1p_decimal() { ]; for (x, ln, rnd) in &inexact_cases { assert_eq!(x.ln_1p(), *ln); - if let Inexact(v, e) = x.context().ln_1p(x.repr()) { + if let Ok(Inexact(v, e)) = x.context().ln_1p(x.repr(), None) { assert_eq!(v, *ln); assert_eq!(e, *rnd); } else { diff --git a/float/tests/mul.rs b/float/tests/mul.rs index 2aedfea..556cddb 100644 --- a/float/tests/mul.rs +++ b/float/tests/mul.rs @@ -50,7 +50,7 @@ fn test_mul_binary() { test_mul(a, b, c); test_mul(b, a, c); - if let Exact(v) = Context::max(a.context(), b.context()).mul(a.repr(), b.repr()) { + if let Ok(Exact(v)) = Context::max(a.context(), b.context()).mul(a.repr(), b.repr()) { assert_eq!(v, *c); } else { panic!("the result should be exact!") @@ -68,7 +68,7 @@ fn test_mul_binary() { test_mul(a, b, c); test_mul(b, a, c); - if let Inexact(v, e) = Context::max(a.context(), b.context()).mul(a.repr(), b.repr()) { + if let Ok(Inexact(v, e)) = Context::max(a.context(), b.context()).mul(a.repr(), b.repr()) { assert_eq!(v, *c); assert_eq!(e, NoOp); } else { @@ -96,7 +96,7 @@ fn test_mul_decimal() { test_mul(a, b, c); test_mul(b, a, c); - if let Exact(v) = Context::max(a.context(), b.context()).mul(a.repr(), b.repr()) { + if let Ok(Exact(v)) = Context::max(a.context(), b.context()).mul(a.repr(), b.repr()) { assert_eq!(v, *c); } else { panic!("the result should be exact!") @@ -115,7 +115,7 @@ fn test_mul_decimal() { test_mul(a, b, c); test_mul(b, a, c); - if let Inexact(v, e) = Context::max(a.context(), b.context()).mul(a.repr(), b.repr()) { + if let Ok(Inexact(v, e)) = Context::max(a.context(), b.context()).mul(a.repr(), b.repr()) { assert_eq!(v, *c); assert_eq!(e, *rnd); } else { diff --git a/float/tests/root.rs b/float/tests/root.rs index 9bb90d9..88a9d8c 100644 --- a/float/tests/root.rs +++ b/float/tests/root.rs @@ -19,7 +19,7 @@ fn test_sqrt_binary() { ]; for (x, sqrt) in &exact_cases { assert_eq!(x.sqrt(), *sqrt); - if let Exact(v) = x.context().sqrt(x.repr()) { + if let Ok(Exact(v)) = x.context().sqrt(x.repr()) { assert_eq!(v, *sqrt); } else { panic!("the result should be exact!") @@ -42,7 +42,7 @@ fn test_sqrt_binary() { for (x, root) in &inexact_cases { assert_eq!(x.sqrt(), *root); - if let Inexact(v, e) = x.context().sqrt(x.repr()) { + if let Ok(Inexact(v, e)) = x.context().sqrt(x.repr()) { assert_eq!(v, *root); assert_eq!(e, NoOp); } else { @@ -61,7 +61,7 @@ fn test_sqrt_decimal() { ]; for (x, sqrt) in &exact_cases { assert_eq!(x.sqrt(), *sqrt); - if let Exact(v) = x.context().sqrt(x.repr()) { + if let Ok(Exact(v)) = x.context().sqrt(x.repr()) { assert_eq!(v, *sqrt); } else { panic!("the result should be exact!") @@ -84,7 +84,7 @@ fn test_sqrt_decimal() { ]; for (x, sqrt, rnd) in &inexact_cases { assert_eq!(x.sqrt(), *sqrt); - if let Inexact(v, e) = x.context().sqrt(x.repr()) { + if let Ok(Inexact(v, e)) = x.context().sqrt(x.repr()) { assert_eq!(v, *sqrt); assert_eq!(e, *rnd); } else { @@ -105,7 +105,7 @@ fn test_cbrt_binary() { ]; for (x, cbrt) in &exact_cases { assert_eq!(x.cbrt(), *cbrt); - if let Exact(v) = x.context().cbrt(x.repr()) { + if let Ok(Exact(v)) = x.context().cbrt(x.repr()) { assert_eq!(v, *cbrt); } else { panic!("the result should be exact!") @@ -125,7 +125,7 @@ fn test_cbrt_binary() { ]; for (x, root) in &inexact_cases { assert_eq!(x.cbrt(), *root); - if let Inexact(v, e) = x.context().cbrt(x.repr()) { + if let Ok(Inexact(v, e)) = x.context().cbrt(x.repr()) { assert_eq!(v, *root); assert_eq!(e, NoOp); } else { @@ -146,7 +146,7 @@ fn test_cbrt_decimal() { ]; for (x, cbrt) in &exact_cases { assert_eq!(x.cbrt(), *cbrt); - if let Exact(v) = x.context().cbrt(x.repr()) { + if let Ok(Exact(v)) = x.context().cbrt(x.repr()) { assert_eq!(v, *cbrt); } else { panic!("the result should be exact!") @@ -169,7 +169,7 @@ fn test_cbrt_decimal() { ]; for (x, cbrt, rnd) in &inexact_cases { assert_eq!(x.cbrt(), *cbrt); - if let Inexact(v, e) = x.context().cbrt(x.repr()) { + if let Ok(Inexact(v, e)) = x.context().cbrt(x.repr()) { assert_eq!(v, *cbrt); assert_eq!(e, *rnd); } else { @@ -193,7 +193,7 @@ fn test_nth_root_exact_decimal() { for (x, n, expected) in cases { assert_eq!(x.nth_root(*n), *expected, "nth_root({n}) of {x:?}"); match x.context().nth_root(*n, x.repr()) { - Exact(v) => assert_eq!(v, *expected), + Ok(Exact(v)) => assert_eq!(v, *expected), _ => panic!("the result should be exact!"), } // the identity case returns the value unchanged @@ -214,7 +214,7 @@ fn test_nth_root_exact_decimal() { ]; for (x, n, root, rnd) in &inexact_cases { assert_eq!(x.nth_root(*n), *root); - if let Inexact(v, e) = x.context().nth_root(*n, x.repr()) { + if let Ok(Inexact(v, e)) = x.context().nth_root(*n, x.repr()) { assert_eq!(v, *root); assert_eq!(e, *rnd); } else { diff --git a/float/tests/signed_zero.rs b/float/tests/signed_zero.rs new file mode 100644 index 0000000..a80d0bc --- /dev/null +++ b/float/tests/signed_zero.rs @@ -0,0 +1,208 @@ +//! Tests for IEEE-754 signed zero (`-0`) propagation across operations. + +use core::str::FromStr; + +use dashu_base::{Abs, ParseError, Sign}; +use dashu_float::{round::mode, Context, DBig, FBig, Repr}; + +mod helper_macros; + +/// The default binary FBig (Zero rounding mode). +type F = FBig; + +fn r2(significand: i32, exponent: isize) -> Repr<2> { + Repr::new(significand.into(), exponent) +} + +/// Helper: assert the value is a negative zero. +fn assert_neg_zero(v: &FBig) { + assert!(v.repr().is_neg_zero(), "expected -0, got {:?}", v.repr()); +} +/// Helper: assert the value is a positive zero. +fn assert_pos_zero(v: &FBig) { + assert!(v.repr().is_zero(), "expected +0, got {:?}", v.repr()); +} + +#[test] +fn test_f64_round_trip() { + // -0.0 round-trips through FBig + let negz: F = FBig::try_from(-0.0f64).unwrap(); + assert_neg_zero(&negz); + let posz: F = FBig::try_from(0.0f64).unwrap(); + assert_pos_zero(&posz); + + // back to f64 preserves the sign + assert!(negz.to_f64().value().is_sign_negative()); + assert!(!posz.to_f64().value().is_sign_negative()); +} + +#[test] +fn test_equality_and_order() { + let negz: F = FBig::try_from(-0.0f64).unwrap(); + let posz: F = FBig::try_from(0.0f64).unwrap(); + assert_eq!(negz, posz); // -0 == +0 + assert!(negz >= posz); // total order: -0 is not less than +0 + assert!(negz <= posz); + // Repr equality too + assert_eq!(Repr::<2>::neg_zero(), Repr::<2>::zero()); +} + +#[test] +fn test_neg_and_abs() { + let negz: F = FBig::try_from(-0.0f64).unwrap(); + let posz: F = FBig::try_from(0.0f64).unwrap(); + assert_pos_zero(&-negz.clone()); // -(-0) = +0 + assert_neg_zero(&-posz.clone()); // -(+0) = -0 + assert_pos_zero(&negz.abs()); // abs(-0) = +0 + assert_pos_zero(&posz.abs()); +} + +#[test] +fn test_signum() { + let negz: F = FBig::try_from(-0.0f64).unwrap(); + let posz: F = FBig::try_from(0.0f64).unwrap(); + // signum(±0) = +0 + assert_pos_zero(&negz.signum()); + assert_pos_zero(&posz.signum()); + assert_eq!(negz.sign(), Sign::Negative); + assert_eq!(posz.sign(), Sign::Positive); +} + +#[test] +fn test_mul_signed_zero() { + let f = |x: f64| -> F { FBig::try_from(x).unwrap() }; + // -0 * 5 = -0 ; -0 * -5 = +0 ; +0 * 5 = +0 ; +0 * -5 = -0 + let r = f(-0.0) * f(5.0); + assert_neg_zero(&r); + let r = f(-0.0) * f(-5.0); + assert_pos_zero(&r); + let r = f(0.0) * f(5.0); + assert_pos_zero(&r); + let r = f(0.0) * f(-5.0); + assert_neg_zero(&r); +} + +#[test] +fn test_div_signed_zero() { + let ctx = Context::::new(53); + let negz = ctx + .div::<2>(&Repr::<2>::neg_zero(), &r2(5, 0)) + .unwrap() + .value(); + assert!(negz.repr().is_neg_zero()); // -0 / 5 = -0 + let posz = ctx.div::<2>(&Repr::<2>::zero(), &r2(5, 0)).unwrap().value(); + assert!(posz.repr().is_zero()); // +0 / 5 = +0 +} + +#[test] +fn test_sqrt_signed_zero() { + let ctx = Context::::new(53); + let negz = ctx.sqrt::<2>(&Repr::<2>::neg_zero()).unwrap().value(); + assert!(negz.repr().is_neg_zero()); // sqrt(-0) = -0 + let posz = ctx.sqrt::<2>(&Repr::<2>::zero()).unwrap().value(); + assert!(posz.repr().is_zero()); // sqrt(+0) = +0 +} + +#[test] +fn test_trig_signed_zero() { + let ctx = Context::::new(53); + let sin_neg0 = ctx.sin::<2>(&Repr::<2>::neg_zero(), None).unwrap().value(); + assert!(sin_neg0.repr().is_neg_zero()); // sin(-0) = -0 + let sin_pos0 = ctx.sin::<2>(&Repr::<2>::zero(), None).unwrap().value(); + assert!(sin_pos0.repr().is_zero()); // sin(+0) = +0 + let tan_neg0 = ctx.tan::<2>(&Repr::<2>::neg_zero(), None).unwrap().value(); + assert!(tan_neg0.repr().is_neg_zero()); // tan(-0) = -0 + let cos_neg0 = ctx.cos::<2>(&Repr::<2>::neg_zero(), None).unwrap().value(); + assert_eq!(cos_neg0, FBig::::ONE); // cos(±0) = 1 +} + +#[test] +fn test_rounding_ops_signed_zero() -> Result<(), ParseError> { + // trunc / round sign of zero + let half_neg = DBig::from_str("-0.5")?; + assert!(half_neg.trunc().repr().is_neg_zero(), "trunc(-0.5) = -0"); + let third_neg = DBig::from_str("-0.3")?; + assert!(third_neg.round().repr().is_neg_zero(), "round(-0.3) = -0"); + + // -0 passes through ceil/floor/trunc unchanged + let neg_zero_d = -DBig::ZERO; + assert!(neg_zero_d.repr().is_neg_zero()); + assert!(neg_zero_d.ceil().repr().is_neg_zero(), "ceil(-0) = -0"); + assert!(neg_zero_d.floor().repr().is_neg_zero(), "floor(-0) = -0"); + assert!(neg_zero_d.trunc().repr().is_neg_zero(), "trunc(-0) = -0"); + + // fract of a negative integer is -0 + let neg_five = DBig::from_str("-5")?; + assert!(neg_five.fract().repr().is_neg_zero(), "fract(-5) = -0"); + Ok(()) +} + +#[test] +fn test_cancellation_under_down() -> Result<(), ParseError> { + // x + (-x) yields -0 only under roundTowardNegative (Down); +0 otherwise. + let three = DBig::from_str("3")?; + let neg_three = DBig::from_str("-3")?; + + let down = Context::::new(10); + let sum_down = down + .add::<10>(three.repr(), neg_three.repr()) + .unwrap() + .value(); + assert!(sum_down.repr().is_neg_zero(), "(-3)+3 under Down = -0"); + + let up = Context::::new(10); + let sum_up = up + .add::<10>(three.repr(), neg_three.repr()) + .unwrap() + .value(); + assert!(sum_up.repr().is_zero(), "(-3)+3 under Up = +0"); + + // subtraction a - a likewise + let sub_down = down.sub::<10>(three.repr(), three.repr()).unwrap().value(); + assert!(sub_down.repr().is_neg_zero(), "3-3 under Down = -0"); + + // The FBig operator path (Add/Sub traits) must agree: a - a under Down = -0. + let a = FBig::::from_str("3")?; + let a2 = FBig::::from_str("3")?; + let diff = a - a2; + assert!(diff.repr().is_neg_zero(), "FBig 3-3 under Down = -0"); + Ok(()) +} + +#[test] +fn test_powi_signed_zero() { + let ctx = Context::::new(53); + let negz = ctx + .powi::<2>(&Repr::<2>::neg_zero(), 3.into()) + .unwrap() + .value(); + assert!(negz.repr().is_neg_zero()); // (-0)^3 = -0 + let posz = ctx + .powi::<2>(&Repr::<2>::neg_zero(), 2.into()) + .unwrap() + .value(); + assert!(posz.repr().is_zero()); // (-0)^2 = +0 +} + +#[test] +fn test_num_traits_sign() { + use dashu_base::Signed; + let negz: F = FBig::try_from(-0.0f64).unwrap(); + let posz: F = FBig::try_from(0.0f64).unwrap(); + // is_positive/is_negative follow the sign bit (matching Rust's f64::is_sign_*): + // -0 is negative-signed, +0 is positive-signed. + assert!(!negz.is_positive()); + assert!(negz.is_negative()); + assert!(posz.is_positive()); + assert!(!posz.is_negative()); +} + +#[test] +fn test_ln_1p_signed_zero() { + let ctx = Context::::new(53); + let r = ctx + .ln_1p::<2>(&Repr::<2>::neg_zero(), None) + .unwrap() + .value(); + assert!(r.repr().is_neg_zero()); // ln_1p(-0) = -0 +} diff --git a/float/tests/trig.rs b/float/tests/trig.rs index e7274b9..2a5a736 100644 --- a/float/tests/trig.rs +++ b/float/tests/trig.rs @@ -36,11 +36,11 @@ fn test_sin_cos() { #[test] fn test_tan() { let x = DBig::ZERO.with_precision(30).value(); - assert_eq!(x.tan().value(&x.context()), DBig::ZERO); + assert_eq!(x.tan(), DBig::ZERO); let pi = DBig::pi(30); let pi4: DBig = pi / 4; - let tan_pi4 = pi4.tan().value(&pi4.context()); + let tan_pi4 = pi4.tan(); assert!((tan_pi4 - DBig::ONE).abs() < DBig::from_parts(1.into(), -29)); } @@ -59,21 +59,25 @@ fn test_atan() { #[test] fn test_asin_acos() { let x = DBig::ZERO.with_precision(30).value(); - assert_eq!(x.asin().value(&x.context()), DBig::ZERO); + assert_eq!(x.asin(), DBig::ZERO); let pi = DBig::pi(30); let half_pi: DBig = &pi / 2; - assert!((x.acos().value(&x.context()) - half_pi).abs() < DBig::from_parts(1.into(), -29)); + assert!((x.acos() - half_pi).abs() < DBig::from_parts(1.into(), -29)); let half = DBig::from_parts(5.into(), -1).with_precision(30).value(); - let asin_half = half.asin().value(&half.context()); + let asin_half = half.asin(); // asin(0.5) = pi/6 let pi6: DBig = &pi / 6; assert!((asin_half - pi6).abs() < DBig::from_parts(1.into(), -29)); +} - // Domain error test +#[test] +#[should_panic] +fn test_asin_out_of_domain_panics() { + // asin(|x| > 1) is out of domain; the FBig convenience layer panics. let two = DBig::from_parts(2.into(), 0).with_precision(10).value(); - assert!(two.asin().is_nan()); + let _ = two.asin(); } #[test] @@ -84,30 +88,26 @@ fn test_atan2() { let pi = DBig::pi(30); // atan2(0, 1) = 0 - assert_eq!(zero.atan2(&one).value(&zero.context()), zero); + assert_eq!(zero.atan2(&one), zero); // atan2(1, 0) = pi/2 let half_pi: DBig = &pi / 2; - assert!( - (one.atan2(&zero).value(&one.context()) - half_pi.clone()).abs() - < DBig::from_parts(1.into(), -29) - ); + assert!((one.atan2(&zero) - half_pi.clone()).abs() < DBig::from_parts(1.into(), -29)); // atan2(0, -1) = pi - assert!( - (zero.atan2(&neg_one).value(&zero.context()) - &pi).abs() < DBig::from_parts(1.into(), -29) - ); + assert!((zero.atan2(&neg_one) - &pi).abs() < DBig::from_parts(1.into(), -29)); // atan2(-1, 0) = -pi/2 let m_half_pi: DBig = -half_pi; - assert!( - (neg_one.atan2(&zero).value(&neg_one.context()) - m_half_pi).abs() - < DBig::from_parts(1.into(), -29) - ); + assert!((neg_one.atan2(&zero) - m_half_pi).abs() < DBig::from_parts(1.into(), -29)); +} - // Undefined case +#[test] +#[should_panic] +fn test_atan2_zero_zero_panics() { + // atan2(0, 0) is indeterminate; the FBig convenience layer panics. let z0 = DBig::ZERO.with_precision(10).value(); - assert!(z0.atan2(&z0).is_nan()); + let _ = z0.atan2(&z0); } #[test] @@ -116,27 +116,27 @@ fn test_atan2_infinities() { let ctx = x.context(); let inf = Repr::infinity(); let neg_inf = Repr::neg_infinity(); - let pi = ctx.pi::<10>().value(); + let pi = ctx.pi::<10>(None).value(); let pi_4 = &pi / 4; let pi_3_4 = &pi * 3 / 4; // atan2(+inf, +inf) = pi/4 - let res: DBig = ctx.atan2(&inf, &inf).value(&ctx); + let res: DBig = ctx.atan2(&inf, &inf, None).unwrap().value(); let diff: DBig = res - &pi_4; assert!(diff.abs() < DBig::from_parts(1.into(), -29)); // atan2(+inf, -inf) = 3pi/4 - let res: DBig = ctx.atan2(&inf, &neg_inf).value(&ctx); + let res: DBig = ctx.atan2(&inf, &neg_inf, None).unwrap().value(); let diff: DBig = res - &pi_3_4; assert!(diff.abs() < DBig::from_parts(1.into(), -29)); // atan2(-inf, +inf) = -pi/4 - let res: DBig = ctx.atan2(&neg_inf, &inf).value(&ctx); + let res: DBig = ctx.atan2(&neg_inf, &inf, None).unwrap().value(); let diff: DBig = res + &pi_4; assert!(diff.abs() < DBig::from_parts(1.into(), -29)); // atan2(-inf, -inf) = -3pi/4 - let res: DBig = ctx.atan2(&neg_inf, &neg_inf).value(&ctx); + let res: DBig = ctx.atan2(&neg_inf, &neg_inf, None).unwrap().value(); let diff: DBig = res + &pi_3_4; assert!(diff.abs() < DBig::from_parts(1.into(), -29)); } diff --git a/guide/src/SUMMARY.md b/guide/src/SUMMARY.md index 6864d35..c7f866b 100644 --- a/guide/src/SUMMARY.md +++ b/guide/src/SUMMARY.md @@ -19,3 +19,4 @@ - [FAQ](./faq.md) - [Performance](./performance.md) - [Cheatsheet](./cheatsheet.md) +- [IEEE 754 Compliance](./ieee754.md) diff --git a/guide/src/construct.md b/guide/src/construct.md index c937a82..01a7a61 100644 --- a/guide/src/construct.md +++ b/guide/src/construct.md @@ -43,3 +43,82 @@ You can directly put numeric literals as the argument without quotes (e.g. `dbig When the number doesn't have a high precision, these macros can be used in a `const` environment, however this ability dependends on the precision and the machine word size. To create large constants, you can use the `static_*` macros (such as `static_ubig!`) in the crate. They have the same syntax as the normal macros, but the different is that the outputs of the macros are references to a static instance, rather than directly generating an instance. There are also other limitations about these macros for static creation. Please refer to [the docs of `dashu-macros`](https://docs.rs/dashu-macros/latest/dashu_macros/) for detailed usage of these macros. + +# Cached Arithmetic for FBig + +The [`CachedFBig`] type is an [`FBig`] that carries a shared handle to a +`Rc>`. The cache stores exact binary-splitting state for +mathematical constants (π, ln2, ln10), so that transcendental operations +(`ln`, `exp`, `sin`, `cos`, …, `pi`) reuse and progressively extend prior +work instead of recomputing from scratch. + +## Creation + +A `CachedFBig` is created by attaching a cache handle to an `FBig`: + +```rust +use std::rc::Rc; +use core::cell::RefCell; +use dashu_float::{CachedFBig, ConstCache, Repr, Context}; + +let cache = Rc::new(RefCell::new(ConstCache::new())); + +// From an FBig +let a = FBig::ONE.into_cached(cache.clone()); + +// From raw parts with a fresh cache +let b = CachedFBig::<_, 10>::with_cache( + Repr::new(1234.into(), -3), + Context::new(50), +); +``` + +Use `From for CachedFBig` for one-off conversions (it creates a fresh +empty cache): + +```rust +let c: CachedFBig = FBig::from(3u8).into(); +``` + +To drop the cache and get back a plain `FBig`, use `into_fbig()` or the +`From for FBig` trait: + +```rust +let plain: FBig = cached.into(); // or cached.into_fbig() +``` + +## Cache sharing + +Binary operations between `CachedFBig` values preserve the cache handle in +the result: `(a + b).ln().exp()` keeps extending the same cache throughout. +When two operands carry different cache handles, the **left-hand side** cache +is preserved. For `FBig op CachedFBig`, the `CachedFBig` operand's cache is +preserved regardless of which side it is on. + +Operations with plain `FBig` and primitives (`u8`, `i32`, `UBig`, etc.) also +work and preserve the `CachedFBig` operand's cache: + +```rust +let cached = CachedFBig::<_, 10>::with_cache( + Repr::new(2.into(), 0), Context::new(20), +); +let result = cached + 3u8; // CachedFBig, cache preserved +let result = 10i32 * cached; // CachedFBig, cache preserved +``` + +## Inspecting and clearing the cache + +Use `cache()` to borrow the cache read-only and inspect its size: + +```rust +let terms = cached.cache().total_terms(); +let words = cached.cache().total_words(); +``` + +Call `clear_cache()` to free all cached big-integer memory. The next +transcendental operation will recompute constants from scratch: + +```rust +cached.clear_cache(); +assert_eq!(cached.cache().total_terms(), 0); +``` diff --git a/guide/src/ieee754.md b/guide/src/ieee754.md new file mode 100644 index 0000000..3117d6f --- /dev/null +++ b/guide/src/ieee754.md @@ -0,0 +1,97 @@ +# IEEE 754-2008 Compliance of dashu-float + +This document describes where `dashu-float`'s `FBig` type is compliant and where it deviates +from IEEE 754-2008. The reference is IEEE Std 754™-2008 (ISO/IEC/IEEE 60559:2011). + +dashu-float is an **arbitrary-precision** floating-point library. Many IEEE 754 concepts +(e.g. fixed-width encoding, subnormals, NaN payloads) have no direct equivalent here. +Where infinite precision makes the standard's rules natural to satisfy, they are satisfied; +where they conflict with the arbitrary-precision model, the deviation is noted. + +## Data Model + +### Section 3 — Floating-point formats + +| IEEE 754 requirement | Compliance | Notes | +|---------------------|-----------|-------| +| Binary and decimal formats | ✅ Supported | `FBig` (binary) and `DBig` = `FBig` (decimal). Other bases are supported via the `const BASE: Word` parameter. | +| Finite non-zero numbers | ✅ | Represented as `significand × BASE^exponent` with unbounded significand. | +| Signed zero (`±0`) | ✅ | Encoded via exponent sentinels: `+0` ↔ exponent `0`, `-0` ↔ exponent `-1`. Produced by arithmetic, rounding, and cancellations per IEEE 754. | +| Signed infinity (`±∞`) | ✅ | Encoded via exponent sentinels: `+∞` ↔ `isize::MAX`, `-∞` ↔ `isize::MIN`. | +| NaN | ❌ Deviates | No NaN. Invalid operations panic (at the `FBig` convenience layer) or return `Err(FpError)` (at the `Context` layer). | +| Subnormals | N/A | Arbitrary-precision significands eliminate the need for subnormals. Any non-zero number is normalized. | +| Fixed-width encoding | N/A | No fixed bit widths; significands are unbounded `IBig` integers. | + +## Arithmetic Operations + +### Section 5 — Operations + +| IEEE 754 requirement | Compliance | Notes | +|---------------------|-----------|-------| +| `±0` compare equal | ✅ | `+0 == -0` in `PartialEq`, `Ord`, `NumOrd`. | +| `±∞` compare equal to same sign, ordered vs finite | ✅ | `+∞ == +∞`, `+∞ > finite`, `-∞ < finite`. | +| Overflow → `±∞` (with rounding-mode-dependent sign) | ✅ | Detected at the Repr level, returned as `Err(FpError::Overflow(sign))` at `Context`, converted to signed infinity at `FBig`. | +| Underflow → `±0` | ✅ | Same mechanism as overflow. | +| `finite / ±0` → `±∞` | ✅ | Produced as a value (not an error). Sign = XOR of operand signs. | +| `0 / 0` → NaN / error | ⚠️ Partial | Returns `Err(FpError::Indeterminate)`. Panics at the `FBig` layer. | +| `∞ ± finite`, `∞ × finite`, etc. → `±∞` | ❌ Deviates | Infinities are terminal: feeding them into arithmetic returns `Err(FpError::InfiniteInput)`. Operations on infinities require explicit handling at the `Context` layer or use special-case methods. | +| `exp(+∞)` = `+∞` | ✅ | `Context::exp` accepts infinite input. | +| `exp(-∞)` = `+0` | ✅ | Same. | +| `exp_m1(+∞)` = `+∞` | ✅ | | +| `exp_m1(-∞)` = `-1` | ✅ | | +| `ln(±0)` = `-∞` | ✅ | Produced as a value. | +| `sqrt(-0)` = `-0` | ✅ | | +| `sin(-0)` = `-0` | ✅ | | +| Cancellation under roundTowardNegative → `-0` | ✅ | `cancel_zero` in add.rs produces `-0` when `R::IS_ROUND_TOWARD_NEGATIVE`. | +| Exact subtraction cancels to `-0` only under directed rounding | ✅ | IEEE 754 §6.3: `(-3) + 3` = `+0` under roundTiesToEven/Up, `-0` under Down. | + +### Section 5.3 — Rounding + +| IEEE 754 requirement | Compliance | Notes | +|---------------------|-----------|-------| +| Rounding modes: roundTiesToEven, roundTiesToAway, roundTowardPositive, roundTowardNegative, roundTowardZero | ✅ | All five modes implemented as `HalfEven`, `HalfAway`, `Up`, `Down`, `Zero`. | +| Correct rounding to within 1 ulp | ✅ | All operations guarantee `|error| < 1 ulp`. The `Rounded` type distinguishes exact from inexact results. | +| Round-to-nearest preserves sign of zero | ✅ | `rounded_to_repr` preserves input sign when rounding collapses a non-zero to zero. | + +### Section 5.6 — Sign bit operations + +| IEEE 754 requirement | Compliance | Notes | +|---------------------|-----------|-------| +| `abs(x)` always non-negative | ✅ | `FBig::abs()` converts `-0` to `+0`. | +| `neg(x)` toggles sign of `±0` and `±∞` | ✅ | Correctly flips exponent sentinels via `negate_special_exponent`. | +| `signum(±0)` = `±0` | ✅ | Returns `+0` for both `+0` and `-0`. | +| `sign()` distinguishes `+0` from `-0` | ✅ | `Repr::sign()` returns `Negative` for `-0`. | + +## Conversions + +| IEEE 754 requirement | Compliance | Notes | +|---------------------|-----------|-------| +| `f32`/`f64` round-trip preserves `-0` | ✅ | `FBig::try_from(-0.0f64)` produces `-0`. | +| `f32`/`f64` round-trip preserves infinity | ✅ | | +| Overflow in conversion to `f32`/`f64` produces `±∞` | ✅ | `into_f32_internal` / `into_f64_internal` check exponent bounds. | +| Underflow in conversion to `f32`/`f64` produces `±0` | ✅ | Same. | +| Int-to-float conversion exact for representable integers | ✅ | | +| Float-to-int overflows saturate (per Rust convention) | N/A | Rust's `TryFrom` returns an error on overflow; `ToPrimitive` returns `None`. | + +## Exceptional Conditions + +| IEEE 754 requirement | Compliance | Notes | +|---------------------|-----------|-------| +| Invalid operation → NaN | ❌ | Panics (`FBig`) or returns `Err(FpError)` (`Context`). | +| Divide by zero → `±∞` (no trap) | ✅ | | +| Overflow → `±∞` (no trap) | ✅ | Detected and propagated. | +| Underflow → `±0` (no trap) | ✅ | Same. | +| Inexact flag | ⚠️ Partial | The `Rounded` type carries `Exact`/`Inexact(T, Rounding)` to signal whether rounding occurred, but there is no sticky flag mechanism. | + +## Summary + +| Category | Status | +|----------|--------| +| Signed zeros | ✅ Fully compliant | +| Signed infinities | ✅ Fully compliant | +| Overflow/underflow | ✅ Fully compliant | +| Directed rounding | ✅ Fully compliant | +| NaN handling | ❌ Panics (by design) | +| Infinite operands in arithmetic | ❌ Error (by design — infinities are terminal) | +| Subnormals | N/A (unbounded precision) | +| Exception flags | ⚠️ Rounded type signals exact/inexact, no sticky flags | diff --git a/guide/src/io/print.md b/guide/src/io/print.md index 784d3cb..ba9c962 100644 --- a/guide/src/io/print.md +++ b/guide/src/io/print.md @@ -1,5 +1,65 @@ # Standard Format API -# Debug Print +`UBig` and `IBig` support the full set of Rust standard formatter traits: +[`Display`], [`Debug`], [`Binary`], [`Octal`], [`LowerHex`], [`UpperHex`]. +All of them support the sign, width, fill, padding, and alignment options of +[`Formatter`]. For custom radices use [`InRadix`] (see below). -# Rational Number Formatting +TODO: describe the `in_radix` API + +## Debug Print + +The [`Debug`] implementation uses a compact **head‥tail** format for large +integers: it prints the most significant digits, a `..` separator, and the +least significant digits, omitting the middle. For small integers that fit in +a single [`Word`] or [`DoubleWord`] the full number is shown without +truncation. + +There are two forms, controlled by the formatter flags: + +### Simple form (`{:?}`) + +Shows the truncated head‥tail representation. + +```rust +use dashu_int::{UBig, IBig}; + +// Small integers print in full +assert_eq!(format!("{:?}", UBig::from(12345u16)), "12345"); +assert_eq!(format!("{:?}", IBig::from(-12345)), "-12345"); + +// Large integers show head..tail (example for 64-bit Word) +assert_eq!( + format!("{:?}", UBig::ONE << 1000), + "1071508607186267320..4386837205668069376" +); +assert_eq!( + format!("{:?}", IBig::NEG_ONE << 1000), + "-1071508607186267320..4386837205668069376" +); +``` + +The number of digits shown on each end depends on the [`Word`] size — +on 64-bit targets it is 19 decimal digits at each end (one word's worth), +on 32-bit targets it is 9 digits. + +### Verbose form (`{:#?}`) + +Adds `(digits: N, bits: M)` after the head‥tail representation, showing the +total digit count and bit length. + +```rust +use dashu_int::{UBig, Word}; + +let x = UBig::ONE << 1000; +if Word::BITS == 64 { + assert_eq!( + format!("{:#?}", x), + "1071508607186267320..4386837205668069376 (digits: 302, bits: 1001)" + ); +} +``` + +## Rational Number Formatting + +TODO: rational numbers have both in_radix and in_expanded functions, other than normal traits