Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Let diagzero propagate information not stored in type where relevant #54440

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

MilesCranmer
Copy link
Sponsor Member

@MilesCranmer MilesCranmer commented May 10, 2024

Changes: diagzero: zero(T) -> haszero(T) ? zero(T) : zero(first(D)) to capture any relevant information in the elements.

(DynamicQuantities.jl (SymbolicML/DynamicQuantities.jl#75) currently overloads method this but I think it might be more robust for the ecosystem to just fix it at the source)

@MilesCranmer
Copy link
Sponsor Member Author

I'll also add this to fzero as it seems the same issue shows up.

@LilithHafner
Copy link
Member

Could you give an example of why this is necessary and add a test?

I see two cases:

  1. zero(x) is always the same for all x::T, in which case zero(T) should be defined.
  2. zero(x) is not always the same for all x::T, in which case zero(diag.diag[1]) may not be the same as zero(diag.diag[2]), so it feels weird to arbitrarily pick the first element to match. Especially because when someone mutates the first element it could then change all the zeros. An error feels like a better option here.

This PR changes the second case from error to match the top-left, which I'm not a fan of at first, but seeing a motivating example could change that.

@MilesCranmer
Copy link
Sponsor Member Author

MilesCranmer commented May 10, 2024

Those are good points. I guess what we really need is to permit an init function in Diagonal that can create the zeros dynamically, rather than rely on zero(eltype(D)) which seems much less extensible. (And perhaps the default init would simply be diagzero)

Perhaps an easy fix for right now would be to declare diagzero and fzero as part of the public API? Then I wouldn't feel as bad about overloading them.

Right now I have to overload them here in DynamicQuantities.jl: https://github.com/SymbolicML/DynamicQuantities.jl/blob/a7acc5a341e7cab68794c39a66862ea303b5a60f/ext/DynamicQuantitiesLinearAlgebraExt.jl#L112-L116

LA.diagzero(D::LA.Diagonal{T}, _, _) where {T<:Quantity} = zero(first(D))
LA.fzero(S::LA.Diagonal{T}) where {T<:Quantity} = zero(first(S))

this is guaranteed to work because QuantityArray ensures the units of all elements in an array are equal. I have to do this because DynamicQuantities stores units in the values, rather than the fields, so an additive identity (zero) needs to access the values to get the right units. In other words, zero(::Type{<:Quantity}) is disallowed.

What happens if I don't overload these is the following:

julia> using DynamicQuantities, LinearAlgebra

julia> qa = QuantityArray([1.0u"m", 2.0u"m", 3.0u"m"])
3-element QuantityArray(::Vector{Float64}, ::Quantity{Float64, Dimensions{FixedRational{Int32, 25200}}}):
 1.0 m
 2.0 m
 3.0 m

julia> Diagonal(qa)
3×3 Diagonal{Quantity{Float64, Dimensions{FixedRational{Int32, 25200}}}, QuantityArray{Float64, 1, Dimensions{FixedRational{Int32, 25200}}, Quantity{Float64, Dimensions{FixedRational{Int32, 25200}}}, Vector{Float64}}}:
Error showing value of type Diagonal{Quantity{Float64, Dimensions{FixedRational{Int32, 25200}}}, QuantityArray{Float64, 1, Dimensions{FixedRational{Int32, 25200}}, Quantity{Float64, Dimensions{FixedRational{Int32, 25200}}}, Vector{Float64}}}:
ERROR: Cannot create an additive identity from `Type{<:Quantity}`, as the dimensions are unknown. Please use `zero(::Quantity)` instead.
Stacktrace:
  [1] error(s::String)
    @ Base ./error.jl:35
  [2] zero(::Type{Quantity{Float64, Dimensions{FixedRational{Int32, 25200}}}})
    @ DynamicQuantities ~/PermaDocuments/SymbolicRegressionMonorepo/DynamicQuantities.jl/src/utils.jl:289
  [3] diagzero(::Diagonal{Quantity{Float64, Dimensions{FixedRational{Int32, 25200}}}, QuantityArray{Float64, 1, Dimensions{FixedRational{Int32, 25200}}, Quantity{Float64, Dimensions{FixedRational{}}}, Vector{Float64}}}, i::Int64, j::Int64)
    @ LinearAlgebra ~/.julia/juliaup/julia-1.10.0+0.aarch64.apple.darwin14/share/julia/stdlib/v1.10/LinearAlgebra/src/diagonal.jl:162
  [4] getindex(D::Diagonal{Quantity{Float64, Dimensions{FixedRational{}}}, QuantityArray{Float64, 1, Dimensions{FixedRational{}}, Quantity{Float64, Dimensions{}}, Vector{Float64}}}, i::Int64, j::Int64)
    @ LinearAlgebra ~/.julia/juliaup/julia-1.10.0+0.aarch64.apple.darwin14/share/julia/stdlib/v1.10/LinearAlgebra/src/diagonal.jl:158
  [5] alignment(io::IOContext{Base.TTY}, X::AbstractVecOrMat, rows::Vector{Int64}, cols::Vector{Int64}, cols_if_complete::Int64, cols_otherwise::Int64, sep::Int64, ncols::Int64)
    @ Base ./arrayshow.jl:69
  [6] _print_matrix(io::IOContext{Base.TTY}, X::AbstractVecOrMat, pre::String, sep::String, post::String, hdots::String, vdots::String, ddots::String, hmod::Int64, vmod::Int64, rowsA::UnitRange{Int64}, colsA::UnitRange{Int64})
    @ Base ./arrayshow.jl:207
  [7] print_matrix(io::IOContext{Base.TTY}, X::Diagonal{Quantity{…}, QuantityArray{…}}, pre::String, sep::String, post::String, hdots::String, vdots::String, ddots::String, hmod::Int64, vmod::Int64)
    @ Base ./arrayshow.jl:171
  [8] print_matrix(io::IO, X::AbstractVecOrMat, pre::AbstractString, sep::AbstractString, post::AbstractString, hdots::AbstractString, vdots::AbstractString, ddots::AbstractString, hmod::Integer, vmod::Integer)
    @ Base ./arrayshow.jl:171 [inlined]
  [9] print_array
    @ ./arrayshow.jl:358 [inlined]
 [10] show(io::IOContext{Base.TTY}, ::MIME{Symbol("text/plain")}, X::Diagonal{Quantity{Float64, Dimensions{FixedRational{}}}, QuantityArray{Float64, 1, Dimensions{FixedRational{}}, Quantity{Float64, Dimensions{}}, Vector{Float64}}})
    @ Base ./arrayshow.jl:399
 [11] (::OhMyREPL.var"#15#16"{REPL.REPLDisplay{REPL.LineEditREPL}, MIME{Symbol("text/plain")}, Base.RefValue{Any}})(io::IOContext{Base.TTY})
    @ OhMyREPL ~/.julia/packages/OhMyREPL/6UG7a/src/output_prompt_overwrite.jl:23
 [12] with_repl_linfo(f::Any, repl::REPL.LineEditREPL)
    @ REPL ~/.julia/juliaup/julia-1.10.0+0.aarch64.apple.darwin14/share/julia/stdlib/v1.10/REPL/src/REPL.jl:569
 [13] display
    @ ~/.julia/packages/OhMyREPL/6UG7a/src/output_prompt_overwrite.jl:6 [inlined]
 [14] display(d::REPL.REPLDisplay, mime::MIME{Symbol("text/plain")}, x::Any)
    @ REPL ~/.julia/juliaup/julia-1.10.0+0.aarch64.apple.darwin14/share/julia/stdlib/v1.10/REPL/src/REPL.jl:278 [inlined]
 [15] display(x::Any)
    @ Base.Multimedia ./multimedia.jl:340
 [16] print_response(errio::IO, response::Any, show_value::Bool, have_color::Bool, specialdisplay::Union{Nothing, AbstractDisplay})
    @ REPL ~/.julia/juliaup/julia-1.10.0+0.aarch64.apple.darwin14/share/julia/stdlib/v1.10/REPL/src/REPL.jl:0
 [17] (::REPL.var"#57#58"{REPL.LineEditREPL, Pair{Any, Bool}, Bool, Bool})(io::Any)
    @ REPL ~/.julia/juliaup/julia-1.10.0+0.aarch64.apple.darwin14/share/julia/stdlib/v1.10/REPL/src/REPL.jl:284
 [18] with_repl_linfo(f::Any, repl::REPL.LineEditREPL)
    @ REPL ~/.julia/juliaup/julia-1.10.0+0.aarch64.apple.darwin14/share/julia/stdlib/v1.10/REPL/src/REPL.jl:569
 [19] print_response(repl::REPL.AbstractREPL, response::Any, show_value::Bool, have_color::Bool)
    @ REPL ~/.julia/juliaup/julia-1.10.0+0.aarch64.apple.darwin14/share/julia/stdlib/v1.10/REPL/src/REPL.jl:282
 [20] (::REPL.var"#do_respond#80"{Bool, Bool, REPL.var"#93#103"{REPL.LineEditREPL, REPL.REPLHistoryProvider}, REPL.LineEditREPL, REPL.LineEdit.Prompt})(s::REPL.LineEdit.MIState, buf::Any, ok::Bool)
    @ REPL ~/.julia/juliaup/julia-1.10.0+0.aarch64.apple.darwin14/share/julia/stdlib/v1.10/REPL/src/REPL.jl:911
 [21] #invokelatest#2
    @ ./essentials.jl:887 [inlined]
 [22] invokelatest
    @ ./essentials.jl:884 [inlined]
 [23] run_interface(terminal::REPL.Terminals.TextTerminal, m::REPL.LineEdit.ModalInterface, s::REPL.LineEdit.MIState)
    @ REPL.LineEdit ~/.julia/juliaup/julia-1.10.0+0.aarch64.apple.darwin14/share/julia/stdlib/v1.10/REPL/src/LineEdit.jl:2656
 [24] run_frontend(repl::REPL.LineEditREPL, backend::REPL.REPLBackendRef)
    @ REPL ~/.julia/juliaup/julia-1.10.0+0.aarch64.apple.darwin14/share/julia/stdlib/v1.10/REPL/src/REPL.jl:1312
 [25] (::REPL.var"#62#68"{REPL.LineEditREPL, REPL.REPLBackendRef})()
    @ REPL ~/.julia/juliaup/julia-1.10.0+0.aarch64.apple.darwin14/share/julia/stdlib/v1.10/REPL/src/REPL.jl:386
Some type information was truncated. Use `show(err)` to see complete types.

this happens for many other operations too, because Diagonal is trying to create a zero at the off-diagonal elements, but is using the eltype rather than something which is value-based.

@jmichel7
Copy link

The type of situation where haszero(T) ? zero(T) : zero(first(D)) is much better than zero(T) is very widespread; such a change would broaden the applicability of many functions in LinearAlgebra. It happens whenever one has a type of object where all the information to make a proper zero element is not in the type domain. A typical example is modular arithmetic: if you work with matrices of integers modulo p, where p is a BigInt prime, you need to know p to make a proper zero element but you cannot put p in the type since it is a BigInt.

@LilithHafner
Copy link
Member

Correctness seems to rely on array constructors that check properties which make them always homogeneous in that their zeros are always equal.

A less surprising but terrible performance implementation would be haszero(T) ? zero(T) : only(unique(zero.(D))). Though this implementation could be optimized to O(1) by specializing on zero.(::QuantileArray).

@dkarrasch dkarrasch added the domain:linear algebra Linear algebra label May 12, 2024
@MilesCranmer
Copy link
Sponsor Member Author

@LilithHafner could we simply allow this to be overloaded by packages with custom arrays, via declaring this to be part of the public API? Otherwise I fear there may not be a good choice here.

@LilithHafner
Copy link
Member

LilithHafner commented May 13, 2024

That sounds like a great choice!

Though I would name it something other than diagzero because it could be used for TriDiagonal, Triangular, etc. Perhaps

element_zero(a::AbstractArray) = zero(eltype(a))

and

element_zero(q::QuantityArray) = zero(first(q))

@MilesCranmer
Copy link
Sponsor Member Author

Sounds good. And what would fzero be named?

@LilithHafner
Copy link
Member

Would it work to make fzero getindex for scalar broadcast styles, element_zero for StructuredMatrixStyle, and missing for all others? When would someone need to customize that function?

@MilesCranmer
Copy link
Sponsor Member Author

fzero is another function that was being used in a similar way to diagzero, and thus makes LinearAlgebra incompatible with quantities that store units as a value rather than a type.

@jmichel7
Copy link

Again it is more general: in many places LinearAlgebra is needlessly incompatible with quantities (numbers) which can compute a zero from a value, not a type. I gave another example: modular numbers where the modulus is a BigInt (thus cannot be stored in the type).

@LilithHafner
Copy link
Member

LilithHafner commented May 16, 2024

I'm proposing to rename diagzero to element_zero and make it public, and let users define new methods to handle those cases, and let fzero depend on element_zero (and getindex for scalars) so that once someone defines element_zero for QuantileArray both current usages of diagzero and current usages of fzero work.

Specifically, this would handle both QuantileArray and homogonous arrays of BigInt modulus integers.

@LilithHafner
Copy link
Member

An alternative would be diagzero(D, i, j) = only(unique(zero(D.diag[i]), zero(D.diag[j])))

Implemented as

function make_zero(a, b)
    za = zero(a)
    zb = zero(b)
    za == zb && typeof(za) == typeof(zb) || throw(ArgumentError("The value of a structural zero is ambiguous"))
    za
end
make_zero(a::M, b::M) where {T, M <: AbstractMatrix{T}} = zeros(T, size(a, 1), size(b, 2))

diagzero(D::Diagonal, i, j) = make_zero(D.diag[i], D.diag[j])

Which should be similarly efficient (equally efficient for the common case where zero(::T) where T = zero(T)) but does provide a performance footgun in some cases.

This follows the spirit of the existing handling of non-homogeneously sized matrices on the diagonal, but I think the performance footgun outweighs the reduced downstream labor of specializing on a custom element_zero, and the element_zero approach works for arbitrary sparsity patterns and for fzero while this post's approach would not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants