Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add MPSMatrixRandom #321

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

christiangnrd
Copy link
Contributor

Adds https://developer.apple.com/documentation/metalperformanceshaders/mpsmatrixrandom?language=objc and the other classes that start with "MPSMatrixRandom".

Code to integrate with Random.jl functions is present but commented out because it seems to be slower than the current rand! functionality from GPUArrays.

@maleadt
Copy link
Member

maleadt commented Mar 27, 2024

Code to integrate with Random.jl functions is present but commented out because it seems to be slower than the current rand! functionality from GPUArrays.

The quality of random numbers produced by GPUArrays' RNG is probably lower though, so I would consider just using the one from MPS.

@christiangnrd
Copy link
Contributor Author

christiangnrd commented Mar 27, 2024

A few comments.

  • I did some stuff to get rand to work on all supported Integer types. Is this valid or does it affect the quality of random numbers?
  • Every buffer is now created to a multiple of 16 bytes as the random number generation generates 32-bit values in multiples of 4. Is this acceptable? Is there an better alternative?
  • I'm not sure how to approach seeding. Any ideas?
  • Float16 numbers are the only values that don't use MPS. Should there be a warning somewhere mentioning this?
  • MPSMatrixRandom requires buffer offset to also be a multiple of 4, how to handle rand! with views?

lib/mps/matrix.jl Outdated Show resolved Hide resolved
lib/mps/vector.jl Outdated Show resolved Hide resolved
@maleadt
Copy link
Member

maleadt commented Mar 29, 2024

I did some stuff to get rand to work on all supported Integer types. Is this valid or does it affect the quality of random numbers?

So you're just reinterpreting Float32-generated random numbers as integers? That doesn't seem valid. Naively I would try to map the generated [0, 1) numbers to the integer's range, but that may still result in a biased output. I'm no RNG expert, but you could try it out and comparing a histogram to Random.RNG and see if it looks somewhat acceptable.

I'm not sure how to approach seeding. Any ideas?

What is the problem? Doesn't it have a way to seed the generator (e.g. https://developer.apple.com/documentation/metalperformanceshaders/mpsmatrixrandomphilox/3242872-initwithdevice?language=objc)?

  • Float16 numbers are the only values that don't use MPS
  • MPSMatrixRandom requires buffer offset to also be a multiple of 4

I would just fall back to the GPUArrays RNG in those cases. CUDA.jl similarly tries to CURAND when possible, and falls back to the native generator in other cases.

@christiangnrd
Copy link
Contributor Author

So you're just reinterpreting Float32-generated random numbers as integers?

The default rng generator generates random UInt32 values so it's reinterpreting those as whatever Integer value was determined. Just took a look at histograms for all Integer datatypes and they all seem to be equivalent to the CPU versions. They also seem to be much better than the GPUArrays.

Ex:
blah

What is the problem?

I'm not sure how to tie it in with Metal.seed! to access the rng state and such.

I've marked this as a draft because the seeding test is currently marked as broken and I don't want to forget to fix it.

@maleadt
Copy link
Member

maleadt commented Mar 30, 2024

They also seem to be much better than the GPUArrays.

That's not surprising :-) FYI, if you really want to make sure the RNG is good, you can use https://github.com/JuliaRandom/RNGTest.jl. CUDA.jl's native RNG passes those, but it took a while to get there.

@christiangnrd
Copy link
Contributor Author

This is now ready for final review. It should not be squashed when merging because there are a few general improvements that are not directly related to the main focus of this PR. I made sure to have all the commits make sense on their own.

@christiangnrd
Copy link
Contributor Author

Added some API functions that I missed. I won't squash anymore until it's been reviewed to make things less confusing.

Copy link
Member

@maleadt maleadt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good!

lib/mps/matrixrandom.jl Outdated Show resolved Hide resolved
docs/src/usage/array.md Outdated Show resolved Hide resolved
lib/mps/random.jl Outdated Show resolved Hide resolved
lib/mps/matrixrandom.jl Outdated Show resolved Hide resolved
lib/mps/random.jl Outdated Show resolved Hide resolved
lib/mps/vector.jl Outdated Show resolved Hide resolved
lib/mps/vector.jl Outdated Show resolved Hide resolved
lib/mtl/buffer.jl Outdated Show resolved Hide resolved
test/metal.jl Outdated Show resolved Hide resolved
src/random.jl Outdated Show resolved Hide resolved
@christiangnrd
Copy link
Contributor Author

I've fixed the easy to fix comments. I'll go over the rest over the weekend when I have more time.

test/random.jl Outdated
@@ -11,6 +11,7 @@ const OOPLACE_TUPLES = [[(Metal.rand, rand, T) for T in RAND_TYPES];
@testset "random" begin
# in-place
@testset "in-place" begin
Metal.seed!(123)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does the global RNG need seeding if you're using an explicit RNG object below?

@christiangnrd
Copy link
Contributor Author

christiangnrd commented Apr 10, 2024

There seems to always be one random test that times out when running tests on 1.11. I can reliably reproduce on this branch on my device but I can't reproduce at all on main so it has to be from this PR (or this PR unmasked a bug?) but I have no idea where to start debugging this. I've seen it happen to linalg/norm, mps, metal, examples, and maybe others.

See #329

Comment on lines +16 to +19
# Seed the default generators to work around value of 0 being
# randomly generated in the size 1 Int8 Array test in 1.11
Metal.seed!(123)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, what? Can you elaborate?

@christiangnrd
Copy link
Contributor Author

christiangnrd commented Apr 16, 2024

I don't understand how the 1.9 failure could be related but it's the second time it happens. Any ideas?

Build 947
Build 949

@maleadt
Copy link
Member

maleadt commented Apr 17, 2024

That's a GPUCompiler issue, unrelated to this PR.

Comment on lines +123 to +122
# Even though `append_copy`` seems to work with any size or offset values, the documentation at
# https://developer.apple.com/documentation/metal/mtlblitcommandencoder/1400767-copyfrombuffer?language=objc
# mentions that both must be multiples of 4 bytes in MacOS so error when they are not
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should rather add a warning or assertion to append_copy directly.

@christiangnrd christiangnrd force-pushed the MPSMatrixRandom branch 2 times, most recently from 00ff88b to d61f381 Compare May 3, 2024 16:08
@christiangnrd christiangnrd force-pushed the MPSMatrixRandom branch 2 times, most recently from 0ce4936 to 8b1f274 Compare May 21, 2024 16:27
docs/src/usage/array.md Outdated Show resolved Hide resolved
lib/mps/matrixrandom.jl Outdated Show resolved Hide resolved
lib/mps/matrixrandom.jl Outdated Show resolved Hide resolved
lib/mps/matrixrandom.jl Outdated Show resolved Hide resolved
lib/mps/matrixrandom.jl Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request libraries Things about libraries and how we use them.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants