Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add .arc() method for types that are typically used as *Refs #5714

Open
boustrophedon opened this issue May 4, 2024 · 7 comments
Open

Add .arc() method for types that are typically used as *Refs #5714

boustrophedon opened this issue May 4, 2024 · 7 comments
Labels
enhancement Any new improvement worthy of a entry in the changelog

Comments

@boustrophedon
Copy link

Is your feature request related to a problem or challenge? Please describe what you are trying to do.
Disclaimer: I only started using arrow-rs the other day via lancedb.

It seems annoying to have to both remember to import Arc and then write out Arc::new(..) every time you want to use a FooRef type.

Describe the solution you'd like
Wouldn't it make more sense to implement a fn arc(self) -> Arc<Self> { Arc::new(self) } on types for which FooRef is commonly used?

Describe alternatives you've considered
Maybe I've just been looking at a lot of example code and typically you load schemas/arrays from things that already return them wrapped in Arc?

Also, to_arc is maybe more in line with Rust naming guidelines. I don't think to_ref or ref is the right choice since it's not returning a &T.

@boustrophedon boustrophedon added the enhancement Any new improvement worthy of a entry in the changelog label May 4, 2024
@tustvold
Copy link
Contributor

tustvold commented May 4, 2024

I believe you can already use .into() for this

@boustrophedon
Copy link
Author

boustrophedon commented May 4, 2024

That works with SchemaRef (thanks!) but not ArrayRef, because ArrayRef is Arc<dyn Array>. Here's an example in the playground but you get the same error with the actual types:

error[E0277]: the trait bound `Arc<dyn Array>: From<PrimitiveArray<Int32Type>>` is not satisfied
  --> src/main.rs:33:54
   |
33 |                 Int32Array::from_iter_values(0..256).into(),
   |                                                      ^^^^ the trait `From<PrimitiveArray<Int32Type>>` is not implemented for `Arc<dyn Array>`, which is required by `PrimitiveArray<Int32Type>: Into<_>`

I believe this is because Arc's generic From<T> impl is, well, only implemented for T and not dyn T

Would it be okay if I made a PR on PrimitiveArray to give it an arc or to_arc method?

@tustvold
Copy link
Contributor

tustvold commented May 4, 2024

I'm honestly not sure, it feels a little bit niche imo, to avoid one fairly common import...

FWIW methods that don't require ownership should take &dyn Array which will work for free.

@boustrophedon
Copy link
Author

boustrophedon commented May 4, 2024

The above example is from RecordBatch::try_new where it takes a Vec<ArrayRef> parameter.

More than avoiding an import it's about readability IMO. When reading code I think seeing the Arc::new is a distraction - hiding those sorts of wrapper types at the end with a into() is pretty common.

@tustvold
Copy link
Contributor

tustvold commented May 4, 2024

I think lets see what other maintainers think, I'm not sure about adding an additional member function to every array type just for this.

Perhaps you could use an extension trait in your codebase for this?

@boustrophedon
Copy link
Author

boustrophedon commented May 4, 2024

Sure, it's possible to write something like

trait ToArc {
  fn arc(self) -> Arc<Self> {
    Arc::new(self)
  }
}
impl ToArc for Int32Array {}
[...]

and impl it for the array types.

I took a quick look and it seems like there's 17 structs if we were to implement it directly.

map_array.rs:impl Array for MapArray {
boolean_array.rs:impl Array for BooleanArray {
run_array.rs:impl<T: RunEndIndexType> Array for RunArray<T> {
run_array.rs:impl<'a, R: RunEndIndexType, V: Sync> Array for TypedRunArray<'a, R, V> {
null_array.rs:impl Array for NullArray {
mod.rs:impl Array for ArrayRef {
mod.rs:impl<'a, T: Array> Array for &'a T {
struct_array.rs:impl Array for StructArray {
list_array.rs:impl<OffsetSize: OffsetSizeTrait> Array for GenericListArray<OffsetSize> {
byte_view_array.rs:impl<T: ByteViewType + ?Sized> Array for GenericByteViewArray<T> {
dictionary_array.rs:impl<T: ArrowDictionaryKeyType> Array for DictionaryArray<T> {
dictionary_array.rs:impl<'a, K: ArrowDictionaryKeyType, V: Sync> Array for TypedDictionaryArray<'a, K, V> {
byte_array.rs:impl<T: ByteArrayType> Array for GenericByteArray<T> {
fixed_size_list_array.rs:impl Array for FixedSizeListArray {
primitive_array.rs:impl<T: ArrowPrimitiveType> Array for PrimitiveArray<T> {
union_array.rs:impl Array for UnionArray {
fixed_size_binary_array.rs:impl Array for FixedSizeBinaryArray {

Although probably implementing it for ArrayRef is bad.


Additionally, into with Schema only kind of works because rust can't infer the type if you're cloning it (to increment the refcount) at the point of use, e.g. passing it into a function.

@paddyhoran
Copy link
Contributor

I think I agree with @tustvold here. I would be someone that theoretically would benefit from this but have never missed it. Also, there is a solution via the helper trait. Just my 2 cents.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Any new improvement worthy of a entry in the changelog
Projects
None yet
Development

No branches or pull requests

3 participants