s3: avoid array copies when dealing with ByteStrings #658

pjfanning · 2024-05-16T19:29:30Z

the 'unsafe' methods are ok if you know the data is not being mutated

pjfanning · 2024-05-16T19:36:46Z

@mdedetrich You are the most active contributor to the S3 code base. Do these changes look ok to you?

pjfanning · 2024-05-16T19:37:55Z

s3/src/main/scala/org/apache/pekko/stream/connectors/s3/impl/auth/package.scala

@@ -36,7 +36,16 @@ package object auth {
    new String(out)
  }

-  @InternalApi private[impl] def encodeHex(bytes: ByteString): String = encodeHex(bytes.toArray)
+  @InternalApi private[impl] def encodeHex(bytes: ByteString): String = {
+    val length = bytes.length


ByteString implements IndexedSeq[Byte] so we can iterate it without converting it to an array.

looking at the ByteStrings code (complex ByteString made up of underlying ByteStrings) the apply(Int) function and the iterator function are not performant.

https://github.com/apache/pekko/blob/c4805f1839ead0cfe31700d6c41928df9fb1056d/actor/src/main/scala-3/org/apache/pekko/util/ByteString.scala#L554-L570

So I'm now thinking of changing this to

def encodeHex(bytes: ByteString): String = encodeHex(bytes.toArrayUnsafe)

nvollmar

lgtm

laglangyue · 2024-05-21T06:07:28Z

lgtm

pjfanning requested a review from mdedetrich May 16, 2024 19:36

pjfanning commented May 16, 2024

View reviewed changes

pjfanning added 2 commits May 18, 2024 20:35

s3: avoid array copies when dealing with ByteStrings

43c5548

remove code that gets bytes directly from ByteString

18976e9

pjfanning force-pushed the s3-hex branch from cc1fbf3 to 18976e9 Compare May 18, 2024 19:40

nvollmar approved these changes May 21, 2024

View reviewed changes

pjfanning merged commit 473c5e1 into apache:main May 21, 2024
51 checks passed

pjfanning deleted the s3-hex branch May 21, 2024 09:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

s3: avoid array copies when dealing with ByteStrings #658

s3: avoid array copies when dealing with ByteStrings #658

pjfanning commented May 16, 2024

pjfanning commented May 16, 2024

pjfanning May 16, 2024

pjfanning May 16, 2024

nvollmar left a comment

laglangyue commented May 21, 2024

s3: avoid array copies when dealing with ByteStrings #658

s3: avoid array copies when dealing with ByteStrings #658

Conversation

pjfanning commented May 16, 2024

pjfanning commented May 16, 2024

pjfanning May 16, 2024

Choose a reason for hiding this comment

pjfanning May 16, 2024

Choose a reason for hiding this comment

nvollmar left a comment

Choose a reason for hiding this comment

laglangyue commented May 21, 2024