New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
improvement: Add atomic update on TrieMap #5938
Conversation
def updateWith(key: K)(remappingFunc: Option[V] => Option[V]): Unit = { | ||
val computeFunction = new java.util.function.BiFunction[K, V, V] { | ||
override def apply(k: K, v: V): V = { | ||
trieMap.get(key) match { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure, but maybe if inside this function we referenced k
instead of key
(so input of computeFunction
rather than the updateWith
method) we could then extract it into a constant defined on the AtomicTrieMap
class level so that there's no need to allocate a new one each time an updateWith
is called? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see its possible, because we need to pass it remappingFunc
.
We could change trieMap
type to TrieMap[K, Set[V]]
, since we only use it like this and add a append
function, that would look the way you suggested 🤔
Edit: I think it won't work either, since we need to pass the value to append
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, you're absolutely right, somehow I missed the remappingFunc
. 👍
null.asInstanceOf[V] | ||
} | ||
} | ||
concurrentMap.compute(key, computeFunction) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't this duplicate the amount of data we hold? Should we just use ConcurrentMap instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Size of concurrentMap is always 0, its only used to provide atomic updateWith for trieMap.
I don't think we should simply replace TrieMap with ConcurrentMap, because get
on TrieMap should be faster
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Size of concurrentMap is always 0, its only used to provide atomic updateWith for trieMap.
Can you ELI5 this? :D
get on TrieMap should be faster
Can you share some source about that, benchmark maybe? I did some reading, found https://medium.com/@igabaydulin/java-concurrenthashmap-vs-scala-concurrent-map-e185e8a0b798 which then led me to scala/scala#10027 and I need to read everything once again because I'm already lost. Best part - those are not even about TrieMap 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Size of concurrentMap is always 0, its only used to provide atomic updateWith for trieMap.
Can you ELI5 this? :D
Sure :D
We need an atomic way to update a TrieMap
with a remapping function, but its not provided in its API.
ConcurrentHashMap
has the ability to do that, with compute
function.
We don't to store any data in ConcurrentHashMap
and only use it as a sort of lock
on the updated value.
So our compute
function atomically calculates new value to be stored, but updates it only in TrieMap
- it inserts null
to ConcurrentHashMap
, which means no mapping
.
Can you share some source about that, benchmark maybe? I did some reading, found https://medium.com/@igabaydulin/java-concurrenthashmap-vs-scala-concurrent-map-e185e8a0b798 which then led me to scala/scala#10027 and I need to read everything once again because I'm already lost. Best part - those are not even about TrieMap 😅
To be honest I'm not sure anymore 😅 . I thought using a trie
would result in a better performance, but from what I've read it could be similar or slightly worse than ConcurrentHashMap
(TrieMap
biggest advantage are snapshots
, but we are not using them).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for explaning!
connected to #5072