Optimize Parallel instance in GenSpawnInstances to avoid allocation.#4599
Conversation
|
Hi @durban, maybe you could review this one as well? In the same spirit as another one you did recently. There is also one more just like it right next to this one. |
|
I'm slightly nervous about the Have you observed the reduced memory footprint and GC overhead you write about? (I believe we have |
|
@durban I did not observe reduced memory -- I wasn't trying to solve a performance problem. I found these by reading the code of cats effect while trying to understand the framework better. Of all the ones you looked this is the one that could have the most impact since we build a natural transform for each element in the collection. But it's up to you. Maybe if I add some comment above the reference to the constant natural transform that would help convince you? |
|
Let's move the |
…Caches the and FunctionK instances in a private companion object using an Id cast.
e97b43f to
496f744
Compare
|
@durban I made the changes you asked for (hopefully). Please take a look. We make 2 calls now instead of one when we call those functions but hopefully the magic of inlining will kick in and save the day. |
durban
left a comment
There was a problem hiding this comment.
Thanks. Yes, I reckon a method call should be much-much cheaper than an allocation. As you say, inlining should take care of it.
Motivation
Currently in
GenSpawnInstances, theParallelinstance provides.paralleland.sequentialasdefs that instantiate anew (M ~> F)andnew (F ~> M)on every invocation.Because operations like
cats.Parallel.parTraverseinvokeP.parallel(ma)andP.sequential(ma)for every item in the collection, this creates a large number ofFunctionKobject allocations during a traversal, putting unnecessary pressure on the garbage collector.Changes
This PR eliminates these allocations by applying the
Idcaching technique (similar toResource.liftK).FunctionKinstances into aprivate object GenSpawnInstancesto act as true JVM-wide singletons (rather than trait fields).asInstanceOfto cast the cachedIdtransformations to the required effect type at the call site.Impact
Zero-allocation
parallelandsequentialtransformations for all effect types that implementGenSpawn. This reduces the memory footprint and GC overhead on hot paths involving parallel collections processing.