Improve speed for workflows with hundreds of items#264
Improve speed for workflows with hundreds of items#264rebrowning wants to merge 6 commits intoStackStorm:masterfrom
Conversation
guzzijones
left a comment
There was a problem hiding this comment.
some follow up needed.
orquesta/conducting.py
Outdated
| return self.staged | ||
|
|
||
| return [x for x in self.staged if x["ready"] and not x.get("completed", False)] | ||
| resp = [x for x in self.staged if x["ready"] and not x.get("completed", False)] |
There was a problem hiding this comment.
this change is not needed, correct?
There was a problem hiding this comment.
correct, I'll revert some of the additional changes that were made (just moved variables out of return statements to get timing on them, didn't move back)
orquesta/specs/native/v1/models.py
Outdated
|
|
||
| item_ctx_value = ctx_util.set_current_item(item_ctx_value, item) | ||
| action = expr_base.evaluate(self.action, item_ctx_value) | ||
| gen_input = expr_base.evaluate(getattr(self, "input", {}), item_ctx_value) |
There was a problem hiding this comment.
This change is not needed? were you just testing the variable value here?
There was a problem hiding this comment.
item_ctx_value is needed i see. 195 and 196 can be reverted.
orquesta/specs/native/v1/models.py
Outdated
| def render(self, in_ctx): | ||
| action_specs = [] | ||
|
|
||
| item_ctx_value = ctx_util.copy_context(in_ctx) |
There was a problem hiding this comment.
having the copy here will slow down non with items . move the copy in the else clause.
| if context and not isinstance(context, dict): | ||
| raise TypeError("The context is not type of dict.") | ||
|
|
||
| ctx = {**context} |
There was a problem hiding this comment.
This is a shallow instead of deep copy , correct?
There was a problem hiding this comment.
I would actually leave this function as is and just do the operation in render.
There was a problem hiding this comment.
Alright, I'll move it over. And yes it's a shallow copy, but intentional. The rest of the sub-objects that are copied in a deep copy don't actually change between iteration from what I saw in testing, but the top level is what needs to change which a shallow copy still facilitates.
There was a problem hiding this comment.
@guzzijones thoughts on undoing the changes to set_current_item, that function is not used anywhere else in orquesta. set_current_item would not work at the current location that copy_context is being used, so we could either remove the set_current_item function and move the logic into the render function, or leave set_current_item in the current form
had to resend with the correct profile, switched to my laptop and forgot to switch
|
we actually don't even need the shallow copy. I am working on an addition to my nocopy branch with added benchmarks. |
|
here is what I ended up with |
|
That's clever @guzzijones I forgot that evaluate was essentially doing it's own deep copy. Makes sense though since that was the next bottleneck in the render function. What do we need to get your branch moved forward? Is there anything I can help with there? I'm not sure I'd want to merge this branch if your will cover this change plus the rest of the deep copy clean up. I know you also have a branch in the st2 repo for performance improvements. Are any of your orquesta changes dependent on that st2 change? I think the next performance improvement in the render function (if it's worth the dev cycles) would need a redis connection. My thought is caching the generated object for a task that has many items. That object created by render is identical from render to render from what I saw. This would mean passing the redis configuration through though, and I haven't looked into how much of an effort that'd be. |
|
I will run another build of st2 with the orquesta improvements. If those tests pass that is a good sign. We still need approval from another maintainer, though. Redis caching is a good idea, but there are lots more deep copies that could be shallow copies before that. |
|
orquesta changes are not dependent on st2 changes. |
I have a test where I make a curl request to a simple endpoint that returns a json payload containing 500 items like:
{"data": f"this is a sentence that has a counter {count} as well as some other text"}. The changes toorquesta/specs/native/v1/models.pyshave about half the time of the render function by doing a deep copy of the full object before we iterate over every single item, since all we really care about is setting the item (the rest of the payload does not change), so far from my testing it appears we can safely perform this action. This may be something that works nicely when combined with the changes in this PR: #256 , though they fix different issues from what I can see.