Skip to content

WIP: Resolve types in processExpr#5224

Draft
ondrejmirtes wants to merge 11 commits into2.1.xfrom
expression-result-type
Draft

WIP: Resolve types in processExpr#5224
ondrejmirtes wants to merge 11 commits into2.1.xfrom
expression-result-type

Conversation

@ondrejmirtes
Copy link
Member

Hi, this is step ~7 out of 12 I have planned. The endgoal is to get rid of 17 % of performance penalty and repeated calls to MutatingScope::resolveType. Of course the performance improvement might not pan out (the new version might also have bottlenecks we'll not see until the very end), but I still like how the new code looks like so I think the rewrite is worth it.

Right now PHPStan traverses AST during analysis multiple times:

  • In NodeScopeResolver to go through the AST and update Scope with assignments etc.
  • In MutatingScope::resolveType to go through the AST of en entire expression to figure out its type
  • In TypeSpecifier to narrow types in Scope based on expressions like instanceof

There are many problems this causes. One of the most understandable examples is that MutatingScope actually also has to call NodeScopeResolver (and TypeSpecifier for that matter) when resolving type of BooleanAnd and others:

$leftResult = $this->nodeScopeResolver->processExprNode(new Node\Stmt\Expression($node->left), $node->left, $this, new ExpressionResultStorage(), new NoopNodeCallback(), ExpressionContext::createDeep());
- so there's definitely multiple invocations of the same code.

The ongoing refactoring aims to change that. I already introduced ExprHandler which consolidates code handling from NodeScopeResolver and MutatingScope into a single class for the same expression type. processExpr is what originally happened in NodeScopeResolver and resolveType is what happened in MutatingScope. But this didn't fundamentally change the code logic and performance.

My idea is to introduce getType into ExpressionResult - which means that after NodeScopeResolver::processExprNode finishes, we don't have just the updated Scope, we also have the Type of the expression. This should reduce the need to call NodeScopeResolver when resolving a type, because we already have the right scope to based our type resolution on!

Right now the refactoring duplicates and transforms the logic from resolveType to be suitably run in processExpr, specifically in typeCallback constructor parameter of ExpressionResult. This means that calling $scope->getType($expr) inside typeCallback is forbidden. Instead, the code should ask $exprResult->getType() from previously called processExprNode.

In cases where the $expr passed into getType is synthetic (virtual - is created with new in the code, doesn't come from the parsed AST), we need to call processExprNode and grab its type from ExpressionResult after.

Thanks to Fibers on PHP 8.1+, the code in resolveType is basically dead, MutatingScope::getType() should never be called. That's why I can live with the current duplication, in PHPStan 3.0 we'll be able to delete a lot of code. Currently it means that any changes to type resolution have to be edited twice - in resolveType and in typeCallback. If we fail to have equivalent logic, tests on PHP 7.4-8.0 or on PHP 8.1+ will fail.

This is starting to get some really nice benefits. Previously, NullsafeShortCircuitingHelper had to be called recursively for every eligible expression, to see if there's ?-> somewhere in the chain that would influence the final result. With the new way, this code only has to happen once, in NullsafePropertyFetchHandler + NullsafeMethodCallHandler.

I imagine this will have other benefits, like no longer needing to unwrap assigns in TypeSpecifier

if ($expr instanceof Expr\Assign) {
$specifiedExprs[] = $expr->var;
$specifiedExprs[] = $expr->expr;
while ($expr->expr instanceof Expr\Assign) {
$specifiedExprs[] = $expr->expr->var;
$expr = $expr->expr;
}
, or bugs regarding to assignments in the middle of expressions (Fibers helped in some cases but not everywhere). One example for all: https://phpstan.org/r/51fc6cea-be4d-47f5-a4b8-e13416fd3f62

In the next steps, I also will move the code from TypeSpecifier to respective ExprHandlers, in a new specifyTypes method. And then I will copy and adjust the code in processExpr so that ExpressionResult can inform about narrowed types in some form as well. And in PHPStan 3.0. we'll be able to delete all the specifyTypes methods as well.

I encourage you to wrap your head around this. I look forward to your feedback!

@ondrejmirtes ondrejmirtes changed the title Expression result type WIP: Resolve types in processExpr Mar 15, 2026
Copy link
Contributor

@staabm staabm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't say that I understood all the details and can oversee the upcoming changes, but I think...

In the next steps, I also will move the code from TypeSpecifier to respective ExprHandlers, in a new specifyTypes method. And then I will copy and adjust the code in processExpr so that ExpressionResult can inform about narrowed types in some form as well. And in PHPStan 3.0. we'll be able to delete all the specifyTypes methods as well.

sounds like a great direction - as I understood it, we will have logic specific to certain expressions (no matter whether its type-specifying or typ-resolving in one class instead of cluttered over multiple class) - sounds like a big plus.

this structural change alone fells like a good thing - no matter whether its getting faster, or we only stay on the speed we have atm.

return $this->cachedKeepVoidType = $this->withExpr($clonedExpr)->getType();
}

private function getTypeByScope(MutatingScope $scope): Type
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just naming, but I feel it misleading this class has a getTypeForScope and a getTypeByScope method

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, one of them is meant to be used inside typeCallback, and another one is just a private method for code deduplication. Could be named better.

We can't actually use ExpressionResult::getType() inside typeCallback, that's what getTypeForScope is for. Because if someone asks for getNativeType(), we need to resolve native type for the whole chain, not just for the first shallow level.

For example, if we have $foo->bar() and a rule asks for getNativeType of this expression, it's not enough to look up the native return type of the bar() method. We also need to check what is it called on. If $foo (by a native typehint) is only object, the resolved type will be mixed and not whatever the bar() native return type is.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we didn't have getType/getNativeType difference, this could be a little bit simple.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants