Skip to content

Increase the cutoff to inline interpreted targets #21875

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 16, 2025

Conversation

r30shah
Copy link
Contributor

@r30shah r30shah commented May 14, 2025

In hot and above compilatios, if a target it is looking to inline is still interpreted, increase the frequency cut-off for such targets. Currently this is not extended to warm compiles, but changes in this commit also adds environment variable to apply higher frequency cut-off for interpreted targets in warm compilations.

@r30shah r30shah marked this pull request as ready for review May 14, 2025 20:55
@r30shah
Copy link
Contributor Author

r30shah commented May 14, 2025

@vijaysun-omr Can I please get your review on this change?

{
static const char *cutOffForWarm = feGetEnv("TR_FreqCutOffForInterpretedCalleeInWarm");
static const int32_t cutOffFreqForWarm = cutOffForWarm ? atoi(cutOffForWarm) : FREQ_CUTOFF_INTERPRETED_WARM;
isInterpretedCallWithLowFrequency = profileManager->getCallGraphProfilingCount(targetCallee->_calleeMethod->getPersistentIdentifier(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we using the getCallGraphProfilingCount api here whereas we use the block frequency on the other side of this conditional, i.e. for hot and above ?

The reason I ask is that @mpirvu and I are considering if we should remove the getCallGraphProfilingCount currently via experiments.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reason I get the information from the callGraphProfilingCount is because profileManager->isColdCall call that is made couple of lines after looks at the call graph profiling count and if it is higher than the lowest CFG freq (set to 5) it return false to isCold call - This change will increase that cutoff higher for interpreted call.

From what I understand - getCallGraphProfilingCount looks into the IProfiler data - With removal of getCallGraphProfilingCount - for warm / cold compilations - what interface we are looking to utilize?

As a side note, I think this impacts one of the issue I found with JProfiling where it was unable to find the information with JProfiling as it was looking into the IProfiler data and one of the change to fix that was to update the ValueProfileInfo manager to allow to look into the JProfiling data for these APIs. I am bringing up this here as I am working on delivering the changes for the JProfiling in coming weeks - would want to make sure that I account for that.

[1].

return (getCallGraphProfilingCount(calleeMethod, method, byteCodeIndex, comp) < (comp->getFlowGraph()->getLowFrequency()));

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With removal of getCallGraphProfilingCount - for warm / cold compilations - what interface we are looking to utilize

Why isn't block frequency enough to determine the call count?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the code and reading some log files - I do see that those values (block frequnecy and the call graph profling count) are more or less in same range so I did think of this. For static and special calls, we do drive the decision in this part of code by looking at the block frequency only [1]. Only reason that prevented me from using the same block frequency is slight possibility of having discrepancy with indirect calls so I chose to match with what we do for <= warm compile.

[1].

if (!callSites[i]->isIndirectCall())
{
profileManager->updateCallGraphProfilingCount( currentBlock, calltarget->_calleeMethod->getPersistentIdentifier(), i, comp());
heuristicTrace(tracer(),"Depth %d: Updating Call Graph Profiling Count for calltarget %p count = %d",_recursionDepth, calltarget,profileManager->getCallGraphProfilingCount(calltarget->_calleeMethod->getPersistentIdentifier(), i, comp()));
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the conceptual level I think that all call counts should be based on block frequency derived from branch profiling.
CallGraph entries in the IProfiler should only be used for determining the most used target and not for call counts (to be consistent).

Copy link
Contributor

@vijaysun-omr vijaysun-omr May 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is the cleanest position to take and @mpirvu and I are trying to take that position for now at least, as we try to remove consumers of these "call graph profiling count" apis. For example, we might be interested in removing the isColdCall api Rahil mentioned and replace it with whatever amended block frequency based test is suitable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @mpirvu - I do agree and as I mentioned, when making change, and inspecting what we get from block frequency and call graph count is more or less in same range, and as @vijaysun-omr mentioned, there is a plan to remove the getCallGraphProfilingCount, we can simplify and use the block frequency for warm as well. Making the change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have made change to remove using getProfilingCallCount and use block frequency for the interpreted targets in https://github.com/eclipse-openj9/openj9/compare/58919c22da5348931f21048a9e30c0075d8d5efa..d386800e13f70f41f0ad960a8230a6bed31aa5a6

To ensure, I do not miss, interest in removing isColdCall API usage is a separate set of changes we would be making - not part of this PR, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

interest in removing isColdCall API usage is a separate set of changes we would be making - not part of this PR, right?

Correct

In hot and above compilatios, if a target it is looking to inline is still
interpreted, increase the frequency cut-off for such targets. Currently this is
not extended to warm compiles, but changes in this commit also adds environment
variable to apply higher frequency cut-off for interpreted targets in warm
compilations.

Signed-off-by: Rahil Shah <[email protected]>
@r30shah r30shah force-pushed the inlinerIncreaseCutoff branch from 58919c2 to d386800 Compare May 15, 2025 16:18
@vijaysun-omr
Copy link
Contributor

Jenkins test sanity.functional,sanity.openjdk all jdk8,jdk21

@vijaysun-omr
Copy link
Contributor

Checks have passed, but I would prefer to get @mpirvu to review and approve formally before merging.

Copy link
Contributor

@mpirvu mpirvu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

@vijaysun-omr vijaysun-omr merged commit e3780b7 into eclipse-openj9:master May 16, 2025
51 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants