-
Notifications
You must be signed in to change notification settings - Fork 767
Increase the cutoff to inline interpreted targets #21875
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Increase the cutoff to inline interpreted targets #21875
Conversation
@vijaysun-omr Can I please get your review on this change? |
{ | ||
static const char *cutOffForWarm = feGetEnv("TR_FreqCutOffForInterpretedCalleeInWarm"); | ||
static const int32_t cutOffFreqForWarm = cutOffForWarm ? atoi(cutOffForWarm) : FREQ_CUTOFF_INTERPRETED_WARM; | ||
isInterpretedCallWithLowFrequency = profileManager->getCallGraphProfilingCount(targetCallee->_calleeMethod->getPersistentIdentifier(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we using the getCallGraphProfilingCount
api here whereas we use the block frequency on the other side of this conditional, i.e. for hot and above ?
The reason I ask is that @mpirvu and I are considering if we should remove the getCallGraphProfilingCount
currently via experiments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reason I get the information from the callGraphProfilingCount is because profileManager->isColdCall
call that is made couple of lines after looks at the call graph profiling count and if it is higher than the lowest CFG freq (set to 5) it return false to isCold call - This change will increase that cutoff higher for interpreted call.
From what I understand - getCallGraphProfilingCount looks into the IProfiler data - With removal of getCallGraphProfilingCount
- for warm / cold compilations - what interface we are looking to utilize?
As a side note, I think this impacts one of the issue I found with JProfiling where it was unable to find the information with JProfiling as it was looking into the IProfiler data and one of the change to fix that was to update the ValueProfileInfo manager to allow to look into the JProfiling data for these APIs. I am bringing up this here as I am working on delivering the changes for the JProfiling in coming weeks - would want to make sure that I account for that.
[1].
return (getCallGraphProfilingCount(calleeMethod, method, byteCodeIndex, comp) < (comp->getFlowGraph()->getLowFrequency())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With removal of getCallGraphProfilingCount - for warm / cold compilations - what interface we are looking to utilize
Why isn't block frequency enough to determine the call count?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the code and reading some log files - I do see that those values (block frequnecy and the call graph profling count) are more or less in same range so I did think of this. For static and special calls, we do drive the decision in this part of code by looking at the block frequency only [1]. Only reason that prevented me from using the same block frequency is slight possibility of having discrepancy with indirect calls so I chose to match with what we do for <= warm compile.
[1].
openj9/runtime/compiler/optimizer/J9EstimateCodeSize.cpp
Lines 1715 to 1719 in 9372718
if (!callSites[i]->isIndirectCall()) | |
{ | |
profileManager->updateCallGraphProfilingCount( currentBlock, calltarget->_calleeMethod->getPersistentIdentifier(), i, comp()); | |
heuristicTrace(tracer(),"Depth %d: Updating Call Graph Profiling Count for calltarget %p count = %d",_recursionDepth, calltarget,profileManager->getCallGraphProfilingCount(calltarget->_calleeMethod->getPersistentIdentifier(), i, comp())); | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the conceptual level I think that all call counts should be based on block frequency derived from branch profiling.
CallGraph entries in the IProfiler should only be used for determining the most used target and not for call counts (to be consistent).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is the cleanest position to take and @mpirvu and I are trying to take that position for now at least, as we try to remove consumers of these "call graph profiling count" apis. For example, we might be interested in removing the isColdCall
api Rahil mentioned and replace it with whatever amended block frequency based test is suitable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @mpirvu - I do agree and as I mentioned, when making change, and inspecting what we get from block frequency and call graph count is more or less in same range, and as @vijaysun-omr mentioned, there is a plan to remove the getCallGraphProfilingCount
, we can simplify and use the block frequency for warm as well. Making the change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have made change to remove using getProfilingCallCount and use block frequency for the interpreted targets in https://github.com/eclipse-openj9/openj9/compare/58919c22da5348931f21048a9e30c0075d8d5efa..d386800e13f70f41f0ad960a8230a6bed31aa5a6
To ensure, I do not miss, interest in removing isColdCall
API usage is a separate set of changes we would be making - not part of this PR, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
interest in removing isColdCall API usage is a separate set of changes we would be making - not part of this PR, right?
Correct
In hot and above compilatios, if a target it is looking to inline is still interpreted, increase the frequency cut-off for such targets. Currently this is not extended to warm compiles, but changes in this commit also adds environment variable to apply higher frequency cut-off for interpreted targets in warm compilations. Signed-off-by: Rahil Shah <[email protected]>
58919c2
to
d386800
Compare
Jenkins test sanity.functional,sanity.openjdk all jdk8,jdk21 |
Checks have passed, but I would prefer to get @mpirvu to review and approve formally before merging. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
In hot and above compilatios, if a target it is looking to inline is still interpreted, increase the frequency cut-off for such targets. Currently this is not extended to warm compiles, but changes in this commit also adds environment variable to apply higher frequency cut-off for interpreted targets in warm compilations.