Skip to content

x86: Implement fma intrinsic #21118

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 27, 2025
Merged

Conversation

BradleyWood
Copy link
Member

@BradleyWood BradleyWood commented Feb 12, 2025

This PR implements Math.fma intrinsics on x86 with vfmadd instructions.

Issue: #7474

@BradleyWood
Copy link
Member Author

Requires fma, extensions.

Here are performance numbers,

Benchmark Compared to Baseline Compared to Hotspot
FMABench.benchFloatFMA 552x 1.19x
FMABench.benchDoubleFMA 1482x 1.16x

@BradleyWood BradleyWood force-pushed the fma branch 4 times, most recently from ff7bb16 to 229e619 Compare February 12, 2025 06:34
@BradleyWood
Copy link
Member Author

FYI @0xdaryl, @JamesKingdon

@hzongaro Would you mind reviewing?

@JamesKingdon
Copy link
Contributor

Thanks Brad, what a result!

@hzongaro hzongaro self-requested a review February 12, 2025 15:48
@hzongaro hzongaro self-assigned this Feb 12, 2025
Copy link
Member

@hzongaro hzongaro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pulling this together so quickly! I just have a few comments and questions.

@BradleyWood
Copy link
Member Author

See the force-push

Copy link
Member

@hzongaro hzongaro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the changes look good. I will wait for @0xdaryl to review before running tests.

Copy link
Contributor

@0xdaryl 0xdaryl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the logic behind choosing the best instruction format to use is sound. I just have a few issues with the approach behind that.

@BradleyWood
Copy link
Member Author

@hzongaro The only thing I changed is comments, but was forced to rebase and fix conflicts. I think everything is ready to run testing.

@hzongaro
Copy link
Member

@0xdaryl, may I ask you to confirm that your comments were all resolved to your satisfaction?

@hzongaro
Copy link
Member

Jenkins test sanity.functional,sanity.openjdk xlinux,win,xmac jdk11,jdk17,jdk21

@hzongaro
Copy link
Member

The many jvmti test failures in JDK 11 sanity.functional xlinux and Windows appear to be unrelated to this change. I see the same failures with another pull request test run.

The failures running cmdlineTester_jfr_0 for JDK 17 sanity.functional Windows and JDK 21 sanity.functional Windows are due to known issue #21181.

The failures in JDK running jdk_util_1 for JDK 21 sanity.openjdk x86-64 macOS appears to be known issue #17270.

I will rerun sanity.functional testing for JDK 11 x86-64 macOS

Jenkins test sanity.functional xmac jdk11

@hzongaro
Copy link
Member

@BradleyWood, it looks there are merge conflicts that you will need to resolve before I will be able to rerun the JDK 11 xmac testing.

Signed-off-by: Bradley Wood <[email protected]>
@BradleyWood
Copy link
Member Author

@hzongaro Fixed conflicts

@hzongaro
Copy link
Member

Jenkins test sanity.functional xmac jdk11

@hzongaro hzongaro merged commit 2dd07eb into eclipse-openj9:master Feb 27, 2025
5 checks passed
@BradleyWood
Copy link
Member Author

@hzongaro Should I double-deliver on monday if no issues arise by then?

@hzongaro
Copy link
Member

hzongaro commented Mar 3, 2025

Should I double-deliver on monday if no issues arise by then?

I this is a safe change. I would be OK with it if there are no objections.

@BradleyWood
Copy link
Member Author

Is this is a safe change.

I think its on the safer side as far as intrinsics go. The only concern I have is that the operands which may be loaded directly from memory could come in any of many possible combinations. I did my best to test this whilst I was working on it buy disabling a few optimizations and placing constants (such as 1f) in place of variables to force memory loads at specific spots.

On the other hand, this change has survived 4 or 5 days of builds/tests without issues, provides significant performance benefits, and has a workaround option to disable the intrinsic if needed.

FYI, @pshipton

@hzongaro
Copy link
Member

hzongaro commented Mar 4, 2025

I this is a safe change.

Sorry for that typo! I meant to say, "I think this is a safe change." I agree with @BradleyWood's analysis.

@pshipton
Copy link
Member

pshipton commented Mar 4, 2025

Getting it in today would be good, before we start any milestone builds.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants