Ryujinx-uplift/ARMeilleure/Instructions
riperiperi 9db73f74cf
ARMeilleure: Respect FZ/RM flags for all floating point operations (#4618)
* ARMeilleure: Respect Fz flag for all floating point operations.

This is a change in strategy for emulating the Fz FPCR flag. Before, it was set before instructions that "needed it" and reset after. However, this missed a few hot instructions like the multiplication instruction, and the entirety of A32.

The new strategy is to set the Fz flag only in the following circumstances:

- Set to match FPCR before translated functions/loop are executed.
- Reset when calling SoftFloat methods, set when returning.
- Reset when exiting execution.

This allows us to remove the code around the existing Fz aware instructions, and get the accuracy benefits on all floating point instructions executed while in translated code.

Single step executions now need to be called with a context wrapper - right now it just contains the Fz flag initialization, and won't actually do anything on ARM.

This fixes a bug in Breath of the Wild where some physics interactions could randomly crash the game due to subnormal values not flushing to zero.

This is draft right now because I need to answer the questions:
- Does dotnet avoid changing the value of Mxcsr?
- Is it a good idea to assume that? Or should the flag set/restore be done on every managed method call, not just softfloat?
- If we assume that, do we want a unit test to verify the behaviour?

I recommend testing a bunch of games, especially games affected when this was originally added, such as #1611.

* Remove unused method

* Use FMA for Fmadd, Fmsub, Fnmadd, Fnmsub, Fmla, Fmls

...when available.

Similar implementation to A32

* Use FMA for Frecps, Frsqrts

* Don't set DAZ.

* Add round mode to ARM FP mode

* Fix mistakes

* Add test for FP state when calling managed methods

* Add explanatory comment to test.

* Cleanup

* Add A64 FPCR flags

* Vrintx_S A32 fast path on A64 backend

* Address feedback 1, re-enable DAZ

* Fix FMA instructions By Elem

* Address feedback
2023-04-10 12:22:58 +02:00
..
CryptoHelper.cs Use ReadOnlySpan<byte> compiler optimization for static data (#3130) 2022-02-17 21:38:50 +01:00
InstEmitAlu32.cs Implement PLD and SUB (imm16) on T32, plus UADD8, SADD8, USUB8 and SSUB8 on both A32 and T32 (#3693) 2022-09-13 19:51:40 -03:00
InstEmitAlu.cs Reduce JIT GC allocations (#2515) 2021-08-17 15:08:34 -03:00
InstEmitAluHelper.cs T32: Add Vfp instructions (#3690) 2022-09-10 23:03:14 -03:00
InstEmitBfm.cs Reduce JIT GC allocations (#2515) 2021-08-17 15:08:34 -03:00
InstEmitCcmp.cs Reduce JIT GC allocations (#2515) 2021-08-17 15:08:34 -03:00
InstEmitCsel.cs Reduce JIT GC allocations (#2515) 2021-08-17 15:08:34 -03:00
InstEmitDiv.cs Reduce JIT GC allocations (#2515) 2021-08-17 15:08:34 -03:00
InstEmitException32.cs Removed unused usings. (#3593) 2022-08-18 18:04:54 +02:00
InstEmitException.cs Decoder: Exit on trapping instructions, and resume execution at trapping instruction (#3153) 2022-03-04 23:16:58 +01:00
InstEmitFlow32.cs T32: Add Vfp instructions (#3690) 2022-09-10 23:03:14 -03:00
InstEmitFlow.cs Reduce JIT GC allocations (#2515) 2021-08-17 15:08:34 -03:00
InstEmitFlowHelper.cs Fix return type mismatch on 32-bit titles (#3000) 2022-01-16 08:39:43 -03:00
InstEmitHash32.cs Minor code formatting (#4498) 2023-03-04 14:43:08 +01:00
InstEmitHash.cs Add SSE4.2 Path for CRC32, add A32 variant, add tests for non-castagnoli variants. (#1328) 2020-07-13 20:48:14 +10:00
InstEmitHashHelper.cs Minor code formatting (#4498) 2023-03-04 14:43:08 +01:00
InstEmitHelper.cs T32: Implement B, B.cond, BL, BLX (#3155) 2022-03-04 23:05:08 +01:00
InstEmitMemory32.cs Add ADD (zx imm12), NOP, MOV (rs), LDA, TBB, TBH, MOV (zx imm16) and CLZ thumb instructions (#3683) 2022-09-09 22:09:11 -03:00
InstEmitMemory.cs Fix return type mismatch on 32-bit titles (#3000) 2022-01-16 08:39:43 -03:00
InstEmitMemoryEx32.cs Implement Thumb (32-bit) memory (ordered), multiply, extension and bitfield instructions (#3687) 2022-09-10 22:51:00 -03:00
InstEmitMemoryEx.cs A32/T32/A64: Implement Hint instructions (CSDB, SEV, SEVL, WFE, WFI, YIELD) (#3694) 2022-09-14 18:18:15 -03:00
InstEmitMemoryExHelper.cs Reduce JIT GC allocations (#2515) 2021-08-17 15:08:34 -03:00
InstEmitMemoryHelper.cs Add ADD (zx imm12), NOP, MOV (rs), LDA, TBB, TBH, MOV (zx imm16) and CLZ thumb instructions (#3683) 2022-09-09 22:09:11 -03:00
InstEmitMove.cs Reduce JIT GC allocations (#2515) 2021-08-17 15:08:34 -03:00
InstEmitMul32.cs Implement Thumb (32-bit) memory (ordered), multiply, extension and bitfield instructions (#3687) 2022-09-10 22:51:00 -03:00
InstEmitMul.cs Add a new JIT compiler for CPU code (#693) 2019-08-08 21:56:22 +03:00
InstEmitSimdArithmetic32.cs Implement JIT Arm64 backend (#4114) 2023-01-10 19:16:59 -03:00
InstEmitSimdArithmetic.cs ARMeilleure: Respect FZ/RM flags for all floating point operations (#4618) 2023-04-10 12:22:58 +02:00
InstEmitSimdCmp32.cs Implement JIT Arm64 backend (#4114) 2023-01-10 19:16:59 -03:00
InstEmitSimdCmp.cs Implement JIT Arm64 backend (#4114) 2023-01-10 19:16:59 -03:00
InstEmitSimdCrypto32.cs Add Profiled Persistent Translation Cache. (#769) 2020-06-16 20:28:02 +02:00
InstEmitSimdCrypto.cs Add Profiled Persistent Translation Cache. (#769) 2020-06-16 20:28:02 +02:00
InstEmitSimdCvt32.cs ARMeilleure: Respect FZ/RM flags for all floating point operations (#4618) 2023-04-10 12:22:58 +02:00
InstEmitSimdCvt.cs Implement JIT Arm64 backend (#4114) 2023-01-10 19:16:59 -03:00
InstEmitSimdHash32.cs ARMeilleure: Hardware accelerate SHA256 (#3585) 2022-08-25 10:12:13 +00:00
InstEmitSimdHash.cs ARMeilleure: Hardware accelerate SHA256 (#3585) 2022-08-25 10:12:13 +00:00
InstEmitSimdHashHelper.cs ARMeilleure: Hardware accelerate SHA256 (#3585) 2022-08-25 10:12:13 +00:00
InstEmitSimdHelper32.cs ARMeilleure: Respect FZ/RM flags for all floating point operations (#4618) 2023-04-10 12:22:58 +02:00
InstEmitSimdHelper32Arm64.cs Implement JIT Arm64 backend (#4114) 2023-01-10 19:16:59 -03:00
InstEmitSimdHelper.cs ARMeilleure: Respect FZ/RM flags for all floating point operations (#4618) 2023-04-10 12:22:58 +02:00
InstEmitSimdHelperArm64.cs Implement JIT Arm64 backend (#4114) 2023-01-10 19:16:59 -03:00
InstEmitSimdLogical32.cs ARMeilleure: Add initial support for AVX512 (EVEX encoding) (cont) (#4147) 2023-03-20 16:09:24 -03:00
InstEmitSimdLogical.cs ARMeilleure: Add initial support for AVX512 (EVEX encoding) (cont) (#4147) 2023-03-20 16:09:24 -03:00
InstEmitSimdMemory32.cs Fix increment on Arm32 NEON VLDn/VSTn instructions with regs > 1 (#3695) 2022-09-13 08:24:09 +02:00
InstEmitSimdMemory.cs Reduce JIT GC allocations (#2515) 2021-08-17 15:08:34 -03:00
InstEmitSimdMove32.cs ARMeilleure: Add initial support for AVX512 (EVEX encoding) (cont) (#4147) 2023-03-20 16:09:24 -03:00
InstEmitSimdMove.cs Reduce JIT GC allocations (#2515) 2021-08-17 15:08:34 -03:00
InstEmitSimdShift32.cs Fpsr and Fpcr freed. (#3701) 2022-09-20 18:55:13 -03:00
InstEmitSimdShift.cs Implement JIT Arm64 backend (#4114) 2023-01-10 19:16:59 -03:00
InstEmitSystem32.cs ARMeilleure: Respect FZ/RM flags for all floating point operations (#4618) 2023-04-10 12:22:58 +02:00
InstEmitSystem.cs ARMeilleure: Respect FZ/RM flags for all floating point operations (#4618) 2023-04-10 12:22:58 +02:00
InstName.cs A32/T32/A64: Implement Hint instructions (CSDB, SEV, SEVL, WFE, WFI, YIELD) (#3694) 2022-09-14 18:18:15 -03:00
NativeInterface.cs Remove use of GetFunctionPointerForDelegate to get JIT cache function pointer (#4337) 2023-01-23 22:37:53 +00:00
SoftFallback.cs Fpsr and Fpcr freed. (#3701) 2022-09-20 18:55:13 -03:00
SoftFloat.cs Fpsr and Fpcr freed. (#3701) 2022-09-20 18:55:13 -03:00