FICTURE7 69093cf2d6
Optimize LSRA (#2563)
* Optimize `TryAllocateRegWithtoutSpill` a bit

* Add a fast path for when all registers are live.
* Do not query `GetOverlapPosition` if the register is already in use
  (i.e: free position is 0).

* Do not allocate child split list if not parent

* Turn `LiveRange` into a reference struct

`LiveRange` is now a reference wrapping struct like `Operand` and
`Operation`.

It has also been changed into a singly linked-list. In micro-benchmarks
traversing the linked-list was faster than binary search on `List<T>`.
Even for quite large input sizes (e.g: 1,000,000), surprisingly.

Could be because the code gen for traversing the linked-list is much
much cleaner and there is no virtual dispatch happening when checking if
intervals overlaps.

* Turn `LiveInterval` into an iterator

The LSRA allocates in forward order and never inspect previous
`LiveInterval` once they are expired. Something similar can be done for
the `LiveRange`s within the `LiveInterval`s themselves.

The `LiveInterval` is turned into a iterator which expires `LiveRange`
within it. The iterator is moved forward along with interval walking
code, i.e: AllocateInterval(context, interval, cIndex).

* Remove `LinearScanAllocator.Sources`

Local methods are less susceptible to do allocations than lambdas.

* Optimize `GetOverlapPosition(interval)` a bit

Time complexity should be in O(n+m) instead of O(nm) now.

* Optimize `NumberLocals` a bit

Use the same idea as in `HybridAllocator` to store the visited state
in the MSB of the Operand's value instead of using a `HashSet<T>`.

* Optimize `InsertSplitCopies` a bit

Avoid allocating a redundant `CopyResolver`.

* Optimize `InsertSplitCopiesAtEdges` a bit

Avoid redundant allocations of `CopyResolver`.

* Use stack allocation for `freePositions`

Avoid redundant computations.

* Add `UseList`

Replace `SortedIntegerList` with an even more specialized data
structure. It allocates memory on the arena allocators and does not
require copying use positions when splitting it.

* Turn `LiveInterval` into a reference struct

`LiveInterval` is now a reference wrapping struct like `Operand` and
`Operation`.

The rationale behind turning this in a reference wrapping struct is
because a `LiveInterval` is associated with each local variable, and
these intervals may themselves be split further. I've seen translations
having up to 8000 local variables.

To make the `LiveInterval` unmanaged, a new data structure called
`LiveIntervalList` was added to store child splits. This differs from
`SortedList<,>` because it can contain intervals with the same start
position.

Really wished we got some more of C++ template in C#. :^(

* Optimize `GetChildSplit` a bit

No need to inspect the remaining ranges if we've reached a range which
starts after position, since the split list is ordered.

* Optimize `CopyResolver` a bit

Lazily allocate the fill, spill and parallel copy structures since most
of the time only one of them is needed.

* Optimize `BitMap.Enumerator` a bit

Marking `MoveNext` as `AggressiveInlining` allows RyuJIT to promote the
`Enumerator` struct into registers completely, reducing load/store code
a lot since it does not have to store the struct on the stack for ABI
purposes.

* Use stack allocation for `use/blockedPositions`

* Optimize `AllocateWithSpill` a bit

* Address feedback

* Make `LiveInterval.AddRange(,)` more conservative

Produces no diff against master, but just for good measure.
2021-10-08 18:15:44 -03:00
2021-10-08 18:15:44 -03:00
2021-10-07 01:13:51 +01:00
2018-02-04 20:08:20 -03:00
2019-10-12 23:48:31 -03:00

Ryujinx

An experimental Switch emulator written in C#

As of September 2021, Ryujinx has been tested on nearly 3,400 titles: ~3,000 boot past menus and into gameplay, with approximately 2,400 of those being considered playable. See the compatibility list here.

Usage

To run this emulator, we recommend that your PC have at least 8GB of RAM; less than this amount can result in unpredictable behavior and may cause crashes or unacceptable performance.

See our Setup & Configuration Guide on how to set up the emulator.

Latest build

These builds are compiled automatically for each commit on the master branch. While we strive to ensure optimal stability and performance prior to pushing an update, our automated builds may be unstable or completely broken.

The latest automatic build for Windows, macOS, and Linux can be found on the Official Website.

Building

If you wish to build the emulator yourself you will need to:

Step one: Install the X64 version of .NET 5.0 (or higher) SDK.

Step two (choose one):
(Variant one)

After the installation of the .NET SDK is done; go ahead and copy the Clone link from GitHub from here (via Clone or Download --> Copy HTTPS Link. You can Git Clone the repo by using Git Bash or Git CMD.

(Variant two):

Download the ZIP Tarball. Then extract it to a directory of your choice.

Step three:

Build the App using a Command prompt in the project directory. You can quickly access it by holding shift in explorer (in the Ryujinx directory) then right clicking, and typing the following command:
Run dotnet build -c Release inside the Ryujinx project folder to build Ryujinx binaries.

Ryujinx system files are stored in the Ryujinx folder. This folder is located in the user folder, which can be accessed by clicking Open Ryujinx Folder under the File menu in the GUI.

Features

  • Audio

    Audio output is entirely supported, audio input (microphone) isn't supported. We use C# wrappers for OpenAL, and SDL2 & libsoundio as fallbacks.

  • CPU

    The CPU emulator, ARMeilleure, emulates an ARMv8 CPU and currently has support for most 64-bit ARMv8 and some of the ARMv7 (and older) instructions, including partial 32-bit support. It translates the ARM code to a custom IR, performs a few optimizations, and turns that into x86 code.
    There are three memory manager options available depending on the user's preference, leveraging both software-based (slower) and host-mapped modes (much faster). The fastest option (host, unchecked) is set by default. Ryujinx also features an optional Profiled Persistent Translation Cache, which essentially caches translated functions so that they do not need to be translated every time the game loads. The net result is a significant reduction in load times (the amount of time between launching a game and arriving at the title screen) for nearly every game. NOTE: this feature is enabled by default in the Options menu > System tab. You must launch the game at least twice to the title screen or beyond before performance improvements are unlocked on the third launch! These improvements are permanent and do not require any extra launches going forward.

  • GPU

    The GPU emulator emulates the Switch's Maxwell GPU using the OpenGL API (version 4.5 minimum) through a custom build of OpenTK. There are currently four graphics enhancements available to the end user in Ryujinx: disk shader caching, resolution scaling, aspect ratio adjustment and anisotropic filtering. These enhancements can be adjusted or toggled as desired in the GUI.

  • Input

    We currently have support for keyboard, mouse, touch input, JoyCon input support, and nearly all controllers. Motion controls are natively supported in most cases; for dual-JoyCon motion support, DS4Windows or BetterJoy are currently required. In all scenarios, you can set up everything inside the input configuration menu.

  • DLC & Modifications

    Ryujinx is able to manage add-on content/downloadable content through the GUI. Mods (romfs, exefs, and runtime mods such as cheats) are also supported; the GUI contains a shortcut to open the respective mods folder for a particular game.

  • Configuration

    The emulator has settings for enabling or disabling some logging, remapping controllers, and more. You can configure all of them through the graphical interface or manually through the config file, Config.json, found in the user folder which can be accessed by clicking Open Ryujinx Folder under the File menu in the GUI.

Compatibility

You can check out the compatibility list here. Anyone is free to submit an updated test on an existing game entry; simply follow the new issue template and testing guidelines, and post as a reply to the applicable game issue.

Don't hesitate to open a new issue if a game isn't already on there!

Help

If you are having problems launching homebrew or a particular game marked status-playable or status-ingame in our compatibility list, you can contact us through our Discord server. We'll take note of whatever is causing the app/game to not work, put it on the watch list and fix it at a later date.

If you need help with setting up Ryujinx, you can ask questions in the #support channel of our Discord server.

Contact

If you have contributions, need support, have suggestions, or just want to get in touch with the team, join our Discord server!

If you'd like to donate, please take a look at our Patreon.

License

This software is licensed under the terms of the MIT license. The Ryujinx.Audio project is licensed under the terms of the LGPLv3 license. This project makes use of code authored by the libvpx project, licensed under BSD and the ffmpeg project, licensed under LGPLv3. See LICENSE.txt and THIRDPARTY.md for more details.

Credits

Description
Experimental Switch emulator written in C#
Readme 233 MiB
Languages
C# 95.1%
GLSL 3.4%
C 1.1%
HLSL 0.2%