docs/templates/pages/getting_started.html

{% extends "base.html" %}
{% block title %}Following along{% endblock %}
{% block body %}
<h1>Following along</h1>
<p>I'd highly recommend following along with the details on these pages yourself. While my aim is to document as much as
    I can to the best of my ability, there will be things I miss, get wrong, or that are out-right newer than these
    pages! Knowing where the information here came from is key to being able to reproduce the findings yourself. It's
    also just generally quite fun, and a useful skill.</p>
<p>With that out of the way, you might then ask <i>how</i> to follow along. We're going to be getting nitty and gritty
    with some games, bemani specifically, so the very first step is to get your hands on one of those. Because we're
    going to want to poke around, we need a version of the game running on our PC (or in a VM), rather than on a
    cabinet. If you feel like starting with a real cabinet, <a href="https://mon.im/2017/12/konami-arcade-drm.html">mon
        has a great blog post</a> about that.</p>
<p>The majority of direct references to code are based on Sound Voltex 4. The specific build I'm using in most snippets
    is KFC-2019020600; no need to be on private websites to be able to make use of that information.</p>
<hr>
<p>Depending on what you have, you may be staring at a working game at this point, or a big network error. Either way,
    you're sorted.</p>

<h2>Static vs dynamic analysis</h2>
<p>Quick detour here. In reverse engineering (what we're doing!) you'll often hear these two terms used.</p>
<p>Static analysis is when we have a copy of the content, be that custom file formats, executable files, you name it,
    and we aim to identify how they work without running them. This can be very powerful, as it allows us to reverse
    engineer things we either can't or don't want to run. For example, we can perform static analysis of <i>any</i>
    program on a modern desktop PC, even a program written for an old games console. If you're sat staring at a network
    error right now, that's also a great example of the sorts of problems static analysis allows us to work around.</p>
<p>Dynamic analysis, as you may now have guessed, is when we start the program in question, and poke around while it's
    running. This poking can vary wildly; you might be curious about the values in memory during the execution of a
    function identified during static analysis, maybe you want to look at network traffic being created while the
    program runs, or maybe you just want to use the program normally to understand how it's intended to function.</p>
<p>We're going to be doing a lot of both, so strap in!</p>

<h2>Setting up our workspace</h2>
<p>There are a few essential tools every reverse engineer should have in their toolbox:</p>
<ul>
    <li>A <dfn title="Interprets machine code, converting it to human-readable assembly">disassembler</dfn></li>
    <li>A <dfn title="Allows reading of binary files">hex editor</dfn></li>
    <li>A <dfn title="Able to connect to a running program and show us internal information about it">debugger</dfn>
    </li>
    <li>A <dfn
            title="Takes assembly from a disassembler, and attempts to guess what the original source code may have looked like">decompiler</dfn>
        (incredibly useful but not essential)</li>
</ul>
<p>I'm going to be using:</p>
<ul>
    <li><a href="https://ghidra-sre.org/">Ghidra</a>: Disassembler and decompiler</li>
    <li><a href="https://hex-rays.com/ida-pro/">IDA</a>: Decompiler (for a second opinion; the decompiler isn't in the
        free version)</li>
    <li><a href="http://www.flexhex.com/">FlexHex</a>: Hex editor (there are <i>so</i> many free options here, so shop
        around)</li>
    <li><a href="https://visualstudio.microsoft.com/vs/community/">Visual Studio</a>: Debugger</li>
    <li><a href="https://www.wireshark.org/">Wireshark</a>: Network captures</li>
</ul>
<p><small><i>(Ghidra has a debugger now, but I'm yet to play around with it enough to ditch VS)</i></small></p>

<h2>Setting up Ghidra</h2>

<p>When you start Ghidra for the very first time, you will be presented with an empty screen. You'll need to create a
    new project; the name and location aren't especially important, but try and keep them sensible. After that, you can
    drag a file (libavs-win32.dll from your game is a good choice here) into the window. It will ask a series of
    questions; just acccept the defaults for everything. Once it's loaded, double click on the file to open the code
    browser. You will be asked if you would like Ghidra to automatically analise the file for you. Yes!</p>
<p>The interface can be pretty intimiading to start with, but there are a few important parts to note. Your window
    likely looks different to mine here, but the general layout will be roughly the same.</p>

<img src="{{ROOT}}/images/ghidra.png">

<p>Everything in the interface is a draggable window, and can be popped out of the main window, so don't be afraid to
    move things around if that helps. For example, I added the bookmarks window below my disassembler and decompiler,
    because I use it quite frequently.</p>

<h3>Key things to know in Ghidra:</h3>
<ul>
    <li>Double click on any label, function, or address to jump to that item. Alt+left and alt+right navigate through
        your location history.</li>
    <li>Middle click on any item to highligh all occurances of it (can be rebound to left click if you prefer it as
        default)</li>
    <li><code>L</code> will rename the item the cursor is over, and <code>Ctrl+L</code> will change the type of the item
        (in the decompiler).</li>
    <li><code>G</code> will open the jump popup. You can type an address, function name, label, etc. here</li>
    <li><code>S</code> to open search. If at first you aren't seeing results, you may need to switch to searching
        <code>All Blocks</code>.
    </li>
    <li>There are a bunch of useful tools in the <code>Window</code> dropdown at the top! Have a play around; you can't
        break anything.</li>
    <li><code>;</code> allows you to add a comment to any line</li>
    <li>In the disassembly: <code>T</code> to change the type of the data at the cursor, <code>D</code> to disassemble
        at the cursor, <code>F</code> to create a function at the cursor, <code>Del</code> to delete a function, and
        <code>C</code> to clear the selected data, returning it back to unknown bytes.
    </li>
</ul>

<h2>Setting up Wireshark</h2>
<p>While less conventional as a dynamic analysis tool, Wireshark is an invaluable tool when working with network-related
    tasks.</p>
<p>Either by editing <code>prop/ea3-config.xml</code>, or using spicecfg, pick a totally bogus service URL, with a
    distinct port. I'm going to use <code>http://127.0.0.1:54321</code>. Now start Wireshark, click once on the "adapter
    for loopback traffic capture", then in the capture filter enter <code>port 54321</code> (edit as required). Hit
    enter, and you'll start capturing. When you now start the game, some things will pop up but because we didn't have
    anything listening on that port (hopefully!) every attempt at communication was an error.</p>
<p>To rememdy this, let's run something on that port! It can be quite literally anything. <code>nc -lvp 54321</code>
    will do, if you have netcat. With wireshark still running, restart the game. This time something interesting should
    appear! If all went to plan, a green <code>HTTP</code> packet should show up.</p>
<img src="{{ROOT}}/images/wireshark.png">
<p>Clicking on it, we can see additional details. If we expand the blue HTTP section, and then the <code>Data</code>
    section at the bottom of that, we can view the raw data that was included in this HTTP POST request.</p>
<p>Wireshark is surprisingly flexible. Notice how in my screenshot the packet was identified as <code>XRPC</code>? I
    wrote a relatively simple protocol dissector, which allows me to view the contents of XRPC packets directly within
    Wireshark. While I might share it if I clean it up, it only took an hour or so in an evening to write; my aim is
    that these documents provide everything you could ever need to be able to quickly write your own too.</p>
<img src="{{ROOT}}/images/wireshark2.png">

<h2>Setting up Visual Studio</h2>
<p>Saved the worst for last, I'm afraid. Once visual studio starts, drag the exe you use to start the game into it. Odds
    are this is <code>spice.exe</code>. Visual Studio, in stark contrast to Ghidra, is totally barren.</p>
<img src="{{ROOT}}/images/vs.png">
<p>When you press the start button, VS will likely ask you to restart it in elevated mode; go ahead and do that.</p>
<img src="{{ROOT}}/images/vs2.png">
<p>Wow. That's a lot more stuff, but it all seems a bit empty? As a debugger, VS only allows you to poke around while
    the program is paused. We can manually pause using the pause icon at the top, which would normally be sufficient.
    Unforunately, in our case, we're looking at a far bigger project. Odds are when you pause the program you will get a
    message that it's running "external" code, or you end up somewhere totally random.</p>
<p>To solve this, we can setup VS to automatically pause for us.
    <code>Debug -&gt; New Breakpoint -&gt; Function Breakpoint</code> is the option we use to do this. VS will then
    allow us to enter a... function name? Aah. The expectation being made here is that we are debugging our own program,
    and have the full source code. Thankfully, we can instead enter an address here, by prefixing its address with
    <code>0x</code>. This is where both static and dynamic analysis work together.
</p>
<p>If you run the program again now (stop it if it's still running) Visual Studio will know to automatically pau- not so
    fast. The addresses we can see listed in Ghidra are the addresses we would expect, if the program was being loaded
    into memory at its "normal" location. Unfortunately for us, that can make genuinely malicious code easier, so a
    system called <a href="https://en.wikipedia.org/wiki/Address_space_layout_randomization">ASLR</a> is used to
    randomise the addresses the program will use. This reallly sucks for dynamic analysis.</p>
<p>Thankfully, we don't need to turn it off for our whole computer. We're going to use a tool called <a
        href="https://petoolse.github.io/petools/">PE Tools</a>. After starting the program, drag the DLL we're curious
    about onto it, <code>libavs-win32.dll</code>, for example. We need to lie to Windows that this DLL is not
    actually able to handle having its addresses randomised, which involves turning off <code>DLL can move</code>. This
    is going to directly edit the DLL file, so if you happen to be seeding it, consider this your warning to copy
    everything over to a different folder before continuing.</p>
<img src="{{ROOT}}/images/petools.png">
<p>At this point, we can return to Visual Studio and add our breakpoint as previously. If you've been following along,
    <code>0x1000A920</code> is a good breakpoint to test. It's quite likely however that the breakpoint won't be hit.
    This is, to the best of my knowledge, an issue in VS. Delete the breakpoint, and this time start the program then
    hit the pause button immediatly. Only once paused, re-add the breakpoint, then continue execution.
</p>
<img src="{{ROOT}}/images/vs3.png">
<p>The breakpoint should be hit almost right away. This is because that address is one of the logging functions :). In
    the bottom left, a list of registers are shown. This particular function takes its values via the stack, so paste
    the ESP register's value into the address box of the memory viewer. Right clicking, we can switch to
    <code>4-byte</code> mode, and can now see the stack clearly. The second number you see (ESP+0x04) is, in this case,
    the first argument to the function. Jumping to that value, we can see what it was about to log. In my case it was
    simply <code>ea3-boot</code>, but expect it to be different for you.
</p>

{% endblock %}