commit 546df244b0e6fd9aad19880b8af1d7a22406a534
Author: Bottersnike Why? I was curious how these APIs work, yet could find little to nothing on Google. There are a number of
+ closed-source projects, with presumably similarly closed-source internal documentation, and a scattering of
+ implementations of things, yet I couldn't find a site that actually just documents how the API works. If I'm
+ going to have to reverse engineer an open source project (or a closed source one, for that matter), I might as
+ well just go reverse engineer an actual game (or it's stdlib, as most of my time has been spent currently). These pages are very much a work in progress, and are being written as I reverse engineer parts of the
+ protocol. I've been asserting all my assumptions by writing my own implementation as I go, however it currently
+ isn't sharable quality code and, more importantly, the purpose of these pages is to make implementation of one's
+ own code hopefully trivial. Sharing annotated sources for all of the games' stdlibs would be both impractical and unwise. Where relevant
+ however I try to include snippets to illustrate concepts, and have included their locations in the source for if
+ you feel like taking a dive too. If you're here because you work on one of those aforementioned closed source projects, hello! Feel free to share
+ knowledge with the rest of the world, or point out corrections. Or don't; you do you. This site intentionally looks not-great. I don't feel like changing that, and honestly quite like the aesthetic. eAmuse uses XML for its application layer payloads*. This XML is either verbatim, or in a custom packed binary
+ format. Each tag that contains a value has a It is perhaps simpler to illustrate with an example, so: Arrays are encoded by concatenating every value together, with spaces between them. Data types that have multiple
+ values, are serialized similarly. Therefore, an element storing an array of Besides this, this is otherwise a rather standard XML. Many packets, rather than using a string-based XML format, use a custom binary packed format instead. While it
+ can be a little confusing, remembering that this is encoding an XML tree can make it easier to parse. To start with, let's take a look at the overall structure of the packets. Every packet starts with the magic byte Currently known possible values for the content byte are: Decompressed packets contain an XML string. Compressed packets are what we're interested in here. The encoding flag indicates the encoding for all string types in the packet (more on those later). Possible
+ values are: The full table for these values can be found in libavs. A second table exists just before this on in the source, responsible for the
+ This is indexed using the following function, which maps the above encoding IDs to 1, 2, 3, 4 and 5
+ respectively. While validating Following the 4 byte header, is a 4 byte integer containing the length of the next part of the header (this is
+ technically made redundant as this structure is also terminated). This part of the header defines the schema that the main payload uses. A tag definition looks like: Structure names are encoded as densely packed 6 bit values, length prefixed ( The children can be a combination of either attribute names, or child tags. Attribute names are represented by
+ the byte Attributes (type All valid IDs, and their respective type, are listed in the following table. The bucket column here will be
+ used later when unpacking the main data, so we need not worry about it for now, but be warned it exists and is
+ possibly the least fun part of this format. Strings should be encoded and decoded according to the encoding specified in the packet header. Null termination is optional, however should be stripped during decoding. All of these IDs are The full table for these values can be found in libavs. This table contains the names of every tag, along
+ with additional information such as how many bytes that data type requires, and which parsing function
+ should be used. While I'm not totally sure, I have a suspicion this type is used internally as a pseudo-type. Trying to
+ identify its function as a parsable type has some obvious blockers: All of the types have convenient If we have a look inside the function that populates node sizes ( In the same function, however, we can find a second (technically first) check for the array type. This seems to suggest that internally arrays are represented as a normal node, with the Also of interest from this snippet is the fact that This is where all the actual packet data is. For the most part, parsing this is the easy part. We traverse our
+ schema, and read values out of the packet according to the value indicated in the schema. Unfortunately, konami
+ decided all data should be aligned very specifically, and that gaps left during alignment should be backfilled
+ later. This makes both reading and writing somewhat more complicated, however the system can be fairly easily
+ understood. Firstly, we divide the payload up into 4 byte chunks. Each chunk can be allocated to either store individual
+ bytes, shorts, or ints (these are the buckets in the table above). When reading or writing a value, we first
+ check if a chunk allocated to the desired type's bucket is available and has free/as-yet-unread space within it.
+ If so, we will store/read our data to/from there. If there is no such chunk, we claim the next unclaimed chunk
+ for our bucket. For example, imagine we write the sequence While this might seem a silly system compared to just not aligning values, it is at least possible to intuit that it helps reduce wasted space. It should be noted that any variable-length structure, such as a string or an array, claims all chunks it encroaches on for the While the intuitive way to understand the packing algorithm is via chunks and buckets, a far more efficient implementation can be made that uses three pointers. Rather than try to explain in words, hopefully this python implementation should suffice as explanation:
+
+
+
+
+ Contents
+ Transport layer
+ Packet format
+ Benami/Konami eAmuse API
+ Contents
+
+
+
+
+
+
+
+
+
+
+ Contents
+ Transport layer
+ Packet format
+ Packet format
+
+
*Newer games use JSON, but this page is about XML.The XML format
+
+ __type
attribute that identifies what type it is. Array types
+ have a __count
attribute indicating how many items are in the array. Binary blobs additionally have
+ a __size
attribute indicating their length (this is notably not present on strings, however).
+ <?xml version='1.0' encoding='UTF-8'?>
+<call model="KFC:J:A:A:2019020600" srcid="1000" tag="b0312077">
+ <eventlog method="write">
+ <retrycnt __type="u32" />
+ <data>
+ <eventid __type="str">G_CARDED</eventid>
+ <eventorder __type="s32">5</eventorder>
+ <pcbtime __type="u64">1639669516779</pcbtime>
+ <gamesession __type="s64">1</gamesession>
+ <strdata1 __type="str" />
+ <strdata2 __type="str" />
+ <numdata1 __type="s64">1</numdata1>
+ <numdata2 __type="s64" />
+ <locationid __type="str">ea</locationid>
+ </data>
+ </eventlog>
+</call>
3u8
([(1, 2, 3), (4, 5, 6)]
) would look like
+ this
+ <demo __type="3u8" __count="2">1 2 3 4 5 6</demo>
Packed binary overview
+
+
+
+
+
+
+
+ 0
+ 1
+ 2
+ 3
+ 4
+ 5
+ 6
+ 7
+ 8
+ 9
+ 10
+ 11
+ 12
+ 13
+ 14
+ 15
+
+
+ A0
+ C
+ E
+ ~E
+ Head length
+
+
+
+ Schema definition
+
+
+
+ FF
+ Align
+
+
+ Data length
+
+
+
+ Payload
+
+
+
+ Align
+ 0xA0
. Following this is the content byte, the encoding byte,
+ and then the 2's compliment of the encoding byte.
+
+
+
+
+
+ C
+ Content
+
+
+ 0x42
+ Compressed data
+
+
+ 0x43
+ Compressed, no data
+
+
+ 0x45
+ Decompressed data
+
+
+ 0x46
+ Decompressed, no data
+
+
+
+
+
+
+
+ E
+ ~E
+ Encoding name
+
+
+ 0x20
+ 0xDF
+ ASCII
+
+
+
+
+ 0x40
+ 0xBF
+ ISO-8859-1
+ ISO_8859-1
+
+
+
+ 0x60
+ 0x9F
+ EUC-JP
+ EUCJP
+ EUC_JP
+
+
+ 0x80
+ 0x7F
+ SHIFT-JIS
+ SHIFT_JIS
+ SJIS
+
+
+ 0xA0
+ 0x5F
+ UTF-8
+ UTF8
+
+ Source code details
+ <?xml version='1.0' encoding='??'?>
line in XML files.
+
+ char* xml_get_encoding_name(uint encoding_id) {
+ return ENCODING_NAME_TABLE[((encoding_id & 0xe0) >> 5) * 4];
+}
~E
isn't technically required, it acts as a useful assertion that the packet being
+ parsed is valid.The packet schema header
+
+
+
+
+
+
+
+ 0
+ 1
+ 2
+ 3
+ 4
+ 5
+ 6
+ 7
+ 8
+ 9
+ 10
+ 11
+ 12
+ 13
+ 14
+ 15
+
+
+ Type
+ nlen
+ Tag name
+
+
+
+ Attributes and children
+ FE
+ nlen
). The acceptable
+ alphabet is 0123456789:ABCDEFGHIJKLMNOPQRSTUVWXYZ_abcdefghijklmnopqrstuvwxyz
, and the packed values
+ are indecies within this alphabet.0x2E
followed by a length prefixed name as defined above. Child tags follow the above
+ format. Type 0x2E
must therefore be considered reserved as a possible structure type.0x2E
) represent a string attribute. Any other attribute must be defined as a child
+ tag. Is it notable that 0 children is allowable, which is how the majority of values are encoded.
+
+
+
+
+
+
+ ID
+ Bytes
+ C type
+ Bucket
+ XML names
+
+ ID
+ Bytes
+ C type
+ Bucket
+ XML names
+
+
+ 0x01
+ 0
+ void
+ -
+ void
+
+
+ 0x21
+ 24
+ uint64[3]
+ int
+ 3u64
+
+
+
+ 0x02
+ 1
+ int8
+ byte
+ s8
+
+
+ 0x22
+ 12
+ float[3]
+ int
+ 3f
+
+
+
+ 0x03
+ 1
+ uint8
+ byte
+ u8
+
+
+ 0x23
+ 24
+ double[3]
+ int
+ 3d
+
+
+
+ 0x04
+ 2
+ int16
+ short
+ s16
+
+
+ 0x24
+ 4
+ int8[4]
+ int
+ 4s8
+
+
+
+ 0x05
+ 2
+ uint16
+ short
+ s16
+
+
+ 0x25
+ 4
+ uint8[4]
+ int
+ 4u8
+
+
+
+ 0x06
+ 4
+ int32
+ int
+ s32
+
+
+ 0x26
+ 8
+ int16[4]
+ int
+ 4s16
+
+
+
+ 0x07
+ 4
+ uint32
+ int
+ u32
+
+
+ 0x27
+ 8
+ uint8[4]
+ int
+ 4s16
+
+
+
+ 0x08
+ 8
+ int64
+ int
+ s64
+
+
+ 0x28
+ 16
+ int32[4]
+ int
+ 4s32
+ vs32
+
+
+ 0x09
+ 8
+ uint64
+ int
+ u64
+
+
+ 0x29
+ 16
+ uint32[4]
+ int
+ 4u32
+ vs32
+
+
+ 0x0a
+ prefix
+ char[]
+ int
+ bin
+ binary
+
+ 0x2a
+ 32
+ int64[4]
+ int
+ 4s64
+
+
+
+ 0x0b
+ prefix
+ char[]
+ int
+ str
+ string
+
+ 0x2b
+ 32
+ uint64[4]
+ int
+ 4u64
+
+
+
+ 0x0c
+ 4
+ uint8[4]
+ int
+ ip4
+
+
+ 0x2c
+ 16
+ float[4]
+ int
+ 4f
+ vf
+
+
+ 0x0d
+ 4
+ uint32
+ int
+ time
+
+
+ 0x2d
+ 32
+ double[4]
+ int
+ 4d
+
+
+
+ 0x0e
+ 4
+ float
+ int
+ float
+ f
+
+ 0x2e
+ prefix
+ char[]
+ int
+ attr
+
+
+
+ 0x0f
+ 8
+ double
+ int
+ double
+ d
+
+ 0x2f
+ 0
+
+ -
+ array
+
+
+
+ 0x10
+ 2
+ int8[2]
+ short
+ 2s8
+
+
+ 0x30
+ 16
+ int8[16]
+ int
+ vs8
+
+
+
+ 0x11
+ 2
+ uint8[2]
+ short
+ 2u8
+
+
+ 0x31
+ 16
+ uint8[16]
+ int
+ vu8
+
+
+
+ 0x12
+ 4
+ int16[2]
+ int
+ 2s16
+
+
+ 0x32
+ 16
+ int8[8]
+ int
+ vs16
+
+
+
+ 0x13
+ 4
+ uint16[2]
+ int
+ 2s16
+
+
+ 0x33
+ 16
+ uint8[8]
+ int
+ vu16
+
+
+
+ 0x14
+ 8
+ int32[2]
+ int
+ 2s32
+
+
+ 0x34
+ 1
+ bool
+ byte
+ bool
+ b
+
+
+ 0x15
+ 8
+ uint32[2]
+ int
+ 2u32
+
+
+ 0x35
+ 2
+ bool[2]
+ short
+ 2b
+
+
+
+ 0x16
+ 16
+ int16[2]
+ int
+ 2s64
+ vs64
+
+ 0x36
+ 3
+ bool[3]
+ int
+ 3b
+
+
+
+ 0x17
+ 16
+ uint16[2]
+ int
+ 2u64
+ vu64
+
+ 0x37
+ 4
+ bool[4]
+ int
+ 4b
+
+
+
+ 0x18
+ 8
+ float[2]
+ int
+ 2f
+
+
+ 0x38
+ 16
+ bool[16]
+ int
+ vb
+
+
+
+ 0x19
+ 16
+ double[2]
+ int
+ 2d
+ vd
+
+ 0x38
+
+
+
+
+
+
+
+ 0x1a
+ 3
+ int8[3]
+ int
+ 3s8
+
+
+ 0x39
+
+
+
+
+
+
+
+ 0x1b
+ 3
+ uint8[3]
+ int
+ 3u8
+
+
+ 0x3a
+
+
+
+
+
+
+
+ 0x1c
+ 6
+ int16[3]
+ int
+ 3s16
+
+
+ 0x3b
+
+
+
+
+
+
+
+ 0x1d
+ 6
+ uint16[3]
+ int
+ 3s16
+
+
+ 0x3c
+
+
+
+
+
+
+
+ 0x1e
+ 12
+ int32[3]
+ int
+ 3s32
+
+
+ 0x3d
+
+
+
+
+
+
+
+ 0x1f
+ 12
+ uint32[3]
+ int
+ 3u32
+
+
+ 0x3e
+
+
+
+
+
+
+
+ 0x20
+ 24
+ int64[3]
+ int
+ 3s64
+
+
+ 0x3f
+
+
+
+
+
+ & 0x3F
. Any value can be turned into an array by setting the 7th bit
+ high (| 0x40
). Arrays of this form, in the data section, will be an aligned size: u32
+ immediately followed by size
bytes' worth of (unaligned!) values of the unmasked type.Source code details
+ Note about the
+ array
type:printf
-using helper functions that are used to emit them when
+ serializing XML. All except one.libavs-win32.dll:0x1000cf00
),
+ it has an explicit case, however is the same fallback as the default case.array
+ type, however when serializing it's converted into the array types we're used to (well, will be after the
+ next sections) by masking 0x40 onto the contained type.void
, bin
, str
,
+ and attr
cannot be arrays. void
and attr
make sense, however
+ str
and bin
are more interesting. I suspect this is because konami want to be able
+ to preallocate the memory, which wouldn't be possible with these variable length structures.
+ The data section
+
+ byte, int, byte, short, byte, int, short
. The final output should look like:
+
+
+
+
+
+
+ 0
+ 1
+ 2
+ 3
+ 4
+ 5
+ 6
+ 7
+ 8
+ 9
+ 10
+ 11
+ 12
+ 13
+ 14
+ 15
+
+
+ byte
+ byte
+ byte
+
+ int
+ short
+ short
+ int
+ int
bucket, disallowing the storage of bytes or shorts within them.Implementing a packer
+ class Packer:
+ def __init__(self, offset=0):
+ self._word_cursor = offset
+ self._short_cursor = offset
+ self._byte_cursor = offset
+ self._boundary = offset % 4
+
+ def _next_block(self):
+ self._word_cursor += 4
+ return self._word_cursor - 4
+
+ def request_allocation(self, size):
+ if size == 0:
+ return self._word_cursor
+ elif size == 1:
+ if self._byte_cursor % 4 == self._boundary:
+ self._byte_cursor = self._next_block() + 1
+ else:
+ self._byte_cursor += 1
+ return self._byte_cursor - 1
+ elif size == 2:
+ if self._short_cursor % 4 == self._boundary:
+ self._short_cursor = self._next_block() + 2
+ else:
+ self._short_cursor += 2
+ return self._short_cursor - 2
+ else:
+ old_cursor = self._word_cursor
+ for _ in range(math.ceil(size / 4)):
+ self._word_cursor += 4
+ return old_cursor
+
+ def notify_skipped(self, no_bytes):
+ for _ in range(math.ceil(no_bytes / 4)):
+ self.request_allocation(4)
Contents | +Transport layer | +Packet format | +
eAmuse packets are sent and received over HTTP (no S), with requests being in the body of POST
requests, and replies being in the, well, reply.
The packets are typically both encrypted and compressed. The compression format used is indicated by the X-Compress
header, and valid values are
none
lz77
Encryption is performed after compression, and uses RC4. RC4 is symmetric, so decryption is performed the same as encryption. That is, packet = encrypt(compress(data))
and data = decompress(decrypt(data))
.
Encryption is not performed using a single static key. Instead, each request and response has its own key that is generated.
+These keys are generated baesd on the X-Eamuse-Info
header.
This header loosely follows the format 1-[0-9a-f]{8}-[0-9a-f]{4}
. This corresponds to [version]-[serial]-[salt]
. TODO: Confirm this
Our per-packet key is then generated using md5(serial | salt | KEY)
. Identifying KEY
is left as an exercise for the reader, however should not be especially challenging.
Packets are compressed using lzss. The compressed data structure is a repeating cycle of an 8 bit flags byte, followed by 8 values. Each value is either a single literal byte, if the corresponding bit in the preceeding flag is high, or is a two byte lookup into the window.
+The lookup bytes are structured as pppppppp ppppllll
where p
is a 12 bit index in the window, and l
is a 4 bit integer that determines how many times to repeat the value located at that index in the window.
The exact algorithm used for compression is not especially important, as long as it follows this format. One can feasibly perform no compression at all, and instead insert 0xFF
every 8 bytes (starting at index 0), to indicate that all values are literals. While obviously poor for compression, this is an easy way to test without first implementing a compressor.