TXTH is a simple text file that uses text commands to simulate a header for files unsupported by vgmstream, mainly headerless audio.
When an unsupported file is loaded (for instance "bgm01.snd"), vgmstream tries to find a TXTH header in the same dir, in this order:
-`(filename.ext).txth`
-`.(ext).txth`
-`.txth`
If found and parsed correctly (the .txth may be rejected if incorrect commands are found) vgmstream will try to play the file as described. Extension must be accepted/added to vgmstream (plugins like foobar2000 only load extensions from a whitelist in formats.c), or one could rename to any supported extension (like .vgmstream), or leave the file extensionless.
You can also use `.(sub).(ext).txth` (if the file is `filename.sub.ext`), to allow mixing slightly different files in the same folder. The `sub` part doesn't need to be an extension, for example:
-`001.1ch.str`, `001.1ch.str` may use `.1ch.txth`
-`003.2ch.str`, `003.2ch.str` may use `.2ch.txth`
- etc
## Example of a TXTH file
For an unsupported `bgm01.vag` this would be a simple TXTH for it:
```
codec = PSX #data uses PS-ADPCM
sample_rate = @0x10$2 #get sample rate at offset 0x10, 16 bit value
channels = @0x14#get number of channels at offset 14
interleave = 0x1000 #fixed value
start_offset = 0x100 #data starts after exactly this value
num_samples = data_size #find automatically number of samples in the file
A text file with the above commands must be saved as `.vag.txth` or `.txth` (preferably the former), notice it starts with a "." (dot). On some Windows versions files starting with a dot need to be created by appending a dot at the end when renaming: `.txth.`
While the main point is playing the file, many of TXTH's features are aimed towards keeping original data intact, for documentation and preservation purposes; try leaving data as untouched as possible and consider how the game plays the file, as there is a good chance some feature can mimic it.
## Available commands
The file is made of lines with `key = value` commands describing a header. Commands are all case sensitive and spaces are optional: `key=value`, `key = value`, and so on are all ok. Comments start with # and can be inlined.
The parser is fairly simple and may be buggy or unexpected in some cases. The order of keys is variable but some things won't work if others aren't defined (ex. bytes-to-samples may not work without channels or interleave) or need to be done in a certain order (due to technical reasons) as explained below.
To get a file playing you need to correctly set, at least: `codec` and sometimes `interleave`, `sample_rate`, `channels` and `num_samples`, or use the "subfile" feature.
### VALUES
The following can be used in place of `(value)` for `(key) = (value)` commands.
-`(number)`: constant number in dec/hex, unsigned (no +10 or -10).
* Examples: `44100, 40, 0x40 (decimal=64)`
-`(offset)`: read a value at offset inside the file, format being `@(number)[:LE|BE][$1|2|3|4]`
*`@(number)`: offset of the value (required)
* if `base_offset` is defined this value is modified (see later)
*`:LE|BE`: value is little/big endian (optional, defaults to LE)
*`$1|2|3|4`: value has size of 8/16/24/32 bit (optional, defaults to 4)
* Example: `@0x10:BE$2` means `get big endian 16b value at 0x10`
-`(field)`: uses current value of some fields. Accepted strings:
# - PCFX: 0=standard, 1='buggy encoder' mode, 2/3=same as 0/1 but with double volume
# - PCM4|PCM4_U: 0=low nibble first, 1=high nibble first
# - others: ignored
codec_mode = (variation)
```
#### (deprecated) VALUE MODIFIERS
*Use inline math instead of this.*
Changes next read to: `(key) = (value) */+- value_(op)`. Set to 0 when done using, as it affects ANY value. Priority is as listed.
```
value_mul|value_* = (value)
value_div|value_/ = (value)
value_add|value_+ = (value)
value_sub|value_- = (value)
```
#### INTERLEAVE / FRAME SIZE [REQUIRED depending on codec]
This value changes how data is read depending on the codec:
- For mono/interleaved codecs it's the amount of data between channels, and while optional (defaults described in the "codec" section) you'll often need to set it to get proper sound.
- For codecs with custom frame sizes (MSADPCM, MS-IMA, ATRAC3/plus) means frame size and is required.
- Interleave 0 means "stereo mode" for codecs marked as "mono/stereo", and setting it will usually force mono-interleaved mode.
Special values:
-`half_size`: sets interleave as data_size / channels automatically
```
interleave = (value)|half_size
```
#### INTERLEAVE IN THE LAST BLOCK
In some files with interleaved data the last block (`interleave * channels`) of data is smaller than normal, so `interleave` is smaller for that block. Setting this fixes decoding glitches at the end.
Note that this doesn't affect files with padding data in the last block (as the `interleave` itself is constant).
Special values:
-`auto`: calculate based on channels, interleave and data_size/start_offset
```
interleave_last = (value)|auto
```
#### ID VALUES
Validates that `id_value` (normally set as constant value) matches value read at `id_offset`. The file will be rejected and won't play if values don't match.
Can be redefined several times, it's checked whenever a new id_offset is found.
```
id_value = (value)
id_offset = (value)
```
#### NUMBER OF CHANNELS [REQUIRED]
```
channels = (value)
```
#### MUSIC FREQUENCY [REQUIRED]
```
sample_rate = (value)
```
#### DATA START
Where encoded data actually starts, after the header part. Defaults to 0.
```
start_offset = (value)
```
#### DATA SIZE
Special variable that can be used in sample values. Defaults to `(file_size - start_offset)`, re-calculated when `start_offset` is set. With multiple subsongs, `block_size` or padding are set this it's recalculated as well.
If data_size is manually set it stays constant and won't be auto changed.
```
data_size = (value)
```
#### DATA PADDING
Some files have extra padding at the end that is meant to be ignored. This adjusts the padding in `data_size`, manually or auto-calculated.
Special values (for PS-ADPCM only):
-`auto`: discards null frames
-`auto-empty`: discards null and 'empty' frames (for games with weird padding)
```
padding_size = (value)|auto|auto-empty
```
#### SAMPLE MEANINGS
Modifies the meaning of sample fields when set *before* them.
Accepted values:
-`samples`: exact sample (default)
-`bytes`: automatically converts bytes/offset to samples (applies after */+-& modifiers)
-`blocks`: same as bytes, but value is given in blocks/frames
* Value is internally converted from blocks to bytes first: `bytes = (value * interleave*channels)`
Some codecs can't convert bytes-to-samples at the moment: `FFMPEG`. For XMA1/2, bytes does special parsing, with loop values being bit offsets within data (as XMA has a peculiar way to loop).
Force loop on or off, as loop start/end may be defined but not used. If not set, by default it loops when loop_end_sample is defined and less than num_samples.
For XMA1/2 + sample_type=bytes it means loop subregion, if read after loop values.
For other codecs its added to loop start/end, if read before loop values (a format may rarely have rough loop offset/bytes, then a loop adjust in samples).
```
loop_adjust = (value)
```
#### ENCODER DELAY
Beginning samples to skip, a.k.a. priming samples or encoder delay, that some codecs use to "warm up" the decoder. This is needed for proper gapless support.
DSP needs a "coefs" list to decode correctly. These are 8*2 16-bit values per channel, starting from `coef_offset`.
Usually each channel uses its own list, so we may need to set separation per channel, usually 0x20 (16 values * 2 bytes). So channel N coefs are read at `coef_offset + coef_spacing * N`
While the coef table is almost always included per-file, some games have their coef table in the executable or precalculated somehow. You can set inline coefs instead of coef_offset. Format is a long string of bytes (optionally space-separated) like `coef_table = 0x1E02DE01 3C0C0EFA ...`. You still need to set `coef_spacing` and `coef_endianness` though.
```
coef_offset = (value)
coef_spacing = (value)
coef_endianness = BE|LE|(value)
coef_table = (string)
```
#### ADPCM STATE
Some ADPCM codecs need to set up their initial or "history" state, normally one or two 16-bit PCM samples per channel, starting from `hist_offset`.
Usually each channel uses its own state, so we may need to set separation per channel.
State values can be little or big endian (usually BE for DSP), set `hist_endianness` directly or in an offset value where ´0=LE, >0=BE´.
Normally audio starts with silence or hist samples are set to zero and can be ignored, but it does affect a bit resulting output.
Currently used by DSP.
```
hist_offset = (value)
hist_spacing = (value)
hist_endianness = BE|LE|(value)
```
#### HEADER/BODY SETTINGS
Changes internal header/body representation to external files.
TXTH commands are done on a "header", and decoding on "body". When loading an unsupported file it becomes the "base" file
that loads the .txth, and is both header and body.
You can alter those, mainly for files that split header and body in separate files (load base file and txth sets header on another file). It's also possible to load the .txth directly with a set body, as a sort of "reverse TXTH" (useful with bigfiles, as you could have one .txth per song).
Allowed values:
- (filename): open any file, subdirs also work (dir/filename)
- *.(extension): opens with same name as the "base" file (the one you open, not the .txth) plus another extension
- null: unloads file and goes back to defaults (body/header = base file).
Sets the number of subsongs in the file, adjusting reads per subsong N: `value = @(offset) + subsong_spacing*N`. Number/constants values aren't adjusted though.
Instead of `subsong_spacing` you can use `subsong_offset` (older alias).
Mainly for bigfiles with consecutive headers per subsong, set subsong_offset to 0 when done as it affects any reads. The current subsong number is handled externally by plugins or TXTP.
`name_offset` can be a (number) value, but being an offset it's also adjusted by `subsong_spacing`. If you need to point to some absolute offset (for example a subsong pointings to name in another table) that doesn't depend on subsong (must not be changed by `subsong_spacing`), use `name_offset_absolute`.
Tells TXTH to parse a full file (ex. an Ogg) at `subfile_offset`, with size of `subfile_size` (defaults to `file size - subfile_offset` if not set). This is useful for files that are just container of other files, so you don't have to remove the extra data (since it could contain useful stuff like loop info).
Internal subfile extension can be changed to `subfile_extension` if needed, as vgmstream won't accept unknown extensions (for example if your file uses .vgmstream or .pogg you may need to set subfile_extension = ogg).
Setting any of those three will trigger this mode (it's ok to set offset 0). Once triggered most fields are ignored, but not all, explained later. This will also set some values like `channels` or `sample_rate` if not set for calculations/convenience.
```
subfile_offset = (value)
subfile_size = (value)
subfile_extension = (string)
```
#### CHUNK DEINTERLEAVING
Some files interleave data chunks, for example 3 stereo songs pasted together, alternating 0x10000 bytes of data each. These settings allow vgmstream to play one of the chunks while ignoring the rest (read 0x10000 data, skip 0x10000*2).
File is first "dechunked" then played with using other settings (`start_offset` would point within the internal dechunked" file). It can be used to remove garbage data that affects decoding, too.
You need to set:
-`chunk_count`: total number of interleaved chunks (ex. 3=3 interleaved songs)
-`chunk_number`: first chunk to start (ex. 1=0x00000, 2=0x10000, 3=0x20000...)
* If you set `subsong_count` first `chunk_number` will be auto-set per subsong (subsong 1 starts from chunk number 1, subsong 2 from chunk 2, etc)
-`chunk_start`: absolute offset where chunks start (normally 0x00)
-`chunk_size`: amount of data in a single chunk (ex. 0x10000)
For fine-tuning you can optionally set (before `chunk_size`, for reasons):
-`chunk_header_size`: header to skip before chunk data (part of chunk_size)
-`chunk_data_size`: actual data size (part of chunk_size, rest is header/padding)
So, if you set size to 0x1000, header_size 0x100, data_size is implicitly 0xF00, or if size is 0x1000 and data_size 0x800 last 0x200 is ignored padding. Use combinations of the above to make vgmstream "see" only actual codec data.
```
chunk_count = (value)
chunk_number = (value)
chunk_start = (value)
chunk_header_size = (value)
chunk_data_size = (value)
chunk_size = (value)
```
#### NAME TABLE
Some games have headers for all files pasted together separate from the actual data, but this order may be hard-coded or even alphabetically ordered by filename. In those cases you can set a "name table" that assigns constant values (one or many) to filenames. This table is loaded from an external text file (for clarity) and can be set to any name, for example `name_table = .names.txt`
```
name_table = (filename)
```
Inside the table you define lines mapping a filename to a bunch of values, in this format:
```
# base definition
(filename1): (value)
...
# may put multiple comma-separated values, spaces are ok
Then I'll find your current file name, and you can then reference its numbers from the list as a `name_value` field, like `base_offset = name_value`, `start_offset = 0x1000 + name_value1`, `interleave = name_value5`, etc. `(filename)` can be with or without extension (like `bgm01.vag` or just `bgm01`), and if the file's name isn't found it'll use default values, and if those aren't defined you'll get 0 instead. Being "values" they can use math or offsets too (`bgm05: 5*0x010`).
While you can put anything in the values, this feature is meant to be used to store some number that points to the actual data inside a real multi-header, that could be set with `header_file`. If you feel the need to store many constant values per file, there is good chance it can be done in some better, simpler way.
You can set a default offset that affects next `@(offset)` reads making them `@(offset + base_offset)`, for cleaner parsing.
This is particularly interesting when combined with offsets to some long value. For example instead of `channels = @0x714` you could set `base_offset = 0x710, channels = @0x04`. Or values from the `name_table`, like `base_offset = name_value, channels = @0x04`.
It also allows parsing formats that set offsets to another offset, by "chaining" `base_offset`. With `base_offset = @0x10` (pointing to `0x40`) then `base_offset = @0x20`, it reads value at `0x60`. Set to 0 when you want to disable/reset the chain: `base_offset = @0x10` then `base_offset = 0` then `base_offset = @0x20` reads value at `0x20`
Most commands are evaluated and calculated immediatedly, every time they are found. This is by design, as it can be used to adjust and trick for certain calculations.
It makes TXTHs a bit harder to follow, as they are order dependant, but otherwise it's hard to accomplish some things or others become ambiguous.
For example, normally you are given a data_size in bytes, that can be used to calculate num_samples for all channels.
```
channels = 2
sample_type = bytes
num_samples = @0x10#calculated from data_size
```
But sometimes this size is for a single channel only (even though the file may be stereo). You can set temporally change the channel number to force a correct calculation.
```
channels = 1 #not the actual number of channels
sample_type = bytes
num_samples = @0x10#calculated from channel_size
channels = 2 #change once calculations are done
```
You can also use:
```
channels = 2
sample_type = bytes
num_samples = @0x10 * channels # resulting bytes is transformed to samples
```
Do note when using special values/strings like `data_size` in `num_samples` and `loop_end_samples` they must be alone to trigger.
```
data_size = @0x100
num_samples = data_size * 2 # doesn't tranform bytes-to-samples (do it before? after?)
sample_rate = 0x04 # sample rate is the same for all subsongs
# Nth subsong ch: 0x04+0x00*N: 0x08
```
### Math
Sometimes header values are in "sectors" or similar concepts (typical in DVD games), and need to be adjusted to a real value using some complex math:
```
sample_type = bytes
start_offset = @0x10* 0x800 # 0x15 * DVD sector size, for example
```
You can use `+-*/&` operators, and also certain fields' values:
```
num_samples = @0x10 * channels # byte-to-samples of channel_size
```
`data_size` is a special value for `num_samples` and `loop_end_sample` and will always convert as bytes-to-samples, though.
Priority is left-to-right. Do add brackets though, they are accounted for and if they are implemented in the future your .txth *will* break with impunity.
```
# normal priority
data_size = @0x10 * 0x800 + 0x800
# also works
data_size = (@0x10 + 1) * 0x800
# same as above but don't do this
# (may become @0x10 + (1 * 0x800) in the future
data_size = @0x10 + 1 * 0x800
# doesn't work at the moment, so reorder as (1 * 0x800) + @0x10
data_size = @0x10 + (1 * 0x800)
# fails, wrong bracket count
data_size = (@0x10 + 1 * 0x800
# fails, wrong bracket count
data_size = )@0x10 + 1 * 0x800
```
If a TXTH needs too many calculations it may be better to implement directly in vgmstream though, consider reporting.
### Modifiers
Remnant of simpler math (priority is fixed to */+-), *shouldn't be needed anymore*.
```
value_multiply = 0x800
start_offset = @0x10
value_multiply = 0
```
```
value_add = 1
channels = @0x08
value_add = 0
value_multiply = channels
sample_type = bytes
num_samples = @0x10
value_multiply = 0
```
```
value_add = 0x10
value_mul = 0x800
start_offset = @0x10
```
### Subfiles
Sometimes a file is just a wrapper for another common format. In those cases you can tell TXTH to just play the internal format:
```
subfile_offset = 0x20 # tell TXTH to parse a full file (ex. .ogg) at this offset
subfile_size = @0x10 # defaults to (file size - subfile_offset) if not set
subfile_extension = ogg # may be ommited if subfile extension is the same
# many fields are ignored
codec = PCM16LE
interleave = 0x1000
channels = 2
# a few fields are applied
sample_rate = @0x08
num_samples = @0x10
loop_start_sample = @0x14
loop_end_sample = @0x18
```
Most fields can't be changed after parsing since doesn't make much sense technically, as the parsed subfile should supply them. You can set them to use bytes-to-samples conversions, though.
```
# parses subfile at start with some num_samples
subfile_offset = 0x20
# force recalculation of num_samples
codec = PSX
start_offset = 0x40
num_samples = data_size
```
### Chunks
Chunks affect some values (padding size, data size, etc) and are a bit sensitive to order at the moment, due to technical complexities:
```
# Street Fighter EX3 (PS2)
# base config is defined normally
codec = PSX
sample_rate = 44100
channels = 2
interleave = 0x8000
# set subsong number instead of chunk_number for subsongs
subsong_count = 26
#chunk_number = 1
chunk_start = 0
chunk_size = 0x10000
chunk_count = 26
# after setting chunks (sizes vary when 'dechunking')