Section 5: SFe file format structure
5.1 File format extensions
The file format extension to use for SFe files is generally .sf4
:
.sf2
is avoided because SFe files are not SoundFonts, but simply banks that use formatting that is very similar to legacy SF2.04..sf3
is avoided because some Werner SF3 bank players may not support SFe features.
Despite .sf4
also being used by cognitone-formatted banks, these banks never existed due to a (fatal bug)[https://github.com/cognitone/sf2convert/issues/1] in cognitone’s sf2convert program.
The presence of a legacy SF file extension such as .sf2
or .sf3
does not necessarily denote a legacy SF bank! SFe-compatible programs are expected to parse the ifil
value and ISFe-list
sub-chunk to properly load the bank, regardless of the extension.
The file type should be referred to as SFe bank
and should not be referred to as SoundFont
or anything containing SoundFont
. SFSPEC24.PDF
states that files with additional chunks don’t conform to the legacy SF2.04 standard.
SFe currently does not use a MIME type.
5.2 General RIFF-type format structures (Update 17)
RIFF-type formats are the file format used in legacy SF2.04, Werner SF3 and SFe standards. There are a few different RIFF-type format structures:
- RIFF is the basic version with 32-bit chunk headers, and is used in legacy SF2.04 and Werner SF3.
- RIFS is a simple extension to RIFF to allow 64-bit chunk headers, and is simpler than RIFF64. It has these changes from RIFF:
- Initial FourCC is
RIFS
instead ofRIFF
. - 8 byte chunk size instead of 4 byte.
- Initial FourCC is
- Both RIFF and RIFS are little-endian formats. (Update 17)
RIFF-type formats are created in building blocks known as “chunks.”
Chunks are defined using this structure:
ckID
: type of data in chunk, equal to a unique 4-character code (FourCC), listed above.ckSize
: size of chunk, 4 bytes in RIFF and 8 bytes in RIFS (Update 17)ckDATA[ckSize]
: the data inside the chunks, including pad bytes.
Chunks can be further divided into “sub-chunks.”
Orders of chunks in all SFe banks are strictly defined, as in legacy SF2.04, and should be kept to, except for TSC mode.
5.3 Chunk header types
In SFe, there are different chunk header types that are used in the format. These correspond to different RIFF-type formats. Currently, there are two defined chunk header types:
- 32-bit static
- This is the same as legacy SF
- This corresponds to RIFF.
- The FourCC used is
RIFF
.
- 64-bit static (Update 17)
- This corresponds to RIFS.
- The FourCC used is
RIFS
. - To prevent loading by incompatible players, the
sfbk
fourcc is replaced withsfen
(SF-enhanced)
Future versions of SFe may define different chunk header types.
Figure 4: 32-bit static versus 64-bit static headers.
5.4 RIFF error checking features
RIFF-type formats have error checking features about:
- The size of the file
- The length of the chunks
- The length of the sub-chunks
Using this information, it is possible to check for damage to an SF(e) file.
5.5 Structure of the SFe 4 file format
5.5.1 SFe 4 file format structure outline
An SFe 4 file consists of:
RIFF
chunk (main chunk) - this changes depending on the chunk header type to be used.sfbk
ascii string - usesfen
with 64-bit chunk headersLIST
INFO
ascii string- Sub-chunks inside
INFO-list
in legacy SF2.04 -ifil
,isng
, etc. LIST
ISFe
ascii string- Chunks listed in section 5.5.2
LIST
sdta
ascii string- Sub-chunks inside
sdta-list
in legacy SF2.04 -smpl
,sm24
LIST
pdta
ascii string- Sub-chunks inside
pdta-list
in legacy SF2.04
Only SFe-specific chunks are listed for brevity. In this section, assume that any non-listed chunk is identical to SF2.04.
Figure 5: Legacy SF2.04 vs SFe 4.0 structures.
5.5.2 ISFe-list information
The ISFe-list
sub-chunk includes many different sub-chunks to show information about SFe-specific features. Generally, we use the ISFe-list
sub-chunk to make it clearer that this kind of information is SFe-specific.
Due to possible compatibility constraints, the ISFe-list
sub-chunk is found inside the INFO-list
sub-chunk, rather than as a fourth RIFF chunk. At least a few legacy sound cards (notably the SB X-Fi) do not error out on the inclusion of a fourth separate chunk. Officially, according to SFSPEC24.PDF
, additional sub-chunks mean that an SFe file is not conformant to the legacy SF2.04 standard.
The ISFe-list
sub-chunk currently contains these sub-chunks as of version 4.0:
SFty
chunk (UTF-8 string)SFvx
chunk (46 bytes)wSFeSpecMajorVersion
(WORD
)wSFeSpecMinorVersion
(WORD
)achSFeSpecType[20]
(CHAR
)wSFeDraftMilestone
(WORD
)achSFeFullVersion[20]
(CHAR
)
flag
chunk (multiple of 6 bytes)byBranch
(BYTE
)byLeaf
(BYTE
)dwFlags
(DWORD
)
5.5.3 Changed and removed chunks
In the pdta-list
sub-chunk, the wBank
chunk has been replaced by byBankMSB
and byBankLSB
. These are functionally the same, but expressed in a different way to make the specification more readable.
5.5.4 String encoding
For most string fields, the encoding to use is now UTF-8 instead of ASCII. Mojibake may result on legacy SF players when using characters unsupported by ASCII. Because some characters use multiple bytes in UTF-8, you may not be able to use as many multibyte characters compared to single-byte characters.
This applies to the isng
, INAM
, irom
, ICRD
, IENG
, IPRD
, ICOP
, ICMT
, ISFT
chunks from legacy SF2.04, as well as the achPresetName
(PHDR
), achInstrumentName
(INST
) and achSampleName
(SHDR
) fields.
5.6 INFO-list sub-chunk
5.6.1 ifil sub-chunk
The value of the ifil
sub-chunk is equal to 2.1024
or 3.1024
when using a 32-bit header, and 4.0
when using a 64-bit header. 2.1024
and 3.1024
are interchangeable. (Update 20)
The size must be exactly four bytes. Reject files with an ifil
sub-chunk that isn’t four bytes as Structurally Unsound.
If the ifil
sub-chunk is missing, either:
- Assume version
2.1024
,3.1024
or4.0
. (Update 8) - Reject the file as Structurally Unsound.
5.6.2 Versioning rules
In SFe 4.0, new versioning rules are used to replace the old ones.
The value of wMajor
increases every time a change is made to the format that makes it incompatible with existing players.
- There will be at least 6 months between the first draft milestone of a new
wMajor
version and the release of the final specification. - Older
wMajor
versions must be supported, either directly or via translation to the latest version. - We strive to minimize the number of these updates whenever possible in favor of updates that are backward compatible.
The value of wMinor
increases every time a change is made to the SFvx
sub-chunk while retaining backwards compatibility.
- These updates generally have only one or two draft milestones before the final specification releases.
- Feature updates in these versions are smaller.
The specification type used is found in the ISFe-list
sub-chunk.
5.6.3 Specification versions to ifil values
wSFeSpecMajorVersion | wSFeSpecMinorVersion | ifil (wMajor) | ifil (wMinor) |
---|---|---|---|
4 | 0 | 2 or 3 (32-bit header) 4 (64-bit header) | 1024 (32-bit header) 0 (64-bit header) |
(Changed for Update 8)
5.6.4 isng sub-chunk
A new default isng sub-chunk value is used in SFe: SFe 4
.
- SFe 4.0 players should recognize this and remove the default velocity related filter used in legacy SF2.04.
- In the case of a missing isng chunk, files with an ifil sub-chunk with
wMajor
= 2 or 3 andwMinor
>=1024
, orwMajor
>= 4, assume an isng sub-chunk value ofSFe 4
. Don’t assumeEMU8000
.
Additionally, UTF-8 is now used instead of ASCII, and the length limit is removed.
- The
isng
sub-chunk contains a UTF-8 string of any length. - Example of value:
SFe 4
(with appropriate zero bytes).
Reject anything not terminated with a zero byte, and assume the value SFe 4
. Do NOT assume EMU8000
by default.
5.6.5 List of sound engines
Creative/E-mu
Sound engine name | isng value | Creative/E-mu SF version | Bit depth | Sound cards |
EMU8000 | EMU8000 | 1.0, 1.5, 2.00 | 16 bit | AWE32, AWE64 |
EMU10K1 | E-mu 10K1 | 2.01 | 16 bit | SB Live! |
EMU10K2 | E-mu 10K2 | 2.01 | 16 bit | SB Audigy |
EMU20K1, EMU20K2 | X-Fi | 2.04 | 24 bit | SB X-Fi |
SFe
Sound engine name | isng value | SFe version | Bit depth |
SFe 4 | SFe 4 | 4.0 | 8 bit, 16 bit, 24 bit, 32 bit, 64 bit (Update 11) |
5.6.6 ICRD sub-chunk
To ease the creation of library management systems that are compatible with multiple languages, the naming convention for the ICRD
sub-chunk has been changed.
The value of ICRD
must now be compliant with the ISO-8601 standard. There are two valid formats:
-
Date only: for example
2025-02-08
-
Date and time: for example
2025-02-08T02:28:00Z
Library management systems should be able to read the value of the ICRD
sub-chunk and show the date (and time if applicable) in the correct language in a field that can be sorted.
If the value of the ICRD
sub-chunk is missing or not in any of the above two valid formats, the program may either:
- attempt to parse the value (if the chunk is present)
- ignore the value (if present) and show an “unknown” error on the date (and time) field
- overwrite the value with the current date (and time) if the program is an editor
The program must NOT reject a file with a missing or invalid ICRD
sub-chunk as Structurally Unsound.
5.6.7 INAM, IENG, IPRD, ICOP, ICMT and ISFT sub-chunks
These sub-chunks are mostly the same as in legacy SF2.04, but UTF-8 is now used instead of ASCII, and the length limit is removed.
Reject anything not terminated with a zero byte. Do NOT reject the file as Structurally Unsound.
5.6.8 irom and iver sub-chunks
Read the legacy SF2.04 specification for info on how to use ROM samples.
The ROM emulator should be implemented in SFe programs.
5.6.9 SFty sub-chunk
The SFty
sub-chunk is nested inside the ISFe-list
sub-chunk. It is required and contains a case-sensitive UTF-8 string with even length identifying the type of format used in SFe. Its value is used by SFe-compatible players to assist in loading banks by telling the program what variant of SFe to load a bank as. (Update 13)
The defined values of the SFty
chunk are:
- the 14 bytes representing
SFe standard
as 12 UTF-8 characters followed by two zero bytes. (Update 12) - the 22 bytes representing
SFe standard with TSC
as 21 UTF-8 characters followed by one zero byte. (Update 12)
The field should not be longer than 22 bytes in SFe 4.0. (Update 12)
If the SFty
sub-chunk is missing or its contents are an undefined value or in an invalid format, other properties of the structure should be used to determine the variant of SFe that is in use. Do not assume SFe standard
; only use such a value when it is evident beyond a reasonable doubt that the file used is in the SFe standard
format. (Update 12)
5.6.10 SFvx sub-chunk
The SFvx
sub-chunk is nested inside the ISFe-list
sub-chunk. It is required and contains extended SFe version attributes. (Update 13)
It is always 46 bytes in length, containing data in the structure below:
struct SFeExtendedVersion
{
WORD wSFeSpecMajorVersion;
WORD wSFeSpecMinorVersion;
CHAR achSFeSpecType[20];
WORD wSFeDraftMilestone;
CHAR achSFeFullVersion[20];
};
The WORD
values wSFeSpecMajorVersion
and wSFeSpecMinorVersion
contain the SFe specification version, and are used to differentiate between different SFe versions as the value of ifil
only changes when the format of the SFvx
sub-chunk does so.
The case-sensitive UTF-8 character field achSFeSpecType
contains a specification type in UTF-8. For the purposes of this specification, the defined values are:
Final
for final specifications.Release Candidate
for release candidate specifications.Milestone
for draft specification milestones.Dev
for rolling draft specifications.
Assume Final
if contents are unknown.
The WORD
value wSFeDraftMilestone
contains the draft specification milestone or release candidate number that a bank was created to. This varies depending on the value of achSFeSpecType
.
The case-sensitive UTF-8 character field achSFeFullVersion
contains the full version string of the specification used, for example 4.0u3
.
If the SFvx
sub-chunk is missing or of an incorrect size, assume these values:
wSFeSpecMajorVersion
andwSFeSpecMinorVersion
correspond to the highest version declared in theflag
sub-chunk- If there is no valid
flag
sub-chunk, then assume the highest SFe version supported by the program.
- If there is no valid
achSFeSpecType=Final
wSFeDraftMilestone=0
achSFeFullVersion
corresponds to the other assumed values
The file may optionally be rejected as Structurally Unsound.
5.6.11 flag sub-chunk
The flag
sub-chunk is nested inside the ISFe-list
sub-chunk. It is required and contains the feature flags used by a bank. (Update 13)
It is always a multiple of 6 bytes in length, and contains at least 2 records (1 feature flag and a record at the end) according to the structure:
The BYTE
value byBranch
represents the branch of the feature. Branches correspond to types of features.
The BYTE
value byLeaf
represents the leaf of the feature. Leaves correspond to specific features.
The DWORD
value dwFlags
represents the feature flags themselves, which represent different parts of the feature. Depending on the byLeaf
value, it can be a number, a series of bytes, etc.
A tree value is a combination of a branch value and a leaf value, and is conventionally written in the format [branch]:[leaf]
with hexadecimal values, for example “feature flag 03:01
” refers to the feature flag with branch number 3
and leaf number 1
(SFe Compression sample compression formats). While the flag
sub-chunk uses a tree structure, it should be noted that no branch includes sub-branches; the branches only include leaves.
Branch numbers between 240 (F0
) and 255 (FF
) are private-use branches that will not be defined in the SFe specification itself, and are free to be used by programs.
An exhaustive list of feature flags and their corresponding tree values can be found in section 6.2.
The final record should never be accessed in normal usage, but its value of byBranch
and byLeaf
have strict values depending on the specification version. Any records after the terminal record or with a higher tree value combination (except for the defined private-use area) should be ignored.
If the flag
sub-chunk is missing or an incorrect size, then an effort should be made to recover the data. If data is not recoverable, then it can be rebuilt from the properties of the data in the rest of the bank. Do not reject the file as Structurally Unsound.
5.6.12 DMOD sub-chunk (Update 15)
The DMOD
sub-chunk is nested inside the INFO-list
sub-chunk. It is optional and contains the redefined default modulators of an SFe bank.
It is always a multiple of 10 bytes in length, and contains at least 2 records (1 feature flag and a record at the end) according to a structure identical to that of a PMOD
or IMOD
modulator list:
struct sfModList
{
SFModulator sfModSrcOper;
SFGenerator sfModDestOper;
SHORT modAmount;
SFModulator sfModAmtSrcOper;
SFTransform sfModTransOper;
};
The DMOD
sub-chunk replaces all default modulators at load time, and acts exactly like the default modulator list in legacy SF2.04.
If the DMOD
sub-chunk is present but without any modulators, then there are no default modulators. The legacy SF2.04 default modulator list is not reloaded.
Default modulator changes
The default modulators list used by SFe 4.0 is similar to that of legacy SF2.04, with these changes:
- Default modulator 2 (MIDI note-on velocity to filter cutoff) is optional.
- The use of the SF2.04 version of default modulator 2 is required.
- However, you may use the SF2.01 version if a legacy SF2.01 bank is detected.
- Default modulators 8 (MIDI CC91 to reverb effects send) and 9 (MIDI CC93 to chorus effects send) have an increased amount.
- Instead of 20.0%, 100.0% is used.
- You may use 20.0% when loading a legacy SF2.0x bank.
- Default modulators 11-16 are added.
Default Modulator 11 (MIDI poly pressure to vibrato LFO pitch depth)
Source enumeration: 0x000a - type: 0 (linear) - polarity: 0 (unipolar) - direction: 0 (forward) - control change: 0 (false) - index: 10 (poly pressure) Destination enumeration: vibrato LFO to pitch Amount: 50 cents / max excursion Amount source enumeration: 0x0 (no controller) Transform enumeration: 0 (linear)
Same as default modulator 3 but for poly pressure.
Default modulator 12 (MIDI CC92 to modulator LFO volume depth)
Source enumeration: 0x00dc - type: 0 (linear) - polarity: 0 (unipolar) - direction: 0 (forward) - control change: 1 (true) - index: 92 (tremolo depth) Destination enumeration: vibrato LFO to pitch Amount: 24 cB Amount source enumeration: 0x0 (no controller) Transform enumeration: 0 (linear)
This implements the MIDI tremolo control change function.
Default modulator 13 (MIDI CC73 to volume envelope attack)
Source enumeration: 0x0ac9 - type: 2 (convex) - polarity: 1 (bipolar) - direction: 0 (forward) - control change: 1 (true) - index: 73 (attack time) Destination enumeration: volume envelope attack Amount: 6000 timecents Amount source enumeration: 0x0 (no controller) Transform enumeration: 0 (linear)
This implements the MIDI attack time function.
Default modulator 14 (MIDI CC72 to volume envelope attack)
Source enumeration: 0x02c8 - type: 0 (linear) - polarity: 1 (bipolar) - direction: 0 (forward) - control change: 1 (true) - index: 72 (release time) Destination enumeration: volume envelope release Amount: 3600 timecents Amount source enumeration: 0x0 (no controller) Transform enumeration: 0 (linear)
This implements the MIDI release time function.
Default modulator 15 (MIDI CC74 to filter cutoff)
Source enumeration: 0x02ca - type: 0 (linear) - polarity: 1 (bipolar) - direction: 0 (forward) - control change: 1 (true) - index: 74 (brightness) Destination enumeration: filter cutoff Amount: 6000 absolute cents Amount source enumeration: 0x0 (no controller) Transform enumeration: 0 (linear)
This implements the MIDI brightness function.
Default modulator 16 (MIDI CC71 to filter resonance)
Source enumeration: 0x02c7 - type: 0 (linear) - polarity: 1 (bipolar) - direction: 0 (forward) - control change: 1 (true) - index: 71 (filter resonance) Destination enumeration: filter resonance Amount: 250 cB Amount source enumeration: 0x0 (no controller) Transform enumeration: 0 (linear)
This implements the MIDI filter resonance function.
5.6.13 xdta-list sub-chunk (Update 16)
The xdta-list
sub-chunk is nested inside the INFO-list
sub-chunk. It is optional and contains a replication of the pdta-list
sub-chunks in the same order. Refer to section 5.8 for more information.
Information about pdta-list
sub-chunk behaviour in xdta-list
is described in the corresponding sections.
SFe programs should only write an xdta-list
sub-chunk if standard pdta-list
limits are exceeded, but if it is unclear whether this is the case, this sub-chunk should be written anyway.
If the xdta-list
sub-chunk is present but doesn’t match the pdta-list
chunk, then the pdta-list
chunk takes precedence and the xdta-list
sub-chunk is ignored.
If the xdta-list
sub-chunk is missing, then the pdta-list
chunk should be loaded.
5.7 sdta-list chunk
5.7.1 smpl sub-chunk
This sub-chunk will now be present in most SFe files, as there is likely to be no ROM where samples can be read from. This does not include AWE ROM emulation. It works in an almost identical manner to legacy SF2.04, with these important differences:
- This contains one or more samples of audio in linearly coded 16-bit, signed words. These words are little-endian. (Update 5)
- No more leeway of 46 zero-valued samples is required after each sample.
- Before saving, SFe editors should insert this leeway. Otherwise, they might give a warning telling the user that loop and interpolation quality may be affected.
- If ROM samples are detected in SFe files, attempt to load them, even if this sub-chunk is missing.
- If this sub-chunk is missing, and no ROM samples are found, show a suitable error message.
- This sub-chunk requires containerisation with SFe files, but can use non-containerised samples in legacy SF2.0x. (Update 20)
5.7.2 About sdta structure modes (Update 19)
In this specification, four different types of sample data structures are described:
- SFe Compression (SFeC) mode (containerised)
- Uncompressed containerised (UCC) mode (containerised)
- Legacy 24-bit (sm24) mode (non-containerised)
- Legacy 16-bit mode (non-containerised)
Only containerised modes are allowed in SFe 4 banks. Programs must write containerised samples with no exceptions.
Non-containerised modes are described here only for legacy SF2.0x compatibility.
5.7.3 Containerised modes (Update 9)
What is a containerised mode? (Update 9)
Containerised sample data structure modes use containerisation, which is where each sample has metadata stored in a “container”. This is in contrast to a non-containerised format, which stores wave data from each sample directly next to each other, as seen in the legacy SoundFont format.
Such containers include information such as sample rate, bit depth, compression format used, audio information and more. These are already used by the earlier Werner SF3 system widely used by the open source community, to store information about compression that a Werner SF3-compatible player could use to decompress the sample.
Containerised modes provide many other advantages such as variable bit depths, conserving sample quality while reducing wasted space, and detailed sample data can also be included directly. Because of said advantages, containerised samples are mandatory for use with all programs that output files in the SFe 4 format. (Update 19)
[Diagram goes here]
What are SFe Compression and uncompressed containerised modes? (Update 9)
SFe Compression is the encoding system for compressed samples used by SFe, based on Werner SF3.
By standardising on Werner SF3 in the form of SFe Compression, we will hopefully ensure that everyone uses the same compression formats. Due to this, we will mostly make small changes to SFe Compression which correspond to updates to the Werner SF3 specification by other SF player programs. We may also make other standardisation changes to streamline the format. To achieve this, all SFe players should implement SFe Compression.
Uncompressed containerised mode provides the containerisation of SFe Compression without actually compressing the samples.
File identification
The wMajor
value in the ifil
sub-chunk is set to 3 instead of 2. The value of the SFvx
sub-chunk remains unchanged. Therefore, SFe players should not use the ifil
value to determine the SFe version, but rather the SFvx
sub-chunk.
However, containerised samples in an SFe bank may continue to use an wMajor
value of 2. (Update 20)
sfSampleType in shdr sub-chunk (Update 16)
Bit 5 of the sfSampleType
field indicates a containerised sample that is not necessarily compressed. This is used to retain backwards compatibility with Werner SF3.
Along with bit 5, SFe Compression defines other bits to help programs determine sample format:
- Bit 6 clear, bit 7 clear: ogg (vorbis)
- Bit 6 set, bit 7 clear: flac
- Bit 6 set, bit 7 clear: opus
- Bit 6 set, bit 7 set: wav
This means that Werner SF3 banks work properly with SFe.
Additionally, bit 8 can be used to force a player to detect the sample format used, as in WernerSF3. Bit 8 should only be used if the player can read formats beyond the supported compression formats below.
Therefore, in uncompressed SFe banks, the typical sfSampleType
values are increased by 112 compared with SF2.04 banks.
Supported compression formats for samples (Update 8)
Currently, SFe Compression requires any compressed samples to be in these formats:
- wav (uncompressed)
- ogg
- opus
- flac
Considerations for compressed samples
For a compressed sample to be valid:
- the type of compression and/or encoding must be recognised and supported by the program.
- the compression and/or encoding format must be valid.
- the compressed sample must only have one channel.
- the compressed sample must be compressed using only supported formats. (Update 5)
A sample is not valid if any of these conditions are not true. If a sample is not valid, then it should be ignored. (Update 8)
If a sample has more than one channel, then the first channel is taken. (Update 19)
Considerations for uncompressed (wav) samples
For a wav sample to be valid:
- the
wFormatTag
value must equal1
(PCM Integer) or3
(IEEE 754 Float) - the sample must only have one channel.
- the sample rates in the metadata and
dwSampleRate
must match.
A sample is not valid if any of these conditions are not true. If a sample is not valid, then it should be ignored. (Update 8)
If a sample has more than one channel, then the first channel is taken. (Update 19)
Interpretation of sample data index fields in shdr sub-chunk
If bit 4 of the sfSampleType
field is set, then the interpretation of the four sample data index fields changes:
dwStart
points to the first byte of the compressed byte stream relative to the beginning of thesmpl
sub-chunk.dwEnd
points to the last byte of the compressed byte stream, instead of the first zero-valued sample data point after the sample data in legacy SF2.04.dwStartLoop
anddwEndLoop
specify loop points relative to the start of the individual uncompressed sample data, in sample data points.
If bit 4 of the sfSampleType
field is clear, then the sample data index fields should be interpreted as in legacy SF2.04.
Using both compressed and uncompressed samples in the same file
You can use both compressed and uncompressed samples with SFe Compression. The preferred method of doing this is to use wav containers for uncompressed samples.
The legacy Werner SF3 practice of placing uncompressed PCM samples at the beginning of the smpl
sub-chunk before the compressed sample byte stream is still supported, but deprecated. Because each sample is compressed individually, the resulting byte streams of all encoded samples are written to the smpl
sub-chunk. The smpl
sub-chunk may also contain uncompressed little-endian PCM samples.
For compressed byte streams, it is not necessary to add forty-six zero-valued sample data points after each sample. The length of the smpl
sub-chunk is not required to be a multiple of two for compressed banks, and its surrounding LIST
chunk is also not padded to a multiple of two as a consequence.
Sample links are not used in SFe Compression
Sample links are not used in banks compressed with SFe Compression. The value of wSampleLink
should be read and written as zero. This is because the file size of two compressed samples that have the same length (as used in linked samples) may differ. (Update 9)
However, when uncompressed samples are used, sample links are still usable. Therefore, stereo samples remain usable in uncompressed containerised mode. Please refer to section 5.7.6 for more information about using stereo samples. (Update 9)
Incompatible compression formats
The only supported compression system for SFe is the Werner SF3-compatible SFe Compression. Proprietary SF compression formats (.sfark
, .sfpack
, .sf2pack
, .sfogg
, .sfq
, .sf4
) must not be used, but programs can remain compatible with existing legacy SF2.0x banks compressed in such formats. Because Cognitone SF4-formatted banks are not valid Werner SF3 banks, they are also incompatible with SFe Compression. (Update 14)
5.7.4 Non-containerised modes (Update 9)
The legacy non-containerised 16-bit and 24-bit modes were used in the legacy SoundFont format, but should never be used for SFe banks. (Update 19)
If the ifil
version is 2.04
or greater, and there an sm24
sub-chunk is present, then the sdta
structure mode is legacy 24-bit (sm24
) mode. Legacy 24-bit mode can only function with uncompressed samples due to the segmented structure of samples stored in this way, and we strongly recommend against using legacy 24-bit mode, because containerisation support is a requirement for SFe level 1 and is easier to work with. (Update 9)
Note that if the ifil
version is below 2.04
(signifying legacy SF2.01 or earlier), then the sdta
structure mode is legacy 16-bit mode, and sm24
is ignored.
In an SFe 4 bank, non-containerised modes can either be rejected as Structurally Unsound, or can optionally be read (if the program supports legacy SF2.0x). Programs must not write non-containerised samples into SFe 4 banks. (Update 19)
5.7.5 Looping rules
- No more leeway of eight samples is required.
- Before saving, SFe editors might give a warning about this leeway telling the user that loop and interpolation quality may be affected.
5.7.6 Implementation of stereo samples
For avoidance of doubt, stereo samples are implemented in the same way as in legacy SF2.04.
This means that stereo samples cannot be directly used in SFe 4. Instead, two mono samples are to be linked together using wSampleLink
to create a stereo sample, which is handled in the same way as in legacy SF2.04.
5.8 pdta-list chunk
5.8.1 phdr sub-chunk
Its size is a multiple of 38 bytes, and its structure is the same as in legacy SF2.04.
The last sfPresetHeader
entry shouldn’t need to be accessed, apart from the uses described in SFSPEC24.PDF
.
The phdr
sub-chunk is required; files without a phdr
sub-chunk are Structurally Unsound.
UTF-8 in achPresetName (Update 16)
The value of achPresetName is now UTF-8 instead of ASCII.
phdr in xdta-list (Update 16)
The values in phdr
are parsed slightly differently in xdta-list
:
achPresetName[20]
inxdta-list
represents the second half of the preset name.- This is combined with the
achPresetName
inpdta-list
for a total of 40 characters.
- This is combined with the
wPreset
is unused.- SFe editors must write a value of zero for
wPreset
inxdta-list
.
- SFe editors must write a value of zero for
wBank
is unused.- SFe editors must write a value of zero for
wBank
inxdta-list
- However, SFe players may use
wBank
inxdta-list
for internal uses.
- SFe editors must write a value of zero for
wPresetBagNdx
represents the upper 16 bits of the preset bag index.fullIndex = (xdtaWord << 16) | pdtaWord
dwLibrary
,dwGenre
anddwMorphology
remain unused.
5.8.2 New Bank System
In SFe 4.0, the bank system has been completely overhauled. Please read this section carefully to ensure that you correctly implement bank selects in your program.
Using MIDI Control Change #32 (Bank Select LSB)
In legacy SF2.04, the wBank
field stores the bank that the preset can be found in. Due to a forward-thinking decision by E-mu, it is a WORD
(16-bit) instead of a BYTE
(8-bit). This means that it could theoretically store values for both bank select instructions found in MIDI 1.0.
Bank select LSB support is added by using the unused 8 bits of wBank
according to the figure below. Bits 2–8 of both byBankMSB
and byBankLSB
are now used to set a bank change.
Figure 8: How the bank select logic differs from legacy SF2.04.
Introducing byBankMSB and byBankLSB
In the above figure, wBank
has been replaced with byBankMSB
and byBankLSB
.
This splits the one WORD
in legacy SF2.04 into two BYTE
values, one for each bank. byBankMSB
goes before byBankLSB
due to RIFF being a little-endian format. (Update 5)
Using more than one percussion bank
Legacy SF2.04 allows bank developers to define one bank of percussion kits for use in channel 10 that can be switched between using MIDI Program Change instructions by using the wBank
number 128
. In other words, if bit 7 is set, bits 0-6 must be clear - you cannot use bank select instructions with channel 10.
SFe 4.0 now allows users to set bit 7 with any value for bits 0-6. The result is that there are 128 percussion banks available when using byBankMSB
, as shown by the figure below.
Figure 9: How the percussion bank listing differs from legacy SF2.04. When byte 7 is set for byBankMSB
, byBankLSB
may also be used. Therefore, a total of 16384 (128×128) banks of percussion kits may be used.
Flowchart for correct handling of bank select instructions
Figure 10: The flowchart for bank select instructions in legacy SF2.04.
Figure 11: The flowchart for bank select instructions in SFe 4.0.
Notice that not only are extra steps added for bank select LSB and percussion bank select handling, but extra configuration information used by the player is added to determine the correct preset to use.
5.8.3 pbag sub-chunk (Update 16)
Its size is a multiple of 4 bytes, and its structure is the same as in legacy SF2.04.
The pbag
sub-chunk is required; files without a pbag
sub-chunk are Structurally Unsound.
pbag in xdta-list (Update 16)
The values in pbag
are parsed slightly differently in xdta-list
:
wGenNdx
represents the upper 16 bits of the generator indexfullIndex = (xdtaWord << 16) | pdtaWord
wModNdx
represents the upper 16 bits of the modulator indexfullIndex = (xdtaWord << 16) | pdtaWord
5.8.4 pmod sub-chunk (Update 16)
Its size is a multiple of 10 bytes, and its structure is the same as in legacy SF2.04.
The pmod
sub-chunk is required; files without a pmod
sub-chunk are Structurally Unsound.
pmod in xdta-list (Update 16)
In xdta-list
, pmod
contains only a terminal modulator record.
5.8.5 pgen sub-chunk (Update 16)
Its size is a multiple of 4 bytes, and its structure is the same as in legacy SF2.04.
The pgen
sub-chunk is required; files without a pgen
sub-chunk are Structurally Unsound.
pgen in xdta-list (Update 16)
In xdta-list
, pgen
contains only a terminal modulator record, similarly to pmod
.
5.8.6 inst sub-chunk (Update 16)
Its size is a multiple of 22 bytes, and its structure is the same as in legacy SF2.04.
The inst
sub-chunk is required; files without an inst
sub-chunk are Structurally Unsound.
UTF-8 in achInstName (Update 16)
The value of achInstName is now UTF-8 instead of ASCII.
inst in xdta-list (Update 16)
The values in inst
are parsed slightly differently in xdta-list
:
achInstrumentName[20]
inxdta-list
represents the second half of the instrument name.- This is combined with the
achInstrumentName
inpdta-list
for a total of 40 characters.
- This is combined with the
wInstBagNdx
inxdta-list
represents the upper 16 bits of the instrument bag index.fullIndex = (xdtaWord << 16) | pdtaWord
5.8.7 ibag sub-chunk (Update 16)
Its size is a multiple of 4 bytes, and its structure is the same as in legacy SF2.04.
The ibag
sub-chunk is required; files without an ibag
sub-chunk are Structurally Unsound.
ibag in xdta-list (Update 16)
The values in ibag
are parsed slightly differently in xdta-list
:
wGenNdx
represents the upper 16 bits of the generator indexfullIndex = (xdtaWord << 16) | pdtaWord
wModNdx
represents the upper 16 bits of the modulator indexfullIndex = (xdtaWord << 16) | pdtaWord
5.8.8 imod sub-chunk (Update 16)
Its size is a multiple of 4 bytes, and its structure is the same as in legacy SF2.04.
The imod
sub-chunk is required; files without an imod
sub-chunk are Structurally Unsound.
imod in xdta-list (Update 16)
In xdta-list
, imod
contains only a terminal modulator record, similarly to pmod
.
5.8.9 igen sub-chunk
Its size is a multiple of 4 bytes, and its structure is the same as in legacy SF2.04.
The igen
sub-chunk is required; files without a igen
sub-chunk are Structurally Unsound.
igen in xdta-list (Update 16)
In xdta-list
, igen
contains only a terminal modulator record, similarly to pmod
.
5.8.10 shdr sub-chunk
Its size is a multiple of 46 bytes, and its structure is the same as in legacy SF2.04.
The shdr
sub-chunk is required; files without a shdr
sub-chunk are Structurally Unsound.
Sample Rate Limit Changes
- In SFe, sample rates (
dwSampleRate
) are stored as a 32-bit integer. This is the same behavior as seen in the legacy SF2.04 format. This results in a theoretical maximum sample rate of 4,294,967,295 Hz. - In the legacy SF2.04 specification, E-mu suggested that sample rates of below 400 Hz or above 50,000 Hz should be avoided as some legacy hardware platforms may not be able to reproduce these sounds. This is not a limitation of the specification, but rather a limitation of legacy sound cards.
- Despite this, Creative did not use 16-bit integers for sample rate in legacy SF2.04. It is thus safe to use sample rates in excess of 50,000 Hz. If a sample rate of below 400 Hz or above 50,000 Hz is encountered, no attempt should be made to change the sample rate.
- A zero sample rate should be reset.
UTF-8 in achSampleName (Update 16)
The value of achSampleName is now UTF-8 instead of ASCII.
sfSampleType Changes (Update 16)
In legacy SF2.04, sfSampleType
is treated as an enum, with eight fixed values. This worked fine when there were only a few possible bits, however it could become a limitation for future expansion.
Therefore, the specification for sfSampleType
discourages the use of fixed enums, in favour of bit flags (as described in SFSPEC24.PDF
):
- If bit 1 is set, a sample is mono (has one channel).
- If bit 2 is set, a sample is the right part of a stereo sample.
- If bit 3 is set, a sample is the left part of a stereo sample.
- If bit 4 is set, a sample is a linked sample.
- If bit 5 is set, a sample is compressed using SFe Compression.
- Read section 6.2 for more information on SFe Compression.
- If bit 6 is clear and bit 7 clear, all samples are in ogg format.
- If bit 6 is set and bit 7 clear, all samples are in flac format.
- If bit 6 is clear and bit 7 set, all samples are in opus format.
- If bit 6 is set and bit 7 set, all samples are in containerised wav format.
- If bit 16 is set, a sample is stored in ROM.
- Read section 9 for more information on ROM samples.
Note that all unused bits are reserved and should not be used by SFe implementations.
List of valid sfSampleType values (Update 16)
Value | Name | Description |
---|---|---|
1 | monoSample | Mono sample |
2 | rightSample | Right part of a stereo sample |
4 | leftSample | Left part of a stereo sample |
8 | linkedSample | Linked sample |
17 | vorbisMonoSample | Mono sample compressed with OGG Vorbis |
49 | flacMonoSample | Mono sample compressed with FLAC |
81 | opusMonoSample | Mono sample compressed with Opus |
113 | wavMonoSample | WAV containerised mono sample |
114 | wavRightSample | WAV containerised right part of a stereo sample |
116 | wavLeftSample | WAV containerised left part of a stereo sample |
120 | wavLinkedSample | WAV containerised linked sample |
32769 | RomMonoSample | monoSample stored in ROM |
32770 | RomRightSample | rightSample stored in ROM |
32772 | RomLeftSample | leftSample stored in ROM |
32776 | RomLinkedSample | linkedSample stored in ROM |
32785 | RomVorbisMonoSample | vorbisMonoSample stored in ROM |
32817 | RomFlacMonoSample | flacMonoSample stored in ROM |
32849 | RomOpusMonoSample | opusMonoSample stored in ROM |
32881 | RomWavMonoSample | wavMonoSample stored in ROM |
32882 | RomWavRightSample | wavRightSample stored in ROM |
32884 | RomWavLeftSample | wavLeftSample stored in ROM |
32888 | RomWavLinkedSample | wavLinkedSample stored in ROM |
All other values are invalid.
Note that SFe 4 does not permit the use of non-containerised samples, so corresponding values are used only for legacy SF2.04 support.
shdr in xdta-list (Update 16)
The values in shdr
are parsed slightly differently in xdta-list
:
achSampleName[20]
inxdta-list
represents the second half of the sample name.- This is combined with the
achSampleName
inpdta-list
for a total of 40 characters.
- This is combined with the
dwStart
,dwEnd
,dwStartloop
anddwEndloop
inxdta-list
may or may not be used:- With 32-bit chunk headers, these values are unused, as they are already 32-bit.
- With 64-bit chunk headers, these values represent the upper 32 bits of the sample data field index.
fullIndex = (xdtaDword << 32) | pdtaDword
dwSampleRate
inxdta-list
is unused, because it is already a 32-bit integer.- SFe software should write a value of zero.
byOriginalPitch
andchPitchCorrection
are unused.- SFe software should write a value of zero.
wSampleLink
inxdta-list
represents the upper 16 bits of the sample link index.fullIndex = (xdtaWord << 16) | pdtaWord
sfSampleType
inxdta-list
is unused.- SFe software should write a value of zero.