Difference between revisions of "Map Format"

From wiki.vg
Jump to navigation Jump to search
(Add explanation of LZ4 compression scheme added in 24W04A)
(Added mention of xxhash32 seed value for lz4 compressed chunks)
Line 200: Line 200:
 
=====Checksum=====
 
=====Checksum=====
  
Block checksums use the XXHash32 algorithm [https://github.com/lz4/lz4-java/blob/master/src/java/net/jpountz/xxhash/StreamingXXHash32.java as implemented in lz4-java].
+
Block checksums use the XXHash32 algorithm with the seed value <code>0x9747b28c</code> [https://github.com/lz4/lz4-java/blob/master/src/java/net/jpountz/xxhash/StreamingXXHash32.java as implemented in lz4-java].
 
When finalizing the hash value, this implementation [https://github.com/lz4/lz4-java/blob/7c931bef32d179ec3d3286ee71638b23ebde3459/src/java/net/jpountz/xxhash/StreamingXXHash32.java#L106 removes the last nibble from the result], which is something that might need to be done manually when using other implementations.
 
When finalizing the hash value, this implementation [https://github.com/lz4/lz4-java/blob/7c931bef32d179ec3d3286ee71638b23ebde3459/src/java/net/jpountz/xxhash/StreamingXXHash32.java#L106 removes the last nibble from the result], which is something that might need to be done manually when using other implementations.
  

Revision as of 20:15, 25 January 2024

This page covers the world information as of 1.2.5 to the present. See Alpha Map Format for Alpha information.


Heads up!

Some information in this article may be outdated due to the new world generation introduced in the 1.17 update.

General Information

Worlds are represented as a series of regions, within which are a number of columns, and chunks. Each region is 32x1x32 columns, each column is 1x16x1 chunks, and each chunk is 16x16x16 blocks. The overall block height of a column is 384. Each chunk stores four (or five) things - block IDs (8-bit), block metadata (4-bit), block light (4-bit), and sky light (4-bit). The optional fifth value is "add" data, which is four bits to be added to block IDs for additional block ID support (not used in vanilla Minecraft). Block light is light cast by things like torches and glowstone, and is calculated with a 3D Flood Fill algorithm. Sky light is the light cast by the sky, and is calculated by starting at the top and working your way down. As you pass through semi-transparent blocks, you decrease the lighting value until you hit an opaque block. The opaque block is the last block whose skylight has a nonzero value. Lighting starts at 0xF (brightest) and works down to 0x0 (dimmest). Columns store biome information. Each 1x256x1 cuboid of blocks has the same biome value, for a total of 256 possible biome values per chunk (16x16). Biome values are stored as bytes.

Biome Values

As of 1.18.1 there are 61 unique biomes (50 for the Overworld, 5 for the Nether, 5 for the End, and minecraft:the_void). Here you can find a mapping of biome to its respective numerical ID.

Storage

This section documents the Anvil format for servers.

Maps are stored as a specific directory structure with several NBT files within. For the sake of examples, the world we'll be working with is stored in a folder called "/world".

Directory Structure

  • /world/data: Unused
  • /world/DIM-1: Nether world
  • /world/DIM-1/region: Nether world regions
  • /world/DIM1: End world
  • /world/DIM1/region: End world regions
  • /world/players: Player data
  • /world/region: Overworld regions

level.dat

In the root directory is a level.dat file. The structure of that file is this:

  • NBTCompound(Data)
    • NBTByte(hardcore)
    • NBTByte(MapFeatures): Set to 1 if structures are generated, such as villages
    • NBTByte(raining): Set to 1 if currently raining
    • NBTByte(thundering): Set to 1 if currently thundering (only if there is potential for thunder, not if a thunderbolt is currently in progress)
    • NBTInt(GameType)
    • NBTInt(generatorVersion): 0 for 1.2.5
    • NBTInt(rainTime): The ticks remaining until rain stops?
    • NBTInt(SpawnX)
    • NBTInt(SpawnY)
    • NBTInt(SpawnZ)
    • NBTInt(thunderTime)
    • NBTInt(version): 19133 for 1.2.5
    • NBTLong(LastPlayed)
    • NBTLong(RandomSeed)
    • NBTLong(SizeOnDisk): Always 0 for 1.2.5
    • NBTLong(Time)
    • NBTString(generatorName)
    • NBTString(LevelName)

[playeruuid].dat

Each player that has ever connected is given a [playeruuid].dat file.

  • NBTCompound
    • NBTByte(OnGround)
    • NBTByte(Sleeping)
    • NBTShort(Air)
    • NBTShort(AttackTime)
    • NBTShort(DeathTime)
    • NBTShort(Fire): Ticks until the player is no longer on fire, or zero
    • NBTShort(Health)
    • NBTShort(HurtTime)
    • NBTShort(SleepTimer)
    • NBTInt(Dimension)
    • NBTInt(foodLevel)
    • NBTInt(foodTickTimer)
    • NBTInt(playerGameType)
    • NBTInt(XpLevel)
    • NBTInt(XpTotal)
    • NBTFloat(FallDistance)
    • NBTFloat(foodExhastionLevel)
    • NBTFloat(foodSaturationLevel)
    • NBTFloat(XpP)
    • NBTCompound(Inventory)
      • NBTCompound
        • NBTByte(Count)
        • NBTByte(Slot)
        • NBTShort(Damage): Damage -or- metadata
        • NBTShort(id)
    • NBTList(Motion)
      • NBTDouble
      • NBTDouble
      • NBTDouble
    • NBTList(Pos)
      • NBTDouble
      • NBTDouble
      • NBTDouble
    • NBTList(Rotation)
      • NBTFloat
      • NBTFloat

[region].mca

Each region file is named "r.x.z.mca", where x and z are the coordinates. These coordinates are relative to each region. Given chunk column coordinates, divide them by 32 to get the region coordinates. Region files are not raw NBT files and must therefore be parsed differently see Region files. Every generated chunk has a 5-byte header with the first four bytes as the length of the compressed chunk in bytes and the fifth byte as the compression scheme.

Length of the compressed
chunk in bytes
Compression Scheme
Decoded 5033 zlib
On Disk(in hex) 00 13 A9 02

Compression schemes

The compression scheme can have four values:

Compression
Scheme
Value
LZ4 4
none 3
zlib 2
gzip 1

The notchian implementation will never write uncompressed or gzip compressed chunks, but can read if provided.

LZ4 Compression

LZ4 compressed data is saved using the LZ4BlockOutputStream stream implementation of lz4-java. This format is not compatible with the standard LZ4 Frame format. Block stream data consists of multiple blocks of data, each of which is prepended by a 21 byte header.

Magic Token Compressed Length Decompressed Length XXH32 Checksum
Decoded LZ4Block 38 489 2911 71030836
On Disk (in hex) 4C 5A 34 42 6C 6F 63 6B 26 E9 01 00 00 5F 0B 00 00 34 D8 3B 04
Token value

The token field encodes the compression method and compression level for the block:

compressionLevelBase = 10;

compressionMethod = token & 0xf0;
compressionLevel = compressionLevelBase + (token & 0x0f);

There are two different compression methods available:

Method Value
Raw/uncompressed 0x10
LZ4 0x20

Uncompressed block should be used if using LZ4 compression would make the block data larger.

Checksum

Block checksums use the XXHash32 algorithm with the seed value 0x9747b28c as implemented in lz4-java. When finalizing the hash value, this implementation removes the last nibble from the result, which is something that might need to be done manually when using other implementations.

checksum = xxh32(blockData) & 0xFFFFFFF; //these are seven, not eight, Fs

Chunk format

Note: The following example is outdated. It is accurate for (at least) 1.4, but at some point the format has been changed. However, you will still need to support this version if you want reverse compatibility with old save files.

TAG_Compound(''): 2 entries
{
  TAG_Compound('Level'): 11 entries
  {
    TAG_List('Entities'): List of entities in the chunk column
    {
      TAG_Compound(): Each entity has a Tag_Compound
      {
        ...
      }
    }
    TAG_List('Sections'): 5 entries
    {
      TAG_Compound(''): 5 entries
      {
        TAG_Byte('Y'): 0
        TAG_Byte_Array('BlockLight'): 2048 bytes
        TAG_Byte_Array('Blocks'): 4096 bytes
        TAG_Byte_Array('Data'): 2048 bytes
        TAG_Byte_Array('SkyLight'): 2048 bytes
      }
      TAG_Compound(''): 5 entries
      {
        TAG_Byte('Y'): 1
        TAG_Byte_Array('BlockLight'): 2048 bytes
        TAG_Byte_Array('Blocks'): 4096 bytes
        TAG_Byte_Array('Data'): 2048 bytes
        TAG_Byte_Array('SkyLight'): 2048 bytes
      }
      ...
    }
    TAG_List('TileEntities'):
    TAG_Long('InhabitedTime'): 16
    TAG_Long('LastUpdate'): Last time a block changed in this column
    TAG_Byte('LightPopulated'): Is the light calculated
    TAG_Byte('TerrainPopulated'): Has the terrain been generated
    TAG_Int('xPos'): X position of the region. Each single increment to x is 512 blocks
    TAG_Int('zPos'): Z position of the region. Each single increment to z is 512 blocks
    TAG_Byte_Array('Biomes'): 256 bytes. Biomes affect blocks in 1x256x1 columns
    TAG_Int_Array('HeightMap'): 256 bytes
  }
  TAG_Int('DataVersion'): 1343
}

See also

Region Files Tile entity format