User:Pokechu22/Chunk Format

From wiki.vg
< User:Pokechu22
Revision as of 18:59, 2 February 2017 by Pokechu22 (talk | contribs) (Add in samples, among other things)
Jump to navigation Jump to search

v2 of SMP Map Format, work in progress.

This article describes in additional detail the format of the Chunk Data packet.

Concepts

  • Chunk Section: a 16×16×16 area, sometimes also called chunk.
  • Chunk Column: 16 chunk sections aligned vertically (totaling 16×256×16).
  • Global palette: All block IDs and states combined together

Packet structure

Packet ID State Bound To Field Name Field Type Notes
0x20 Play Client Chunk X Int Chunk coordinate (block coordinate divided by 16, rounded down)
Chunk Z Int Chunk coordinate (block coordinate divided by 16, rounded down)
Ground-Up Continuous Boolean This is true if the packet represents all chunk sections in this vertical chunk column. If true, the chunk that was previously there should be replaced with this chunk. If false, this packet is instead modifying the given chunk sections, but leaves the other sections alone.
Primary Bit Mask VarInt Bitmask with bits set to 1 for every 16×16×16 chunk section whose data is included in Data. The least significant bit represents the chunk section at the bottom of the chunk column (from y=0 to y=15).
Size VarInt Size of Data in bytes
Data Byte array See data structure below
Number of block entities VarInt Length of the following array
Block entities Array of NBT Tag All block entities in the chunk. Use the x, y, and z tags in the NBT to determine their positions.

Data structure

The data section of the packet contains most of the useful data for the chunk.

Field Name Field Type Notes
Data Array of Chunk Section The length of the array is equal to the number of bits set in Primary Bit Mask. Chunks are sent bottom-to-top, i.e. the first chunk, if sent, extends from Y=0 to Y=15.
Biomes Optional Byte Array Only sent if Ground-Up Continuous is true; 256 bytes if present

Chunk Section

A Chunk Section is defined in terms of other data types. A Chunk Section consists of the following fields:

Field Name Field Type Notes
Bits Per Block Unsigned Byte Determines how many bits are used to encode a block. Note that not all numbers are valid here. This also changes whether the palette is present.
Palette Length VarInt Length of the following array. May be 0, in which case the following palette is not sent.
Palette Optional Array of VarInt Mapping of block state IDs in the global palette to indices of this array
Data Array Length VarInt Number of longs in the following array
Data Array Array of Long Compacted list of 4096 indices pointing to state IDs in the Palette
Block Light Byte Array Half byte per block
Sky Light Optional Byte Array Only if in the Overworld; half byte per block

In half-byte arrays, two values are packed into each byte. Even-indexed items are packed into the low bits, odd-indexed into the high bits.

Data Array, Block Light, and Sky Light are given for each block with increasing x coordinates, within rows of increasing z coordinates, within layers of increasing y coordinates.

The Data Array, although varying in length, will never be padded due to the number of blocks being evenly divisible by 64, which is the number of bits in a long.

There are several values that can be used for the bits per block value. In most cases, invalid values will be interpreted as a different value when parsed by the Notchian client, meaning that chunk data will be parsed incorrectly if you use an invalid bits per block. Servers must make sure that the bits per block value is correct.

  • up to 4: Blocks are encoded as 4 bits. The palette is used and sent.
  • 5 to 8: Blocks are encoded with the given number of bits. The palette is used and sent.
  • 9 and above: The palette is not sent. Blocks are encoded by their whole ID in the global palette, with bits per block being set as the base 2 logarithm of the number of block states, rounded up. For the current vanilla release, this is 13 bits per block.

The global palette encodes a block as 13 bits. It uses the block ID for the first 9 bits, and the block damage value for the last 4 bits. For example, Diorite (block ID 1 for minecraft:stone with damage 3) would be encoded as 000000001 0011. If a block is not found in the global palette (either due to not having a valid damage value or due to not being a valid ID), it will be treated as air.

If Minecraft Forge is installed and a sufficiently large number of blocks are added, the bits per block value for the global palette will be increased to compensate for the increased ID count. This increase can go up to 16 bits per block (for a total of 4096 block IDs; when combined with the 16 damage values, there are 65536 total states). You can get the number of blocks with the "Number of ids" field found in the RegistryData packet in the Forge Handshake.

The data array stores several entries within a single long, and sometimes overlaps one entry between multiple longs. For a bits per block value of 13, the data is stored such that bits 1 through 13 are the first entry, 14 through 26 are the second, and so on. Note that bit 1 is the least significant bit in this case, not the most significant bit. The same behavior applies when a value stretches between two longs: for instance, block 5 would be bits 53 through 64 of the first long and then bit 65 of the second long.

Example

13 bits per block, using the global palette.

The following two longs would represent...

1001880C0060020 = 0000000100000000000110001000000011000000000001100000000000100000
200D0068004C020 = 0000001000000000110100000000011010000000000001001100000000100000

9 blocks, with the start of a 10th (that would be finished in the next long).

  1. Grass, 2:0
  2. Dirt, 3:0
  3. Dirt, 3:0
  4. Coarse dirt, 3:1
  5. Stone, 1:0
  6. Stone, 1:0
  7. Diorite, 1:3
  8. Gravel, 13:0
  9. Gravel, 13:0
  10. Stone, 1:0 (or potentially emerald ore, 129:0)

Biomes

The biomes array is only present when ground-up continuous is set to true. Biomes cannot be changed unless a chunk is re-sent.

The structure is an array of 256 bytes, each representing a Biome ID (it is recommended that 127 for "Void" is used if there is no set biome). The array is indexed by z * 16 | x.

Tips

There are several things that can make it easier to implement this format.

  • The 13 value for full bits per block is likely to change in the future, so it should not be hardcoded (instead, it should either be calculated or left as a constant).
  • Servers do not need to implement the palette initially (instead always using 13 bits per block), although it is an important optimization later on.
  • The Notchain server implementation does not send values that are out of bounds for the palette. If such a value is received, the format is being parsed incorrectly.

Sample implementations

How the chunk format can be implemented varies largely by how you want to read/write it. It is often easier to read/write the data long-by-long instead of pre-create the data to write; however, storing the chunk data arrays in their packed form can be far more efficient memory- and performance-wise. These implementations are simple versions that can work as a base (especially for dealing with the bit shifting), but are not ideal.

Deserializing

When deserializing, it is easy to read to a buffer (since length information is present). A basic example:

public Chunk ReadChunkDataPacket(Buffer data) {
    int x = ReadInt(data);
    int z = ReadInt(data);
    bool full = ReadBool(data);
    Chunk chunk;
    if (full) {
        chunk = new Chunk(x, z);
    } else {
        chunk = GetExistingChunk(x, z);
    }
    int mask = ReadVarInt(data);
    int size = ReadVarInt(data);
    ReadChunkColumn(chunk, full, mask, data.ReadByteArray(size));

    int blockEntityCount = ReadVarInt(data);
    for (int i = 0; i < blockEntityCount; i++) {
        CompoundTag tag = ReadCompoundTag(data);
        chunk.AddBlockEntity(tag.GetInt("x"), tag.GetInt("y"), tag.GetInt("z"), tag);
    }

    return chunk;
}

private void ReadChunkColumn(Chunk chunk, bool full, int mask, Buffer data) {
    for (int sectionY = 0; sectionY < CHUNK_HEIGHT / SECTION_HEIGHT; y++) {
        if ((mask & (1 << chunkY)) != 0) {  // Is the given bit set in the mask?
            byte bitsPerBlock = ReadByte(data);

            // Excessively specific format that exactly matches the client logic
            // This extra checking makes sense on the server side, but client
            // side it only is needed when dealing with servers sending incorrect packets
            // (the notchian server will not send such packets)
            if (bitsPerBlock < 4) {
                bitsPerBlock = 4;
            }
            if (bitsPerBlock > 8) {
                bitsPerBlock = FULL_SIZE_BITS_PER_BLOCK;  // 13, currently, but liable to eventually change
            }

            bool usePalette = (bitsPerBlock <= 8)

            int[] palette = null;
            if (usePalette) {
                int numPaletteEntries = ReadVarInt(data);
                palette = new int[numPaletteEntries];
                for (int i = 0; i < numPaletteEntries; i++) {
                    palette[i] = ReadVarInt(data);
                }
            } else {
                ReadVarInt(data);  // Should always be 0
            }

            // A bitmask that contains bitsPerBlock set bits
            uint individualValueMask = (uint)((1 << bitsPerBlock) - 1);

            UInt64[] dataArray = ReadUInt64Array(data);  // Reads a VarInt length prefix and then that many UInt64

            ChunkSection section = new ChunkSection();

            for (int y = 0; y < SECTION_HEIGHT; y++) {
                for (int z = 0; z < SECTION_WIDTH; z++) {
                    for (int x = 0; x < SECTION_WIDTH; x++) {
                        int blockNumber = (((blockY * SECTION_HEIGHT) + blockZ) * SECTION_WIDTH) + blockX;
                        int startLong = (blockNumber * bitsPerBlock) / 64;
                        int startOffset = (blockNumber * bitsPerBlock) % 64;
                        int endLong = ((blockNumber + 1) * bitsPerBlock - 1) / 64;

                        uint data;
                        if (startLong == endLong) {
                            data = (uint)(dataArray[startLong] >> startOffset);
                        } else {
                            int endOffset = 64 - startOffset;
                            blockId = (uint)(dataArray[startLong] >> startOffset | dataArray[endLong] << endOffset);
                        }
                        data &= individualValueMask;

                        if (usePalette) {
                            // data should always be within the palette length
                            // If you're reading a power of 2 minus one (15, 31, 63, 127, etc...) that's out of bounds,
                            // you're probably reading light data instead
                            data = palette[data];
                        }

                        byte metadata = data & 0xF;
                        uint id = data >> 4;

                        section.SetBlock(x, y, z, id, metadata);
                    }
                }
            }

            for (int y = 0; y < SECTION_HEIGHT; y++) {
                for (int z = 0; z < SECTION_WIDTH; z++) {
                    for (int x = 0; x < SECTION_WIDTH; x += 2) {
                        // Note: x += 2 above; we read 2 values along x each time
                        byte value = ReadByte(data);

                        section.SetBlockLight(x, y, z, value & 0xF);
                        section.SetBlockLight(x + 1, y, z, (value >> 4) & 0xF);
                    }
                }
            }

            if (currentDimension.HasSkylight()) { // IE, current dimension is overworld / 0
                for (int y = 0; y < SECTION_HEIGHT; y++) {
                    for (int z = 0; z < SECTION_WIDTH; z++) {
                        for (int x = 0; x < SECTION_WIDTH; x += 2) {
                            // Note: x += 2 above; we read 2 values along x each time
                            byte value = ReadByte(data);

                            section.SetSkyLight(x, y, z, value & 0xF);
                            section.SetSkyLight(x + 1, y, z, (value >> 4) & 0xF);
                        }
                    }
                }
            }

            // May replace an existing section or a null one
            chunk.Sections[SectionY] = section;
        }
    }

    for (int z = 0; z < SECTION_WIDTH; z++) {
        for (int x = 0; x < SECTION_WIDTH; x++) {
            chunk.SetBiome(x, z, ReadByte(data));
        }
    }
}

Serializing

Serializing the packet is more complicated, because of the palette. It is easy to implement with the full bits per block value; implementing it with a compacting palette is much harder since algorithms to generate and resize the packet must be written. As such, this example does not generate a packet. The packet is a good performance improvement (as it can significantly reduce the amount of data sent), but managing that is much harder and there are a variety of ways of implementing it.

Also note that this implementation doesn't handle situations where full is false (ie, making a large change to one section); it's only good for serializing a full chunk.

public void WriteChunkDataPacket(Chunk chunk, Buffer data) {
    WriteInt(data, chunk.GetX());
    WriteInt(data, chunk.GetZ());
    WriteBool(true);  // Full

    int mask = 0;
    Buffer columnBuffer = new Buffer();
    for (int sectionY = 0; sectionY < CHUNK_HEIGHT / SECTION_HEIGHT; y++) {
        if (!chunk.IsSectionEmpty(sectionY)) {
            mask |= (1 << chunkY);  // Set that bit to true in the mask
            WriteChunkSection(chunk.Sections[sectionY], columnBuffer);
        }
    }
    for (int z = 0; z < SECTION_WIDTH; z++) {
        for (int x = 0; x < SECTION_WIDTH; x++) {
            WriteByte(columnBuffer, chunk.GetBiome(x, z));  // Use 127 for 'void' if your server doesn't support biomes
        }
    }

    WriteVarInt(data, mask);
    WriteVarInt(data, columnBuffer.Size);
    WriteByteArray(data, columnBuffer);

    // If you don't support block entities yet, use 0
    // If you need to implement it by sending block entities later with the update block entity packet,
    // do it that way and send 0 as well.  (Note that 1.10.1 (not 1.10 or 1.10.2) will not accept that)

    WriteVarInt(data, chunk.BlockEntities.Length);
    foreach (CompoundTag tag in chunk.BlockEntities) {
        WriteCompoundTag(data, tag);
    }
}

private void WriteChunkSection(ChunkSection section, Buffer data) {
    byte bitsPerBlock = FULL_SIZE_BITS_PER_BLOCK;  // 13

    WriteVarInt(data, 0);  // Palette size is 0

    // A bitmask that contains bitsPerBlock set bits
    uint individualValueMask = (uint)((1 << bitsPerBlock) - 1);

    UInt64 workLong;
    int currentLong = 0;

    for (int y = 0; y < SECTION_HEIGHT; y++) {
        for (int z = 0; z < SECTION_WIDTH; z++) {
            for (int x = 0; x < SECTION_WIDTH; x++) {
                int blockNumber = (((blockY * SECTION_HEIGHT) + blockZ) * SECTION_WIDTH) + blockX;
                int startLong = (blockNumber * bitsPerBlock) / 64;
                int startOffset = (blockNumber * bitsPerBlock) % 64;
                int endLong = ((blockNumber + 1) * bitsPerBlock - 1) / 64;

                if (startLong != currentLong) {
                    // We've finished one long at the border.  Write it and start another.
                    WriteUInt64(data, workLong);
                    workLong = 0;
                    currentLong = startLong;
                }

                byte metadata = section.GetMetadata(x, y, z);
                uint id = section.GetBlockID(x, y, z);

                uint value = id << 4 | metadata;
                value &= individualValueMask;

                workLong |= (Value << startOffset);

                if (startLong != endLong) {
                    // We've finished part of one long; write it and start the next.
                    Packet.WriteBEUInt64(workLong);
                    currentLong = endLong;

                    workLong = (value >> (64 - startOffset));
                }
            }
        }
    }

    for (int y = 0; y < SECTION_HEIGHT; y++) {
        for (int z = 0; z < SECTION_WIDTH; z++) {
            for (int x = 0; x < SECTION_WIDTH; x += 2) {
                // Note: x += 2 above; we read 2 values along x each time
                byte value = section.GetBlockLight(x, y, z) | (section.GetBlockLight(x + 1, y, z) << 4);
                WriteByte(data, value);
            }
        }
    }

    if (currentDimension.HasSkylight()) { // IE, current dimension is overworld / 0
        for (int y = 0; y < SECTION_HEIGHT; y++) {
            for (int z = 0; z < SECTION_WIDTH; z++) {
                for (int x = 0; x < SECTION_WIDTH; x += 2) {
                    // Note: x += 2 above; we read 2 values along x each time
                    byte value = section.GetSkyLight(x, y, z) | (section.GetSkyLight(x + 1, y, z) << 4);
                    WriteByte(data, value);
                }
            }
        }
    }
}

Full implementations

The following implement the previous (before 1.9) format: