Difference between revisions of "Chunk Format"

From wiki.vg
Jump to navigation Jump to search
(→‎Paletted Container structure: Remove redundant and incorrect (typo?) information on data array format; add links to field descriptions to make extended specifications easier to find.)
(Make sectioning flatter and more consistent. The Paletted Container section is very large and deep, so it gets its own top level section. The Data section is moved under Packet to make it adjacent with Heightmaps.)
Line 75: Line 75:
 
  |}
 
  |}
  
 +
=== Heightmaps structure ===
  
=== Heightmaps structure ===
 
 
A nameless NBT TAG_Compound containing two entries of TAG_Long_Array named <code>MOTION_BLOCKING</code> and <code>WORLD_SURFACE</code> (in upper case exactly as shown).
 
A nameless NBT TAG_Compound containing two entries of TAG_Long_Array named <code>MOTION_BLOCKING</code> and <code>WORLD_SURFACE</code> (in upper case exactly as shown).
  
Line 82: Line 82:
  
 
==== MOTION_BLOCKING encoding ====
 
==== MOTION_BLOCKING encoding ====
 +
 
<code>MOTION_BLOCKING</code> is a heightmap for the highest solid block at each position in the chunk. It's encoded as a compacted long array with 256 entries at 9 bits per entry totalling 37 longs.  
 
<code>MOTION_BLOCKING</code> is a heightmap for the highest solid block at each position in the chunk. It's encoded as a compacted long array with 256 entries at 9 bits per entry totalling 37 longs.  
  
Line 112: Line 113:
 
The first 16 encode the y values for <code>x, z = (0, 0), (1, 0), (2, 0) ...</code> (relative to the chunk) and when it hits the 16th number (4) it wraps back around to x = 0 but with an increased z value: <code>x, z = (0, 1), (1, 1), (2, 1), ...</code>
 
The first 16 encode the y values for <code>x, z = (0, 0), (1, 0), (2, 0) ...</code> (relative to the chunk) and when it hits the 16th number (4) it wraps back around to x = 0 but with an increased z value: <code>x, z = (0, 1), (1, 1), (2, 1), ...</code>
  
== Data structure ==
+
=== Data structure ===
  
 
The data section of the packet contains most of the useful data for the chunk.
 
The data section of the packet contains most of the useful data for the chunk.
Line 127: Line 128:
 
  |}
 
  |}
  
=== Chunk Section structure ===
+
==== Chunk Section structure ====
 
 
==== Chunk Section ====
 
  
 
A Chunk Section is defined in terms of other [[data types]]. A Chunk Section consists of the following fields:
 
A Chunk Section is defined in terms of other [[data types]]. A Chunk Section consists of the following fields:
Line 152: Line 151:
 
  |}
 
  |}
  
==== Paletted Container structure ====
+
== Paletted Container structure ==
  
 
A Paletted Container is a palette-based storage of entries. Paletted Containers have an associated global palette (either block states or biomes as of now), where values are mapped from. A Paletted Container consists of the following fields:
 
A Paletted Container is a palette-based storage of entries. Paletted Containers have an associated global palette (either block states or biomes as of now), where values are mapped from. A Paletted Container consists of the following fields:
Line 181: Line 180:
 
Data Array is given for each entry with increasing x coordinates, within rows of increasing z coordinates, within layers of increasing y coordinates.
 
Data Array is given for each entry with increasing x coordinates, within rows of increasing z coordinates, within layers of increasing y coordinates.
  
===== Palettes =====
+
=== Palettes ===
  
 
The bits per entry value determines what format is used for the palette. In most cases, invalid values will be interpreted as a different value when parsed by the notchian client, meaning that chunk data will be parsed incorrectly if you use an invalid bits per entry. Servers must make sure that the bits per entry value is correct. There are currently three types of palettes:
 
The bits per entry value determines what format is used for the palette. In most cases, invalid values will be interpreted as a different value when parsed by the notchian client, meaning that chunk data will be parsed incorrectly if you use an invalid bits per entry. Servers must make sure that the bits per entry value is correct. There are currently three types of palettes:
  
====== Single valued ======
+
==== Single valued ====
  
 
This format is used when bits per entry is equal to 0, and signifies that the palette contains a single value.
 
This format is used when bits per entry is equal to 0, and signifies that the palette contains a single value.
Line 214: Line 213:
 
The second part of the packet is for biomes, first we have their Bits per Entry <span style="border: 2px solid orange">01</span>, followed by the length of the palette <span style="border: 2px solid yellow">02</span> and these two elements <span style="border: 2px solid lime">27 03</span>. The indexed data of this biome has <span style="border: 2px solid green">01</span> long element, which are 8 bytes each, giving the long <span style="border: 2px solid aqua">CC FF CC FF CC FF CC FF</span>.
 
The second part of the packet is for biomes, first we have their Bits per Entry <span style="border: 2px solid orange">01</span>, followed by the length of the palette <span style="border: 2px solid yellow">02</span> and these two elements <span style="border: 2px solid lime">27 03</span>. The indexed data of this biome has <span style="border: 2px solid green">01</span> long element, which are 8 bytes each, giving the long <span style="border: 2px solid aqua">CC FF CC FF CC FF CC FF</span>.
  
====== Indirect ======
+
==== Indirect ====
  
 
There are three variants of this:
 
There are three variants of this:
Line 241: Line 240:
 
  |}
 
  |}
  
====== Direct ======
+
==== Direct ====
  
 
This format is used for bits per entry values greater than or equal to a threshold (9 for block states, 4 for biomes). The number of bits used to represent an entry is the base 2 logarithm of the number of entries in the global palette, rounded up. For the current vanilla release, this is 15 bits per entry for block states, and 6 bits per entry for biomes.
 
This format is used for bits per entry values greater than or equal to a threshold (9 for block states, 4 for biomes). The number of bits used to represent an entry is the base 2 logarithm of the number of entries in the global palette, rounded up. For the current vanilla release, this is 15 bits per entry for block states, and 6 bits per entry for biomes.
Line 257: Line 256:
 
The Notchian client calculates the bits per entry values for the global palettes at runtime based on the sizes of the block state and biome registries. If a sufficiently large number of blocks or biomes is added with mods, the value will be increased to compensate for the increased ID count. This increase can go up to 31 bits per entry (since registry IDs are signed integers). In case of Minecraft Forge, you can get the number of blocks with the "Number of ids" field found in the [[Minecraft Forge Handshake#RegistryData|RegistryData packet in the Forge Handshake]].
 
The Notchian client calculates the bits per entry values for the global palettes at runtime based on the sizes of the block state and biome registries. If a sufficiently large number of blocks or biomes is added with mods, the value will be increased to compensate for the increased ID count. This increase can go up to 31 bits per entry (since registry IDs are signed integers). In case of Minecraft Forge, you can get the number of blocks with the "Number of ids" field found in the [[Minecraft Forge Handshake#RegistryData|RegistryData packet in the Forge Handshake]].
  
==== Compacted data array ====
+
=== Compacted data array ===
  
 
The data array stores several entries within a single long, and sometimes overlaps one entry between multiple longs.  For a bits per block value of 15, the data is stored such that bits 1 through 15 are the first entry, 16 through 30 are the second, and so on.  Note that bit 1 is the ''least'' significant bit in this case, not the most significant bit.  The same behavior applies when a value stretches between two longs: for instance, block 5 would be bits 57 through 64 of the first long and then bits 1 through 6 of the second long.
 
The data array stores several entries within a single long, and sometimes overlaps one entry between multiple longs.  For a bits per block value of 15, the data is stored such that bits 1 through 15 are the first entry, 16 through 30 are the second, and so on.  Note that bit 1 is the ''least'' significant bit in this case, not the most significant bit.  The same behavior applies when a value stretches between two longs: for instance, block 5 would be bits 57 through 64 of the first long and then bits 1 through 6 of the second long.
Line 265: Line 264:
 
However, the compacted array format has been adjusted between MC 1.15 and MC 1.16 so that individual entries no longer span across multiple longs.
 
However, the compacted array format has been adjusted between MC 1.15 and MC 1.16 so that individual entries no longer span across multiple longs.
  
===== Example (Old) =====
+
==== Example (Old) ====
  
 
Format used up to Minecraft 1.15.2
 
Format used up to Minecraft 1.15.2
Line 274: Line 273:
 
<code>8B1018A7260F68C8</code> <code><span style="outline: solid 2px rgb(30%, 30%, 30%)">100</span><span style="outline: solid 2px rgb(60%, 60%, 60%)">01011</span><span style="outline: solid 2px hsl(330, 90%, 30%)">00010</span><span style="outline: solid 2px hsl(300, 90%, 30%)">00000</span><span style="outline: solid 2px hsl(270, 90%, 30%)">01100</span><span style="outline: solid 2px hsl(240, 90%, 30%)">01010</span><span style="outline: solid 2px hsl(210, 90%, 30%)">01110</span><span style="outline: solid 2px hsl(180, 90%, 30%)">01001</span><span style="outline: solid 2px hsl(150, 90%, 30%)">10000</span><span style="outline: solid 2px hsl(120, 90%, 30%)">01111</span><span style="outline: solid 2px hsl(90, 90%, 30%)">01101</span><span style="outline: solid 2px hsl(60, 90%, 30%)">00011</span><span style="outline: solid 2px hsl(30, 90%, 30%)">00100</span><span style="outline: solid 2px hsl(0, 90%, 30%)">0</span></code>
 
<code>8B1018A7260F68C8</code> <code><span style="outline: solid 2px rgb(30%, 30%, 30%)">100</span><span style="outline: solid 2px rgb(60%, 60%, 60%)">01011</span><span style="outline: solid 2px hsl(330, 90%, 30%)">00010</span><span style="outline: solid 2px hsl(300, 90%, 30%)">00000</span><span style="outline: solid 2px hsl(270, 90%, 30%)">01100</span><span style="outline: solid 2px hsl(240, 90%, 30%)">01010</span><span style="outline: solid 2px hsl(210, 90%, 30%)">01110</span><span style="outline: solid 2px hsl(180, 90%, 30%)">01001</span><span style="outline: solid 2px hsl(150, 90%, 30%)">10000</span><span style="outline: solid 2px hsl(120, 90%, 30%)">01111</span><span style="outline: solid 2px hsl(90, 90%, 30%)">01101</span><span style="outline: solid 2px hsl(60, 90%, 30%)">00011</span><span style="outline: solid 2px hsl(30, 90%, 30%)">00100</span><span style="outline: solid 2px hsl(0, 90%, 30%)">0</span></code>
  
===== Example (New) =====
+
==== Example (New) ====
  
 
Format used since Minecraft 1.16.0
 
Format used since Minecraft 1.16.0

Revision as of 09:36, 26 November 2023

This article describes in additional detail the format of the Chunk Data packet.

Concepts

Chunks columns and Chunk sections

You've probably heard the term "chunk" before. Minecraft uses chunks to store and transfer world data. However, there are actually 2 different concepts that are both called "chunks" in different contexts: chunk columns and chunk sections.


A chunk column is a collection of blocks with a horizontal size of 16×16, spanning the entire buildable area on the vertical axis. This is what most players think of when they hear the term "chunk". However, these are not the smallest unit data is stored in in the game; chunk columns are vertically divided into chunk sections, each 16 blocks tall.

Chunk columns store biomes, block entities, entities, tick data, and an array of sections.


A chunk section is a 16×16×16 collection of blocks (chunk sections are cubic). This is the actual area that blocks are stored in, and is often the concept Mojang refers to via "chunk". Breaking columns into sections wouldn't be useful, except that you don't need to send all chunk sections in a column: If a section is empty, then it doesn't need to be sent (more on this later).

Chunk sections store blocks and light data (both block light and sky light). Additionally, they can be associated with a section palette. A chunk section can contain at maximum 4096 (16×16×16, or 212) unique IDs (but, it is highly unlikely that such a section will occur in normal circumstances).

Chunk columns and chunk sections are both displayed when chunk border rendering is enabled (F3+G). Chunk columns borders are indicated via the red vertical lines, while chunk sections borders are indicated by the blue lines.

Global and section palettes

Illustration of an indexed palette (Source)

Minecraft also uses palettes. A palette maps numeric IDs to block states. The concept is more commonly used with colors in an image; Wikipedia's articles on color look-up tables, indexed colors, and palettes in general may be helpful for fully grokking it.

There are 2 palettes that are used in the game: the global palette and the section palette.


The global palette is the standard mapping of IDs to block states. Block state IDs are created in a linear fashion based off of order of assignment. One block state ID allocated for each unique block state for a block; if a block has multiple properties then the number of allocated states is the product of the number of values for each property. Note that the global palette is currently represented by 15 bits per entry[concept note 1]. If a block is not found in the global palette, it will be treated as air. The Data Generators system can be used to generate a list of all values in the current global palette.

Warning.png Don't assume that the global palette will always be like this. The format of it was changed in 1.13 as part of the flattening, and might change further in the future (although it's less likely) now that the 1.13 changes have been made. Furthermore, the size of the palette can and will change based on the number of IDs, which can also happen as a result of modding; see #Direct for more details.


A section palette is used to map IDs within a chunk section to global palette IDs. Other than skipping empty sections, correct use of the section palette is the biggest place where data can be saved. Given that most sections contain only a few blocks, using 15 bits to represent a chunk section that is only stone, gravel, and air would be extremely wasteful. Instead, a list of IDs are sent mapping indexes to global palette IDs (for instance, 0x10 0xD0 0x00), and indexes within the section palette are used (so stone would be sent as 0, gravel 1, and air 2)[concept note 2]. The number of bits per ID in the section palette varies from 4 to 8; if fewer than 4 bits would be needed it's increased to 4[concept note 3] and if more than 8 would be needed, the section palette is not used and instead global palette IDs are used[concept note 4].

Warning.png Note that the notchian client (and server) store their chunk data within the compacted, paletted format. Sending non-compacted data not only wastes bandwidth, but also leads to increased memory use clientside; while this is OK for an initial implementation it is strongly encouraged that one compacts the block data as soon as possible.

Notes

  1. The number of bits in the global palette via the ceil of a base-2 logarithm of the highest value in the palette.
  2. There is no requirement for IDs in a section palette to be monotonic; the order within the list is entirely arbitrary and often has to do with how the palette is built (if it finds a stone block before an air block, stone can come first). (However, although the order of the section palette entries can be arbitrary, it can theoretically be optimized to ensure the maximum possible DEFLATE compression. This optimization offers little to no gain, so generally do not attempt it.) However, there shouldn't be any gaps in the section palette, as gaps would increase the size of the section palette when it is sent.
  3. Most likely, sizes smaller than 4 are not used in the section palette because it would require the palette to be resized several times as it is built in the majority of cases; the processing cost would be higher than the data saved. Note that were the palette being built at once from existing data, a more optimal approach would be to iterate over the data twice: first to determine the required palette size, then to write out the compacted representation. This is indeed what the Notchian server does when saving chunks to disk. However, since Notchian Minecraft stores chunks internally in the paletted form, it seems plausible that world generation performance would be affected by a lower minimum.
  4. Most likely, sizes larger than 8 use the global palette because otherwise, the amount of data used to transmit the palette would exceed the savings that the section palette would grant.

Packet structure

Packet ID State Bound To Field Name Field Type Notes
0x20 Play Client Chunk X Int Chunk coordinate (block coordinate divided by 16, rounded down).
Chunk Z Int Chunk coordinate (block coordinate divided by 16, rounded down).
Heightmaps NBT See heightmaps structure below.
Size VarInt Size of Data in bytes; in some cases this is larger than it needs to be (e.g. MC-131684, MC-247438) in which case extra bytes should be skipped before reading fields after Data.
Data Byte array See data structure below.
Additional Data Various See protocol docs.

Heightmaps structure

A nameless NBT TAG_Compound containing two entries of TAG_Long_Array named MOTION_BLOCKING and WORLD_SURFACE (in upper case exactly as shown).

Purpose of WORLD_SURFACE is unknown, but it's not required for the chunk to be accepted. For the Superflat heightmap world example below the values transmitted in WORLD_SURFACE array are identical to values transmitted in the MOTION_BLOCKING array.

MOTION_BLOCKING encoding

MOTION_BLOCKING is a heightmap for the highest solid block at each position in the chunk. It's encoded as a compacted long array with 256 entries at 9 bits per entry totalling 37 longs.

The heightmap values are encoded as 9-bit unsigned values, at seven entries per each long, i.e. occupying 63 bits of a given long. It's padded at the right side with a single zero, so each long is self-contained and the encoded values do not overflow into the next long.

There are 36 fully filled longs and one half-filled with only four 9-bit entries. The half-filled long is padded the same way as the fully filled ones.

For the classic superflat world the heightmap contains 36 long values with hex values of 0100804020100804 that translates into height map values of:

000000010 000000010 000000010 000000010 000000010 000000010 000000010 0

This encodes heightmap of the superflat world, with value "2" for each column (why it is 2 and not 3 is not yet clear).

This is followed by one more long, with hex value of 0000000020100804 that translates into height map values of:

000000000 000000000 000000000 000000010 000000010 000000010 000000010 0

This adds up to 256 values in total for 256 columns of the chunk. Last three 9-bit values of the last long are unused.

The values are then parsed with increasing x values and increasing z values every time it hits a multiple of 16 for x. For example, if you have the y values

1, 1, 2, 2, 3, 1, 2, 3, 2, 4, 6, 1, 2, 5, 3, 4, 7, ...

The first 16 encode the y values for x, z = (0, 0), (1, 0), (2, 0) ... (relative to the chunk) and when it hits the 16th number (4) it wraps back around to x = 0 but with an increased z value: x, z = (0, 1), (1, 1), (2, 1), ...

Data structure

The data section of the packet contains most of the useful data for the chunk.

Field Name Field Type Notes
Data Array of Chunk Section This array is NOT length-prefixed. The number of elements in the array is calculated based on the world's height. Sections are sent bottom-to-top. Starting with 1.18, the world height changes based on the dimension. The height of each dimension is assigned by the server in its corresponding registry data entry. For example, the vanilla overworld is 384 blocks tall, meaning 24 chunks will be included in this array.

Chunk Section structure

A Chunk Section is defined in terms of other data types. A Chunk Section consists of the following fields:

Field Name Field Type Notes
Block count Short Number of non-air blocks present in the chunk section. "Non-air" is defined as any fluid and block other than air, cave air, and void air. The client will keep count of the blocks as they are broken and placed, and, if the block count reaches 0, the whole chunk section is not rendered, even if it still has blocks.
Block states Paletted Container Consists of 4096 entries, representing all the blocks in the chunk section.
Biomes Paletted Container Consists of 64 entries, representing 4x4x4 biome regions in the chunk section.

Paletted Container structure

A Paletted Container is a palette-based storage of entries. Paletted Containers have an associated global palette (either block states or biomes as of now), where values are mapped from. A Paletted Container consists of the following fields:

Field Name Field Type Notes
Bits Per Entry Unsigned Byte Determines how many bits are used to encode entries. Note that not all numbers are valid here.
Palette Varies See below for the format.
Data Array Length VarInt Number of longs in the following array. This value isn't entirely respected by the Notchian client. If it is smaller than expected, it will be overridden by the correct size calculated from Bits Per Entry. If too large, the client will read the specified number of longs, but silently discard all of them afterwards, resulting in a chunk filled with palette entry 0 (which appears to have been unintentional). Present but equal to 0 when Bits Per Entry is 0.
Data Array Array of Long Compacted list of indices pointing to entry IDs in the Palette. See #Compacted data array for the format. When Bits Per Entry is 0, this array is empty (see Single valued palette).

Data Array is given for each entry with increasing x coordinates, within rows of increasing z coordinates, within layers of increasing y coordinates.

Palettes

The bits per entry value determines what format is used for the palette. In most cases, invalid values will be interpreted as a different value when parsed by the notchian client, meaning that chunk data will be parsed incorrectly if you use an invalid bits per entry. Servers must make sure that the bits per entry value is correct. There are currently three types of palettes:

Single valued

This format is used when bits per entry is equal to 0, and signifies that the palette contains a single value.

When this palette is used, the Data Array sent/received is empty, since entries can be inferred from the palette's single value. However, the length of the Data Array is still included, even if it's always 0.

The format is as follows:

Field Name Field Type Notes
Value VarInt ID of the corresponding entry in its global palette.

Here is an example of the use of single-valued palettes within a completely empty chunk (filled with air).

00 00000000010227 0301CC FF CC FF CC FF CC FF

The first bytes 00 00 are the number of non-air blocks in the chunk. They are followed by the Bits per Entry 00, which is zero so we know the palette will have one element (not prefixed with length). This single element is code for air, 00. Next we have the length of the long array, which is always 00 for single-valued palettes.

The second part of the packet is for biomes, first we have their Bits per Entry 01, followed by the length of the palette 02 and these two elements 27 03. The indexed data of this biome has 01 long element, which are 8 bytes each, giving the long CC FF CC FF CC FF CC FF.

Indirect

There are three variants of this:

  • For block states with bits per entry <= 4, 4 bits are used to represent a block.
  • For block states and bits per entry between 5 and 8, the given value is used.
  • For biomes the given value is always used, and will be <= 3

This is an actual palette which lists the entries used. Values in the chunk section's data array are indices into the palette, which in turn gives a proper entry.

The format is as follows:

Field Name Field Type Notes
Palette Length VarInt Number of elements in the following array.
Palette Array of VarInt Mapping of entry IDs in the global palette to indices of this array.

Direct

This format is used for bits per entry values greater than or equal to a threshold (9 for block states, 4 for biomes). The number of bits used to represent an entry is the base 2 logarithm of the number of entries in the global palette, rounded up. For the current vanilla release, this is 15 bits per entry for block states, and 6 bits per entry for biomes.

The "palette" uses the following format:

Field Name Field Type Notes
no fields

The Notchian client calculates the bits per entry values for the global palettes at runtime based on the sizes of the block state and biome registries. If a sufficiently large number of blocks or biomes is added with mods, the value will be increased to compensate for the increased ID count. This increase can go up to 31 bits per entry (since registry IDs are signed integers). In case of Minecraft Forge, you can get the number of blocks with the "Number of ids" field found in the RegistryData packet in the Forge Handshake.

Compacted data array

The data array stores several entries within a single long, and sometimes overlaps one entry between multiple longs. For a bits per block value of 15, the data is stored such that bits 1 through 15 are the first entry, 16 through 30 are the second, and so on. Note that bit 1 is the least significant bit in this case, not the most significant bit. The same behavior applies when a value stretches between two longs: for instance, block 5 would be bits 57 through 64 of the first long and then bits 1 through 6 of the second long.

The Data Array, although varying in length, will never be padded due to the number of blocks being evenly divisible by 64, which is the number of bits in a long.

However, the compacted array format has been adjusted between MC 1.15 and MC 1.16 so that individual entries no longer span across multiple longs.

Example (Old)

Format used up to Minecraft 1.15.2

5 bits per block, containing the following references to blocks in a palette (not shown): 122344566480743131516914101202114 (although note that 4 could instead be any other value ending in those bits)

7020863148418841 0111000000100000100001100011000101001000010000011000100001000001
8B1018A7260F68C8 1000101100010000000110001010011100100110000011110110100011001000

Example (New)

Format used since Minecraft 1.16.0

5 bits per block, containing the following references to blocks in a palette (not shown): 122344566480743131516914101202

0020863148418841 0000000000100000100001100011000101001000010000011000100001000001
01018A7260F68C87 0000000100000001100010100111001001100000111101101000110010000111


Huh.png The following information needs to be added to this page:
Numeric IDs are outdated in this example, though the format is still correct

A second older example: 13 bits per block, using the global palette.

The following two longs would represent...

01001880C0060020 = 0000000100000000000110001000000011000000000001100000000000100000
0200D0068004C020 = 0000001000000000110100000000011010000000000001001100000000100000

9 blocks, with the start of a 10th (that would be finished in the next long).

  1. Grass, 2:0 (0x020)
  2. Dirt, 3:0 (0x030)
  3. Dirt, 3:0 (0x030)
  4. Coarse dirt, 3:1 (0x031)
  5. Stone, 1:0 (0x010)
  6. Stone, 1:0 (0x010)
  7. Diorite, 1:3 (0x013)
  8. Gravel, 13:0 (0x0D0)
  9. Gravel, 13:0 (0x0D0)
  10. Stone, 1:0 (or potentially emerald ore, 129:0) (0x010 or 0x810)

Tips and notes

There are several things that can make it easier to implement this format.

  • The 15 value for full bits per block is likely to change in the future, so it should not be hardcoded (instead, it should either be calculated or left as a constant).
  • Servers do not need to implement the palette initially (instead always using 15 bits per block), although it is an important optimization later on.
  • The Notchian server implementation does not send values that are out of bounds for the palette. If such a value is received, the format is being parsed incorrectly. In particular, if you're reading a number with all bits set (15, 31, etc), you might be reading skylight data (or you may have a sign error and you're reading negative numbers).
  • NOTE: This only applies to the old format! The number of longs needed for the data array can be calculated as ((16×16×16 blocks)×Bits per block)÷64 bits per long (which simplifies to 64×Bits per block). For instance, 14 bits per block requires 896 longs.
  • The Notchian client generally does not render chunks that lack neighbors. (As of 1.20.2 such chunks appear to sporadically become visible anyway, and do so consistently when interacted with.) This means that if you only send a fixed set of chunks with no empty chunks around them, then some of them will not be visible, although you can still interact with them. This is intended behavior, so that lighting and connected blocks can be handled correctly.

Sample implementations

Huh.png The following information needs to be added to this page:
This sample code is missing the heightmap, biome changes and the changes from 1.16

How the chunk format can be implemented varies largely by how you want to read/write it. It is often easier to read/write the data long-by-long instead of pre-create the data to write; however, storing the chunk data arrays in their packed form can be far more efficient memory- and performance-wise. These implementations are simple versions that can work as a base (especially for dealing with the bit shifting), but are not ideal.

Shared code

This is some basic pseudocode that shows the various types of palettes. It does not handle actually populating the palette based on data in a chunk section; handling this is left as for the implementer since there are many ways of doing so. (This does not apply for the direct version).

private uint GetGlobalPaletteIDFromState(BlockState state) {
    // Implementation left to the user; see Data Generators for more info on the values
}

private BlockState GetStateFromGlobalPaletteID(uint value) {
    // Implementation left to the user; see Data Generators for more info on the values
}

public interface Palette {
    uint IdForState(BlockState state);
    BlockState StateForId(uint id);
    byte GetBitsPerBlock();
    void Read(Buffer data);
    void Write(Buffer data);
}

public class IndirectPalette : Palette {
    Map<uint, BlockState> idToState;
    Map<BlockState, uint> stateToId;
    byte bitsPerBlock;

    public IndirectPalette(byte palBitsPerBlock) {
        bitsPerBlock = palBitsPerBlock;
    }

    public uint IdForState(BlockState state) {
        return stateToId.Get(state);
    }

    public BlockState StateForId(uint id) {
        return idToState.Get(id);
    }

    public byte GetBitsPerBlock() {
        return bitsPerBlock;
    }

    public void Read(Buffer data) {
        idToState = new Map<>();
        stateToId = new Map<>();
        // Palette Length
        int length = ReadVarInt();
        // Palette
        for (int id = 0; id < length; id++) {
            uint stateId = ReadVarInt();
            BlockState state = GetStateFromGlobalPaletteID(stateId);
            idToState.Set(id, state);
            stateToId.Set(state, id);
        }
    }

    public void Write(Buffer data) {
        Assert(idToState.Size() == stateToId.Size()); // both should be equivalent
        // Palette Length
        WriteVarInt(idToState.Size());
        // Palette
        for (int id = 0; id < idToState.Size(); id++) {
            BlockState state = idToState.Get(id);
            uint stateId = GetGlobalPaletteIDFromState(state);
            WriteVarInt(stateId);
        }
    }
}

public class DirectPalette : Palette {
    public uint IdForState(BlockState state) {
        return GetGlobalPaletteIDFromState(state);
    }

    public BlockState StateForId(uint id) {
        return GetStateFromGlobalPaletteID(id);
    }

    public byte GetBitsPerBlock() {
        return Ceil(Log2(BlockState.TotalNumberOfStates)); // currently 15
    }

    public void Read(Buffer data) {
        // No Data
    }

    public void Write(Buffer data) {
        // No Data
    }
}

public Palette ChoosePalette(byte bitsPerBlock) {
    if (bitsPerBlock <= 4) {
        return new IndirectPalette(4);
    } else if (bitsPerBlock <= 8) {
        return new IndirectPalette(bitsPerBlock);
    } else {
        return new DirectPalette();
    }
}

Deserializing

When deserializing, it is easy to read to a buffer (since length information is present). A basic example:

public Chunk ReadChunkDataPacket(Buffer data) {
    int x = ReadInt(data);
    int z = ReadInt(data);
    bool full = ReadBool(data);
    Chunk chunk;
    if (full) {
        chunk = new Chunk(x, z);
    } else {
        chunk = GetExistingChunk(x, z);
    }
    int mask = ReadVarInt(data);
    int size = ReadVarInt(data);
    ReadChunkColumn(chunk, full, mask, data.ReadByteArray(size));

    int blockEntityCount = ReadVarInt(data);
    for (int i = 0; i < blockEntityCount; i++) {
        CompoundTag tag = ReadCompoundTag(data);
        chunk.AddBlockEntity(tag.GetInt("x"), tag.GetInt("y"), tag.GetInt("z"), tag);
    }

    return chunk;
}

private void ReadChunkColumn(Chunk chunk, bool full, int mask, Buffer data) {
    for (int sectionY = 0; sectionY < (CHUNK_HEIGHT / SECTION_HEIGHT); y++) {
        if ((mask & (1 << sectionY)) != 0) {  // Is the given bit set in the mask?
            byte bitsPerBlock = ReadByte(data);
            Palette palette = ChoosePalette(bitsPerBlock);
            palette.Read(data);

            // A bitmask that contains bitsPerBlock set bits
            uint individualValueMask = (uint)((1 << bitsPerBlock) - 1);

            int dataArrayLength = ReadVarInt(data);
            UInt64[] dataArray = ReadUInt64Array(data, dataArrayLength);

            ChunkSection section = new ChunkSection();

            for (int y = 0; y < SECTION_HEIGHT; y++) {
                for (int z = 0; z < SECTION_WIDTH; z++) {
                    for (int x = 0; x < SECTION_WIDTH; x++) {
                        int blockNumber = (((y * SECTION_HEIGHT) + z) * SECTION_WIDTH) + x;
                        int startLong = (blockNumber * bitsPerBlock) / 64;
                        int startOffset = (blockNumber * bitsPerBlock) % 64;
                        int endLong = ((blockNumber + 1) * bitsPerBlock - 1) / 64;

                        uint data;
                        if (startLong == endLong) {
                            data = (uint)(dataArray[startLong] >> startOffset);
                        } else {
                            int endOffset = 64 - startOffset;
                            data = (uint)(dataArray[startLong] >> startOffset | dataArray[endLong] << endOffset);
                        }
                        data &= individualValueMask;

                        // data should always be valid for the palette
                        // If you're reading a power of 2 minus one (15, 31, 63, 127, etc...) that's out of bounds,
                        // you're probably reading light data instead

                        BlockState state = palette.StateForId(data);
                        section.SetState(x, y, z, state);
                    }
                }
            }

            for (int y = 0; y < SECTION_HEIGHT; y++) {
                for (int z = 0; z < SECTION_WIDTH; z++) {
                    for (int x = 0; x < SECTION_WIDTH; x += 2) {
                        // Note: x += 2 above; we read 2 values along x each time
                        byte value = ReadByte(data);

                        section.SetBlockLight(x, y, z, value & 0xF);
                        section.SetBlockLight(x + 1, y, z, (value >> 4) & 0xF);
                    }
                }
            }

            if (currentDimension.HasSkylight()) { // IE, current dimension is overworld / 0
                for (int y = 0; y < SECTION_HEIGHT; y++) {
                    for (int z = 0; z < SECTION_WIDTH; z++) {
                        for (int x = 0; x < SECTION_WIDTH; x += 2) {
                            // Note: x += 2 above; we read 2 values along x each time
                            byte value = ReadByte(data);

                            section.SetSkyLight(x, y, z, value & 0xF);
                            section.SetSkyLight(x + 1, y, z, (value >> 4) & 0xF);
                        }
                    }
                }
            }

            // May replace an existing section or a null one
            chunk.Sections[SectionY] = section;
        }
    }

    for (int z = 0; z < SECTION_WIDTH; z++) {
        for (int x = 0; x < SECTION_WIDTH; x++) {
            chunk.SetBiome(x, z, ReadInt(data));
        }
    }
}

Serializing

Serializing the packet is more complicated, because of the palette. It is easy to implement with the full bits per block value; implementing it with a compacting palette is much harder since algorithms to generate and resize the palette must be written. As such, this example does not generate a palette. The palette is a good performance improvement (as it can significantly reduce the amount of data sent), but managing that is much harder and there are a variety of ways of implementing it.

Also note that this implementation doesn't handle situations where full is false (ie, making a large change to one section); it's only good for serializing a full chunk.

public void WriteChunkDataPacket(Chunk chunk, Buffer data) {
    WriteInt(data, chunk.GetX());
    WriteInt(data, chunk.GetZ());
    WriteBool(true);  // Full

    int mask = 0;
    Buffer columnBuffer = new Buffer();
    for (int sectionY = 0; sectionY < (CHUNK_HEIGHT / SECTION_HEIGHT); y++) {
        if (!chunk.IsSectionEmpty(sectionY)) {
            mask |= (1 << chunkY);  // Set that bit to true in the mask
            WriteChunkSection(chunk.Sections[sectionY], columnBuffer);
        }
    }
    for (int z = 0; z < SECTION_WIDTH; z++) {
        for (int x = 0; x < SECTION_WIDTH; x++) {
            WriteInt(columnBuffer, chunk.GetBiome(x, z));  // Use 127 for 'void' if your server doesn't support biomes
        }
    }

    WriteVarInt(data, mask);
    WriteVarInt(data, columnBuffer.Size);
    WriteByteArray(data, columnBuffer);

    // If you don't support block entities yet, use 0
    // If you need to implement it by sending block entities later with the update block entity packet,
    // do it that way and send 0 as well.  (Note that 1.10.1 (not 1.10 or 1.10.2) will not accept that)

    WriteVarInt(data, chunk.BlockEntities.Length);
    foreach (CompoundTag tag in chunk.BlockEntities) {
        WriteCompoundTag(data, tag);
    }
}

private void WriteChunkSection(ChunkSection section, Buffer buf) {
    Palette palette = section.palette;
    byte bitsPerBlock = palette.GetBitsPerBlock();

    WriteByte(bitsPerBlock);
    palette.Write(buf);

    int dataLength = (16*16*16) * bitsPerBlock / 64; // See tips section for an explanation of this calculation
    UInt64[] data = new UInt64[dataLength];

    // A bitmask that contains bitsPerBlock set bits
    uint individualValueMask = (uint)((1 << bitsPerBlock) - 1);

    for (int y = 0; y < SECTION_HEIGHT; y++) {
        for (int z = 0; z < SECTION_WIDTH; z++) {
            for (int x = 0; x < SECTION_WIDTH; x++) {
                int blockNumber = (((y * SECTION_HEIGHT) + z) * SECTION_WIDTH) + x;
                int startLong = (blockNumber * bitsPerBlock) / 64;
                int startOffset = (blockNumber * bitsPerBlock) % 64;
                int endLong = ((blockNumber + 1) * bitsPerBlock - 1) / 64;

                BlockState state = section.GetState(x, y, z);

                UInt64 value = palette.IdForState(state);
                value &= individualValueMask;

                data[startLong] |= (value << startOffset);

                if (startLong != endLong) {
                    data[endLong] = (value >> (64 - startOffset));
                }
            }
        }
    }

    WriteVarInt(dataLength);
    WriteLongArray(data);

    for (int y = 0; y < SECTION_HEIGHT; y++) {
        for (int z = 0; z < SECTION_WIDTH; z++) {
            for (int x = 0; x < SECTION_WIDTH; x += 2) {
                // Note: x += 2 above; we read 2 values along x each time
                byte value = section.GetBlockLight(x, y, z) | (section.GetBlockLight(x + 1, y, z) << 4);
                WriteByte(data, value);
            }
        }
    }

    if (currentDimension.HasSkylight()) { // IE, current dimension is overworld / 0
        for (int y = 0; y < SECTION_HEIGHT; y++) {
            for (int z = 0; z < SECTION_WIDTH; z++) {
                for (int x = 0; x < SECTION_WIDTH; x += 2) {
                    // Note: x += 2 above; we read 2 values along x each time
                    byte value = section.GetSkyLight(x, y, z) | (section.GetSkyLight(x + 1, y, z) << 4);
                    WriteByte(data, value);
                }
            }
        }
    }
}

Full implementations

Sample data

Old format

The following implement the previous (before 1.9) format: