Difference between revisions of "Chunk Format"

From wiki.vg
Jump to navigation Jump to search
(→‎Sample implementations: Fix writing code)
 
(146 intermediate revisions by 28 users not shown)
Line 1: Line 1:
This article describes in additional detail the format of the [[Protocol#Chunk Data|Chunk Data]] packet.
+
This article describes in additional detail the format of the [[Protocol#Chunk_Data_and_Update_Light|Chunk Data]] packet.
  
 
== Concepts ==
 
== Concepts ==
Line 7: Line 7:
 
You've probably heard the term "chunk" before.  Minecraft uses chunks to store and transfer world data.  However, there are actually 2 different concepts that are both called "chunks" in different contexts: chunk columns and chunk sections.
 
You've probably heard the term "chunk" before.  Minecraft uses chunks to store and transfer world data.  However, there are actually 2 different concepts that are both called "chunks" in different contexts: chunk columns and chunk sections.
  
{{Anchor|Chunk column}}A '''chunk column''' is a 16×256×16 collection of blocks, and is what most players think of when they hear the term "chunk".  However, these are not the smallest unit data is stored in in the game; chunk columns are actually 16 chunk sections aligned vertically.
+
{{Anchor|Chunk column}}A '''chunk column''' is a collection of blocks with a horizontal size of 16×16, spanning the entire buildable area on the vertical axis.  This is what most players think of when they hear the term "chunk".  However, these are not the smallest unit data is stored in in the game; chunk columns are vertically divided into chunk sections, each 16 blocks tall.
  
Chunk columns store biomes, block entities, entities, tick data, and an array of sections.
+
Chunk columns store block entities, entities, tick data, and an array of sections.
  
 
{{Anchor|Chunk section}}A '''chunk section''' is a 16×16×16 collection of blocks (chunk sections are cubic).  This is the actual area that blocks are stored in, and is often the concept Mojang refers to via "chunk".  Breaking columns into sections wouldn't be useful, except that you don't need to send all chunk sections in a column: If a section is empty, then it doesn't need to be sent (more on this later).
 
{{Anchor|Chunk section}}A '''chunk section''' is a 16×16×16 collection of blocks (chunk sections are cubic).  This is the actual area that blocks are stored in, and is often the concept Mojang refers to via "chunk".  Breaking columns into sections wouldn't be useful, except that you don't need to send all chunk sections in a column: If a section is empty, then it doesn't need to be sent (more on this later).
  
Chunk sections store blocks and light data (both block light and sky light).  Additionally, they can be associated with a [[#Section palette|section palette]].  A chunk section can contain at maximum 4096 (16&times;16&times;16, or 2<sup>12</sup>) unique IDs (but, it is highly unlikely that such a section will occur in normal circumstances).
+
Chunk sections store blocks, biomes and light data (both block light and sky light).  Additionally, they can be associated with at most two [[#Palettes|palettes]]&mdash;one for blocks, one for biomes.  A chunk section can contain at maximum 4096 (16&times;16&times;16, or 2<sup>12</sup>) unique block state IDs, and 64 (4&times;4&times;4) unique biome IDs (but, it is highly unlikely that such a section will occur in normal circumstances).
  
 
Chunk columns and chunk sections are both displayed when chunk border rendering is enabled (<kbd>F3</kbd>+<kbd>G</kbd>).  Chunk columns borders are indicated via the red vertical lines, while chunk sections borders are indicated by the blue lines.
 
Chunk columns and chunk sections are both displayed when chunk border rendering is enabled (<kbd>F3</kbd>+<kbd>G</kbd>).  Chunk columns borders are indicated via the red vertical lines, while chunk sections borders are indicated by the blue lines.
  
=== Empty sections and the primary bit mask ===
+
=== Registries ===
  
As previously mentioned, chunk sections can be '''empty'''.  Sections which contain no useful data are treated as empty<ref group="concept note">Empty is defined by the notchian server as being composed of all air, but this can result in lighting issues ([https://bugs.mojang.com/browse/MC-80966 MC-80966]).  Custom servers should consider defining empty to mean something like "completely air and without lighting data" or "completely air and with no blocks in the neighboring sections that need to be lit by light from this section".</ref>, and are not sent to the client, as the client is able to infer the contents<ref group="concept note">Generally meaning, "it's all air".  Of course, lighting is an issue as with before - the notchian client assumes 0 block light and 15 sky light, even when that's not valid (underground sections shouldn't be skylit, and sections near light sources should be lit).</ref>.  For the average world, this means around 60% of the world's data doesn't need to be sent, since it's all air; this is a significant save.
+
The registries are the primary, protocol-wide mappings from block states and biomes to numeric identifiers.
  
It is important to note that a chunk composed entirely of empty sections is different from an empty (ie, unloaded) chunk column.  When a block is changed in an empty section, the section is created (as all air), and the block is set.  When a block is changed in an empty chunk, the behavior is undefined (but generally, nothing happens).
+
==== Block state registry ====
  
The '''primary bit mask''' simply determines which sections are being sent.  The least significant bit is for the lowest section (y=0 to y=15).  Only 16 bits can be set in it (with the 16th bit controlling the y=240 to y=255 section); sections above y=255 are not valid for the notchian client.  To check whether a section is included, use <span style="white-space:nowrap"><syntaxhighlight lang="c" inline>((mask & (1 << sectionY)) != 0)</syntaxhighlight></span>.
+
The block state registry is hardcoded into Minecraft, and can only be changed via modding. Such changes break protocol compatibility, and as such, modding frameworks typically include protocol extensions to negotiate which IDs the client and server have in common.
  
=== Global and section palettes ===
+
One block state ID is allocated for each unique block state of a block; if a block has multiple properties then the number of allocated states is the product of the number of values for each property. The block state IDs belonging to a given block are always consecutive. Other than that, the ordering of block states is hardcoded, and somewhat arbitrary.
  
[[File:Indexed_palette.png|thumb|Illustration of an indexed palette ([[Commons:File:Indexed_palette.png|Source]])]]
+
The [[Data Generators]] system can be used to generate a list of all block state IDs.
  
Minecraft also uses palettes.  A palette maps numeric IDs to block states.  The concept is more commonly used with colors in an image; Wikipedia's articles on [[Wikipedia:Color look-up table|color look-up tables]], [[Wikipedia:Indexed color|indexed colors]], and [[Wikipedia:Palette (computing)|palettes in general]] may be helpful for fully grokking it.
+
==== Biome registry ====
  
There are 2 palettes that are used in the game: the global palette and the section palette.
+
The biome registry is defined at runtime in a [[Registry Data]] packet sent by the server during the Configuration phase.
  
{{Anchor|Global palette}}The '''global palette''' is the standard mapping of IDs to block states.  Currently, it is a combination Block ID and Metadata <span style="white-space:nowrap">(<syntaxhighlight lang="c" inline>(blockId << 4) | metadata</syntaxhighlight>)</span>.  Note that thus, the global palette is not continuous<ref group="concept note">The global palette is not continuous in more ways than 1.  The more obvious manner is that not all blocks have metadata: for instance, dirt (ID 3) has only 3 states (dirt, coarse dirt, and podzol), so the palette surrounding it is <code>000000011&nbsp;0000; 000000011&nbsp;0001; 000000011&nbsp;0010; 000000100&nbsp;0000</code>.  The second way is that structure blocks have an ID of 255, even though there is currently no block with ID 254; thus, there is a large gap.</ref>.  Entries not defined within the global palette are treated as air (even if the block ID itself is known, if the metadata is not known, the state is treated as air).  Note that the global palette is currently represented by 13 bits per entry<ref group="concept note">The number of bits in the global palette via the ceil of a base-2 logarithm of the highest value in the palette.</ref>, with 9 bits for the block ID and 4 bits for the metadata.
+
The Notchian server pulls these biome definitions {{Minecraft Wiki|Custom biome|from data packs}}.
  
The basic implementation looks like this:
+
=== Palettes ===
  
<syntaxhighlight lang="java">
+
[[File:Indexed_palette.png|thumb|Illustration of an indexed palette ([[Commons:File:Indexed_palette.png|Source]])]]
long getGlobalPaletteIDFromState(BlockState state) {
 
    if (state.isValid()) {
 
        return (state.getId() << 4) | state.getMetadata();
 
    } else {
 
        return 0;
 
    }
 
}
 
  
BlockState getStateFromGlobalPaletteID(long id) {
+
A palette maps a smaller set of IDs within a [[#Chunk section|chunk section]] to registry IDs. Other than skipping empty sections, correct use of palettes is the biggest place where data can be saved. For example, encoding any of the IDs in the block state registry as of vanilla 1.20.2 requires 15 bits. Given that most sections contain only a few different blocks, using 15 bits per block to represent a chunk section that is only stone, gravel, and air would be extremely wasteful.  Instead, a list of registry IDs is sent (for instance, <code>40 57 0</code>), and indices into that list&mdash;the palette&mdash;are sent as the block state or biome values within the chunk (so <code>40</code> would be sent as <code>0</code>, <code>57</code> as <code>1</code>, and <code>0</code> as <code>2</code>).<ref group="concept note">There is no requirement for IDs in a palette to be [[Wikipedia:Monotonic|monotonic]]; the order within the list is entirely arbitrary and often has to do with how the palette is built (if it finds a stone block before an air block, stone can come first).  (However, although the order of the palette entries can be arbitrary, it can theoretically be optimized to ensure the maximum possible DEFLATE compression. This optimization offers little to no gain, so generally do not attempt it.However, there shouldn't be any gaps in the palette, as gaps would increase the size of the palette when it is sent.</ref>
    int blockID = (id >> 4);
 
    byte metadata = (id & 0x0F);
 
    BlockState state = new BlockState(blockID, metadata);
 
    if (state.isValid()) {
 
        return state;
 
    } else {
 
        return new BlockState(0, 0); // Air
 
    }
 
}
 
</syntaxhighlight>
 
  
{{Warning2|Don't assume that the global palette will always be like this; keep it in a separate function.  Mojang has stated that they plan to change the global palette to avoid increasing the total size. Equally so, though, do not hardcode the total size of the palette; keep it in a constant.}}
+
The number of bits used to encode palette indices varies based on the number of indices, and the registry in question. If a threshold on the number of unique IDs in the section is exceeded, a palette is not used, and registry IDs are used directly instead.
  
{{Anchor|Section palette}}A '''section palette''' is used to map IDs within a [[#Chunk section|chunk section]] to global palette IDs.  Other than skipping empty sections, correct use of the section palette is the biggest place where data can be saved.  Given that most sections contain only a few blocks, using 13 bits to represent a chunk section that is only stone, gravel, and air would be extremely wasteful.  Instead, a list of IDs are sent mapping indexes to global palette IDs (for instance, <code>0x10 0xD0 0x00</code>), and indexes within the section palette are used (so stone would be sent as <code>0</code>, gravel <code>1</code>, and air <code>2</code>)<ref group="concept note">There is no requirement for IDs in a section palette to be [[Wikipedia:Monotonic|monotonic]]; the order within the list is entirely arbitrary and often has to deal with how the palette is built (if it finds a stone block before an air block, stone can come first).  (However, although the order of the section palette entries can be arbitrary, it can theoretically be optimized to ensure the maximum possible GZIP compression.  This optimization offers little to no gain, so generally do not attempt it.)  However, there shouldn't be any gaps in the section palette, as gaps would increase the size of the section palette when it is sent.</ref>.  The number of bits per ID in the section palette varies from 4 to 8; if fewer than 4 bits would be needed it's increased to 4<ref group="concept note">Most likely, sizes smaller than 4 are not used in the section palette because it would require the palette to be resized several times as it is built in the majority of cases; the processing cost would be higher than the data saved.</ref> and if more than 8 would be needed, the section palette is not used and instead global palette IDs are used<ref group="concept note">Most likely, sizes larger than 8 use the global palette because otherwise, the amount of data used to transmit the palette would exceed the savings that the section palette would grant.</ref>.
+
The concept of palettes is more commonly used with colors in an image; Wikipedia's articles on [[Wikipedia:Color look-up table|color look-up tables]], [[Wikipedia:Indexed color|indexed colors]], and [[Wikipedia:Palette (computing)|palettes in general]] may be helpful for fully grokking it.
  
 
{{Warning2|Note that the notchian client (and server) store their chunk data within the compacted, paletted format.  Sending non-compacted data not only wastes bandwidth, but also leads to increased memory use clientside; while this is OK for an initial implementation it is strongly encouraged that one compacts the block data as soon as possible.}}
 
{{Warning2|Note that the notchian client (and server) store their chunk data within the compacted, paletted format.  Sending non-compacted data not only wastes bandwidth, but also leads to increased memory use clientside; while this is OK for an initial implementation it is strongly encouraged that one compacts the block data as soon as possible.}}
 
=== Ground-up continuous ===
 
 
The '''ground-up continuous''' value (tentative name) is one of the more confusing properties of the chunk data packet, simply because there's no good name for it.  It controls two different behaviors of the chunk data packet, one that most people need, and one that most don't.
 
 
When ground-up continuous is set, the chunk data packet is used to create a ''new'' chunk.  This includes biome data, and all (non-empty) sections in the chunk.  Sections not specified in the primary bit mask are empty sections.
 
 
{{Warning2|Sending a packet with ground-up continuous enabled over a chunk that already exists will '''leak memory''' clientside.
 
 
Make sure to unload chunks before overwriting them with the [[Protocol#Unload Chunk|Unload Chunk]] packet.  That packet can always be sent even on unloaded chunks, so in situations where the chunk might or might not be loaded already, it's valid to send it again (but avoid sending it in excess).
 
 
The <code>MultiplayerChunkCache</code> values in F3 show the number of chunks in the client's 2 chunk storage mechanisms; if the numbers aren't equal, you've leaked chunks.}}
 
 
When ground-up continuous is not set, then the chunk data packet acts as a large [[Protocol#Multi Block Change|Multi Block Change]] packet, changing all of the blocks in the given section at once.  This can have some performance benefits, especially for lighting purposes.  BIome data is ''not'' sent when ground-up continuous is not set; that means that biomes can't be changed once a chunk is loaded.  Sections not specified in the primary bit mask are not changed and should be left as-is.
 
 
{{Warning2|As with [[Protocol#Multi Block Change|Multi Block Change]] and [[Protocol#Block Change|Block Change]], it is not safe to send this packet in unloaded chunks, as it can corrupt notchian client's shared empty chunk.  Clients should ''ignore'' such packets, and servers should not send non-ground-up continuous chunk data packets into unloaded chunks.}}
 
  
 
=== Notes ===
 
=== Notes ===
Line 94: Line 61:
 
  ! Notes
 
  ! Notes
 
  |-
 
  |-
  |rowspan="9"| 0x20
+
  |rowspan="6"| 0x20
  |rowspan="9"| Play
+
  |rowspan="6"| Play
  |rowspan="9"| Client
+
  |rowspan="6"| Client
 
  | Chunk X
 
  | Chunk X
  | Int
+
  | {{Type|Int}}
  | Chunk coordinate (block coordinate divided by 16, rounded down)
+
  | Chunk coordinate (block coordinate divided by 16, rounded down).
 
  |-
 
  |-
 
  | Chunk Z
 
  | Chunk Z
  | Int
+
  | {{Type|Int}}
  | Chunk coordinate (block coordinate divided by 16, rounded down)
+
  | Chunk coordinate (block coordinate divided by 16, rounded down).
  |-
+
  |-  
  | Ground-Up Continuous
+
  | Heightmaps
  | Boolean
+
  | {{Type|NBT}}
  | See [[#Ground-up continuous]]
+
  | See [[#Heightmaps structure]] below.
|-
 
| Primary Bit Mask
 
| VarInt
 
| Bitmask with bits set to 1 for every 16×16×16 chunk section whose data is included in Data. The least significant bit represents the chunk section at the bottom of the chunk column (from y=0 to y=15).
 
 
  |-  
 
  |-  
 
  | Size
 
  | Size
  | VarInt
+
  | {{Type|VarInt}}
  | Size of Data in bytes
+
  | Size of Data in bytes; in some cases this is larger than it needs to be (e.g. [https://bugs.mojang.com/browse/MC-131684 MC-131684], [https://bugs.mojang.com/browse/MC-247438 MC-247438]) in which case extra bytes should be skipped before reading fields after Data.
 
  |-
 
  |-
 
  | Data
 
  | Data
  | Byte array
+
  | {{Type|Byte Array}}
  | See [[#Data structure|data structure]] below
+
  | See [[#Data structure]] below.
 +
|-
 +
| Additional Data
 +
| Various
 +
| See [[Protocol#Chunk Data and Update Light]].
 +
|}
 +
 
 +
=== Heightmaps structure ===
 +
 
 +
Minecraft uses heightmaps to optimize various operations on both the server and the client. All heightmaps share the basic structure of encoding the position of the highest "occupied" block in each column of blocks within a chunk column. The differences have to do with which blocks are considered to be "occupied".
 +
 
 +
Rather than calculating them from the chunk data, the client receives the initial heightmaps it needs from the server. This trades an increase in network usage for a decrease in client-side processing. Once a chunk is loaded, the client updates its heightmaps based on block changes independently from the server.
 +
 
 +
No heightmaps are strictly required for the client to accept a chunk. If a heightmap is missing from a Chunk Data packet, the client will initialize it with all heights set to their minimum values. However, block changes will still cause the corresponding height values to be updated as normal.
 +
 
 +
The Heightmaps structure is an NBT [[NBT#Specification:compound_tag|Compound Tag]] containing a [[NBT#Specification:long_array_tag|Long Array Tag]] element for each heightmap. The name of each Long Array is the name of the corresponding heightmap.
 +
 
 +
The height values of a heightmap are packed into the long array in the same manner described in [[#Data Array format]], and ordered such that the fastest-increasing coordinate is x. (However, there are only 256 entries&mdash;one for each block column.) The Bits Per Entry value used is calculated as ceil(log2(world height + 1)). This is because the number of possible height values is one more than the world height&mdash;ranging from 0 (completely blank column; not even bedrock) to world height (highest position is occupied). Note that this means, for example, that a world with height 256 will use a Bits Per Entry of 9.
 +
 
 +
The following heightmaps are currently used by the client:
 +
 
 +
{| class="wikitable"
 +
! Name
 +
! Considers Occupied
 +
! Purposes
 
  |-
 
  |-
  | Number of block entities
+
  | MOTION_BLOCKING
  | VarInt
+
  | "Solid" blocks, except bamboo saplings and cactuses; fluids.
  | Length of the following array
+
  | To determine where to display rain and snow.
 
  |-
 
  |-
  | Block entities
+
  | WORLD_SURFACE
  | Array of [[NBT|NBT Tag]]
+
  | All blocks other than air, cave air and void air.
  | All block entities in the chunk.  Use the x, y, and z tags in the NBT to determine their positions.
+
  | To determine if a beacon beam is obstructed.
 
  |}
 
  |}
  
== Data structure ==
+
This list appears to be exhaustive as of 1.20.2.
 +
 
 +
=== Data structure ===
  
 
The data section of the packet contains most of the useful data for the chunk.
 
The data section of the packet contains most of the useful data for the chunk.
Line 141: Line 130:
 
  |-
 
  |-
 
  | Data
 
  | Data
  | Array of [[#Chunk Section structure|Chunk Section]]
+
  | {{Type|Array}} of [[#Chunk Section structure|Chunk Section]]
  | The length of the array is equal to the number of bits set in Primary Bit Mask. Sections are sent bottom-to-top, i.e. the first section, if sent, extends from Y=0 to Y=15.
+
  | This array is NOT length-prefixed. The number of elements in the array is calculated based on the world's height. Sections are sent bottom-to-top. Starting with 1.18, the world height changes based on the dimension. The height of each dimension is assigned by the server in its corresponding [[Registry Data#Dimension Type|registry data]] entry. For example, the vanilla overworld is 384 blocks tall, meaning 24 chunk sections will be included in this array.
 +
|}
 +
 
 +
==== Chunk Section structure ====
 +
 
 +
{{Need Info|How do biomes work now?  The biome change happened at the same time as the seed change, but it's not clear how/if biomes could be computed given that it's not the actual seed...  ([https://www.reddit.com/r/Mojira/comments/e5at6i/a_discussion_for_the_changes_to_how_biomes_are/ /r/mojira discussion] which notes that it seems to be some kind of interpolation)}}
 +
 
 +
A Chunk Section is defined in terms of other [[data types]]. A Chunk Section consists of the following fields:
 +
 
 +
{| class="wikitable"
 +
|-
 +
! Field Name
 +
! Field Type
 +
! Notes
 +
|-
 +
| Block count
 +
| {{Type|Short}}
 +
| Number of non-air blocks present in the chunk section. "Non-air" is defined as any fluid and block other than air, cave air, and void air. The client will keep count of the blocks as they are broken and placed, and, if the block count reaches 0, the whole chunk section is not rendered, even if it still has blocks.
 +
|-
 +
| Block states
 +
| [[#Paletted Container structure|Paletted Container]]
 +
| Consists of 4096 entries, representing all the blocks in the chunk section.
 
  |-
 
  |-
 
  | Biomes
 
  | Biomes
  | Optional Byte Array
+
  | [[#Paletted Container structure|Paletted Container]]
  | Only sent if Ground-Up Continuous is true; 256 bytes if present
+
  | Consists of 64 entries, representing 4&times;4&times;4 biome regions in the chunk section.
 
  |}
 
  |}
  
=== Chunk Section structure ===
+
== Paletted Container structure ==
  
A Chunk Section is defined in terms of other [[data types]]. A Chunk Section consists of the following fields:
+
A Paletted Container is a palette-based storage of entries. Paletted Containers have an associated registry (either block states or biomes as of now), where values are mapped from. A Paletted Container consists of the following fields:
  
 
{| class="wikitable"
 
{| class="wikitable"
Line 159: Line 169:
 
  ! Notes
 
  ! Notes
 
  |-
 
  |-
  | Bits Per Block
+
  | Bits Per Entry
  | Unsigned Byte
+
  | {{Type|Unsigned Byte}}
  | Determines how many bits are used to encode a block. Note that not all numbers are valid here. This also changes whether the palette is present.
+
  | Determines how many bits are used to encode entries. Note that not all numbers are valid here.
|-
 
| Palette Length
 
| VarInt
 
| Length of the following array. May be 0, in which case the following palette is not sent.
 
 
  |-
 
  |-
 
  | Palette
 
  | Palette
  | Optional Array of VarInt
+
  | Varies
  | Mapping of block state IDs in the global palette to indices of this array
+
  | See [[#Palette formats]] below.
 
  |-
 
  |-
 
  | Data Array Length
 
  | Data Array Length
  | VarInt
+
  | {{Type|VarInt}}
  | Number of longs in the following array
+
  | Number of longs in the following array. This value isn't entirely respected by the Notchian client. If it is smaller than expected, it will be overridden by the correct size calculated from Bits Per Entry. If too large, the client will read the specified number of longs, but silently discard all of them afterwards, resulting in a chunk filled with palette entry 0 (which appears to have been unintentional). Present but equal to 0 when Bits Per Entry is 0.
 
  |-
 
  |-
 
  | Data Array
 
  | Data Array
  | Array of Long
+
  | {{Type|Array}} of {{Type|Long}}
  | Compacted list of 4096 indices pointing to state IDs in the Palette
+
  | See [[#Data Array format]] below.
 +
|}
 +
 
 +
=== Palette formats ===
 +
 
 +
The Bits Per Entry value determines what format is used for the palette.
 +
 
 +
{{Warning2|Values not listed in the following table are rounded upwards to the next one specified, or downwards if larger than the value for Direct. Therefore such values will lead to unexpected results, and should not be used.}}
 +
 
 +
There are currently three possible palette formats:
 +
 
 +
{| class="wikitable"
 +
|-
 +
! <abbr title="Bits Per Entry">BPE</abbr> (blocks)
 +
! <abbr title="Bits Per Entry">BPE</abbr> (biomes)
 +
! Palette Format
 
  |-
 
  |-
  | Block Light
+
  | 0
  | Byte Array
+
  | 0
  | Half byte per block
+
  | [[#Single valued|Single valued]]
 
  |-
 
  |-
  | Sky Light
+
  | 4-8
  | Optional Byte Array
+
  | 1-3
  | Only if in the Overworld; half byte per block
+
  | [[#Indirect|Indirect]]
 +
|-
 +
| 15**
 +
| 6*
 +
| [[#Direct|Direct]]
 
  |}
 
  |}
  
Data Array, Block Light, and Sky Light are given for each block with increasing x coordinates, within rows of increasing z coordinates, within layers of increasing y coordinates. For the half-byte light arrays, even-indexed items (those with an even x coordinate, starting at 0) are packed into the ''low bits'', odd-indexed into the ''high bits''.
+
<nowiki>*</nowiki>The Notchian client calculates the Bits Per Entry values for the Direct palette format at runtime based on the sizes of the block state and biome registries. As such, the value used for biomes is entirely dependent on the contents of the biome registry sent in the [[Registry Data]] packet; the value shown is only valid for vanilla servers with no custom data packs. If the BPE requirement for Direct is less than or equal to the maximum for Indirect, Direct will never be used given BPE values within the valid range.
 +
 
 +
<nowiki>**</nowiki>Similarly, if a sufficiently large number of blocks is added with mods, the value will be increased to compensate for the increased ID count. This increase can go up to 31 bits per entry (since registry IDs are signed integers). In case of Minecraft Forge, you can get the number of blocks with the "Number of ids" field found in the [[Minecraft Forge Handshake#RegistryData|RegistryData packet in the Forge Handshake]].
 +
 
 +
==== Single valued ====
 +
 
 +
When this palette is used, the Data Array sent/received is empty, since entries can be inferred from the palette's single value.
 +
However, the length of the Data Array is still included, even though it's always 0.
  
The Data Array, although varying in length, will never be padded due to the number of blocks being evenly divisible by 64, which is the number of bits in a long.
+
{| class="wikitable"
 +
|- {{added}}
 +
! Field Name
 +
! Field Type
 +
! Notes
 +
|-
 +
| Value
 +
| {{Type|VarInt}}
 +
| ID of the corresponding entry in its registry.
 +
|}
  
There are several values that can be used for the bits per block value. In most cases, invalid values will be interpreted as a different value when parsed by the notchian client, meaning that chunk data will be parsed incorrectly if you use an invalid bits per block.  Servers must make sure that the bits per block value is correct.
+
==== Indirect ====
  
* up to 4: Blocks are encoded as 4 bits. The palette is used and sent.
+
This is an actual palette which lists the entries used. Values in the Data Array are indices into the palette, which in turn gives a proper registry ID.
* 5 to 8: Blocks are encoded with the given number of bits. The palette is used and sent.
 
* 9 and above: The palette is not sent. Blocks are encoded by their whole ID in the global palette, with bits per block being set as the base 2 logarithm of the number of block states, rounded up. For the current vanilla release, this is 13 bits per block.
 
  
The global palette encodes a block as 13 bits. It uses the {{Minecraft Wiki|Data values#Block IDs|block ID}} for the first 9 bits, and the block damage value for the last 4 bits. For example, Diorite (block ID <code>1</code> for <code>minecraft:stone</code> with damage <code>3</code>) would be encoded as <code>000000001&nbsp;0011</code>. If a block is not found in the global palette (either due to not having a valid damage value or due to not being a valid ID), it will be treated as air.
+
{| class="wikitable"
 +
|-
 +
! Field Name
 +
! Field Type
 +
! Notes
 +
|-
 +
| Palette Length
 +
| {{Type|VarInt}}
 +
| Number of elements in the following array.
 +
|-
 +
| Palette
 +
| {{Type|Array}} of {{Type|VarInt}}
 +
| Mapping of IDs in the registry to indices of this array.
 +
|}
  
If Minecraft Forge is installed and a sufficiently large number of blocks are added, the bits per block value for the global palette will be increased to compensate for the increased ID count.  This increase can go up to 16 bits per block (for a total of 4096 block IDs; when combined with the 16 damage values, there are 65536 total states).  You can get the number of blocks with the "Number of ids" field found in the [[Minecraft Forge Handshake#RegistryData|RegistryData packet in the Forge Handshake]].
+
==== Direct ====
  
The data array stores several entries within a single long, and sometimes overlaps one entry between multiple longsFor a bits per block value of 13, the data is stored such that bits 1 through 13 are the first entry, 14 through 26 are the second, and so on. Note that bit 1 is the ''least'' significant bit in this case, not the most significant bit. The same behavior applies when a value stretches between two longs: for instance, block 5 would be bits 53 through 64 of the first long and then bit 65 of the second long.
+
Registry IDs are stored directly as entries in the Data Array.
 +
 
 +
{| class="wikitable"
 +
  |-
 +
! Field Name
 +
! Field Type
 +
! Notes
 +
|-
 +
  |colspan="3"| ''no fields''
 +
  |}
  
 
==== Example ====
 
==== Example ====
  
13 bits per block, using the global palette.
+
Here is an example showing a Chunk Section using a single-valued palette for block states, and an indirect palette with 2 indices for biomes:
 +
 
 +
<code
 +
><span style="border: 2px solid red">00 00</span
 +
><span style="border: 2px solid orange">00</span
 +
><span style="border: 2px solid lime">00</span
 +
><span style="border: 2px solid green">00</span
 +
><span style="border: 2px solid orange">01</span
 +
><span style="border: 2px solid yellow">02</span
 +
><span style="border: 2px solid lime">27 03</span
 +
><span style="border: 2px solid green">01</span
 +
><span style="border: 2px solid aqua">CC FF CC FF CC FF CC FF</span
 +
></code>
 +
 
 +
The first bytes <span style="border: 2px solid red">00 00</span> are the number of non-air blocks in the chunk.
 +
They are followed by the Bits Per Entry <span style="border: 2px solid orange">00</span>, which is zero so we know the palette will have one element (not prefixed with length). This single element is the block state ID of air, <span style="border: 2px solid lime">00</span>. Next there is the length of the long array, which is always <span style="border: 2px solid green">00</span> for single-valued palettes.
 +
 
 +
The second part of the packet is for biomes. The first byte is their Bits Per Entry <span style="border: 2px solid orange">01</span>, followed by the length of the palette <span style="border: 2px solid yellow">02</span> and the two elements <span style="border: 2px solid lime">27 03</span>. The indexed data of this biome has <span style="border: 2px solid green">01</span> long element, which are 8 bytes each, giving the long <span style="border: 2px solid aqua">CC FF CC FF CC FF CC FF</span>.
 +
 
 +
=== Data Array format ===
  
The following two longs would represent...
+
The Data Array stores entries as Bits Per Entry&ndash;bit integers, corresponding to either palette indices or registry IDs depending on the palette format in use. If Bits Per Entry is 0, it is empty.
  
<code>1001880C0060020</code> =
+
Entries are stored in order of increasing x coordinate, within rows at increasing z coordinates, within layers at increasing y coordinates. In other words, if the Data Array were a multidimensional array in C (modulo the packed encoding), it would be indexed <code style="white-space: pre">array[y][z][x]</code>.
<code><span style="outline: solid 2px hsl(160, 90%, 60%); outline-left-style: dashed">00000001</span><span style="outline: solid 2px hsl(160, 90%, 70%)">0000</span><span style="outline: solid 2px hsl(120, 90%, 60%)">000000011</span><span style="outline: solid 2px hsl(120, 90%, 70%)">0001</span><span style="outline: solid 2px hsl(80, 90%, 60%)">000000011</span><span style="outline: solid 2px hsl(80, 90%, 70%)">0000</span><span style="outline: solid 2px hsl(40, 90%, 60%)">000000011</span><span style="outline: solid 2px hsl(50, 90%, 70%)">0000</span><span style="outline: solid 2px hsl(0, 90%, 65%)">000000010</span><span style="outline: solid 2px hsl(0, 90%, 75%)">0000</span></code><br>
 
<code>200D0068004C020</code> = <code><span style="outline: solid 2px rgb(60%, 60%, 60%); outline-left-style: dashed">0000001</span><span style="outline: solid 2px rgb(70%, 70%, 70%)">0000</span><span style="outline: solid 2px hsl(320, 90%, 60%)">000001101</span><span style="outline: solid 2px hsl(320, 90%, 70%)">0000</span><span style="outline: solid 2px hsl(280, 90%, 60%)">000001101</span><span style="outline: solid 2px hsl(280, 90%, 75%)">0000</span><span style="outline: solid 2px hsl(240, 90%, 65%)">000000001</span><span style="outline: solid 2px hsl(240, 90%, 75%)">0011</span><span style="outline: solid 2px hsl(200, 90%, 65%)">000000001</span><span style="outline: solid 2px hsl(200, 90%, 70%)">0000</span><span style="outline: solid 2px hsl(160, 90%, 60%); outline-right-style: dashed">0</span></code>
 
  
9 blocks, with the start of a 10th (that would be finished in the next long).
+
A single long of the array holds several entries. The entries are tightly packed within the long, with the first entry on the least significant bits. An entry cannot span across multiple longs; instead, padding is inserted as required, starting from the most significant bits.
  
#Grass, <span style="border: solid 2px hsl(0, 90%, 65%)">2</span>:<span style="border: solid 2px hsl(0, 90%, 75%)">0</span>
+
For example, assuming a bits per block value of 15, and that bit 0 is the least significant bit, the data is stored such that bits 0 through 14 are the first entry, 15 through 29 are the second, and so on. The fourth entry ends on bit 59, and since only 4 bits are left, they become padding, and the fifth entry starts on the next long.
#Dirt, <span style="border: solid 2px hsl(40, 90%, 60%)">3</span>:<span style="border: solid 2px hsl(40, 90%, 70%)">0</span>
 
#Dirt, <span style="border: solid 2px hsl(80, 90%, 60%)">3</span>:<span style="border: solid 2px hsl(80, 90%, 70%)">0</span>
 
#Coarse dirt, <span style="border: solid 2px hsl(120, 90%, 60%)">3</span>:<span style="border: solid 2px hsl(120, 90%, 70%)">1</span>
 
#Stone, <span style="border: solid 2px hsl(160, 90%, 60%)">1</span>:<span style="border: solid 2px hsl(160, 90%, 70%)">0</span>
 
#Stone, <span style="border: solid 2px hsl(200, 90%, 60%)">1</span>:<span style="border: solid 2px hsl(200, 90%, 70%)">0</span>
 
#Diorite, <span style="border: solid 2px hsl(240, 90%, 65%)">1</span>:<span style="border: solid 2px hsl(240, 90%, 75%)">3</span>
 
#Gravel, <span style="border: solid 2px hsl(280, 90%, 65%)">13</span>:<span style="border: solid 2px hsl(280, 90%, 75%)">0</span>
 
#Gravel, <span style="border: solid 2px hsl(320, 90%, 60%)">13</span>:<span style="border: solid 2px hsl(320, 90%, 70%)">0</span>
 
#Stone, <span style="border: solid 2px rgb(60%, 60%, 60%)">1</span>:<span style="border: solid 2px rgb(70%, 70%, 70%)">0</span> (or potentially emerald ore, <span style="border: solid 2px rgb(60%, 60%, 60%)">129</span>:<span style="border: solid 2px rgb(70%, 70%, 70%)">0</span>)
 
  
=== Biomes ===
+
Note that since longs are sent in big endian order, the least significant bit of the first entry in a long will be on the ''last'' byte of the long on the wire.
  
The biomes array is only present when ground-up continuous is set to true. Biomes cannot be changed unless a chunk is re-sent.
+
{{Warning2|This format was changed in Minecraft 1.16. In prior versions, entries could cross long boundaries, and there was no padding.}}
  
The structure is an array of 256 bytes, each representing a {{Minecraft Wiki|Biome/ID|Biome ID}} (it is recommended that 127 for "Void" is used if there is no set biome).  The array is indexed by <code>z * 16 | x</code>.
+
==== Visual example ====
  
== Tips ==
+
5 bits per block, containing the following references to entries in a palette (not shown):
 +
<code
 +
><span style="border: solid 2px hsl(  0, 90%, 60%); margin-left: -2px; padding: 0 1px">1</span
 +
><span style="border: solid 2px hsl( 30, 90%, 60%); margin-left: -2px; padding: 0 1px">2</span
 +
><span style="border: solid 2px hsl( 60, 90%, 60%); margin-left: -2px; padding: 0 1px">2</span
 +
><span style="border: solid 2px hsl( 90, 90%, 60%); margin-left: -2px; padding: 0 1px">3</span
 +
><span style="border: solid 2px hsl(120, 90%, 60%); margin-left: -2px; padding: 0 1px">4</span
 +
><span style="border: solid 2px hsl(150, 90%, 60%); margin-left: -2px; padding: 0 1px">4</span
 +
><span style="border: solid 2px hsl(180, 90%, 60%); margin-left: -2px; padding: 0 1px">5</span
 +
><span style="border: solid 2px hsl(210, 90%, 60%); margin-left: -2px; padding: 0 1px">6</span
 +
><span style="border: solid 2px hsl(240, 90%, 60%); margin-left: -2px; padding: 0 1px">6</span
 +
><span style="border: solid 2px hsl(270, 90%, 60%); margin-left: -2px; padding: 0 1px">4</span
 +
><span style="border: solid 2px hsl(300, 90%, 60%); margin-left: -2px; padding: 0 1px">8</span
 +
><span style="border: solid 2px hsl(330, 90%, 60%); margin-left: -2px; padding: 0 1px">0</span
 +
><span style="border: solid 2px hsl(  0, 90%, 30%); margin-left: -2px; padding: 0 1px">7</span
 +
><span style="border: solid 2px hsl( 30, 90%, 30%); margin-left: -2px; padding: 0 1px">4</span
 +
><span style="border: solid 2px hsl( 60, 90%, 30%); margin-left: -2px; padding: 0 1px">3</span
 +
><span style="border: solid 2px hsl( 90, 90%, 30%); margin-left: -2px; padding: 0 1px">13</span
 +
><span style="border: solid 2px hsl(120, 90%, 30%); margin-left: -2px; padding: 0 1px">15</span
 +
><span style="border: solid 2px hsl(150, 90%, 30%); margin-left: -2px; padding: 0 1px">16</span
 +
><span style="border: solid 2px hsl(180, 90%, 30%); margin-left: -2px; padding: 0 1px">9</span
 +
><span style="border: solid 2px hsl(210, 90%, 30%); margin-left: -2px; padding: 0 1px">14</span
 +
><span style="border: solid 2px hsl(240, 90%, 30%); margin-left: -2px; padding: 0 1px">10</span
 +
><span style="border: solid 2px hsl(270, 90%, 30%); margin-left: -2px; padding: 0 1px">12</span
 +
><span style="border: solid 2px hsl(300, 90%, 30%); margin-left: -2px; padding: 0 1px">0</span
 +
><span style="border: solid 2px hsl(330, 90%, 30%); margin:    0 -2px; padding: 0 1px">2</span
 +
></code>
 +
 
 +
<code>0020863148418841</code><code
 +
><span style="border: dashed 2px black;            margin-left: -2px">0000</span
 +
><span style="border: solid 2px hsl(330, 90%, 60%); margin-left: -2px">00000</span
 +
><span style="border: solid 2px hsl(300, 90%, 60%); margin-left: -2px">01000</span
 +
><span style="border: solid 2px hsl(270, 90%, 60%); margin-left: -2px">00100</span
 +
><span style="border: solid 2px hsl(240, 90%, 60%); margin-left: -2px">00110</span
 +
><span style="border: solid 2px hsl(210, 90%, 60%); margin-left: -2px">00110</span
 +
><span style="border: solid 2px hsl(180, 90%, 60%); margin-left: -2px">00101</span
 +
><span style="border: solid 2px hsl(150, 90%, 60%); margin-left: -2px">00100</span
 +
><span style="border: solid 2px hsl(120, 90%, 60%); margin-left: -2px">00100</span
 +
><span style="border: solid 2px hsl( 90, 90%, 60%); margin-left: -2px">00011</span
 +
><span style="border: solid 2px hsl( 60, 90%, 60%); margin-left: -2px">00010</span
 +
><span style="border: solid 2px hsl( 30, 90%, 60%); margin-left: -2px">00010</span
 +
><span style="border: solid 2px hsl(  0, 90%, 60%); margin:    0 -2px">00001</span
 +
></code></br>
 +
<code>01018A7260F68C87</code><code
 +
><span style="border: dashed 2px black;            margin-left: -2px">0000</span
 +
><span style="border: solid 2px hsl(330, 90%, 30%); margin-left: -2px">00010</span
 +
><span style="border: solid 2px hsl(300, 90%, 30%); margin-left: -2px">00000</span
 +
><span style="border: solid 2px hsl(270, 90%, 30%); margin-left: -2px">01100</span
 +
><span style="border: solid 2px hsl(240, 90%, 30%); margin-left: -2px">01010</span
 +
><span style="border: solid 2px hsl(210, 90%, 30%); margin-left: -2px">01110</span
 +
><span style="border: solid 2px hsl(180, 90%, 30%); margin-left: -2px">01001</span
 +
><span style="border: solid 2px hsl(150, 90%, 30%); margin-left: -2px">10000</span
 +
><span style="border: solid 2px hsl(120, 90%, 30%); margin-left: -2px">01111</span
 +
><span style="border: solid 2px hsl( 90, 90%, 30%); margin-left: -2px">01101</span
 +
><span style="border: solid 2px hsl( 60, 90%, 30%); margin-left: -2px">00011</span
 +
><span style="border: solid 2px hsl( 30, 90%, 30%); margin-left: -2px">00100</span
 +
><span style="border: solid 2px hsl(  0, 90%, 30%); margin:    0 -2px">00111</span
 +
></code>
 +
 
 +
== Tips and notes ==
  
 
There are several things that can make it easier to implement this format.
 
There are several things that can make it easier to implement this format.
  
* The <code>13</code> value for full bits per block is likely to change in the future, so it should not be hardcoded (instead, it should either be calculated or left as a constant).
+
* Servers do <em>not</em> need to implement the palette initially (instead always using 15 bits per block), although it is an important optimization later on.
* Servers do <em>not</em> need to implement the palette initially (instead always using 13 bits per block), although it is an important optimization later on.
+
* The Notchian server implementation does not send values that are out of bounds for the palette.  If such a value is received, the format is being parsed incorrectly.  In particular, if you're reading a number with all bits set (15, 31, etc), you might be reading skylight data (or you may have a sign error and you're reading negative numbers).
* The Notchian server implementation does not send values that are out of bounds for the palette.  If such a value is received, the format is being parsed incorrectly.  In particular, if you're reading a number with all bits set (15, 31, etc), then you're probably reading sky light data.
+
* The Notchian client generally does not render chunks that lack neighbors.  (As of 1.20.2 such chunks appear to sporadically become visible anyway, and do so consistently when interacted with.) This means that if you only send a fixed set of chunks with no empty chunks around them, then some of them will not be visible, although you can still interact with themThis is intended behavior, so that lighting and connected blocks can be handled correctly.
* The number of longs needed for the data array can be calculated as ((16&times;16&times;16 blocks)&times;Bits per block)&divide;64 bits per long (which simplifies to 64&times;Bits per block).  For instance, 13 bits per block requires 832 longs.
 
  
 
== Sample implementations ==
 
== Sample implementations ==
 +
 +
{{Need Info|This sample code is missing the heightmap, biome changes and the changes from 1.16}}
  
 
How the chunk format can be implemented varies largely by how you want to read/write it.  It is often easier to read/write the data long-by-long instead of pre-create the data to write; however, storing the chunk data arrays in their packed form can be far more efficient memory- and performance-wise.  These implementations are simple versions that can work as a base (especially for dealing with the bit shifting), but are not ideal.
 
How the chunk format can be implemented varies largely by how you want to read/write it.  It is often easier to read/write the data long-by-long instead of pre-create the data to write; however, storing the chunk data arrays in their packed form can be far more efficient memory- and performance-wise.  These implementations are simple versions that can work as a base (especially for dealing with the bit shifting), but are not ideal.
 +
 +
=== Shared code ===
 +
 +
This is some basic pseudocode that shows the various types of palettes.  It does not handle actually populating the palette based on data in a chunk section; handling this is left as for the implementer since there are many ways of doing so.  (This does not apply for the direct version).
 +
 +
<syntaxhighlight lang="csharp">
 +
private uint GetGlobalPaletteIDFromState(BlockState state) {
 +
    // Implementation left to the user; see Data Generators for more info on the values
 +
}
 +
 +
private BlockState GetStateFromGlobalPaletteID(uint value) {
 +
    // Implementation left to the user; see Data Generators for more info on the values
 +
}
 +
 +
public interface Palette {
 +
    uint IdForState(BlockState state);
 +
    BlockState StateForId(uint id);
 +
    byte GetBitsPerBlock();
 +
    void Read(Buffer data);
 +
    void Write(Buffer data);
 +
}
 +
 +
public class IndirectPalette : Palette {
 +
    Map<uint, BlockState> idToState;
 +
    Map<BlockState, uint> stateToId;
 +
    byte bitsPerBlock;
 +
 +
    public IndirectPalette(byte palBitsPerBlock) {
 +
        bitsPerBlock = palBitsPerBlock;
 +
    }
 +
 +
    public uint IdForState(BlockState state) {
 +
        return stateToId.Get(state);
 +
    }
 +
 +
    public BlockState StateForId(uint id) {
 +
        return idToState.Get(id);
 +
    }
 +
 +
    public byte GetBitsPerBlock() {
 +
        return bitsPerBlock;
 +
    }
 +
 +
    public void Read(Buffer data) {
 +
        idToState = new Map<>();
 +
        stateToId = new Map<>();
 +
        // Palette Length
 +
        int length = ReadVarInt();
 +
        // Palette
 +
        for (int id = 0; id < length; id++) {
 +
            uint stateId = ReadVarInt();
 +
            BlockState state = GetStateFromGlobalPaletteID(stateId);
 +
            idToState.Set(id, state);
 +
            stateToId.Set(state, id);
 +
        }
 +
    }
 +
 +
    public void Write(Buffer data) {
 +
        Assert(idToState.Size() == stateToId.Size()); // both should be equivalent
 +
        // Palette Length
 +
        WriteVarInt(idToState.Size());
 +
        // Palette
 +
        for (int id = 0; id < idToState.Size(); id++) {
 +
            BlockState state = idToState.Get(id);
 +
            uint stateId = GetGlobalPaletteIDFromState(state);
 +
            WriteVarInt(stateId);
 +
        }
 +
    }
 +
}
 +
 +
public class DirectPalette : Palette {
 +
    public uint IdForState(BlockState state) {
 +
        return GetGlobalPaletteIDFromState(state);
 +
    }
 +
 +
    public BlockState StateForId(uint id) {
 +
        return GetStateFromGlobalPaletteID(id);
 +
    }
 +
 +
    public byte GetBitsPerBlock() {
 +
        return Ceil(Log2(BlockState.TotalNumberOfStates)); // currently 15
 +
    }
 +
 +
    public void Read(Buffer data) {
 +
        // No Data
 +
    }
 +
 +
    public void Write(Buffer data) {
 +
        // No Data
 +
    }
 +
}
 +
 +
public Palette ChoosePalette(byte bitsPerBlock) {
 +
    if (bitsPerBlock <= 4) {
 +
        return new IndirectPalette(4);
 +
    } else if (bitsPerBlock <= 8) {
 +
        return new IndirectPalette(bitsPerBlock);
 +
    } else {
 +
        return new DirectPalette();
 +
    }
 +
}
 +
</syntaxhighlight>
  
 
=== Deserializing ===
 
=== Deserializing ===
Line 275: Line 506:
  
 
private void ReadChunkColumn(Chunk chunk, bool full, int mask, Buffer data) {
 
private void ReadChunkColumn(Chunk chunk, bool full, int mask, Buffer data) {
     for (int sectionY = 0; sectionY < CHUNK_HEIGHT / SECTION_HEIGHT; y++) {
+
     for (int sectionY = 0; sectionY < (CHUNK_HEIGHT / SECTION_HEIGHT); y++) {
 
         if ((mask & (1 << sectionY)) != 0) {  // Is the given bit set in the mask?
 
         if ((mask & (1 << sectionY)) != 0) {  // Is the given bit set in the mask?
 
             byte bitsPerBlock = ReadByte(data);
 
             byte bitsPerBlock = ReadByte(data);
 
+
             Palette palette = ChoosePalette(bitsPerBlock);
             // Excessively specific format that exactly matches the client logic
+
             palette.Read(data);
            // This extra checking makes sense on the server side, but client
 
            // side it only is needed when dealing with servers sending incorrect packets
 
            // (the notchian server will not send such packets)
 
            if (bitsPerBlock < 4) {
 
                bitsPerBlock = 4;
 
            }
 
            if (bitsPerBlock > 8) {
 
                bitsPerBlock = FULL_SIZE_BITS_PER_BLOCK; // 13, currently, but liable to eventually change
 
            }
 
 
 
             bool usePalette = (bitsPerBlock <= 8)
 
 
 
            int[] palette = null;
 
            if (usePalette) {
 
                int numPaletteEntries = ReadVarInt(data);
 
                palette = new int[numPaletteEntries];
 
                for (int i = 0; i < numPaletteEntries; i++) {
 
                    palette[i] = ReadVarInt(data);
 
                }
 
            } else {
 
                ReadVarInt(data);  // Should always be 0
 
            }
 
  
 
             // A bitmask that contains bitsPerBlock set bits
 
             // A bitmask that contains bitsPerBlock set bits
Line 314: Line 523:
 
                 for (int z = 0; z < SECTION_WIDTH; z++) {
 
                 for (int z = 0; z < SECTION_WIDTH; z++) {
 
                     for (int x = 0; x < SECTION_WIDTH; x++) {
 
                     for (int x = 0; x < SECTION_WIDTH; x++) {
                         int blockNumber = (((blockY * SECTION_HEIGHT) + blockZ) * SECTION_WIDTH) + blockX;
+
                         int blockNumber = (((y * SECTION_HEIGHT) + z) * SECTION_WIDTH) + x;
 
                         int startLong = (blockNumber * bitsPerBlock) / 64;
 
                         int startLong = (blockNumber * bitsPerBlock) / 64;
 
                         int startOffset = (blockNumber * bitsPerBlock) % 64;
 
                         int startOffset = (blockNumber * bitsPerBlock) % 64;
Line 324: Line 533:
 
                         } else {
 
                         } else {
 
                             int endOffset = 64 - startOffset;
 
                             int endOffset = 64 - startOffset;
                             blockId = (uint)(dataArray[startLong] >> startOffset | dataArray[endLong] << endOffset);
+
                             data = (uint)(dataArray[startLong] >> startOffset | dataArray[endLong] << endOffset);
 
                         }
 
                         }
 
                         data &= individualValueMask;
 
                         data &= individualValueMask;
  
                         if (usePalette) {
+
                         // data should always be valid for the palette
                            // data should always be within the palette length
+
                        // If you're reading a power of 2 minus one (15, 31, 63, 127, etc...) that's out of bounds,
                            // If you're reading a power of 2 minus one (15, 31, 63, 127, etc...) that's out of bounds,
+
                        // you're probably reading light data instead
                            // you're probably reading light data instead
 
                            data = palette[data];
 
                        }
 
  
                         BlockState state = GetStateFromGlobalPaletteID(data);
+
                         BlockState state = palette.StateForId(data);
 
                         section.SetState(x, y, z, state);
 
                         section.SetState(x, y, z, state);
 
                     }
 
                     }
Line 374: Line 580:
 
     for (int z = 0; z < SECTION_WIDTH; z++) {
 
     for (int z = 0; z < SECTION_WIDTH; z++) {
 
         for (int x = 0; x < SECTION_WIDTH; x++) {
 
         for (int x = 0; x < SECTION_WIDTH; x++) {
             chunk.SetBiome(x, z, ReadByte(data));
+
             chunk.SetBiome(x, z, ReadInt(data));
 
         }
 
         }
 
     }
 
     }
}
 
 
// Value should already have gone through the section palette
 
private BlockState GetStateFromGlobalPaletteID(uint value) {
 
    // This method is subject to change in future MC versions
 
 
    byte metadata = data & 0xF;
 
    uint id = data >> 4;
 
 
    return BlockState.ForIDAndMeta(id, metadata);
 
 
}
 
}
 
</syntaxhighlight>
 
</syntaxhighlight>
Line 404: Line 600:
 
     int mask = 0;
 
     int mask = 0;
 
     Buffer columnBuffer = new Buffer();
 
     Buffer columnBuffer = new Buffer();
     for (int sectionY = 0; sectionY < CHUNK_HEIGHT / SECTION_HEIGHT; y++) {
+
     for (int sectionY = 0; sectionY < (CHUNK_HEIGHT / SECTION_HEIGHT); y++) {
 
         if (!chunk.IsSectionEmpty(sectionY)) {
 
         if (!chunk.IsSectionEmpty(sectionY)) {
 
             mask |= (1 << chunkY);  // Set that bit to true in the mask
 
             mask |= (1 << chunkY);  // Set that bit to true in the mask
Line 412: Line 608:
 
     for (int z = 0; z < SECTION_WIDTH; z++) {
 
     for (int z = 0; z < SECTION_WIDTH; z++) {
 
         for (int x = 0; x < SECTION_WIDTH; x++) {
 
         for (int x = 0; x < SECTION_WIDTH; x++) {
             WriteByte(columnBuffer, chunk.GetBiome(x, z));  // Use 127 for 'void' if your server doesn't support biomes
+
             WriteInt(columnBuffer, chunk.GetBiome(x, z));  // Use 127 for 'void' if your server doesn't support biomes
 
         }
 
         }
 
     }
 
     }
Line 430: Line 626:
 
}
 
}
  
private void WriteChunkSection(ChunkSection section, Buffer data) {
+
private void WriteChunkSection(ChunkSection section, Buffer buf) {
     byte bitsPerBlock = FULL_SIZE_BITS_PER_BLOCK; // 13
+
    Palette palette = section.palette;
 +
     byte bitsPerBlock = palette.GetBitsPerBlock();
  
 
     WriteByte(bitsPerBlock);
 
     WriteByte(bitsPerBlock);
     WriteVarInt(data, 0); // Palette size is 0
+
     palette.Write(buf);
  
 
     int dataLength = (16*16*16) * bitsPerBlock / 64; // See tips section for an explanation of this calculation
 
     int dataLength = (16*16*16) * bitsPerBlock / 64; // See tips section for an explanation of this calculation
Line 445: Line 642:
 
         for (int z = 0; z < SECTION_WIDTH; z++) {
 
         for (int z = 0; z < SECTION_WIDTH; z++) {
 
             for (int x = 0; x < SECTION_WIDTH; x++) {
 
             for (int x = 0; x < SECTION_WIDTH; x++) {
                 int blockNumber = (((blockY * SECTION_HEIGHT) + blockZ) * SECTION_WIDTH) + blockX;
+
                 int blockNumber = (((y * SECTION_HEIGHT) + z) * SECTION_WIDTH) + x;
 
                 int startLong = (blockNumber * bitsPerBlock) / 64;
 
                 int startLong = (blockNumber * bitsPerBlock) / 64;
 
                 int startOffset = (blockNumber * bitsPerBlock) % 64;
 
                 int startOffset = (blockNumber * bitsPerBlock) % 64;
Line 452: Line 649:
 
                 BlockState state = section.GetState(x, y, z);
 
                 BlockState state = section.GetState(x, y, z);
  
                 uint value = GetGlobalPaletteIDFromState(state);
+
                 UInt64 value = palette.IdForState(state);
 
                 value &= individualValueMask;
 
                 value &= individualValueMask;
  
                 data[startLong] |= (Value << startOffset);
+
                 data[startLong] |= (value << startOffset);
  
 
                 if (startLong != endLong) {
 
                 if (startLong != endLong) {
Line 463: Line 660:
 
         }
 
         }
 
     }
 
     }
 +
 +
    WriteVarInt(dataLength);
 +
    WriteLongArray(data);
  
 
     for (int y = 0; y < SECTION_HEIGHT; y++) {
 
     for (int y = 0; y < SECTION_HEIGHT; y++) {
Line 473: Line 673:
 
         }
 
         }
 
     }
 
     }
 
    WriteVarInt(dataLength);
 
    WriteLongArray(data);
 
  
 
     if (currentDimension.HasSkylight()) { // IE, current dimension is overworld / 0
 
     if (currentDimension.HasSkylight()) { // IE, current dimension is overworld / 0
Line 488: Line 685:
 
         }
 
         }
 
     }
 
     }
}
 
 
private uint GetGlobalPaletteIDFromState(BlockState state) {
 
    // NOTE: This method will probably change in new versions
 
    byte metadata = state.getMetadata();
 
    uint id = section.GetBlockID(x, y, z);
 
 
    return id << 4 | metadata;
 
 
}
 
}
 
</syntaxhighlight>
 
</syntaxhighlight>
Line 501: Line 690:
 
== Full implementations ==
 
== Full implementations ==
  
* [https://github.com/GlowstoneMC/Glowstone/blob/master/src/main/java/net/glowstone/chunk/ChunkSection.java Java, 1.11.2, writing only, with palette]
+
* [https://github.com/GlowstoneMC/Glowstone/blob/dev/src/main/java/net/glowstone/chunk/ChunkSection.java Java, 1.12.2, writing only, with palette]
 +
* [https://github.com/feather-rs/feather/blob/main/feather/base/src/chunk.rs Rust, 1.16.5, with palette]
 
* [https://github.com/Steveice10/MCProtocolLib/blob/4ed72deb75f2acb0a81d641717b7b8074730f701/src/main/java/org/spacehq/mc/protocol/data/game/chunk/BlockStorage.java#L42 Java, 1.9, both sides]
 
* [https://github.com/Steveice10/MCProtocolLib/blob/4ed72deb75f2acb0a81d641717b7b8074730f701/src/main/java/org/spacehq/mc/protocol/data/game/chunk/BlockStorage.java#L42 Java, 1.9, both sides]
 +
* [https://github.com/barneygale/quarry Python, 1.7 through 1.13]. Read/write, paletted/unpaletted, [https://github.com/barneygale/quarry/blob/master/quarry/types/buffer/v1_7.py#L403 packets]/[https://github.com/barneygale/quarry/blob/master/quarry/types/chunk.py arrays]
 
* [https://github.com/SpockBotMC/SpockBot/blob/0535c31/spockbot/plugins/tools/smpmap.py#L144-L183 Python, 1.9, reading only]
 
* [https://github.com/SpockBotMC/SpockBot/blob/0535c31/spockbot/plugins/tools/smpmap.py#L144-L183 Python, 1.9, reading only]
 
* [https://github.com/Protryon/Osmium/blob/fdd61b9/MinecraftClone/src/ingame.c#L512-L632 C, 1.9, reading only]
 
* [https://github.com/Protryon/Osmium/blob/fdd61b9/MinecraftClone/src/ingame.c#L512-L632 C, 1.9, reading only]
 
* [https://github.com/Protryon/Basin/blob/master/basin/src/packet.c#L1124 C, 1.11.2, writing only]
 
* [https://github.com/Protryon/Basin/blob/master/basin/src/packet.c#L1124 C, 1.11.2, writing only]
* [https://github.com/cuberite/cuberite/blob/master/src/Protocol/ChunkDataSerializer.cpp#L140 C++, 1.11.2, writing only]
+
* [https://github.com/cuberite/cuberite/blob/master/src/Protocol/ChunkDataSerializer.cpp#L190 C++, 1.12.2, writing only]
 +
* [https://github.com/PrismarineJS/prismarine-chunk Node.js, 1.8->1.18]
 +
 
 +
== Sample data ==
 +
 
 +
* [https://gist.github.com/Pokechu22/0b89f928b381dede0387fe5f88faf8c0 some sample data] from 1.13.2, with both complete packets and just the data structures
 +
* [https://github.com/PrismarineJS/prismarine-chunk/tree/master/test prismarine test data] chunks from 1.8 to 1.20 used as testing data, generated using automated [https://github.com/PrismarineJS/minecraft-chunk-dumper chunk-dumper]
  
 
=== Old format ===
 
=== Old format ===
Line 514: Line 711:
 
* [https://github.com/GlowstoneMC/Glowstone/blob/d3ed79ea7d284df1d2cd1945bf53d5652962a34f/src/main/java/net/glowstone/GlowChunk.java#L640 Java, 1.8]
 
* [https://github.com/GlowstoneMC/Glowstone/blob/d3ed79ea7d284df1d2cd1945bf53d5652962a34f/src/main/java/net/glowstone/GlowChunk.java#L640 Java, 1.8]
 
* [https://github.com/barneygale/smpmap Python, 1.4]
 
* [https://github.com/barneygale/smpmap Python, 1.4]
* [https://github.com/PrismarineJS/prismarine-chunk Node.js, 1.8]
+
 
 +
 
 +
[[Category:Protocol Details]]
 +
[[Category:Minecraft Modern]]

Latest revision as of 07:45, 3 August 2024

This article describes in additional detail the format of the Chunk Data packet.

Concepts

Chunks columns and Chunk sections

You've probably heard the term "chunk" before. Minecraft uses chunks to store and transfer world data. However, there are actually 2 different concepts that are both called "chunks" in different contexts: chunk columns and chunk sections.


A chunk column is a collection of blocks with a horizontal size of 16×16, spanning the entire buildable area on the vertical axis. This is what most players think of when they hear the term "chunk". However, these are not the smallest unit data is stored in in the game; chunk columns are vertically divided into chunk sections, each 16 blocks tall.

Chunk columns store block entities, entities, tick data, and an array of sections.


A chunk section is a 16×16×16 collection of blocks (chunk sections are cubic). This is the actual area that blocks are stored in, and is often the concept Mojang refers to via "chunk". Breaking columns into sections wouldn't be useful, except that you don't need to send all chunk sections in a column: If a section is empty, then it doesn't need to be sent (more on this later).

Chunk sections store blocks, biomes and light data (both block light and sky light). Additionally, they can be associated with at most two palettes—one for blocks, one for biomes. A chunk section can contain at maximum 4096 (16×16×16, or 212) unique block state IDs, and 64 (4×4×4) unique biome IDs (but, it is highly unlikely that such a section will occur in normal circumstances).

Chunk columns and chunk sections are both displayed when chunk border rendering is enabled (F3+G). Chunk columns borders are indicated via the red vertical lines, while chunk sections borders are indicated by the blue lines.

Registries

The registries are the primary, protocol-wide mappings from block states and biomes to numeric identifiers.

Block state registry

The block state registry is hardcoded into Minecraft, and can only be changed via modding. Such changes break protocol compatibility, and as such, modding frameworks typically include protocol extensions to negotiate which IDs the client and server have in common.

One block state ID is allocated for each unique block state of a block; if a block has multiple properties then the number of allocated states is the product of the number of values for each property. The block state IDs belonging to a given block are always consecutive. Other than that, the ordering of block states is hardcoded, and somewhat arbitrary.

The Data Generators system can be used to generate a list of all block state IDs.

Biome registry

The biome registry is defined at runtime in a Registry Data packet sent by the server during the Configuration phase.

The Notchian server pulls these biome definitions from data packs.

Palettes

Illustration of an indexed palette (Source)

A palette maps a smaller set of IDs within a chunk section to registry IDs. Other than skipping empty sections, correct use of palettes is the biggest place where data can be saved. For example, encoding any of the IDs in the block state registry as of vanilla 1.20.2 requires 15 bits. Given that most sections contain only a few different blocks, using 15 bits per block to represent a chunk section that is only stone, gravel, and air would be extremely wasteful. Instead, a list of registry IDs is sent (for instance, 40 57 0), and indices into that list—the palette—are sent as the block state or biome values within the chunk (so 40 would be sent as 0, 57 as 1, and 0 as 2).[concept note 1]

The number of bits used to encode palette indices varies based on the number of indices, and the registry in question. If a threshold on the number of unique IDs in the section is exceeded, a palette is not used, and registry IDs are used directly instead.

The concept of palettes is more commonly used with colors in an image; Wikipedia's articles on color look-up tables, indexed colors, and palettes in general may be helpful for fully grokking it.

Warning.png Note that the notchian client (and server) store their chunk data within the compacted, paletted format. Sending non-compacted data not only wastes bandwidth, but also leads to increased memory use clientside; while this is OK for an initial implementation it is strongly encouraged that one compacts the block data as soon as possible.

Notes

  1. There is no requirement for IDs in a palette to be monotonic; the order within the list is entirely arbitrary and often has to do with how the palette is built (if it finds a stone block before an air block, stone can come first). (However, although the order of the palette entries can be arbitrary, it can theoretically be optimized to ensure the maximum possible DEFLATE compression. This optimization offers little to no gain, so generally do not attempt it.) However, there shouldn't be any gaps in the palette, as gaps would increase the size of the palette when it is sent.

Packet structure

Packet ID State Bound To Field Name Field Type Notes
0x20 Play Client Chunk X Int Chunk coordinate (block coordinate divided by 16, rounded down).
Chunk Z Int Chunk coordinate (block coordinate divided by 16, rounded down).
Heightmaps NBT See #Heightmaps structure below.
Size VarInt Size of Data in bytes; in some cases this is larger than it needs to be (e.g. MC-131684, MC-247438) in which case extra bytes should be skipped before reading fields after Data.
Data Byte Array See #Data structure below.
Additional Data Various See Protocol#Chunk Data and Update Light.

Heightmaps structure

Minecraft uses heightmaps to optimize various operations on both the server and the client. All heightmaps share the basic structure of encoding the position of the highest "occupied" block in each column of blocks within a chunk column. The differences have to do with which blocks are considered to be "occupied".

Rather than calculating them from the chunk data, the client receives the initial heightmaps it needs from the server. This trades an increase in network usage for a decrease in client-side processing. Once a chunk is loaded, the client updates its heightmaps based on block changes independently from the server.

No heightmaps are strictly required for the client to accept a chunk. If a heightmap is missing from a Chunk Data packet, the client will initialize it with all heights set to their minimum values. However, block changes will still cause the corresponding height values to be updated as normal.

The Heightmaps structure is an NBT Compound Tag containing a Long Array Tag element for each heightmap. The name of each Long Array is the name of the corresponding heightmap.

The height values of a heightmap are packed into the long array in the same manner described in #Data Array format, and ordered such that the fastest-increasing coordinate is x. (However, there are only 256 entries—one for each block column.) The Bits Per Entry value used is calculated as ceil(log2(world height + 1)). This is because the number of possible height values is one more than the world height—ranging from 0 (completely blank column; not even bedrock) to world height (highest position is occupied). Note that this means, for example, that a world with height 256 will use a Bits Per Entry of 9.

The following heightmaps are currently used by the client:

Name Considers Occupied Purposes
MOTION_BLOCKING "Solid" blocks, except bamboo saplings and cactuses; fluids. To determine where to display rain and snow.
WORLD_SURFACE All blocks other than air, cave air and void air. To determine if a beacon beam is obstructed.

This list appears to be exhaustive as of 1.20.2.

Data structure

The data section of the packet contains most of the useful data for the chunk.

Field Name Field Type Notes
Data Array of Chunk Section This array is NOT length-prefixed. The number of elements in the array is calculated based on the world's height. Sections are sent bottom-to-top. Starting with 1.18, the world height changes based on the dimension. The height of each dimension is assigned by the server in its corresponding registry data entry. For example, the vanilla overworld is 384 blocks tall, meaning 24 chunk sections will be included in this array.

Chunk Section structure

Huh.png The following information needs to be added to this page:
How do biomes work now? The biome change happened at the same time as the seed change, but it's not clear how/if biomes could be computed given that it's not the actual seed... (/r/mojira discussion which notes that it seems to be some kind of interpolation)

A Chunk Section is defined in terms of other data types. A Chunk Section consists of the following fields:

Field Name Field Type Notes
Block count Short Number of non-air blocks present in the chunk section. "Non-air" is defined as any fluid and block other than air, cave air, and void air. The client will keep count of the blocks as they are broken and placed, and, if the block count reaches 0, the whole chunk section is not rendered, even if it still has blocks.
Block states Paletted Container Consists of 4096 entries, representing all the blocks in the chunk section.
Biomes Paletted Container Consists of 64 entries, representing 4×4×4 biome regions in the chunk section.

Paletted Container structure

A Paletted Container is a palette-based storage of entries. Paletted Containers have an associated registry (either block states or biomes as of now), where values are mapped from. A Paletted Container consists of the following fields:

Field Name Field Type Notes
Bits Per Entry Unsigned Byte Determines how many bits are used to encode entries. Note that not all numbers are valid here.
Palette Varies See #Palette formats below.
Data Array Length VarInt Number of longs in the following array. This value isn't entirely respected by the Notchian client. If it is smaller than expected, it will be overridden by the correct size calculated from Bits Per Entry. If too large, the client will read the specified number of longs, but silently discard all of them afterwards, resulting in a chunk filled with palette entry 0 (which appears to have been unintentional). Present but equal to 0 when Bits Per Entry is 0.
Data Array Array of Long See #Data Array format below.

Palette formats

The Bits Per Entry value determines what format is used for the palette.

Warning.png Values not listed in the following table are rounded upwards to the next one specified, or downwards if larger than the value for Direct. Therefore such values will lead to unexpected results, and should not be used.

There are currently three possible palette formats:

BPE (blocks) BPE (biomes) Palette Format
0 0 Single valued
4-8 1-3 Indirect
15** 6* Direct

*The Notchian client calculates the Bits Per Entry values for the Direct palette format at runtime based on the sizes of the block state and biome registries. As such, the value used for biomes is entirely dependent on the contents of the biome registry sent in the Registry Data packet; the value shown is only valid for vanilla servers with no custom data packs. If the BPE requirement for Direct is less than or equal to the maximum for Indirect, Direct will never be used given BPE values within the valid range.

**Similarly, if a sufficiently large number of blocks is added with mods, the value will be increased to compensate for the increased ID count. This increase can go up to 31 bits per entry (since registry IDs are signed integers). In case of Minecraft Forge, you can get the number of blocks with the "Number of ids" field found in the RegistryData packet in the Forge Handshake.

Single valued

When this palette is used, the Data Array sent/received is empty, since entries can be inferred from the palette's single value. However, the length of the Data Array is still included, even though it's always 0.

Field Name Field Type Notes
Value VarInt ID of the corresponding entry in its registry.

Indirect

This is an actual palette which lists the entries used. Values in the Data Array are indices into the palette, which in turn gives a proper registry ID.

Field Name Field Type Notes
Palette Length VarInt Number of elements in the following array.
Palette Array of VarInt Mapping of IDs in the registry to indices of this array.

Direct

Registry IDs are stored directly as entries in the Data Array.

Field Name Field Type Notes
no fields

Example

Here is an example showing a Chunk Section using a single-valued palette for block states, and an indirect palette with 2 indices for biomes:

00 00000000010227 0301CC FF CC FF CC FF CC FF

The first bytes 00 00 are the number of non-air blocks in the chunk. They are followed by the Bits Per Entry 00, which is zero so we know the palette will have one element (not prefixed with length). This single element is the block state ID of air, 00. Next there is the length of the long array, which is always 00 for single-valued palettes.

The second part of the packet is for biomes. The first byte is their Bits Per Entry 01, followed by the length of the palette 02 and the two elements 27 03. The indexed data of this biome has 01 long element, which are 8 bytes each, giving the long CC FF CC FF CC FF CC FF.

Data Array format

The Data Array stores entries as Bits Per Entry–bit integers, corresponding to either palette indices or registry IDs depending on the palette format in use. If Bits Per Entry is 0, it is empty.

Entries are stored in order of increasing x coordinate, within rows at increasing z coordinates, within layers at increasing y coordinates. In other words, if the Data Array were a multidimensional array in C (modulo the packed encoding), it would be indexed array[y][z][x].

A single long of the array holds several entries. The entries are tightly packed within the long, with the first entry on the least significant bits. An entry cannot span across multiple longs; instead, padding is inserted as required, starting from the most significant bits.

For example, assuming a bits per block value of 15, and that bit 0 is the least significant bit, the data is stored such that bits 0 through 14 are the first entry, 15 through 29 are the second, and so on. The fourth entry ends on bit 59, and since only 4 bits are left, they become padding, and the fifth entry starts on the next long.

Note that since longs are sent in big endian order, the least significant bit of the first entry in a long will be on the last byte of the long on the wire.

Warning.png This format was changed in Minecraft 1.16. In prior versions, entries could cross long boundaries, and there was no padding.

Visual example

5 bits per block, containing the following references to entries in a palette (not shown): 122344566480743131516914101202

00208631484188410000000000100000100001100011000101001000010000011000100001000001
01018A7260F68C870000000100000001100010100111001001100000111101101000110010000111

Tips and notes

There are several things that can make it easier to implement this format.

  • Servers do not need to implement the palette initially (instead always using 15 bits per block), although it is an important optimization later on.
  • The Notchian server implementation does not send values that are out of bounds for the palette. If such a value is received, the format is being parsed incorrectly. In particular, if you're reading a number with all bits set (15, 31, etc), you might be reading skylight data (or you may have a sign error and you're reading negative numbers).
  • The Notchian client generally does not render chunks that lack neighbors. (As of 1.20.2 such chunks appear to sporadically become visible anyway, and do so consistently when interacted with.) This means that if you only send a fixed set of chunks with no empty chunks around them, then some of them will not be visible, although you can still interact with them. This is intended behavior, so that lighting and connected blocks can be handled correctly.

Sample implementations

Huh.png The following information needs to be added to this page:
This sample code is missing the heightmap, biome changes and the changes from 1.16

How the chunk format can be implemented varies largely by how you want to read/write it. It is often easier to read/write the data long-by-long instead of pre-create the data to write; however, storing the chunk data arrays in their packed form can be far more efficient memory- and performance-wise. These implementations are simple versions that can work as a base (especially for dealing with the bit shifting), but are not ideal.

Shared code

This is some basic pseudocode that shows the various types of palettes. It does not handle actually populating the palette based on data in a chunk section; handling this is left as for the implementer since there are many ways of doing so. (This does not apply for the direct version).

private uint GetGlobalPaletteIDFromState(BlockState state) {
    // Implementation left to the user; see Data Generators for more info on the values
}

private BlockState GetStateFromGlobalPaletteID(uint value) {
    // Implementation left to the user; see Data Generators for more info on the values
}

public interface Palette {
    uint IdForState(BlockState state);
    BlockState StateForId(uint id);
    byte GetBitsPerBlock();
    void Read(Buffer data);
    void Write(Buffer data);
}

public class IndirectPalette : Palette {
    Map<uint, BlockState> idToState;
    Map<BlockState, uint> stateToId;
    byte bitsPerBlock;

    public IndirectPalette(byte palBitsPerBlock) {
        bitsPerBlock = palBitsPerBlock;
    }

    public uint IdForState(BlockState state) {
        return stateToId.Get(state);
    }

    public BlockState StateForId(uint id) {
        return idToState.Get(id);
    }

    public byte GetBitsPerBlock() {
        return bitsPerBlock;
    }

    public void Read(Buffer data) {
        idToState = new Map<>();
        stateToId = new Map<>();
        // Palette Length
        int length = ReadVarInt();
        // Palette
        for (int id = 0; id < length; id++) {
            uint stateId = ReadVarInt();
            BlockState state = GetStateFromGlobalPaletteID(stateId);
            idToState.Set(id, state);
            stateToId.Set(state, id);
        }
    }

    public void Write(Buffer data) {
        Assert(idToState.Size() == stateToId.Size()); // both should be equivalent
        // Palette Length
        WriteVarInt(idToState.Size());
        // Palette
        for (int id = 0; id < idToState.Size(); id++) {
            BlockState state = idToState.Get(id);
            uint stateId = GetGlobalPaletteIDFromState(state);
            WriteVarInt(stateId);
        }
    }
}

public class DirectPalette : Palette {
    public uint IdForState(BlockState state) {
        return GetGlobalPaletteIDFromState(state);
    }

    public BlockState StateForId(uint id) {
        return GetStateFromGlobalPaletteID(id);
    }

    public byte GetBitsPerBlock() {
        return Ceil(Log2(BlockState.TotalNumberOfStates)); // currently 15
    }

    public void Read(Buffer data) {
        // No Data
    }

    public void Write(Buffer data) {
        // No Data
    }
}

public Palette ChoosePalette(byte bitsPerBlock) {
    if (bitsPerBlock <= 4) {
        return new IndirectPalette(4);
    } else if (bitsPerBlock <= 8) {
        return new IndirectPalette(bitsPerBlock);
    } else {
        return new DirectPalette();
    }
}

Deserializing

When deserializing, it is easy to read to a buffer (since length information is present). A basic example:

public Chunk ReadChunkDataPacket(Buffer data) {
    int x = ReadInt(data);
    int z = ReadInt(data);
    bool full = ReadBool(data);
    Chunk chunk;
    if (full) {
        chunk = new Chunk(x, z);
    } else {
        chunk = GetExistingChunk(x, z);
    }
    int mask = ReadVarInt(data);
    int size = ReadVarInt(data);
    ReadChunkColumn(chunk, full, mask, data.ReadByteArray(size));

    int blockEntityCount = ReadVarInt(data);
    for (int i = 0; i < blockEntityCount; i++) {
        CompoundTag tag = ReadCompoundTag(data);
        chunk.AddBlockEntity(tag.GetInt("x"), tag.GetInt("y"), tag.GetInt("z"), tag);
    }

    return chunk;
}

private void ReadChunkColumn(Chunk chunk, bool full, int mask, Buffer data) {
    for (int sectionY = 0; sectionY < (CHUNK_HEIGHT / SECTION_HEIGHT); y++) {
        if ((mask & (1 << sectionY)) != 0) {  // Is the given bit set in the mask?
            byte bitsPerBlock = ReadByte(data);
            Palette palette = ChoosePalette(bitsPerBlock);
            palette.Read(data);

            // A bitmask that contains bitsPerBlock set bits
            uint individualValueMask = (uint)((1 << bitsPerBlock) - 1);

            int dataArrayLength = ReadVarInt(data);
            UInt64[] dataArray = ReadUInt64Array(data, dataArrayLength);

            ChunkSection section = new ChunkSection();

            for (int y = 0; y < SECTION_HEIGHT; y++) {
                for (int z = 0; z < SECTION_WIDTH; z++) {
                    for (int x = 0; x < SECTION_WIDTH; x++) {
                        int blockNumber = (((y * SECTION_HEIGHT) + z) * SECTION_WIDTH) + x;
                        int startLong = (blockNumber * bitsPerBlock) / 64;
                        int startOffset = (blockNumber * bitsPerBlock) % 64;
                        int endLong = ((blockNumber + 1) * bitsPerBlock - 1) / 64;

                        uint data;
                        if (startLong == endLong) {
                            data = (uint)(dataArray[startLong] >> startOffset);
                        } else {
                            int endOffset = 64 - startOffset;
                            data = (uint)(dataArray[startLong] >> startOffset | dataArray[endLong] << endOffset);
                        }
                        data &= individualValueMask;

                        // data should always be valid for the palette
                        // If you're reading a power of 2 minus one (15, 31, 63, 127, etc...) that's out of bounds,
                        // you're probably reading light data instead

                        BlockState state = palette.StateForId(data);
                        section.SetState(x, y, z, state);
                    }
                }
            }

            for (int y = 0; y < SECTION_HEIGHT; y++) {
                for (int z = 0; z < SECTION_WIDTH; z++) {
                    for (int x = 0; x < SECTION_WIDTH; x += 2) {
                        // Note: x += 2 above; we read 2 values along x each time
                        byte value = ReadByte(data);

                        section.SetBlockLight(x, y, z, value & 0xF);
                        section.SetBlockLight(x + 1, y, z, (value >> 4) & 0xF);
                    }
                }
            }

            if (currentDimension.HasSkylight()) { // IE, current dimension is overworld / 0
                for (int y = 0; y < SECTION_HEIGHT; y++) {
                    for (int z = 0; z < SECTION_WIDTH; z++) {
                        for (int x = 0; x < SECTION_WIDTH; x += 2) {
                            // Note: x += 2 above; we read 2 values along x each time
                            byte value = ReadByte(data);

                            section.SetSkyLight(x, y, z, value & 0xF);
                            section.SetSkyLight(x + 1, y, z, (value >> 4) & 0xF);
                        }
                    }
                }
            }

            // May replace an existing section or a null one
            chunk.Sections[SectionY] = section;
        }
    }

    for (int z = 0; z < SECTION_WIDTH; z++) {
        for (int x = 0; x < SECTION_WIDTH; x++) {
            chunk.SetBiome(x, z, ReadInt(data));
        }
    }
}

Serializing

Serializing the packet is more complicated, because of the palette. It is easy to implement with the full bits per block value; implementing it with a compacting palette is much harder since algorithms to generate and resize the palette must be written. As such, this example does not generate a palette. The palette is a good performance improvement (as it can significantly reduce the amount of data sent), but managing that is much harder and there are a variety of ways of implementing it.

Also note that this implementation doesn't handle situations where full is false (ie, making a large change to one section); it's only good for serializing a full chunk.

public void WriteChunkDataPacket(Chunk chunk, Buffer data) {
    WriteInt(data, chunk.GetX());
    WriteInt(data, chunk.GetZ());
    WriteBool(true);  // Full

    int mask = 0;
    Buffer columnBuffer = new Buffer();
    for (int sectionY = 0; sectionY < (CHUNK_HEIGHT / SECTION_HEIGHT); y++) {
        if (!chunk.IsSectionEmpty(sectionY)) {
            mask |= (1 << chunkY);  // Set that bit to true in the mask
            WriteChunkSection(chunk.Sections[sectionY], columnBuffer);
        }
    }
    for (int z = 0; z < SECTION_WIDTH; z++) {
        for (int x = 0; x < SECTION_WIDTH; x++) {
            WriteInt(columnBuffer, chunk.GetBiome(x, z));  // Use 127 for 'void' if your server doesn't support biomes
        }
    }

    WriteVarInt(data, mask);
    WriteVarInt(data, columnBuffer.Size);
    WriteByteArray(data, columnBuffer);

    // If you don't support block entities yet, use 0
    // If you need to implement it by sending block entities later with the update block entity packet,
    // do it that way and send 0 as well.  (Note that 1.10.1 (not 1.10 or 1.10.2) will not accept that)

    WriteVarInt(data, chunk.BlockEntities.Length);
    foreach (CompoundTag tag in chunk.BlockEntities) {
        WriteCompoundTag(data, tag);
    }
}

private void WriteChunkSection(ChunkSection section, Buffer buf) {
    Palette palette = section.palette;
    byte bitsPerBlock = palette.GetBitsPerBlock();

    WriteByte(bitsPerBlock);
    palette.Write(buf);

    int dataLength = (16*16*16) * bitsPerBlock / 64; // See tips section for an explanation of this calculation
    UInt64[] data = new UInt64[dataLength];

    // A bitmask that contains bitsPerBlock set bits
    uint individualValueMask = (uint)((1 << bitsPerBlock) - 1);

    for (int y = 0; y < SECTION_HEIGHT; y++) {
        for (int z = 0; z < SECTION_WIDTH; z++) {
            for (int x = 0; x < SECTION_WIDTH; x++) {
                int blockNumber = (((y * SECTION_HEIGHT) + z) * SECTION_WIDTH) + x;
                int startLong = (blockNumber * bitsPerBlock) / 64;
                int startOffset = (blockNumber * bitsPerBlock) % 64;
                int endLong = ((blockNumber + 1) * bitsPerBlock - 1) / 64;

                BlockState state = section.GetState(x, y, z);

                UInt64 value = palette.IdForState(state);
                value &= individualValueMask;

                data[startLong] |= (value << startOffset);

                if (startLong != endLong) {
                    data[endLong] = (value >> (64 - startOffset));
                }
            }
        }
    }

    WriteVarInt(dataLength);
    WriteLongArray(data);

    for (int y = 0; y < SECTION_HEIGHT; y++) {
        for (int z = 0; z < SECTION_WIDTH; z++) {
            for (int x = 0; x < SECTION_WIDTH; x += 2) {
                // Note: x += 2 above; we read 2 values along x each time
                byte value = section.GetBlockLight(x, y, z) | (section.GetBlockLight(x + 1, y, z) << 4);
                WriteByte(data, value);
            }
        }
    }

    if (currentDimension.HasSkylight()) { // IE, current dimension is overworld / 0
        for (int y = 0; y < SECTION_HEIGHT; y++) {
            for (int z = 0; z < SECTION_WIDTH; z++) {
                for (int x = 0; x < SECTION_WIDTH; x += 2) {
                    // Note: x += 2 above; we read 2 values along x each time
                    byte value = section.GetSkyLight(x, y, z) | (section.GetSkyLight(x + 1, y, z) << 4);
                    WriteByte(data, value);
                }
            }
        }
    }
}

Full implementations

Sample data

Old format

The following implement the previous (before 1.9) format: