Difference between revisions of "Data types"
Nickelpro1 (talk | contribs) Tag: Undo |
(→Definitions: Add Sound Event. This is most likely a temporary solution; would be better to have these in Registry Data.) |
||
(43 intermediate revisions by 16 users not shown) | |||
Line 12: | Line 12: | ||
! Notes | ! Notes | ||
|- | |- | ||
− | ! Boolean | + | ! id=Type:Boolean | {{Type|Boolean}} |
| 1 | | 1 | ||
| Either false or true | | Either false or true | ||
| True is encoded as <code>0x01</code>, false as <code>0x00</code>. | | True is encoded as <code>0x01</code>, false as <code>0x00</code>. | ||
|- | |- | ||
− | ! Byte | + | ! id=Type:Byte | {{Type|Byte}} |
| 1 | | 1 | ||
| An integer between -128 and 127 | | An integer between -128 and 127 | ||
| Signed 8-bit integer, [[wikipedia:Two's complement|two's complement]] | | Signed 8-bit integer, [[wikipedia:Two's complement|two's complement]] | ||
|- | |- | ||
− | ! Unsigned Byte | + | ! id=Type:Unsigned_Byte | {{Type|Unsigned Byte}} |
| 1 | | 1 | ||
| An integer between 0 and 255 | | An integer between 0 and 255 | ||
| Unsigned 8-bit integer | | Unsigned 8-bit integer | ||
|- | |- | ||
− | ! Short | + | ! id=Type:Short | {{Type|Short}} |
| 2 | | 2 | ||
| An integer between -32768 and 32767 | | An integer between -32768 and 32767 | ||
| Signed 16-bit integer, two's complement | | Signed 16-bit integer, two's complement | ||
|- | |- | ||
− | ! Unsigned Short | + | ! id=Type:Unsigned_Short | {{Type|Unsigned Short}} |
| 2 | | 2 | ||
| An integer between 0 and 65535 | | An integer between 0 and 65535 | ||
| Unsigned 16-bit integer | | Unsigned 16-bit integer | ||
|- | |- | ||
− | ! Int | + | ! id=Type:Int | {{Type|Int}} |
| 4 | | 4 | ||
| An integer between -2147483648 and 2147483647 | | An integer between -2147483648 and 2147483647 | ||
| Signed 32-bit integer, two's complement | | Signed 32-bit integer, two's complement | ||
|- | |- | ||
− | ! Long | + | ! id=Type:Long | {{Type|Long}} |
| 8 | | 8 | ||
| An integer between -9223372036854775808 and 9223372036854775807 | | An integer between -9223372036854775808 and 9223372036854775807 | ||
| Signed 64-bit integer, two's complement | | Signed 64-bit integer, two's complement | ||
|- | |- | ||
− | ! Float | + | ! id=Type:Float | {{Type|Float}} |
| 4 | | 4 | ||
| A [[wikipedia:Single-precision floating-point format|single-precision 32-bit IEEE 754 floating point number]] | | A [[wikipedia:Single-precision floating-point format|single-precision 32-bit IEEE 754 floating point number]] | ||
| | | | ||
|- | |- | ||
− | ! Double | + | ! id=Type:Double | {{Type|Double}} |
| 8 | | 8 | ||
| A [[wikipedia:Double-precision floating-point format|double-precision 64-bit IEEE 754 floating point number]] | | A [[wikipedia:Double-precision floating-point format|double-precision 64-bit IEEE 754 floating point number]] | ||
| | | | ||
|- | |- | ||
− | ! String (n) | + | ! id=Type:String | {{Type|String}} (n) |
− | | ≥ 1 <br />≤ (n× | + | | ≥ 1 <br />≤ (n×3) + 3 |
| A sequence of [[wikipedia:Unicode|Unicode]] [http://unicode.org/glossary/#unicode_scalar_value scalar values] | | A sequence of [[wikipedia:Unicode|Unicode]] [http://unicode.org/glossary/#unicode_scalar_value scalar values] | ||
− | | [[wikipedia:UTF-8|UTF-8]] string prefixed with its size in bytes as a VarInt. Maximum length of <code>n</code> characters, which varies by context | + | | [[wikipedia:UTF-8|UTF-8]] string prefixed with its size in bytes as a VarInt. Maximum length of <code>n</code> characters, which varies by context. The encoding used on the wire is regular UTF-8, ''not'' [https://docs.oracle.com/en/java/javase/18/docs/api/java.base/java/io/DataInput.html#modified-utf-8 Java's "slight modification"]. However, the length of the string for purposes of the length limit is its number of [[wikipedia:UTF-16|UTF-16]] code units, that is, scalar values > U+FFFF are counted as two. Up to <code>n × 3</code> bytes can be used to encode a UTF-8 string comprising <code>n</code> code units when converted to UTF-16, and both of those limits are checked. Maximum <code>n</code> value is 32767. The + 3 is due to the max size of a valid length VarInt. |
|- | |- | ||
− | ! | + | ! id=Type:Text_Component | {{Type|Text Component}} |
− | | | + | | Varies |
− | | See [[ | + | | See [[Text formatting#Text components]] |
− | | Encoded as a | + | | Encoded as a [[NBT|NBT Tag]], with the type of tag used depending on the case: |
+ | * As a [[NBT#Specification:string_tag|String Tag]]: For components only containing text (no styling, no events etc.). | ||
+ | * As a [[NBT#Specification:compound_tag|Compound Tag]]: Every other case. | ||
|- | |- | ||
− | ! Identifier | + | ! id=Type:JSON_Text_Component | {{Type|JSON Text Component}} |
− | | ≥ 1 <br />≤ (32767× | + | | ≥ 1 <br />≤ (262144×3) + 3 |
+ | | See [[Text formatting#Text components]] | ||
+ | | The maximum permitted length when decoding is 262144, but the Notchian server since 1.20.3 refuses to encode longer than 32767. This may be a bug. | ||
+ | |- | ||
+ | ! id=Type:Identifier | {{Type|Identifier}} | ||
+ | | ≥ 1 <br />≤ (32767×3) + 3 | ||
| See [[#Identifier|Identifier]] below | | See [[#Identifier|Identifier]] below | ||
| Encoded as a String with max length of 32767. | | Encoded as a String with max length of 32767. | ||
|- | |- | ||
− | ! VarInt | + | ! id=Type:VarInt | {{Type|VarInt}} |
| ≥ 1 <br />≤ 5 | | ≥ 1 <br />≤ 5 | ||
| An integer between -2147483648 and 2147483647 | | An integer between -2147483648 and 2147483647 | ||
| Variable-length data encoding a two's complement signed 32-bit integer; more info in [[#VarInt and VarLong|their section]] | | Variable-length data encoding a two's complement signed 32-bit integer; more info in [[#VarInt and VarLong|their section]] | ||
|- | |- | ||
− | ! VarLong | + | ! id=Type:VarLong | {{Type|VarLong}} |
| ≥ 1 <br />≤ 10 | | ≥ 1 <br />≤ 10 | ||
| An integer between -9223372036854775808 and 9223372036854775807 | | An integer between -9223372036854775808 and 9223372036854775807 | ||
| Variable-length data encoding a two's complement signed 64-bit integer; more info in [[#VarInt and VarLong|their section]] | | Variable-length data encoding a two's complement signed 64-bit integer; more info in [[#VarInt and VarLong|their section]] | ||
|- | |- | ||
− | ! Entity Metadata | + | ! id=Type:Entity_Metadata | {{Type|Entity Metadata}} |
| Varies | | Varies | ||
| Miscellaneous information about an entity | | Miscellaneous information about an entity | ||
| See [[Entity_metadata#Entity Metadata Format]] | | See [[Entity_metadata#Entity Metadata Format]] | ||
|- | |- | ||
− | ! Slot | + | ! id=Type:Slot | {{Type|Slot}} |
| Varies | | Varies | ||
| An item stack in an inventory or container | | An item stack in an inventory or container | ||
| See [[Slot Data]] | | See [[Slot Data]] | ||
|- | |- | ||
− | ! NBT | + | ! id=Type:NBT | {{Type|NBT}} |
| Varies | | Varies | ||
| Depends on context | | Depends on context | ||
| See [[NBT]] | | See [[NBT]] | ||
|- | |- | ||
− | ! Position | + | ! id=Type:Position | {{Type|Position}} |
| 8 | | 8 | ||
− | | An integer/block position: x (-33554432 to 33554431), | + | | An integer/block position: x (-33554432 to 33554431), z (-33554432 to 33554431), y (-2048 to 2047) |
− | | x as a 26-bit integer, followed by | + | | x as a 26-bit integer, followed by z as a 26-bit integer, followed by y as a 12-bit integer (all signed, two's complement). See also [[#Position|the section below]]. |
|- | |- | ||
− | ! Angle | + | ! id=Type:Angle | {{Type|Angle}} |
| 1 | | 1 | ||
| A rotation angle in steps of 1/256 of a full turn | | A rotation angle in steps of 1/256 of a full turn | ||
| Whether or not this is signed does not matter, since the resulting angles are the same. | | Whether or not this is signed does not matter, since the resulting angles are the same. | ||
|- | |- | ||
− | ! UUID | + | ! id=Type:UUID | {{Type|UUID}} |
| 16 | | 16 | ||
| A [[wikipedia:Universally_unique_identifier|UUID]] | | A [[wikipedia:Universally_unique_identifier|UUID]] | ||
| Encoded as an unsigned 128-bit integer (or two unsigned 64-bit integers: the most significant 64 bits and then the least significant 64 bits) | | Encoded as an unsigned 128-bit integer (or two unsigned 64-bit integers: the most significant 64 bits and then the least significant 64 bits) | ||
|- | |- | ||
− | ! Optional X | + | ! id=Type:BitSet | {{Type|BitSet}} |
+ | | Varies | ||
+ | | See [[#BitSet]] below | ||
+ | | A length-prefixed bit set. | ||
+ | |- | ||
+ | ! id=Type:Fixed_BitSet | {{Type|Fixed BitSet}} (n) | ||
+ | | ceil(n / 8) | ||
+ | | See [[#Fixed BitSet]] below | ||
+ | | A bit set with a fixed length of <var>n</var> bits. | ||
+ | |- | ||
+ | ! id=Type:Optional | {{Type|Optional}} X | ||
| 0 or size of X | | 0 or size of X | ||
| A field of type X, or nothing | | A field of type X, or nothing | ||
| Whether or not the field is present must be known from the context. | | Whether or not the field is present must be known from the context. | ||
|- | |- | ||
− | ! Array of X | + | ! id=Type:Array | {{Type|Array}} of X |
| count times size of X | | count times size of X | ||
| Zero or more fields of type X | | Zero or more fields of type X | ||
| The count must be known from the context. | | The count must be known from the context. | ||
|- | |- | ||
− | ! X Enum | + | ! id=Type:Enum | X {{Type|Enum}} |
| size of X | | size of X | ||
| A specific value from a given list | | A specific value from a given list | ||
| The list of possible values and how each is encoded as an X must be known from the context. An invalid value sent by either side will usually result in the client being disconnected with an error or even crashing. | | The list of possible values and how each is encoded as an X must be known from the context. An invalid value sent by either side will usually result in the client being disconnected with an error or even crashing. | ||
|- | |- | ||
− | ! Byte Array | + | ! id=Type:Byte_Array | {{Type|Byte Array}} |
| Varies | | Varies | ||
| Depends on context | | Depends on context | ||
| This is just a sequence of zero or more bytes, its meaning should be explained somewhere else, e.g. in the packet description. The length must also be known from the context. | | This is just a sequence of zero or more bytes, its meaning should be explained somewhere else, e.g. in the packet description. The length must also be known from the context. | ||
+ | |- | ||
+ | ! id=Type:ID_or | {{Type|ID or}} X | ||
+ | | size of {{Type|VarInt}} + (size of X or 0) | ||
+ | | See [[#ID or X]] below | ||
+ | | Either a registry ID or an inline data definition of type X. | ||
+ | |- | ||
+ | ! id=Type:ID_Set | {{Type|ID Set}} | ||
+ | | Varies | ||
+ | | See [[#ID Set]] below | ||
+ | | Set of registry IDs specified either inline or as a reference to a tag. | ||
+ | |- | ||
+ | ! id=Type:Sound_Event | {{Type|Sound Event}} | ||
+ | | Varies | ||
+ | | See [[#Sound Event]] below | ||
+ | | Parameters for a sound event. | ||
|} | |} | ||
<noinclude>== Identifier ==</noinclude><includeonly>=== Identifier ===</includeonly> | <noinclude>== Identifier ==</noinclude><includeonly>=== Identifier ===</includeonly> | ||
− | Identifiers are a namespaced location, in the form of <code>minecraft:thing</code>. If the namespace is not provided, it defaults to <code>minecraft</code> (i.e. <code>thing</code> is <code>minecraft:thing</code>. Custom content should always be in its own namespace, not the default one. | + | Identifiers are a namespaced location, in the form of <code>minecraft:thing</code>. If the namespace is not provided, it defaults to <code>minecraft</code> (i.e. <code>thing</code> is <code>minecraft:thing</code>). Custom content should always be in its own namespace, not the default one. Both the namespace and value can use all lowercase alphanumeric characters (a-z and 0-9), dot (<code>.</code>), dash (<code>-</code>), and underscore (<code>_</code>). In addition, values can use slash (<code>/</code>). The naming convention is <code>lower_case_with_underscores</code>. [https://minecraft.net/en-us/article/minecraft-snapshot-17w43a More information]. |
+ | For ease of determining whether a namespace or value is valid, here are regular expressions for each: | ||
+ | * Namespace: <code>[a-z0-9.-_]</code> | ||
+ | * Value: <code>[a-z0-9.-_/]</code> | ||
<noinclude>== VarInt and VarLong ==</noinclude><includeonly>=== VarInt and VarLong ===</includeonly> | <noinclude>== VarInt and VarLong ==</noinclude><includeonly>=== VarInt and VarLong ===</includeonly> | ||
Line 145: | Line 180: | ||
<b>Note:</b> What you are seeing here is the latest version of the [[Data types]] article, but the position type was [https://wiki.vg/index.php?title=Data_types&oldid=14345#Position different before 1.14]. | <b>Note:</b> What you are seeing here is the latest version of the [[Data types]] article, but the position type was [https://wiki.vg/index.php?title=Data_types&oldid=14345#Position different before 1.14]. | ||
− | 64-bit value split | + | 64-bit value split into three '''signed''' integer parts: |
* x: 26 MSBs | * x: 26 MSBs | ||
Line 151: | Line 186: | ||
* y: 12 LSBs | * y: 12 LSBs | ||
− | Encoded as | + | For example, a 64-bit position can be broken down as follows: |
+ | |||
+ | Example value (big endian): <code><span style="outline: solid 2px rgb(255, 0, 0)">01000110000001110110001100</span> <span style="outline: solid 2px rgb(0, 0, 255)">10110000010101101101001000</span> <span style="outline: solid 2px rgb(0, 255, 0)">001100111111</span></code><br> | ||
+ | * The red value is the X coordinate, which is <code>18357644</code> in this example.<br> | ||
+ | * The blue value is the Z coordinate, which is <code>-20882616</code> in this example.<br> | ||
+ | * The green value is the Y coordinate, which is <code>831</code> in this example.<br> | ||
+ | |||
+ | Encoded as follows: | ||
((x & 0x3FFFFFF) << 38) | ((z & 0x3FFFFFF) << 12) | (y & 0xFFF) | ((x & 0x3FFFFFF) << 38) | ((z & 0x3FFFFFF) << 12) | (y & 0xFFF) | ||
Line 157: | Line 199: | ||
And decoded as: | And decoded as: | ||
− | val = | + | val = read_long(); |
x = val >> 38; | x = val >> 38; | ||
− | y = val | + | y = val << 52 >> 52; |
− | z = | + | z = val << 26 >> 38; |
− | Note: The | + | Note: The above assumes that the right shift operator sign extends the value (this is called an [https://en.wikipedia.org/wiki/Arithmetic_shift arithmetic shift]), so that the signedness of the coordinates is preserved. In many languages, this requires the integer type of <code>val</code> to be signed. In the absence of such an operator, the following may be useful: |
− | if x >= | + | if x >= 1 << 25 { x -= 1 << 26 } |
− | if y >= | + | if y >= 1 << 11 { y -= 1 << 12 } |
− | if z >= | + | if z >= 1 << 25 { z -= 1 << 26 } |
<noinclude>== Fixed-point numbers ==</noinclude><includeonly>=== Fixed-point numbers ===</includeonly> | <noinclude>== Fixed-point numbers ==</noinclude><includeonly>=== Fixed-point numbers ===</includeonly> | ||
− | Some fields may be stored as [https://en.wikipedia.org/wiki/Fixed-point_arithmetic fixed-point numbers], where a certain number of bits | + | Some fields may be stored as [https://en.wikipedia.org/wiki/Fixed-point_arithmetic fixed-point numbers], where a certain number of bits represent the signed integer part (number to the left of the decimal point) and the rest represent the fractional part (to the right). Floating point numbers (float and double), in contrast, keep the number itself (mantissa) in one chunk, while the location of the decimal point (exponent) is stored beside it. Essentially, while fixed-point numbers have lower range than floating point numbers, their fractional precision is greater for higher values. |
− | + | Prior to version 1.9 a fixed-point format with 5 fraction bits and 27 integer bits was used to send entity positions to the client. Some uses of fixed point remain in modern versions, but they differ from that format. | |
− | + | Most programming languages lack support for fractional integers directly, but you can represent them as integers. The following C or Java-like pseudocode converts a double to a fixed-point integer with <var>n</var> fraction bits: | |
+ | |||
+ | x_fixed = (int)(x_double * (1 << n)); | ||
− | |||
− | |||
And back again: | And back again: | ||
− | + | x_double = (double)x_fixed / (1 << n); | |
− | <noinclude>== | + | <noinclude>== Bit sets ==</noinclude><includeonly>=== Bit sets ===</includeonly> |
+ | |||
+ | The types {{Type|BitSet}} and {{Type|Fixed BitSet}} represent packed lists of bits. The Notchian implementation uses Java's [https://docs.oracle.com/javase/8/docs/api/java/util/BitSet.html <code>BitSet</code>] class. | ||
+ | |||
+ | <noinclude>=== BitSet ===</noinclude><includeonly>==== BitSet ====</includeonly> | ||
+ | |||
+ | Bit sets of type BitSet are prefixed by their length in longs. | ||
{| class="wikitable" | {| class="wikitable" | ||
Line 189: | Line 237: | ||
! Meaning | ! Meaning | ||
|- | |- | ||
− | | | + | | Length |
− | | VarInt | + | | {{Type|VarInt}} |
− | | | + | | Number of longs in the following array. May be 0 (if no bits are set). |
|- | |- | ||
| Data | | Data | ||
− | | | + | | {{Type|Array}} of {{Type|Long}} |
− | | | + | | A packed representation of the bit set as created by [https://docs.oracle.com/javase/8/docs/api/java/util/BitSet.html#toLongArray-- <code>BitSet.toLongArray</code>]. |
|} | |} | ||
+ | |||
+ | The <var>i</var>th bit is set when <code>(Data[i / 64] & (1 << (i % 64))) != 0</code>, where <var>i</var> starts at 0. | ||
+ | |||
+ | <noinclude>=== Fixed BitSet ===</noinclude><includeonly>==== Fixed BitSet ====</includeonly> | ||
+ | |||
+ | Bit sets of type Fixed BitSet (n) have a fixed length of <var>n</var> bits, encoded as <code>ceil(n / 8)</code> bytes. Note that this is different from BitSet, which uses longs. | ||
{| class="wikitable" | {| class="wikitable" | ||
+ | ! Field Name | ||
+ | ! Field Type | ||
+ | ! Meaning | ||
|- | |- | ||
− | ! | + | | Data |
− | ! | + | | {{Type|Byte Array}} (n) |
− | ! | + | | A packed representation of the bit set as created by [https://docs.oracle.com/javase/8/docs/api/java/util/BitSet.html#toByteArray-- <code>BitSet.toByteArray</code>], padded with zeroes at the end to fit the specified length. |
+ | |} | ||
+ | |||
+ | The <var>i</var>th bit is set when <code>(Data[i / 8] & (1 << (i % 8))) != 0</code>, where <var>i</var> starts at 0. This encoding is ''not'' equivalent to the long array in BitSet. | ||
+ | |||
+ | <noinclude>== Registry references ==</noinclude><includeonly>=== Registry references ===</includeonly> | ||
+ | |||
+ | <noinclude>=== ID or X ===</noinclude><includeonly>==== ID or X ====</includeonly> | ||
+ | |||
+ | Represents a data record of type X, either inline, or by reference to a registry implied by context. | ||
+ | |||
+ | {| class="wikitable" | ||
+ | ! Field Name | ||
+ | ! Field Type | ||
+ | ! Meaning | ||
|- | |- | ||
− | | | + | | ID |
− | + | | {{Type|VarInt}} | |
− | + | | 0 if value of type X is given inline; otherwise registry ID + 1. | |
− | |||
− | |||
− | |||
− | |||
− | |||
− | | | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | | | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
|- | |- | ||
+ | | Value | ||
+ | | {{Type|Optional}} X | ||
+ | | Only present if ID is 0. | ||
|} | |} | ||
− | <noinclude>== | + | <noinclude>=== ID Set ===</noinclude><includeonly>==== ID Set ====</includeonly> |
− | Represents | + | Represents a set of IDs in a certain registry (implied by context), either directly (enumerated IDs) or indirectly (tag name). |
{| class="wikitable" | {| class="wikitable" | ||
Line 682: | Line 293: | ||
! Meaning | ! Meaning | ||
|- | |- | ||
− | | | + | | Type |
− | | VarInt | + | | {{Type|VarInt}} |
− | | | + | | Value used to determine the data that follows. It can be either: |
+ | * 0 - Represents a named set of IDs defined by a tag. | ||
+ | * Anything else - Represents an ad-hoc set of IDs enumerated inline. | ||
+ | |- | ||
+ | | Tag Name | ||
+ | | {{Type|Optional}} {{Type|Identifier}} | ||
+ | | The registry tag defining the ID set. Only present if Type is 0. | ||
|- | |- | ||
− | | | + | | IDs |
− | | Array of | + | | {{Type|Optional}} {{Type|Array}} of {{Type|VarInt}} |
− | | | + | | An array of registry IDs. Only present if Type is not 0.<br>The size of the array is equal to <code>Type - 1</code>. |
|} | |} | ||
− | + | <noinclude>== Registry data ==</noinclude><includeonly>=== Registry data ===</includeonly> | |
+ | |||
+ | These types are commonly used in conjuction with {{Type|ID or}} X to specify custom data inline. | ||
+ | |||
+ | <noinclude>=== Sound Event ===</noinclude><includeonly>==== Sound Event ====</includeonly> | ||
+ | |||
+ | Describes a sound that can be played. | ||
+ | |||
+ | {| class="wikitable" | ||
+ | ! Name | ||
+ | ! Type | ||
+ | ! Description | ||
+ | |- | ||
+ | | Sound Name | ||
+ | | {{Type|Identifier}} | ||
+ | | | ||
+ | |- | ||
+ | | Has Fixed Range | ||
+ | | {{Type|Boolean}} | ||
+ | | Whether this sound has a fixed range, as opposed to a variable volume based on distance. | ||
+ | |- | ||
+ | | Fixed Range | ||
+ | | {{Type|Optional}} {{Type|Float}} | ||
+ | | The maximum range of the sound. Only present if Has Fixed Range is true. | ||
+ | |} | ||
<noinclude> | <noinclude> |
Latest revision as of 09:26, 21 November 2024
This article defines the data types used in the protocol. All data sent over the network (except for VarInt and VarLong) is big-endian, that is the bytes are sent from most significant byte to least significant byte. The majority of everyday computers are little-endian, therefore it may be necessary to change the endianness before sending data over the network.
Contents
Definitions
Name | Size (bytes) | Encodes | Notes |
---|---|---|---|
Boolean | 1 | Either false or true | True is encoded as 0x01 , false as 0x00 .
|
Byte | 1 | An integer between -128 and 127 | Signed 8-bit integer, two's complement |
Unsigned Byte | 1 | An integer between 0 and 255 | Unsigned 8-bit integer |
Short | 2 | An integer between -32768 and 32767 | Signed 16-bit integer, two's complement |
Unsigned Short | 2 | An integer between 0 and 65535 | Unsigned 16-bit integer |
Int | 4 | An integer between -2147483648 and 2147483647 | Signed 32-bit integer, two's complement |
Long | 8 | An integer between -9223372036854775808 and 9223372036854775807 | Signed 64-bit integer, two's complement |
Float | 4 | A single-precision 32-bit IEEE 754 floating point number | |
Double | 8 | A double-precision 64-bit IEEE 754 floating point number | |
String (n) | ≥ 1 ≤ (n×3) + 3 |
A sequence of Unicode scalar values | UTF-8 string prefixed with its size in bytes as a VarInt. Maximum length of n characters, which varies by context. The encoding used on the wire is regular UTF-8, not Java's "slight modification". However, the length of the string for purposes of the length limit is its number of UTF-16 code units, that is, scalar values > U+FFFF are counted as two. Up to n × 3 bytes can be used to encode a UTF-8 string comprising n code units when converted to UTF-16, and both of those limits are checked. Maximum n value is 32767. The + 3 is due to the max size of a valid length VarInt.
|
Text Component | Varies | See Text formatting#Text components | Encoded as a NBT Tag, with the type of tag used depending on the case:
|
JSON Text Component | ≥ 1 ≤ (262144×3) + 3 |
See Text formatting#Text components | The maximum permitted length when decoding is 262144, but the Notchian server since 1.20.3 refuses to encode longer than 32767. This may be a bug. |
Identifier | ≥ 1 ≤ (32767×3) + 3 |
See Identifier below | Encoded as a String with max length of 32767. |
VarInt | ≥ 1 ≤ 5 |
An integer between -2147483648 and 2147483647 | Variable-length data encoding a two's complement signed 32-bit integer; more info in their section |
VarLong | ≥ 1 ≤ 10 |
An integer between -9223372036854775808 and 9223372036854775807 | Variable-length data encoding a two's complement signed 64-bit integer; more info in their section |
Entity Metadata | Varies | Miscellaneous information about an entity | See Entity_metadata#Entity Metadata Format |
Slot | Varies | An item stack in an inventory or container | See Slot Data |
NBT | Varies | Depends on context | See NBT |
Position | 8 | An integer/block position: x (-33554432 to 33554431), z (-33554432 to 33554431), y (-2048 to 2047) | x as a 26-bit integer, followed by z as a 26-bit integer, followed by y as a 12-bit integer (all signed, two's complement). See also the section below. |
Angle | 1 | A rotation angle in steps of 1/256 of a full turn | Whether or not this is signed does not matter, since the resulting angles are the same. |
UUID | 16 | A UUID | Encoded as an unsigned 128-bit integer (or two unsigned 64-bit integers: the most significant 64 bits and then the least significant 64 bits) |
BitSet | Varies | See #BitSet below | A length-prefixed bit set. |
Fixed BitSet (n) | ceil(n / 8) | See #Fixed BitSet below | A bit set with a fixed length of n bits. |
Optional X | 0 or size of X | A field of type X, or nothing | Whether or not the field is present must be known from the context. |
Array of X | count times size of X | Zero or more fields of type X | The count must be known from the context. |
X Enum | size of X | A specific value from a given list | The list of possible values and how each is encoded as an X must be known from the context. An invalid value sent by either side will usually result in the client being disconnected with an error or even crashing. |
Byte Array | Varies | Depends on context | This is just a sequence of zero or more bytes, its meaning should be explained somewhere else, e.g. in the packet description. The length must also be known from the context. |
ID or X | size of VarInt + (size of X or 0) | See #ID or X below | Either a registry ID or an inline data definition of type X. |
ID Set | Varies | See #ID Set below | Set of registry IDs specified either inline or as a reference to a tag. |
Sound Event | Varies | See #Sound Event below | Parameters for a sound event. |
Identifier
Identifiers are a namespaced location, in the form of minecraft:thing
. If the namespace is not provided, it defaults to minecraft
(i.e. thing
is minecraft:thing
). Custom content should always be in its own namespace, not the default one. Both the namespace and value can use all lowercase alphanumeric characters (a-z and 0-9), dot (.
), dash (-
), and underscore (_
). In addition, values can use slash (/
). The naming convention is lower_case_with_underscores
. More information.
For ease of determining whether a namespace or value is valid, here are regular expressions for each:
- Namespace:
[a-z0-9.-_]
- Value:
[a-z0-9.-_/]
VarInt and VarLong
Variable-length format such that smaller numbers use fewer bytes. These are very similar to Protocol Buffer Varints: the 7 least significant bits are used to encode the value and the most significant bit indicates whether there's another byte after it for the next part of the number. The least significant group is written first, followed by each of the more significant groups; thus, VarInts are effectively little endian (however, groups are 7 bits, not 8).
VarInts are never longer than 5 bytes, and VarLongs are never longer than 10 bytes. Within these limits, unnecessarily long encodings (e.g. 81 00
to encode 1) are allowed.
Pseudocode to read and write VarInts and VarLongs:
private static final int SEGMENT_BITS = 0x7F;
private static final int CONTINUE_BIT = 0x80;
public int readVarInt() {
int value = 0;
int position = 0;
byte currentByte;
while (true) {
currentByte = readByte();
value |= (currentByte & SEGMENT_BITS) << position;
if ((currentByte & CONTINUE_BIT) == 0) break;
position += 7;
if (position >= 32) throw new RuntimeException("VarInt is too big");
}
return value;
}
public long readVarLong() {
long value = 0;
int position = 0;
byte currentByte;
while (true) {
currentByte = readByte();
value |= (long) (currentByte & SEGMENT_BITS) << position;
if ((currentByte & CONTINUE_BIT) == 0) break;
position += 7;
if (position >= 64) throw new RuntimeException("VarLong is too big");
}
return value;
}
public void writeVarInt(int value) {
while (true) {
if ((value & ~SEGMENT_BITS) == 0) {
writeByte(value);
return;
}
writeByte((value & SEGMENT_BITS) | CONTINUE_BIT);
// Note: >>> means that the sign bit is shifted with the rest of the number rather than being left alone
value >>>= 7;
}
}
public void writeVarLong(long value) {
while (true) {
if ((value & ~((long) SEGMENT_BITS)) == 0) {
writeByte(value);
return;
}
writeByte((value & SEGMENT_BITS) | CONTINUE_BIT);
// Note: >>> means that the sign bit is shifted with the rest of the number rather than being left alone
value >>>= 7;
}
}
Note Minecraft's VarInts are identical to LEB128 with the slight change of throwing a exception if it goes over a set amount of bytes.
Note that Minecraft's VarInts are not encoded using Protocol Buffers; it's just similar. If you try to use Protocol Buffers Varints with Minecraft's VarInts, you'll get incorrect results in some cases. The major differences:
- Minecraft's VarInts are all signed, but do not use the ZigZag encoding. Protocol buffers have 3 types of Varints:
uint32
(normal encoding, unsigned),sint32
(ZigZag encoding, signed), andint32
(normal encoding, signed). Minecraft's are theint32
variety. Because Minecraft uses the normal encoding instead of ZigZag encoding, negative values always use the maximum number of bytes. - Minecraft's VarInts are never longer than 5 bytes and its VarLongs will never be longer than 10 bytes, while Protocol Buffer Varints will always use 10 bytes when encoding negative numbers, even if it's an
int32
.
Sample VarInts:
Value | Hex bytes | Decimal bytes |
---|---|---|
0 | 0x00 | 0 |
1 | 0x01 | 1 |
2 | 0x02 | 2 |
127 | 0x7f | 127 |
128 | 0x80 0x01 | 128 1 |
255 | 0xff 0x01 | 255 1 |
25565 | 0xdd 0xc7 0x01 | 221 199 1 |
2097151 | 0xff 0xff 0x7f | 255 255 127 |
2147483647 | 0xff 0xff 0xff 0xff 0x07 | 255 255 255 255 7 |
-1 | 0xff 0xff 0xff 0xff 0x0f | 255 255 255 255 15 |
-2147483648 | 0x80 0x80 0x80 0x80 0x08 | 128 128 128 128 8 |
Sample VarLongs:
Value | Hex bytes | Decimal bytes |
---|---|---|
0 | 0x00 | 0 |
1 | 0x01 | 1 |
2 | 0x02 | 2 |
127 | 0x7f | 127 |
128 | 0x80 0x01 | 128 1 |
255 | 0xff 0x01 | 255 1 |
2147483647 | 0xff 0xff 0xff 0xff 0x07 | 255 255 255 255 7 |
9223372036854775807 | 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0x7f | 255 255 255 255 255 255 255 255 127 |
-1 | 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0x01 | 255 255 255 255 255 255 255 255 255 1 |
-2147483648 | 0x80 0x80 0x80 0x80 0xf8 0xff 0xff 0xff 0xff 0x01 | 128 128 128 128 248 255 255 255 255 1 |
-9223372036854775808 | 0x80 0x80 0x80 0x80 0x80 0x80 0x80 0x80 0x80 0x01 | 128 128 128 128 128 128 128 128 128 1 |
Position
Note: What you are seeing here is the latest version of the Data types article, but the position type was different before 1.14.
64-bit value split into three signed integer parts:
- x: 26 MSBs
- z: 26 middle bits
- y: 12 LSBs
For example, a 64-bit position can be broken down as follows:
Example value (big endian): 01000110000001110110001100 10110000010101101101001000 001100111111
- The red value is the X coordinate, which is
18357644
in this example. - The blue value is the Z coordinate, which is
-20882616
in this example. - The green value is the Y coordinate, which is
831
in this example.
Encoded as follows:
((x & 0x3FFFFFF) << 38) | ((z & 0x3FFFFFF) << 12) | (y & 0xFFF)
And decoded as:
val = read_long(); x = val >> 38; y = val << 52 >> 52; z = val << 26 >> 38;
Note: The above assumes that the right shift operator sign extends the value (this is called an arithmetic shift), so that the signedness of the coordinates is preserved. In many languages, this requires the integer type of val
to be signed. In the absence of such an operator, the following may be useful:
if x >= 1 << 25 { x -= 1 << 26 } if y >= 1 << 11 { y -= 1 << 12 } if z >= 1 << 25 { z -= 1 << 26 }
Fixed-point numbers
Some fields may be stored as fixed-point numbers, where a certain number of bits represent the signed integer part (number to the left of the decimal point) and the rest represent the fractional part (to the right). Floating point numbers (float and double), in contrast, keep the number itself (mantissa) in one chunk, while the location of the decimal point (exponent) is stored beside it. Essentially, while fixed-point numbers have lower range than floating point numbers, their fractional precision is greater for higher values.
Prior to version 1.9 a fixed-point format with 5 fraction bits and 27 integer bits was used to send entity positions to the client. Some uses of fixed point remain in modern versions, but they differ from that format.
Most programming languages lack support for fractional integers directly, but you can represent them as integers. The following C or Java-like pseudocode converts a double to a fixed-point integer with n fraction bits:
x_fixed = (int)(x_double * (1 << n));
And back again:
x_double = (double)x_fixed / (1 << n);
Bit sets
The types BitSet and Fixed BitSet represent packed lists of bits. The Notchian implementation uses Java's BitSet
class.
BitSet
Bit sets of type BitSet are prefixed by their length in longs.
Field Name | Field Type | Meaning |
---|---|---|
Length | VarInt | Number of longs in the following array. May be 0 (if no bits are set). |
Data | Array of Long | A packed representation of the bit set as created by BitSet.toLongArray .
|
The ith bit is set when (Data[i / 64] & (1 << (i % 64))) != 0
, where i starts at 0.
Fixed BitSet
Bit sets of type Fixed BitSet (n) have a fixed length of n bits, encoded as ceil(n / 8)
bytes. Note that this is different from BitSet, which uses longs.
Field Name | Field Type | Meaning |
---|---|---|
Data | Byte Array (n) | A packed representation of the bit set as created by BitSet.toByteArray , padded with zeroes at the end to fit the specified length.
|
The ith bit is set when (Data[i / 8] & (1 << (i % 8))) != 0
, where i starts at 0. This encoding is not equivalent to the long array in BitSet.
Registry references
ID or X
Represents a data record of type X, either inline, or by reference to a registry implied by context.
Field Name | Field Type | Meaning |
---|---|---|
ID | VarInt | 0 if value of type X is given inline; otherwise registry ID + 1. |
Value | Optional X | Only present if ID is 0. |
ID Set
Represents a set of IDs in a certain registry (implied by context), either directly (enumerated IDs) or indirectly (tag name).
Field Name | Field Type | Meaning |
---|---|---|
Type | VarInt | Value used to determine the data that follows. It can be either:
|
Tag Name | Optional Identifier | The registry tag defining the ID set. Only present if Type is 0. |
IDs | Optional Array of VarInt | An array of registry IDs. Only present if Type is not 0. The size of the array is equal to Type - 1 .
|
Registry data
These types are commonly used in conjuction with ID or X to specify custom data inline.
Sound Event
Describes a sound that can be played.
Name | Type | Description |
---|---|---|
Sound Name | Identifier | |
Has Fixed Range | Boolean | Whether this sound has a fixed range, as opposed to a variable volume based on distance. |
Fixed Range | Optional Float | The maximum range of the sound. Only present if Has Fixed Range is true. |