Difference between revisions of "VarInt And VarLong"

From wiki.vg
Jump to navigation Jump to search
m (Made my previous code more readable and added the error cases)
(Undo revision 16775 by Pv6q (talk) - Reverted because it unnecessarily removed a helpful comment and needlessly changed variable names. The old methods have worked fine for literally years, there's no point in changing something that isn't broken.)
Tag: Undo
Line 6: Line 6:
  
 
<syntaxhighlight lang="java">
 
<syntaxhighlight lang="java">
public int readVarInt() {
+
public static int readVarInt() {
     int decodedInt = 0;
+
     int numRead = 0;
     int bitOffset = 0;
+
     int result = 0;
     byte currentByte;
+
     byte read;
 
     do {
 
     do {
         currentByte = readByte();
+
         read = readByte();
         decodedInt |= (currentByte & 0b01111111) << bitOffset;
+
         int value = (read & 0b01111111);
 +
        result |= (value << (7 * numRead));
  
         if (bitOffset == 35) throw new RuntimeException("VarInt is too big");
+
        numRead++;
 +
         if (numRead > 5) {
 +
            throw new RuntimeException("VarInt is too big");
 +
        }
 +
    } while ((read & 0b10000000) != 0);
  
        bitOffset += 7;
+
     return result;
    } while ((currentByte & 0b10000000) != 0);
 
 
 
     return decodedInt;
 
 
}
 
}
 
</syntaxhighlight>
 
</syntaxhighlight>
 
<syntaxhighlight lang="java">
 
<syntaxhighlight lang="java">
public long readVarLong() {
+
public static long readVarLong() {
     int decodedLong = 0;
+
     int numRead = 0;
     int bitOffset = 0;
+
     long result = 0;
     byte currentByte;
+
     byte read;
 
     do {
 
     do {
         currentByte = readByte();
+
         read = readByte();
         decodedLong |= (currentByte & 0b01111111) << bitOffset;
+
         long value = (read & 0b01111111);
 +
        result |= (value << (7 * numRead));
  
         if (bitOffset == 70) throw new RuntimeException("VarLong is too big");
+
        numRead++;
 +
         if (numRead > 10) {
 +
            throw new RuntimeException("VarLong is too big");
 +
        }
 +
    } while ((read & 0b10000000) != 0);
  
        bitOffset += 7;
+
     return result;
    } while ((currentByte & 0b10000000) != 0);
 
 
 
     return decodedLong;
 
 
}
 
}
 
</syntaxhighlight>
 
</syntaxhighlight>
 
<syntaxhighlight lang="java">
 
<syntaxhighlight lang="java">
public void writeVarInt(int value) {
+
public static void writeVarInt(int value) {
 
     do {
 
     do {
         byte currentByte = (byte) (value & 0b01111111);
+
         byte temp = (byte)(value & 0b01111111);
 
+
        // Note: >>> means that the sign bit is shifted with the rest of the number rather than being left alone
 
         value >>>= 7;
 
         value >>>= 7;
         if (value != 0) currentByte |= 0b10000000;
+
         if (value != 0) {
 
+
            temp |= 0b10000000;
         writeByte(currentByte);
+
        }
 +
         writeByte(temp);
 
     } while (value != 0);
 
     } while (value != 0);
 
}
 
}
 
</syntaxhighlight>
 
</syntaxhighlight>
 
<syntaxhighlight lang="java">
 
<syntaxhighlight lang="java">
public void writeVarLong(long value) {
+
public static void writeVarLong(long value) {
 
     do {
 
     do {
         byte currentByte = (byte) (value & 0b01111111);
+
         byte temp = (byte)(value & 0b01111111);
 
+
        // Note: >>> means that the sign bit is shifted with the rest of the number rather than being left alone
 
         value >>>= 7;
 
         value >>>= 7;
         if (value != 0) currentByte |= 0b10000000;
+
         if (value != 0) {
 
+
            temp |= 0b10000000;
         writeByte(currentByte);
+
        }
 +
         writeByte(temp);
 
     } while (value != 0);
 
     } while (value != 0);
 
}
 
}

Revision as of 20:25, 26 June 2021

Variable-length format such that smaller numbers use fewer bytes. These are very similar to Protocol Buffer Varints: the 7 least significant bits are used to encode the value and the most significant bit indicates whether there's another byte after it for the next part of the number. The least significant group is written first, followed by each of the more significant groups; thus, VarInts are effectively little endian (however, groups are 7 bits, not 8).

VarInts are never longer than 5 bytes, and VarLongs are never longer than 10 bytes.

Pseudocode to read and write VarInts and VarLongs:

public static int readVarInt() {
    int numRead = 0;
    int result = 0;
    byte read;
    do {
        read = readByte();
        int value = (read & 0b01111111);
        result |= (value << (7 * numRead));

        numRead++;
        if (numRead > 5) {
            throw new RuntimeException("VarInt is too big");
        }
    } while ((read & 0b10000000) != 0);

    return result;
}
public static long readVarLong() {
    int numRead = 0;
    long result = 0;
    byte read;
    do {
        read = readByte();
        long value = (read & 0b01111111);
        result |= (value << (7 * numRead));

        numRead++;
        if (numRead > 10) {
            throw new RuntimeException("VarLong is too big");
        }
    } while ((read & 0b10000000) != 0);

    return result;
}
public static void writeVarInt(int value) {
    do {
        byte temp = (byte)(value & 0b01111111);
        // Note: >>> means that the sign bit is shifted with the rest of the number rather than being left alone
        value >>>= 7;
        if (value != 0) {
            temp |= 0b10000000;
        }
        writeByte(temp);
    } while (value != 0);
}
public static void writeVarLong(long value) {
    do {
        byte temp = (byte)(value & 0b01111111);
        // Note: >>> means that the sign bit is shifted with the rest of the number rather than being left alone
        value >>>= 7;
        if (value != 0) {
            temp |= 0b10000000;
        }
        writeByte(temp);
    } while (value != 0);
}

Warning.png Note Minecraft's VarInts are identical to LEB128 with the slight change of throwing a exception if it goes over a set amount of bytes.

Warning.png Note that Minecraft's VarInts are not encoded using Protocol Buffers; it's just similar. If you try to use Protocol Buffers Varints with Minecraft's VarInts, you'll get incorrect results in some cases. The major differences:

  • Minecraft's VarInts are all signed, but do not use the ZigZag encoding. Protocol buffers have 3 types of Varints: uint32 (normal encoding, unsigned), sint32 (ZigZag encoding, signed), and int32 (normal encoding, signed). Minecraft's are the int32 variety. Because Minecraft uses the normal encoding instead of ZigZag encoding, negative values always use the maximum number of bytes.
  • Minecraft's VarInts are never longer than 5 bytes and its VarLongs will never be longer than 10 bytes, while Protocol Buffer Varints will always use 10 bytes when encoding negative numbers, even if it's an int32.

Sample VarInts:

Value Hex bytes Decimal bytes
0 0x00 0
1 0x01 1
2 0x02 2
127 0x7f 127
128 0x80 0x01 128 1
255 0xff 0x01 255 1
2097151 0xff 0xff 0x7f 255 255 127
2147483647 0xff 0xff 0xff 0xff 0x07 255 255 255 255 7
-1 0xff 0xff 0xff 0xff 0x0f 255 255 255 255 15
-2147483648 0x80 0x80 0x80 0x80 0x08 128 128 128 128 8

Sample VarLongs:

Value Hex bytes Decimal bytes
0 0x00 0
1 0x01 1
2 0x02 2
127 0x7f 127
128 0x80 0x01 128 1
255 0xff 0x01 255 1
2147483647 0xff 0xff 0xff 0xff 0x07 255 255 255 255 7
9223372036854775807 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0x7f 255 255 255 255 255 255 255 255 127
-1 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0x01 255 255 255 255 255 255 255 255 255 1
-2147483648 0x80 0x80 0x80 0x80 0xf8 0xff 0xff 0xff 0xff 0x01 128 128 128 128 248 255 255 255 255 1
-9223372036854775808 0x80 0x80 0x80 0x80 0x80 0x80 0x80 0x80 0x80 0x01 128 128 128 128 128 128 128 128 128 1