I find myself a bit confused regarding the TAG_String name thing. I understand that it is used to identify each tag, but the portion which confuses me is that the article implies that not all tags have this name attached to them, "Note that ONLY Named Tags carry the name and tagType data. Explicitly identified Tags (such as TAG_String above) only contains the payload."
Now, my question is, what is considered a named tag, and what is not?
- "Tagged" imples: tag type, name, payload. "Untagged" implies just the payload. If you had a TAG_List of TAG_Shorts, each element of the list is *just* the short payload, i.e. 2 bytes per element. As such the children of TAG_Lists are unnamed. TAG_Lists define a "tagId" at the start of their payload so you know what kind of tag you're reading. Named tags exist only in two circumstances: 1) As children of TAG_Compound, 2) As the root node. Barneygale 17:52, 25 November 2011 (MST)
- Thanks! Understood.
I have altered the bigtest.nbt example due to the fact the example readout was not correctly representing the UTF-8 encoding actually contained in the document.
The string purported to be "HELLO WORLD THIS IS A TEST STRING !" but was actually "HELLO WORLD THIS IS A TEST STRING ÅÄÖ!"
This can be verified by examining the bytes which make up the string:
72 69 76 76 79 32 87 79 82 76 68 32 84 72 73 83 32 73 83 32 65 32 84 69 83 84 32 83 84 82 73 78 71 32 195 133 195 132 195 150 33
Using a UTF-8 decoder in "Freeflow Numeric" mode.
~Drainedsoul 15:06, 29 November 2011 (PST)
The table in the specification section states that the length part of the header for string values is **little-endian** while in the hello_world.nbt file it is clearly in big-endian. I have not tested my code thoroughly so I have not submitted an edit, but it seems to me that all numerical values regardless of significance (either as payloads or length values in headers) are encoded in a network byte order (big-endian). --Homelessrobot (talk) 06:16, 30 May 2016 (UTC)
Re: Annoying Format
Wikipedia is not the place to voice personal opinions, in this case about whether or not creating a new file format was necessary.
There are many reasons why Mojang may have chosen to use this format, (not endorsed by Mojang AB):
- They felt that other data structures were inadequate
- They wished to use all of their own code for the data structure (allows greater modification/control etc.)
- They wanted it to be more difficult for people to edit data files
- They believed that it would save space
- They have some future use planned where it may be necessary
If even for the simple fact that they wanted to set themselves apart from the crowd, voice your opinion on a forum or blog (or this discussion page). Wikipedia is supposed to be a place where unbiased information is presented for public consumption.
String payload is 2 bytes little endian
Hello. In the wiki, it said that "The prefix [of a string] is an unsigned short (thus 2 bytes) in little-endian". However, the official mc wiki says is big endian. Checking the format of NBT files wroted by Minecraft also confirm is big-endian. So, why it is said is little ?