Talk:NBT format

Suggestions?
I just created this page because it was sorely needed. Does anyone have any suggestions? LB 01:17, 22 June 2012 (UTC)
 * sure: You reverted my changes about the compression method. But this little detail is essential for developers. If you dont want it in the "File Format"-Section then put it somewhere else please. --80.134.26.27 09:12, 12 September 2012 (UTC)
 * You can start a new section (eg "Other Formats") and explain when zlib is used. I believe all scenarios are: GZip compressed/uncompressed files, GZip or zlib compressed within region files, and another kind of compression for multiplayer chunk sending. The reason I remove it from the File Format section was because the only NBT files we're aware of (this doesn't include region files) use either GZip or are uncompressed. I know it's lazy of me to revert your edit, but I'm pressed for time (and honestly shouldn't be writing this). I may do it later...  LB ( T 01:37, 13 September 2012 (UTC)

Differences between NBT Versions
How to check which Program supports which NBT Version? --Sfan5 16:28, 10 December 2012 (UTC)
 * There are only two versions of the NBT format - 19132 and 19133. An easy way is to have a firework in your inventory and try to load your level.dat file with the program. If it can't load it, it probably only supports 19132. If it can, it supports 19133. NBTEdit supports 19132, and NBTExplorer supports 19133.  LB ( T 02:43, 11 December 2012 (UTC)

Code point
32,767 UTF-8 Code Points (see UTF-8 format; most commonly-used characters are a single code point).

Not sure if other definitions exist, but as the term is normally used, a unicode code point (see Wikipedia) is the same thing as a character. What is likely meant is "32767 bytes in UTF-8 (see UTF-8 format; most commonly used characters are encoded by a single byte)." Arancaytar 15:10, 26 March 2013 (UTC)
 * Actually, no - some characters require multiple code points. Check out these for instance:


 * กิิิิิิิิิิิิิิิิิิิิ ก้้้้้้้้้้้้้้้้้้้้ ก็็็็็็็็็็็็็็็็็็็็ ก็็็็็็็็็็็็็็็็็็็็ กิิิิิิิิิิิิิิิิิิิิ ก้้้้้้้้้้้้้้้้้้้้ ก็็็็็็็็็็็็็็็็็็็็ กิิิิิิิิิิิิิิิิิิิิ ก้้้้้้้้้้้้้้้้้้้้ กิิิิิิิิิิิิิิิิิิิิ ก้้้้้้้้้้้้้้้้้้้้ ก็็็็็็็็็็็็็็็็็็็็ ก็็็็็็็็็็็็็็็็็็็็ กิิิิิิิิิิิิิิิิิิิิ ก้้้้้้้้้้้้้้้้้้้้ ก็็็็็็็็็็็็็็็็็็็็ กิิิิิิิิิิิิิิิิิิิิ ก้้้้้้้้้้้้้้้้้้้้


 *  LB ( T 23:16, 26 March 2013 (UTC)


 * Oh, okay; I didn't think of clusters. But can the string then truly contain 32767 code points of any kind (up to 4 octets each), rather than 32767 octets? Arancaytar 14:35, 27 March 2013 (UTC)


 * I'm not sure, I'll wait for an expert to respond to this one. Sorry for this useless response...  LB ( T 21:14, 27 March 2013 (UTC)


 * Looking at the code, I don't think the page is currently accurate. Writing and reading NBT strings uses the java.io.DataOutput.writeUTF and java.io.DataInput.readUTF methods, which employ a slight modification of the UTF-8 standard. The length of the stored data is an unsigned 16-bit integer, which lets it contain up to 65,535 bytes. Each char is encoded as 1-3 bytes; supplementary characters (above U+FFFF) use two chars, so each code point can use up to 6 bytes. Worst-case scenario is as few as 10,922 code points; best case is 65,535 ASCII characters. -- Orthotope 04:11, 28 March 2013 (UTC)

UTF vs binary
The string are UTF8, but does the 2 byte size specify the number of bytes or the number of unicode code points? I am guessing bytes?114.77.189.179 02:24, 21 December 2013 (UTC)


 * Minecraft's implementation uses writeUTF, so the size is the number of bytes, not the number of UTF8 code points.  LB ( T 04:40, 21 December 2013 (UTC)