• vON: Vercas' Object Notation [Release for developers]
    112 replies, posted
  • Avatar of vercas
  • [B][URL="https://github.com/vercas/vON"]GitHub Repository[/URL][/B] [H2]Why?[/H2] Well, the current serialization method available in GMod - GLON - is slow. And some trials with (extremely) complex data structures have failed miserably. I also don't like the idea of using non-printable characters. What I was using before writing vON was JSON. But JSON had a problem. I couldn't find a solid parser. The best one that I found had some miserable failures in the end. Also the JSON implementations are bound to the JSON standard (obviously), so they don't support things like booleans or even tables as keys. [HR][/HR] [H2]Specifications[/H2] As in Lua, the "main" data structure is the table. But, unlike Lua tables, they are separated in two parts: the numeric (array) component and the key:value pairs (dictionary) component. Tables start with [i]{[/i] and end with [i]}[/i], and the two components are separated by a [i]~[/i] (tilda) character, which may be absent if the table is a pure array. [b]Examples[/b]: Format: [b]{ ... ~ ... }[/b] and data: [b]{ "lolv" "maov" ~ "lolv":"maov" }[/b], which would be [b]{"lol","mao",lol="mao"}[/b] in Lua. The first table in the data, the [i]chunk[/i] has no initial and final characters (they are useless). [b]Example[/b]: [b]"lolv""maov"~"lolv":"maov"[/b], which means the same thing as the example above enclosed in [i]{ }[/i]. Also, for the sake of parsing speed, some types (such as boolean and numbers) are prefixed with a character. If no valid prefix character is found, the last type will be automatically used. Inside tables, spaces, tabs and newlines are simply ignored when deserializing. Numbers are currently declared like this: [b]n...[/b] ([i]...[/i] represents the value in base 10). They either end in [i];, }, \n (newline), : or ~[/i]. (At least one must be present). [b]Example[/b]: [b]n1;2;3~4:4;[/b] and [b]{n1;2;"intruder!v"n4}[/b]. [b]If I get enough moral support, I will add a compressor for the numbers![/b] Booleans are prefixed by [b]b[/b] and are represented either by a [b]1[/b] [i](true)[/i] or [b]0[/b] [i](false)[/i]. They are represented by a single character so they don't have a delimiter. Boolean flags usually look like this: [b]b101101001[/b]. Strings start and being with double quotes ("). Quotes inside strings are escaped with a "\". Only quotes are escaped now. A "v" is added at the end of every string to make sure the string doesn't end in a "\" (which would break the deserializer). That's 1 character more per string, but it whould be a fair sacrifice considering the speeds at which it deserializes strings. (Especially strings of kilobytes in size) This was written in pure Lua for Windows. (It will obviously work on other OSes too!) GMod-specific types are included in a different version. [quote="Note"]I'm not sure how "human-readable" vON is. I'm not the right person to judge that. And I honestly don't care. All it matters is the code to understand the syntax.[/quote] In vON, keys and values can be [b]anything[/b]. You can even have [b]booleans and tables[/b] as keys! [h2]Code[/h2] I'm not going to dump the code here. A code file for comparing this with GLON and dkJSON is available [B][URL="https://dl.dropbox.com/s/ori49higxyufol5/von%20vs%20glon%20vs%20dkjson.lua?dl=1"]here[/URL][/B]. (Might be outdated, so replace the vON code in that one with the latest release.) Latest stable versions: - [B][URL="https://dl.dropbox.com/u/1217587/GMod/Lua/von.lua?dl=1"]Pure Lua[/URL][/B] - [B][URL="https://dl.dropbox.com/u/1217587/GMod/Lua/von%20for%20GMOD.lua?dl=1"]GLua[/URL][/B] Development versions: - [B][URL="https://dl.dropbox.com/u/1217587/GMod/Lua/von%20-%20development.lua?dl=1"]Pure Lua[/URL][/B] - [B][URL="https://dl.dropbox.com/u/1217587/GMod/Lua/von%20for%20GMOD%20-%20development.lua?dl=1"]GLua[/URL][/B] These are [B]not[/B] to be used in any projects. They're the files I work and test on before updating the stable versions. They're here for the curious to peek. [h2]Comparison[/h2] It's more than twice as fast and occupies less space than GLON. Unlike JSON, it's not bound to a standard, like all keys in tables being strings. I've tested vON with the following piece of code: [lua]local test6 = { 1, -1337, -99.99, 2, 3, 100, 101, 121, 143, 144, "ma\"ra", "are", "mere", { 500,600,700,800,900,9001, TROLOLOLOLOLOOOO = 666, [true] = false, [false] = "lol?", pere = true, [1997] = "vasile", [{ [true] = false, [false] = true }] = { [true] = "true", ["false"] = false } }, true, false, false, true, false, true, true, false, true, [1337] = 1338, mara = "are", mere = false, [true] = false, [{ [true] = false, [false] = true }] = { [true] = "true", ["false"] = false } } local last = test6 for i = 1, 5000 do local s = von.serialize(last) --print(to_string(t, 0)) --print(s) last = von.deserialize(s) end print(von.serialize(last))[/lua] vON produced a [b]flawless[/b] result. GLON produced a disaster... JSON [b]cannot[/b] encode that. The code in the linked comparison outputted the following for me: [code]GLON: 5000 encoding/decoding successions took 4.2862453460693 seconds to finish. Length of final (probably overmutilated) data: 597. JSON: 5000 encoding/decoding successions took 2.1321220397949 seconds to finish. Length of the final (100% mutilated) data: 545. vON: 5000 encoding/decoding successions took 1.4760837554932 seconds to finish. Length of the final (100% healthy) data: 583. [Finished in 7.9s][/code] [h2]Examples[/h2] You must be aware of the uglyness of the code already. The code above produces: [code]n1;-1337;-99.99;2;3;100;101;121;143;144;"ma\"ra""are""mere"{n500;600;700;800;900;9001~"TROLOLOLOLOLOOOO":n666;b1:0{~b0:11:0}:{~b1:"true""false":b0}"pere":b1n1997:"vasile"b0:"lol?"}b100101101~1:0{~b0:11:0}:{~b1:"true""false":b0}n1337:1338;"mara":"are""mere":b0[/code] Yeah, it looks like s**t. Previously, vON featured a "nice mode" for formatting the code for human readability. Well, it's no longer supported. It was slow and a pain in the arse to maintain. [h2]Usage[/h2] [lua]von -- The global table of the library. von.serialize -- A "functable" containing the serialization procedures. -- The internal functions are exposed because someone might find a use for 'em. von.serialize(table data) -- Serializes the table into a string. von.deserialize -- A "functable" containing the deserialization procedures. -- The internal functions are, again, exposed, because someone might find them useful. von.deserialize(string data) -- Deserializes the specified data into a table. -- (Non-numeric) key order might not be preserved, but it shouldn't matter.[/lua] [h2]Final thoughts[/h2] Credits and appreciations are in the file. This is not intended to replace or work with anything in particular. I made this for myself, for my gamemode, to store data in a quick, flexible and, thus, efficient way. If you don't like it, either suggest an improvement or leave the thread. I don't want or need hateful opinions. Also, I apologize for the bad-ish formatting of the thread... And I apologize if my text sounds hateful/mean. It's just my way of writing, it's not on purpose. :smile: [b]By publishing this I'm getting no profit. My only intention is to help, to make someone's day a little brighter.[/b] Please read the notice in the code file and do as it says. [h2]If you have ideas for improvement, optimizations or bug reports, please post them in this thread![/h2] [h2]Changelog[/h2] All times are [URL="http://www.worldtimeserver.com/current_time_in_UTC.aspx"]GMT[/URL]. [i]2012.07.06 8:20[/i] - Version 1.0.0 - Started making a changelog. - Declared version 1.0.0 [i]2012.08.02 8:55[/i] - Version 1.1.0 - Fixed errors on Angle and Vector deserialization saying they're entities. - Added errors when trying to (de)serialize the wong types. - Fixed GLua version's distribution link pointing to the pure Lua version. - Added Player data type to the GLua version. Players are save
  • Avatar of vercas
  • [QUOTE=rebel1324;36538065]No more string.Explode![/QUOTE] Well, uh, yeah? It's written in pure Lua. :wink:
  • Looks excellent, will be using this in my gamemode as well. Also add "?dl=1" to your download links so they autodownload when someone clicks them
  • Avatar of vercas
  • [QUOTE=Remscar;36538125]Looks excellent, will be using this in my gamemode as well.[/QUOTE] [QUOTE=_Chewgum;36537831]Going to use this in my gamemode :wink:[/QUOTE] [QUOTE=Deadman123;36537466]This looks pretty cool. I'd definitely use it over glon any day.[/QUOTE] I'm so glad you like it! I'll add GMod-specific types tomorrow, and I'm looking into a way to compress the output string, or at least the numeric types.. [QUOTE=Remscar;36538125]Also add "?dl=1" to your download links so they autodownload when someone clicks them[/QUOTE] Thank you!
  • Avatar of vercas
  • [QUOTE=rebel1324;36538215]You did great job, vercas. I lov you. No homo.[/QUOTE] Thanks! :smile: [hr] [/hr] I'm looking for some input now. So, how should I store angles? Should I reduce them to [0; 360] or [-180;180]? Should I convert them to numbers like pitch*360*360 + yaw*360 + roll? In [0, 360] the number will vary from 0 to 46,785,960. In [-180, 180] the number will vary from -23,392,980 to 23,392,980. Using the interval [0, 360] will cut down on one character. Most of the times, the numeric representation will be smaller than the values together. 46785960 is smaller than 360,360,360 by 3 characters.. [editline]asd[/editline] I just realized what a retarded thing I suggested above.
  • Avatar of vercas
  • [QUOTE=rebel1324;36538348]Angle value is 360 to 0 as i know.[/QUOTE] Well, many functions actually return angles in the [-180; 180] range. (At least in E2 :v:)
  • Avatar of vercas
  • Hm. Now that I've refreshed my mind, it's not such a good idea to add up the angles to a number because of the decimals. It was actually a very dumb idea... Damn, angles and vectors are going to be big... I think I will clamp all numbers to 3 decimals. Precision decreases dramatically after the 2nd decimal anyway.
  • Avatar of vercas
  • [QUOTE=rebel1324;36538541]Oops, sorry then.[/QUOTE] Sorry for? Anyway, does anyone fancy [URL="http://oss.digirati.com.br/luabignum/bn/index.htm"]big numbers[/URL]? I'm using this library for a while and it's actually awesome. I could easily implement them (as strings...) in vON as a separate type.
  • Avatar of Divran
  • Nice work. For some big speed increases, I suggest using T[#T+1] = Val instead of insert(T,val) Also at the top of the file it says "authot" EDIT: Oh and this [code]local stuff = { ["\\"] = "\\\\", ["\""] = "\\\"" } gsub(data, stuff)[/code] might be faster than [code]gsub(gsub(data, "\\", "\\\\"), "\"", "\\\"")[/code]
  • Avatar of Chief Tiger
  • Very interesting, I may have to rewrite a lot of the code in the gamemode I'm working on because of this new discovery...
  • Avatar of vercas
  • [QUOTE=Divran;36541789]Nice work. For some big speed increases, I suggest using T[#T+1] = Val instead of insert(T,val) Also at the top of the file it says "authot"[/QUOTE] Thanks! I fixed the typo and used T[#T + 1] as you suggested. It cut about 0.1 seconds in the benchmark. Now it's definitely faster than JSON! [QUOTE=Divran;36541789]EDIT: Oh and this [code]local stuff = { ["\\"] = "\\\\", ["\""] = "\\\"" } gsub(data, stuff)[/code] might be faster than [code]gsub(gsub(data, "\\", "\\\\"), "\"", "\\\"")[/code][/QUOTE] Um... This doesn't even work. It doesn't accept tables as the second argument.
  • Avatar of vercas
  • [QUOTE=pennerlord;36547935]Maybe this could help you: [url]http://trac.caspring.org/wiki/LuaPerformance[/url][/QUOTE] Well, one of my benchmarks shows the opposite of one of those things there. But the last one is definitely useful. I'll use a counter instead of [i]#T+1[/i] and see the results.
  • Avatar of MakeR
  • "premature optimization is the root of all evil" The performance you gain from these tiny optimizations will dramatically reduce the readability of your code.
  • Avatar of vercas
  • [QUOTE=MakeR;36548047]"premature optimization is the root of all evil" The performance you gain from these tiny optimizations will dramatically reduce the readability of your code.[/QUOTE] Yeah... :v: You poor humans. Honestly, the code is ugly. I will eventually comment it to make it more beautiful (or just less ugly). And I'd rather have speed than readability. Also, if you want more readable code, use an editor with folding (Like Sublime Text 2) and fold the unnecessary functions. That's how I manage to maintain the code.
  • Avatar of Divran
  • [QUOTE=vercas;36547102]Um... This doesn't even work. It doesn't accept tables as the second argument.[/QUOTE] I remembered wrong. According to the reference manual, it was the third argument that can be a table. [quote]string.gsub (s, pattern, repl [, n]) ... If repl is a table, then the table is queried for every match, using the first capture as the key; if the pattern specifies no captures, then the whole match is used as the key.[/quote] [url]http://www.lua.org/manual/5.1/manual.html#pdf-string.gsub[/url]
  • Avatar of vercas
  • [QUOTE=Divran;36548252]I remembered wrong. According to the reference manual, it was the third argument that can be a table. [url]http://www.lua.org/manual/5.1/manual.html#pdf-string.gsub[/url][/QUOTE] And how do I use it? Honestly I know almost nothing about the patterns.
  • Avatar of MakeR
  • IIRC each capture gets replaced by the corresponding value in the table, ie the table gets indexed with the capture, and the capture in the string gets replaced by the result.
  • Avatar of vercas
  • [QUOTE=MakeR;36548388]IIRC each capture gets replaced by the corresponding value in the table, ie the table gets indexed with the capture, and the capture in the string gets replaced by the result.[/QUOTE] And how do I write a patter to detect either \ or "? Also, if it replaces " with \", won't it later detect the \ in \" and replace it by \\"? :tinfoil: Again, my knowledge of patterns is extremely... absent.
  • Avatar of MakeR
  • [QUOTE=vercas;36548405]And how do I write a patter to detect either \ or "? Also, if it replaces " with \", won't it later detect the \ in \" and replace it by \\"? :tinfoil: Again, my knowledge of patterns is extremely... absent.[/QUOTE] I haven't touched Lua in such a long time, honestly I can't remember. [editline]29th June 2012[/editline] I'm sure one of the regulars here can help.
  • Avatar of vercas
  • I've managed to come up with [b][\"\\][/b] and it seems to work. It saves about 0.05 seconds in the benchmark. I'm commenting the code and I'll be making more improvements, such as two different serializers: one for normal code and the other for "nice" code. Interesting. When I'm on battery power, vON seems to be 20-25% slower. [b]Edit[/b]: Believe it or not, but using a counter is actually slower than [i]#T + 1[/i]... [b]Edit2[/b]: Old: [code]GLON: 5000 encoding/decoding successions took 4.3212471008301 seconds to finish. Length of final (probably overmutilated) data: 597. JSON: 5000 encoding/decoding successions took 2.1121196746826 seconds to finish. Length of the final (100% mutilated) data: 545. vON: 5000 encoding/decoding successions took 2.0471172332764 seconds to finish. Length of the final (100% healthy) data: 583. [Finished in 8.5s][/code] New: [code]GLON: 5000 encoding/decoding successions took 4.3902492523193 seconds to finish. Length of final (probably overmutilated) data: 597. JSON: 5000 encoding/decoding successions took 2.1211223602295 seconds to finish. Length of the final (100% mutilated) data: 545. vON: 5000 encoding/decoding successions took 2.0071144104004 seconds to finish. Length of the final (100% healthy) data: 583. [Finished in 8.6s][/code] Well, this is an improvement.