• LuaJIT for Garry's Mod
    101 replies, posted
  • Avatar of gamerpaddy
  • Can anyone test my Script with LuaJit? I see no difference... Without LuaJit on my slow PC: [CODE]PRIME: 100k digits after 0.128662109375 PRIME: 1 million digits after 2.92138671875 PRIME: 78494 numbers found [/CODE] With LuaJit on my slow PC: [CODE]PRIME: 100k digits after 0.108662109375 PRIME: 1 million digits after 2.72138671875 PRIME: 78494 numbers found [/CODE] Luacode: [CODE] local t = SysTime() local Primes = 0 local I = 0 function prime(N) if (N % 2 == 0 or N % 3 == 0 or N % 7 == 0 or N % 5 == 0) then return false end R =math.sqrt(N) F = 11 if (R % 1 == 0)then return false end while (F <= R) do if (N % F == 0)then return false end F = F + 2 if (N % F == 0)then return false end F = F + 4 end return true end local I = 0 while I < 100000 do I = I + 1 if prime(I) then Primes = Primes + 1 end end print("PRIME: 100k digits after "..SysTime() - t) local t = SysTime() while I < 1000000 do I = I + 1 if prime(I) then Primes = Primes + 1 end end print("PRIME: 1 million digits after "..SysTime() - t) print("PRIME: "..Primes.." numbers found") -- fail D: [/CODE]
  • [video=youtube;4rSZWAeAi7k]http://www.youtube.com/watch?v=4rSZWAeAi7k[/video] used test: [url]http://dl.dropbox.com/u/1285798/metatable.lua[/url] original test: [url]http://lubyk.org/en/post313.html[/url] to gamerpaddy: Without JIT: PRIME: 100k digits after 0.177734375 PRIME: 1 million digits after 4.020263671875 PRIME: 78494 numbers found With it: PRIME: 100k digits after 0.04248046875 PRIME: 1 million digits after 1.005615234375 PRIME: 78494 numbers found ps: tested on Core2Quad Q9400 @ 2.66ghz, 8gb RAM
  • Avatar of AzuiSleet
  • It looks like your bitshift operators are missing the fallback implementation in luaV_arith (eg coercing a string into an int), but that's only a one or two line fix. I'm still skeptical whether this actually has any benefit to per-frame times, or if it is stable enough and easy enough to maintain. The last test, with LuaJIT 2, there was a rather serious issue with pcall. It would also be interesting to evaluate the generational GC in 5.2, as it's perfectly suited for the type of allocations in gmod. I believe it's on the roadmap but not yet implemented in LuaJIT.
  • Avatar of Johncw87
  • It turns out that adding new ops to LuaJIT 2.0.0 is much harder than I thought it would be. I added all of GLua's syntax to it except for the bitwise operators. I've uploaded a new *.rar that has my current work on LuaJIT 2.0.0 in 'gmod_luajit2.dll' as well as a fix for LuaJIT 1.1.7 for the issue that AzuiSleet pointed out (thanks). I'll have to go over the LuaJIT 2.0.0 code a lot more before I can change it further. In the mean time, gmod_luajit2.dll adds the 'bit' library, so you can (if you really want to) replace all instances of '|', '&', '<<', and '>>' with their equivalents from the 'bit' library.
  • Avatar of Jellyman
  • [quote="Flapadar"][code]E:\lua>luajit.exe prime_2.lua Testing 1000000 numbers Testing method 1 Method 1: 78494 primes; Time taken: 0.418 seconds E:\lua>lua5.1.exe prime_2.lua Testing 1000000 numbers Testing method 1 Method 1: 78494 primes; Time taken: 3.444 seconds[/code] Used a similar method to the Sieve of Eratosthenes. Luajit is 8x faster.[/quote] I was asked to post this.
  • Avatar of Johncw87
  • I managed to load the module into the Garry's Mod client by loading it as a shader. You can get the updated module in the first post. I am not sure if VAC will care about it. My guess is no, but I'm not making any promises.
  • [QUOTE=JustSoFaded;34999005]if it's a plugin, VAC will not care[/QUOTE] Are you implying VAC won't detect plugins?
  • Avatar of garry
  • Anyone done any kind of timedemo comparison - to compare real world performance gains?
  • Avatar of OldFusion
  • [lua] if GAMEMODE.Name != "Trouble in Terrorist Town" then return end local i = 0; hook.Add("Think", "FrameTimeLog", function()) if i > 30 then file.Append("Performance.txt", ( FrameTime() / #player.GetAll() ) .. "\n") i = 0; end i = i + 1; end [/lua] Today normal, tomorrow JIT. Unless anyone has a better idea.
  • Avatar of initrd.gz
  • Too bad LuaJIT FFI doesn't support C++ yet, would make all those engine calls blazing fast. Try using LuaJIT with glon, since I've heard that is pretty resource-intensive and purely done in Lua. EDIT: Code: [code]function LuaJit_test() local tbl = {} local t, dt for i=1,100000 do tbl[i] = i end t = SysTime() local encoded = glon.encode(tbl) dt = SysTime() - t print(string.format("Encoded in %f seconds",dt)) t = SysTime() local decoded = glon.decode(encoded) dt = SysTime() - t print(string.format("Decoded in %f seconds",dt)) end[/code] With LuaJit2: [code] > LuaJit_test()... Encoded in 17.816620 seconds Decoded in 0.402542 seconds [/code] Without: [code]> LuaJit_test()... Encoded in 20.596542 seconds Decoded in 0.777443 seconds [/code]
  • Avatar of Levybreak
  • Alright, so I ran some trials using my SimplexNoise module (the lua one) and got the following results for comparison between vanilla garrysmod and the LuaJIT 2.0 version (Which breaks a surprisingly large number of things): Vanilla: [img]http://dl.dropbox.com/u/99862/wo_jit.png[/img] LuaJIT2.0: [img]http://dl.dropbox.com/u/99862/w_jit.png[/img] "Writing to a file" includes encoding the table of generated values with glon and writing it to a file (which generally took longer than the pure math of the noise by quite a large amount.) Generating a 1000x1000 grid of 2D noise saw a 1.3063420786133 second improvement, or about 56.662%. Writing it to a file saw a 3.619976043701 second improvement, or about 17.200%. Generating a 100x100x100 array of turbulent noise on 3 iterations (detial levels, kinda) saw an improvement of 8.003707885742 seconds (!!!), or about 40.735% Writing that to a file saw an improvement of 2.4279327392575 seconds, or about 22.138%. The test was done on my Intel i5-2500k clocked to 3.60 GHz, with 16GB of RAM. A 50% performance increase on what is essentially pure math and table manipulation is HUGE savings. With LuaJIT 2.0 I'd be inclined to say that I could very nearly calculate small amounts of the noise in real-time, which would be awesome (with just pure lua).
  • Avatar of OldFusion
  • [QUOTE=Gran PC;35010171]Are you really writing to a file every 30 frames?[/QUOTE] Its comparing JIT frametime against non-Jit frametime.
  • Avatar of Hoffa1337
  • Only thing I've found so far [code] lua_run print( 128 & 384 == 128 ) lua_run:1: attempt to perform arithmetic on a boolean value [/code] Not a big thing really just annoying that I have to enclose my bitmask checks in ( ).
  • Avatar of Johncw87
  • [QUOTE=Hoffa1337;35018972]Only thing I've found so far [code] lua_run print( 128 & 384 == 128 ) lua_run:1: attempt to perform arithmetic on a boolean value [/code] Not a big thing really just annoying that I have to enclose my bitmask checks in ( ).[/QUOTE] Funny thing, last night I did some more disassembly of lua_shared.dll to check how Garry set the operator priority for the bitwise operators. I've set them in the same manner as Garry set them, but I haven't uploaded the changes yet. If you want the fix now, open up lparser.c and search for the 'priority' array. There should be a list of value pairs. Change the last 4 to '{6, 5}, {6, 5}, {6, 5}, {6, 5},'
  • Avatar of Grocel
  • I heard that the speed increases only come when your CPU supports SSE2, this may explain the lack of the performance improvements (in some tests on the first page) on some machines.
  • Avatar of initrd.gz
  • [QUOTE=Grocel;35026589]I heard that the speed increases only come when your CPU supports SSE2, this may explain the lack of the performance improvements (in some tests on the first page) on some machines.[/QUOTE] SSE2 has been out for awhile according to Wikipedia. I kindof doubt this is the issue.
  • Avatar of COBRAa
  • [QUOTE=initrd.gz;35028142]SSE2 has been out for awhile according to Wikipedia. I kindof doubt this is the issue.[/QUOTE] It was when implemented in GMOD properly.
  • current gmod server binary compile under SSE2? TF2, CS:S already use it
  • Avatar of Johncw87
  • I just managed to get binary operators working in LuaJIT 2.0.0 With that out of the way, I've noticed that LuaJIT 2.0.0 is way more picky about how you use escape chars in strings (it doesn't like it when you 'escape' something that doesn't have a meaning.) I've most frequently seen this problem in pattern matching strings. Fixing it is likely just a matter of removing the offending '\' char. The toolgun will most certainly be broken due to line 123 in stool.lua: [lua]local char1,char2,toolmode = string.find( val, "([%w_]*)\.lua" )[/lua] I've tested the pattern without the slash and it gave me the proper result. Just think of it as a syntax error that normal Lua never bitched about.
  • Avatar of COBRAa
  • I never understood why they done that (with the backslash complaining). If it was in lua51 then it would make sense to have it like that but it isn't.
  • Avatar of Johncw87
  • [QUOTE=COBRAa;35062404]I never understood why they done that (with the backslash complaining). If it was in lua51 then it would make sense to have it like that but it isn't.[/QUOTE] I think the standard interpreter is wrong to NOT say anything about it. I would rather it throw me an error telling me exactly what is wrong right away than spend 20 minutes trying to figure out why my Windows-style path string isn't working properly. Oh, I forgot to explicitly mention this in my previous post. The updated LuaJIT 2.0.0 (with working bitwise operators) is available for download in the first post.
  • Avatar of Zeh Matt
  • I had finally some time to test this, huge loops are epic fast with LuaJIT 2 and all scripts seem to work just fine, garry should consider going to LuaJIT 2, everything seems fine. Edit: Ops forgot the results [lua] LuaJIT2: ] lua_run_cl local s = SysTime() local x = 0 for i=0,1000000 do x = x * i end print(SysTime() - s) 0.002105712890625 ] lua_run_cl local s = SysTime() local x = 0 for i=0,1000000 do x = x * i end print(SysTime() - s) 0.00201416015625 ] lua_run_cl local s = SysTime() local x = 0 for i=0,1000000 do x = x * i end print(SysTime() - s) 0.00201416015625 ] lua_run_cl local s = SysTime() local x = 0 for i=0,1000000 do x = x * i end print(SysTime() - s) 0.002288818359375 GLua: ] lua_run_cl local s = SysTime() local x = 0 for i=0,1000000 do x = x * i end print(SysTime() - s) 0.01654052734375 ] lua_run_cl local s = SysTime() local x = 0 for i=0,1000000 do x = x * i end print(SysTime() - s) 0.01641845703125 ] lua_run_cl local s = SysTime() local x = 0 for i=0,1000000 do x = x * i end print(SysTime() - s) 0.01654052734375 ] lua_run_cl local s = SysTime() local x = 0 for i=0,1000000 do x = x * i end print(SysTime() - s) 0.016448974609375 [/lua] I know its some basic testing, but that quite shows off how fast it is.
  • Avatar of Zeh Matt
  • [QUOTE=Drakehawke;35079504]Did you get GLua and LuaJIT 2 the wrong way round in that?[/QUOTE] Ops Edit: Fixed
  • Avatar of AzuiSleet
  • In your LuaJIT2 implementation, I don't think your bit ops are rigged up to the trace recorder to emit the IR_* bitop instructions. Since they're already implemented (because the bit library becomes IR_ in traces), you might want to check another op like BC_POW to see how IR_ ops are being emitted in lj_record.c