L1VM - math speedup
L1VM - math speedup
I now developed a compiler optimization flag: “(no-var-pull-on)”. It can be used with my new math expressions inside the { } brackets. The optimization flag tells the parser to not use the “pull” opcodes. So the target variable is stored in a register and not pulled. If more than one math expression is used like in my double numbers benchmark this gives a huge speedup:
The optimized program runs in about 2 seconds. The normal program needs about 7 seconds! See my updated benchmark post: L1VM-benchmark-double-add-04
Now my L1VM is a bit faster as Node.js!
Here is my modified benchmark program:
// double-test-optimized.l1com
//
// (no-var-pull-on) (no-var-pull-off) optimization demo
//
(main func)
(set int64 1 zero 0)
(set double 1 zerod 0.0)
(set double 1 x 23.0)
(set double 1 y 42.0)
(set int64 1 max 10000000Q)
(set int64 1 loop 0)
(set int64 1 one 1)
(set double 1 count 1.0)
(set int64 1 f 0)
(optimize-if)
(:loop)
(no-var-pull-on)
{count = (count + x)}
{count = (count + x)}
{count = (count + x)}
{count = (count + x)}
{count = (count + x)}
{count = (count + x)}
{count = (count + x)}
{count = (count + x)}
{count = (count + x)}
{count = (count + x)}
{count = (count + x)}
{count = (count + x)}
{count = (count + x)}
{count = (count + x)}
{count = (count + x)}
{count = (count + x)}
{count = (count + x)}
{count = (count + x)}
{count = (count + x)}
{count = (count + x)}
{count = (count + y)}
{count = (count + y)}
{count = (count + y)}
{count = (count + y)}
{count = (count + y)}
{count = (count + y)}
{count = (count + y)}
{count = (count + y)}
{count = (count + y)}
{count = (count + y)}
(no-var-pull-off)
((count zerod +d) count =)
((loop one +) loop =)
(((loop max <=) f =) f if)
(:loop jmp)
(endif)
(5 count 0 0 intr0)
(7 0 0 0 intr0)
(255 0 0 0 intr0)
(funcend)
And here is the optimized assembly program without the “pull” opcodes after each “addd”:
.data
Q, 1, zero
@, 0Q, 0
F, 1, zerod
@, 8Q, 0.0
F, 1, x
@, 16Q, 23.0
F, 1, y
@, 24Q, 42.0
Q, 1, max
@, 32Q, 10000000Q
Q, 1, loop
@, 40Q, 0
Q, 1, one
@, 48Q, 1
F, 1, count
@, 56Q, 1.0
Q, 1, f
@, 64Q, 0
.dend
.code
:main
loada zero, 0, 0
:loop
loadd count, 0, 1
loadd x, 0, 2
addd 1, 2, 3
addd 3, 2, 1
addd 1, 2, 3
addd 3, 2, 1
addd 1, 2, 3
addd 3, 2, 1
addd 1, 2, 3
addd 3, 2, 1
addd 1, 2, 3
addd 3, 2, 1
addd 1, 2, 3
addd 3, 2, 1
addd 1, 2, 3
addd 3, 2, 1
addd 1, 2, 3
addd 3, 2, 1
addd 1, 2, 3
addd 3, 2, 1
addd 1, 2, 3
addd 3, 2, 1
loadd y, 0, 3
addd 1, 3, 4
addd 4, 3, 1
addd 1, 3, 4
addd 4, 3, 1
addd 1, 3, 4
addd 4, 3, 1
addd 1, 3, 4
addd 4, 3, 1
addd 1, 3, 4
addd 4, 3, 1
loadd zerod, 0, 4
addd 1, 4, 5
load count, 0, 6
pulld 5, 6, 0
loada loop, 0, 1
loada one, 0, 2
addi 1, 2, 3
load loop, 0, 4
pullqw 3, 4, 0
loada loop, 0, 1
loada max, 0, 5
lseqi 1, 5, 6
jmpi 6, :if_0
jmp :endif_0
:if_0
jmp :loop
:endif_0
loadd count, 0, 1
intr0 5, 1, 0, 0
intr0 7, 0, 0, 0
intr0 255, 0, 0, 0
rts
.cend
Update: I now wrote a JIT-compiled version of this benchmark. This is the first time I did not need inline assembly! I wrote it in Brackets. Here it is:
// double-test-optimized-jit.l1com
//
// (no-var-pull-on) (no-var-pull-off) optimization demo
#include <intr.l1h>
(main func)
(set int64 1 zero 0)
(set double 1 zerod 0.0)
(set double 1 x 23.0)
(set double 1 y 42.0)
(set int64 1 max 10000000Q)
(set int64 1 loop 0)
(set int64 1 one 1)
(set double 1 count 1.0)
(set int64 1 f 0)
(set int64 1 jit_start 0)
(set int64 1 jit_end 0)
(optimize-if)
(:jit_start jit_start loadl)
(:jit_end jit_end loadl)
run_jit_comp (jit_start, jit_end)
{x = (x + zerod)}
{y = (y + zerod)}
{count = (count + zerod)}
(:loop)
run_jit_code (zero)
(:next jmp)
(:jit_start)
(no-var-pull-on)
{count = (count + x)}
{count = (count + x)}
{count = (count + x)}
{count = (count + x)}
{count = (count + x)}
{count = (count + x)}
{count = (count + x)}
{count = (count + x)}
{count = (count + x)}
{count = (count + x)}
{count = (count + x)}
{count = (count + x)}
{count = (count + x)}
{count = (count + x)}
{count = (count + x)}
{count = (count + x)}
{count = (count + x)}
{count = (count + x)}
{count = (count + x)}
{count = (count + x)}
{count = (count + y)}
{count = (count + y)}
{count = (count + y)}
{count = (count + y)}
{count = (count + y)}
{count = (count + y)}
{count = (count + y)}
{count = (count + y)}
{count = (count + y)}
(:jit_end)
{count = (count + y)}
(:next)
(no-var-pull-off)
((count zerod +d) count =)
((loop one +) loop =)
(((loop max <=) f =) f if)
(:loop jmp)
(endif)
(5 count 0 0 intr0)
(7 0 0 0 intr0)
(255 0 0 0 intr0)
(funcend)
The runtime was:
$ time l1vm prog/double-test-optimized-jit -q
8800000881.0000000000
real 0m1,111s
user 0m1,100s
sys 0m0,008s