## L1VM - math speedup

I now developed a compiler optimization flag: “(no-var-pull-on)”. It can be used with my new math expressions inside the { } brackets. The optimization flag tells the parser to not use the “pull” opcodes. So the target variable is stored in a register and not pulled. If more than one math expression is used like in my double numbers benchmark this gives a huge speedup:

The optimized program runs in about 2 seconds. The normal program needs about 7 seconds! See my updated benchmark post: L1VM-benchmark-double-add-04

Now my L1VM is a bit faster as Node.js!

Here is my modified benchmark program:

``````// double-test-optimized.l1com
//
// (no-var-pull-on) (no-var-pull-off) optimization demo
//
(main func)
(set int64 1 zero 0)
(set double 1 zerod 0.0)
(set double 1 x 23.0)
(set double 1 y 42.0)
(set int64 1 max 10000000Q)
(set int64 1 loop 0)
(set int64 1 one 1)
(set double 1 count 1.0)
(set int64 1 f 0)
(optimize-if)
(:loop)
(no-var-pull-on)

{count = (count + x)}
{count = (count + x)}
{count = (count + x)}
{count = (count + x)}
{count = (count + x)}
{count = (count + x)}
{count = (count + x)}
{count = (count + x)}
{count = (count + x)}
{count = (count + x)}

{count = (count + x)}
{count = (count + x)}
{count = (count + x)}
{count = (count + x)}
{count = (count + x)}
{count = (count + x)}
{count = (count + x)}
{count = (count + x)}
{count = (count + x)}
{count = (count + x)}

{count = (count + y)}
{count = (count + y)}
{count = (count + y)}
{count = (count + y)}
{count = (count + y)}

{count = (count + y)}
{count = (count + y)}
{count = (count + y)}
{count = (count + y)}
{count = (count + y)}

(no-var-pull-off)
((count zerod +d) count =)

((loop one +) loop =)
(((loop max <=) f =) f if)
(:loop jmp)
(endif)

(5 count 0 0 intr0)
(7 0 0 0 intr0)
(255 0 0 0 intr0)
(funcend)
``````

And here is the optimized assembly program without the “pull” opcodes after each “addd”:

``````.data
Q, 1, zero
@, 0Q, 0
F, 1, zerod
@, 8Q, 0.0
F, 1, x
@, 16Q, 23.0
F, 1, y
@, 24Q, 42.0
Q, 1, max
@, 32Q, 10000000Q
Q, 1, loop
@, 40Q, 0
Q, 1, one
@, 48Q, 1
F, 1, count
@, 56Q, 1.0
Q, 1, f
@, 64Q, 0
.dend
.code
:main
:loop
pulld 5, 6, 0
pullqw 3, 4, 0
lseqi 1, 5, 6
jmpi 6, :if_0
jmp :endif_0
:if_0
jmp :loop
:endif_0
intr0 5, 1, 0, 0
intr0 7, 0, 0, 0
intr0 255, 0, 0, 0
rts
.cend
``````

Update: I now wrote a JIT-compiled version of this benchmark. This is the first time I did not need inline assembly! I wrote it in Brackets. Here it is:

``````// double-test-optimized-jit.l1com
//
// (no-var-pull-on) (no-var-pull-off) optimization demo
#include <intr.l1h>
(main func)
(set int64 1 zero 0)
(set double 1 zerod 0.0)
(set double 1 x 23.0)
(set double 1 y 42.0)
(set int64 1 max 10000000Q)
(set int64 1 loop 0)
(set int64 1 one 1)
(set double 1 count 1.0)
(set int64 1 f 0)
(set int64 1 jit_start 0)
(set int64 1 jit_end 0)
(optimize-if)
run_jit_comp (jit_start, jit_end)
{x = (x + zerod)}
{y = (y + zerod)}
{count = (count + zerod)}
(:loop)
run_jit_code (zero)
(:next jmp)
(:jit_start)
(no-var-pull-on)

{count = (count + x)}
{count = (count + x)}
{count = (count + x)}
{count = (count + x)}
{count = (count + x)}
{count = (count + x)}
{count = (count + x)}
{count = (count + x)}
{count = (count + x)}
{count = (count + x)}

{count = (count + x)}
{count = (count + x)}
{count = (count + x)}
{count = (count + x)}
{count = (count + x)}
{count = (count + x)}
{count = (count + x)}
{count = (count + x)}
{count = (count + x)}
{count = (count + x)}

{count = (count + y)}
{count = (count + y)}
{count = (count + y)}
{count = (count + y)}
{count = (count + y)}

{count = (count + y)}
{count = (count + y)}
{count = (count + y)}
{count = (count + y)}

(:jit_end)
{count = (count + y)}

(:next)
(no-var-pull-off)
((count zerod +d) count =)

((loop one +) loop =)
(((loop max <=) f =) f if)
(:loop jmp)
(endif)

(5 count 0 0 intr0)
(7 0 0 0 intr0)
(255 0 0 0 intr0)
(funcend)
``````

The runtime was:

``````\$ time l1vm prog/double-test-optimized-jit -q
8800000881.0000000000

real	0m1,111s
user	0m1,100s
sys	0m0,008s
``````