L1VM - loop benchmark

Ich fand ein interessantes Video auf Youtube: Youtube-Python-loops

Es wird erklärt wie man Python Schleifen schneller ausführen lassen kann mit Codeoptimierung. The video explains how to optimize loops in Python.

Hier mal ein direkter Vergleich:
Here is a direct comparison between Python and Brackets:

#!/usr/bin/python
#loop-benchm
import timeit

def for_loop (n = 100_000_000):
    s = 0
    for i in range (n):
        s += i
    return s

def main ():
    print ('for loop: ', timeit.timeit (for_loop, number = 1))

main ()
[stefan@tuxmobile python]$ python loop-benchm.py
for loop:  7.246488967000914
// loops-benchm.l1com
//
#include <intr.l1h>
(main func)
	(set int64 1 zero 0)
	(set int64 1 one 1)
	(set int64 1 loop 0)
	(set int64 1 maxloop 100000000)
	(set int64 1 sum 0)
	(set int64 1 f 0)
	(set int64 1 time 0)
	(set string s timestr "time in ticks: ")
	(zero loop =)
	start_timer
	(:for_loop)
	(((loop maxloop <) f =) f if)
		((sum loop +) sum =)
		((loop one +) loop =)
		(:for_loop jmp)
	(endif)
	stop_timer (time)
	print_s (timestr)
	print_i (time)
	print_n
	(255 zero 0 0 intr0)
(funcend)
[stefan@tuxmobile l1vm-2021-06-02-work]$ l1vm prog/loops-benchm -q
TIMER ms: 9483.5460000000
time in ticks: 9484

Python ist schneller hier. Was können wir machen um das L1VM Brackets Programm zu beschleunigen? Wir können den Code direkt in Assembly schreiben!

So Python is faster here! What can we do to speed up the L1VM Brackets program? We can write the program in Assembly! I did do changes to the original assembly output. Note: you can use inline assembly too in Brackets!

// loops-benchm-asm.l1asm
.data
Q, 1, zero
@, 0Q, 0
Q, 1, one
@, 8Q, 1
Q, 1, loop
@, 16Q, 0
Q, 1, maxloop
@, 24Q, 100000000
Q, 1, sum
@, 32Q, 0
Q, 1, f
@, 40Q, 0
Q, 1, time
@, 48Q, 0
B, 16, timestr
@, 56Q, "time in ticks: "
Q, 1, timestraddr
@, 72Q, 56Q
.dend
.code
:main
loada zero, 0, 0
loada one, 0, 1
loada loop, 0, 2
loada maxloop, 0, 3
loada sum, 0, 4
loada time, 0, 5
intr0 24, 0, 0, 0
:for_loop
eqi 2, 3, 10
jmpi 10, :loop_end
addi 2, 1, 2
addi 4, 2, 4
jmp :for_loop
:loop_end
intr0 25, 5, 0, 0
intr0 255, 0, 0, 0
rts
.cend
[stefan@tuxmobile l1vm-2021-06-02-work]$ l1vm prog/loops-benchm-asm -q
TIMER ms: 952.4900000000

Nicht schlecht! :)
Not that bad! :)

Mit JIT-compiler geht es noch schneller!
With the JIT-compiler it can be faster!

// loops-benchm-asm-jit.l1asm
.data
Q, 1, zero
@, 0Q, 0
Q, 1, one
@, 8Q, 1
Q, 1, loop
@, 16Q, 0
Q, 1, maxloop
@, 24Q, 100000000
Q, 1, sum
@, 32Q, 0
Q, 1, f
@, 40Q, 0
Q, 1, time
@, 48Q, 0
B, 16, timestr
@, 56Q, "time in ticks: "
Q, 1, timestraddr
@, 72Q, 56Q
.dend
.code
:main
loada zero, 0, 0
loada one, 0, 1
loada loop, 0, 2
loada maxloop, 0, 3
loada sum, 0, 4
loada time, 0, 5
loadl :jit_start, 20
loadl :jit_end, 21
intr0 253, 20, 21, 0
intr0 24, 0, 0, 0
intr0 254, 0, 0, 0
jmp :loop_end
:jit_start
:for_loop
addi 2, 1, 2
addi 4, 2, 4
lsi 2, 3, 10
:jit_end
jmpi 10, :for_loop
:loop_end
intr0 25, 5, 0, 0
intr0 255, 0, 0, 0
rts
.cend
[stefan@tuxmobile l1vm-2021-06-02-work]$ l1vm prog/loops-benchm-asm-jit -q
TIMER ms: 254.8630000000

Das ist eine wirklich große Beschleunigung verglichen mit dem originalen Assembly Code. Dies macht deutlich wie gut der JIT-Compiler den Code optimiert!

This is a really speedup even compared with the original assembly code. So this shows how good the JIT-compiler can optimize the code!