Syntax highlighter

2014-08-19

Performance turning - I/O

The R7RS benchmark showed that I/O was slow in Sagittarius. Well not quite, in R7RS library read-line is defined in Scheme whilst get-line is defined in C.This is one of the reason why it's slow. There is another reason that makes I/O slow which is port locking.

Sagittarius guarantees that passing a object to the port then it writes it in order even it's in multi threading script. For example;
(import (rnrs) (srfi :18) (sagittarius threads))

(let-values (((out extract) (open-string-output-port)))
  (let ((threads (map (lambda (v) 
   (make-thread (lambda ()
           (sys-nanosleep 3)
           (put-string out v)
           (newline out))))
        '("hello world"
   "bye bye world"
   "the red fox bla bla"
   "now what?"))))
    (for-each thread-start! threads)
    (for-each thread-join! threads)
    (display (extract))))
This script won't have shuffled values but (maybe random order) whole sentence.

To make this, each I/O call from Scheme locks the given port. However if the reading/writing value is a byte then the locking is not needed. Now we need to consider 2 things, one is a character and the other one is custom ports. Reading/writing a character may have multiple I/O because we need to handle Unicode. And we can't know what custom port would do ahead. Thus for binary port, we don't have to lock unless it's a custom port. And for textual port, we can use string port without lock.

Now how much performance impact with this change? Following is the result of current version and HEAD version:
% ./bench sagittarius tail
Testing tail under Sagittarius
Compiling...
Running...
Running tail:10

real    0m26.155s
user    0m25.568s
sys     0m0.936s

% env SAGITTARIUS=../../build/sagittarius ./bench sagittarius tail

Testing tail under Sagittarius
Compiling...
Running...
Running tail:10

real    0m19.417s
user    0m18.703s
sys     0m0.904s
Well not too bad. Plus this change is not for this particular benchmarking which uses read-line but for generic performance improvements. Now we can finally change the subjective procedures implementation. The difference between get-line and read-line is that handling end of line. R7RS decided to handle '\r', '\n' and '\r\n' as end of line for convenience whilst R6RS only needs to handle '\n'. Following is the result of implementing read-line in C.
% env SAGITTARIUS="../../build/sagittarius" ./bench -o -L../../sitelib sagittarius tail

Testing tail under Sagittarius
Compiling...
Running...
Running tail:10

real    0m5.031s
user    0m4.492s
sys     0m0.795s

Well it's as I expected so no surprise.

No comments:

Post a Comment