Skip to content

Commit

Permalink
Optimize IOSource#read_until method by using `StringScanner#check_u…
Browse files Browse the repository at this point in the history
…ntil(string)` (#226)

## Why?
`StringScanner#check_until(string)` is faster than
`StringScanner#check_until(regex)`.

See:
- ruby/strscan#106
- ruby/strscan#111

## Benchmark
```
RUBYLIB= BUNDLER_ORIG_RUBYLIB= /Users/naitoh/.rbenv/versions/3.3.4/bin/ruby -v -S benchmark-driver /Users/naitoh/ghq/github.com/naitoh/rexml/benchmark/parse.yaml
ruby 3.3.4 (2024-07-09 revision be1089c8ec) [arm64-darwin22]
Calculating -------------------------------------
                         before       after  before(YJIT)  after(YJIT)
                 dom     19.459      19.840        35.035       35.786 i/s -     100.000 times in 5.139034s 5.040369s 2.854304s 2.794367s
                 sax     30.057      30.026        52.986       53.716 i/s -     100.000 times in 3.326998s 3.330499s 1.887303s 1.861652s
                pull     33.777      34.415        62.294       64.020 i/s -     100.000 times in 2.960622s 2.905668s 1.605284s 1.562002s
              stream     33.789      34.003        60.174       60.411 i/s -     100.000 times in 2.959521s 2.940916s 1.661845s 1.655334s

Comparison:
                              dom
         after(YJIT):        35.8 i/s
        before(YJIT):        35.0 i/s - 1.02x  slower
               after:        19.8 i/s - 1.80x  slower
              before:        19.5 i/s - 1.84x  slower

                              sax
         after(YJIT):        53.7 i/s
        before(YJIT):        53.0 i/s - 1.01x  slower
              before:        30.1 i/s - 1.79x  slower
               after:        30.0 i/s - 1.79x  slower

                             pull
         after(YJIT):        64.0 i/s
        before(YJIT):        62.3 i/s - 1.03x  slower
               after:        34.4 i/s - 1.86x  slower
              before:        33.8 i/s - 1.90x  slower

                           stream
         after(YJIT):        60.4 i/s
        before(YJIT):        60.2 i/s - 1.00x  slower
               after:        34.0 i/s - 1.78x  slower
              before:        33.8 i/s - 1.79x  slower

```

- YJIT=ON : 1.00x - 1.03x faster
- YJIT=OFF : 1.00x - 1.02x faster
  • Loading branch information
naitoh authored Dec 19, 2024
1 parent a1d875b commit bb0bedd
Showing 1 changed file with 8 additions and 2 deletions.
10 changes: 8 additions & 2 deletions lib/rexml/source.rb
Original file line number Diff line number Diff line change
Expand Up @@ -68,8 +68,14 @@ module Private
SCANNER_RESET_SIZE = 100000
PRE_DEFINED_TERM_PATTERNS = {}
pre_defined_terms = ["'", '"', "<"]
pre_defined_terms.each do |term|
PRE_DEFINED_TERM_PATTERNS[term] = /#{Regexp.escape(term)}/
if StringScanner::Version < "3.1.1"
pre_defined_terms.each do |term|
PRE_DEFINED_TERM_PATTERNS[term] = /#{Regexp.escape(term)}/
end
else
pre_defined_terms.each do |term|
PRE_DEFINED_TERM_PATTERNS[term] = term
end
end
end
private_constant :Private
Expand Down

0 comments on commit bb0bedd

Please sign in to comment.