Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use StringScanner#peek_byte to get double or single quotation mark #227

Merged
merged 1 commit into from
Dec 20, 2024

Conversation

naitoh
Copy link
Contributor

@naitoh naitoh commented Dec 19, 2024

Why?

StringScanner#peek_byte is fast, because it does not generate String object.

Benchmark

RUBYLIB= BUNDLER_ORIG_RUBYLIB= /Users/naitoh/.rbenv/versions/3.3.4/bin/ruby -v -S benchmark-driver /Users/naitoh/ghq/github.com/naitoh/rexml/benchmark/parse.yaml
ruby 3.3.4 (2024-07-09 revision be1089c8ec) [arm64-darwin22]
Calculating -------------------------------------
                         before       after  before(YJIT)  after(YJIT)
                 dom     19.753      19.888        35.641       35.928 i/s -     100.000 times in 5.062402s 5.028121s 2.805792s 2.783339s
                 sax     30.349      30.978        53.485       57.885 i/s -     100.000 times in 3.295012s 3.228103s 1.869671s 1.727567s
                pull     34.170      35.436        61.713       66.534 i/s -     100.000 times in 2.926534s 2.821955s 1.620404s 1.502996s
              stream     33.121      35.268        60.751       63.276 i/s -     100.000 times in 3.019222s 2.835443s 1.646065s 1.580374s

Comparison:
                              dom
         after(YJIT):        35.9 i/s
        before(YJIT):        35.6 i/s - 1.01x  slower
               after:        19.9 i/s - 1.81x  slower
              before:        19.8 i/s - 1.82x  slower

                              sax
         after(YJIT):        57.9 i/s
        before(YJIT):        53.5 i/s - 1.08x  slower
               after:        31.0 i/s - 1.87x  slower
              before:        30.3 i/s - 1.91x  slower

                             pull
         after(YJIT):        66.5 i/s
        before(YJIT):        61.7 i/s - 1.08x  slower
               after:        35.4 i/s - 1.88x  slower
              before:        34.2 i/s - 1.95x  slower

                           stream
         after(YJIT):        63.3 i/s
        before(YJIT):        60.8 i/s - 1.04x  slower
               after:        35.3 i/s - 1.79x  slower
              before:        33.1 i/s - 1.91x  slower

  • YJIT=ON : 1.01x - 1.08x faster
  • YJIT=OFF : 1.00x - 1.06x faster

@naitoh naitoh marked this pull request as ready for review December 19, 2024 14:25
@naitoh naitoh requested a review from kou December 19, 2024 23:53
lib/rexml/parsers/baseparser.rb Outdated Show resolved Hide resolved
lib/rexml/parsers/baseparser.rb Outdated Show resolved Hide resolved
lib/rexml/parsers/baseparser.rb Outdated Show resolved Hide resolved
## Why?
`StringScanner#peek_byte` is fast, because it does not generate String object.

## Benchmark
```
RUBYLIB= BUNDLER_ORIG_RUBYLIB= /Users/naitoh/.rbenv/versions/3.3.4/bin/ruby -v -S benchmark-driver /Users/naitoh/ghq/github.com/naitoh/rexml/benchmark/parse.yaml
ruby 3.3.4 (2024-07-09 revision be1089c8ec) [arm64-darwin22]
Calculating -------------------------------------
                         before       after  before(YJIT)  after(YJIT)
                 dom     19.753      19.888        35.641       35.928 i/s -     100.000 times in 5.062402s 5.028121s 2.805792s 2.783339s
                 sax     30.349      30.978        53.485       57.885 i/s -     100.000 times in 3.295012s 3.228103s 1.869671s 1.727567s
                pull     34.170      35.436        61.713       66.534 i/s -     100.000 times in 2.926534s 2.821955s 1.620404s 1.502996s
              stream     33.121      35.268        60.751       63.276 i/s -     100.000 times in 3.019222s 2.835443s 1.646065s 1.580374s

Comparison:
                              dom
         after(YJIT):        35.9 i/s
        before(YJIT):        35.6 i/s - 1.01x  slower
               after:        19.9 i/s - 1.81x  slower
              before:        19.8 i/s - 1.82x  slower

                              sax
         after(YJIT):        57.9 i/s
        before(YJIT):        53.5 i/s - 1.08x  slower
               after:        31.0 i/s - 1.87x  slower
              before:        30.3 i/s - 1.91x  slower

                             pull
         after(YJIT):        66.5 i/s
        before(YJIT):        61.7 i/s - 1.08x  slower
               after:        35.4 i/s - 1.88x  slower
              before:        34.2 i/s - 1.95x  slower

                           stream
         after(YJIT):        63.3 i/s
        before(YJIT):        60.8 i/s - 1.04x  slower
               after:        35.3 i/s - 1.79x  slower
              before:        33.1 i/s - 1.91x  slower

```
- YJIT=ON : 1.01x - 1.08x faster
- YJIT=OFF : 1.00x - 1.06x faster

Co-authored-by: Sutou Kouhei <[email protected]>
@naitoh naitoh requested a review from kou December 20, 2024 14:27
@kou kou merged commit b70388c into ruby:master Dec 20, 2024
61 checks passed
@kou
Copy link
Member

kou commented Dec 20, 2024

Thanks.

@naitoh naitoh deleted the use_peek_byte branch December 20, 2024 23:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants