Age | Commit message (Collapse) | Author |
---|
| |
| ``` ../string.c:660:38: warning: comparison of integers of different signs: 'rb_atomic_t' (aka 'unsigned int') and 'int' [-Wsign-compare] 660 | RUBY_ASSERT(table->count < table->capacity / 2); ``` Notes: Merged: https://github.com/ruby/ruby/pull/13160 |
| |
| The fstring table size used to be reported as part of the VM size, but since it was refactored to be lock-less it was no longer reported. Since it's now wrapped by a `T_DATA`, we can implement its `dsize` function and get a valuable insight into the size of the table. ``` {"address":"0x100ebff18", "type":"DATA", "shape_id":0, "slot_size":80, "struct":"VM/fstring_table", "memsize":131176, ... ``` Notes: Merged: https://github.com/ruby/ruby/pull/13138 |
| Notes: Merged: https://github.com/ruby/ruby/pull/13137 |
| This implements a hash set which is wait-free for lookup and lock-free for insert (unless resizing) to use for fstring de-duplication. As highlighted in https://bugs.ruby-lang.org/issues/19288, heavy use of fstrings (frozen interned strings) can significantly reduce the parallelism of Ractors. I tried a few other approaches first: using an RWLock, striping a series of RWlocks (partitioning the hash N-ways to reduce lock contention), and putting a cache in front of it. All of these improved the situation, but were unsatisfying as all still required locks for writes (and granular locks are awkward, since we run the risk of needing to reach a vm barrier) and this table is somewhat write-heavy. My main reference for this was Cliff Click's talk on a lock free hash-table for java https://www.youtube.com/watch?v=HJ-719EGIts. It turns out this lock-free hash set is made easier to implement by a few properties: * We only need a hash set rather than a hash table (we only need keys, not values), and so the full entry can be written as a single VALUE * As a set we only need lookup/insert/delete, no update * Delete is only run inside GC so does not need to be atomic (It could be made concurrent) * I use rb_vm_barrier for the (rare) table rebuilds (It could be made concurrent) We VM lock (but don't require other threads to stop) for table rebuilds, as those are rare * The conservative garbage collector makes deferred replication easy, using a T_DATA object Another benefits of having a table specific to fstrings is that we compare by value on lookup/insert, but by identity on delete, as we only want to remove the exact string which is being freed. This is faster and provides a second way to avoid the race condition in https://bugs.ruby-lang.org/issues/21172. This is a pretty standard open-addressing hash table with quadratic probing. Similar to our existing st_table or id_table. Deletes (which happen on GC) replace existing keys with a tombstone, which is the only type of update which can occur. Tombstones are only cleared out on resize. Unlike st_table, the VALUEs are stored in the hash table itself (st_table's bins) rather than as a compact index. This avoids an extra pointer dereference and is possible because we don't need to preserve insertion order. The table targets a load factor of 2 (it is enlarged once it is half full). Notes: Merged: https://github.com/ruby/ruby/pull/12921 |
| This allows more flexibility in how we deal with the fstring table Notes: Merged: https://github.com/ruby/ruby/pull/12921 |
| [Feature #20877] Notes: Merged: https://github.com/ruby/ruby/pull/11975 |
| Notes: Merged: https://github.com/ruby/ruby/pull/13030 Merged-By: peterzhu2118 <peter@peterzhu.ca> |
| [Feature #21109] By always freezing when setting the global rb_rs variable, we can ensure it is not modified and can be accessed from a ractor. We're also making sure it's an instance of String and does not have any instance variables. Of course, if $/ is changed at runtime, it may cause surprising behavior but doing so is deprecated already anyway. Co-authored-by: Jean Boussier <jean.boussier@gmail.com> Notes: Merged: https://github.com/ruby/ruby/pull/12975 |
| `rb_str_hash` doesn't include the encoding for ASCII only strings because ASCII only strings are equal regardless of their encoding. But in the case if the `fstring_table`, two identical ASCII strings with different encodings aren't equal. Given it's common to have both `:foo` (or `def foo`) and `"foo"` in the same source code, this causes a lot of collisions in the `fstring_table`. Notes: Merged: https://github.com/ruby/ruby/pull/12881 |
| [Bug #21172] This fixes a rare CI failure. The timeline of the race condition is: - A `"foo" oid=1` string is interned. - `"foo" oid=1` is no longer referenced and will be swept in the future. - Another `"foo" oid=2` string is interned. - `register_fstring` finds `"foo" oid=1`, but since it is about to be swept, removes it from `fstring_table` and insert `"foo" oid=2` instead. - `"foo" oid=1` is swept, since it has the `RSTRING_FSTR` flag, a `st_delete` is issued in `fstring_table` which removes `"foo" oid=2`. I don't know how to reproduce this bug consistently in a single test case. Notes: Merged: https://github.com/ruby/ruby/pull/12857 |
| In gsub is used with a string replacement or a map that doesn't have a default proc, we know for sure no code can cause the MatchData to escape the `gsub` call. In such case, we still have to allocate a new MatchData because we don't know what is the lifetime of the backref, but for any subsequent match we can re-use the MatchData we allocated ourselves, reducing allocations significantly. This partially fixes [Misc #20652], except when a block is used, and partially reduce the performance impact of abc0304cb28cb9dcc3476993bc487884c139fd11 / [Bug #17507] ``` compare-ruby: ruby 3.5.0dev (2025-02-24T09:44:57Z master 5cf146399f) +PRISM [arm64-darwin24] built-ruby: ruby 3.5.0dev (2025-02-24T10:58:27Z gsub-elude-match da966636e9) +PRISM [arm64-darwin24] warming up.... | |compare-ruby|built-ruby| |:----------------|-----------:|---------:| |escape | 3.577k| 3.697k| | | -| 1.03x| |escape_bin | 5.869k| 6.743k| | | -| 1.15x| |escape_utf8 | 3.448k| 3.738k| | | -| 1.08x| |escape_utf8_bin | 6.361k| 7.267k| | | -| 1.14x| ``` Co-Authored-By: Étienne Barrié <etienne.barrie@gmail.com> |
| If the provided Hash doesn't have a default proc, we know for sure that we'll never call into user provided code, hence the string we allocate to access the Hash can't possibly escape. So we don't actually have to allocate it, we can use a fake_str, AKA a stack allocated string. ``` compare-ruby: ruby 3.5.0dev (2025-02-10T13:47:44Z master 3fb455adab) +PRISM [arm64-darwin23] built-ruby: ruby 3.5.0dev (2025-02-10T17:09:52Z opt-gsub-alloc ea5c28958f) +PRISM [arm64-darwin23] warming up.... | |compare-ruby|built-ruby| |:----------------|-----------:|---------:| |escape | 3.374k| 3.722k| | | -| 1.10x| |escape_bin | 5.469k| 6.587k| | | -| 1.20x| |escape_utf8 | 3.465k| 3.734k| | | -| 1.08x| |escape_utf8_bin | 5.752k| 7.283k| | | -| 1.27x| ``` Notes: Merged: https://github.com/ruby/ruby/pull/12730 |
| Notes: Merged: https://github.com/ruby/ruby/pull/12608 |
| Lots of documentation examples still use encoding APIs with encoding names rather than encoding constants. I think it would be preferable to direct users toward constants as it can help with auto-completion, static analysis and such. Notes: Merged: https://github.com/ruby/ruby/pull/12552 |
| Notes: Merged: https://github.com/ruby/ruby/pull/12496 |
| Notes: Merged: https://github.com/ruby/ruby/pull/12341 |
| While profiling `strscan`, I noticed `rb_must_asciicompat` was quite slow, as more than 5% of the benchmark was spent in it: https://share.firefox.dev/49bOcTn By checking for the common 3 ASCII compatible encoding index first, we can skip a lot of expensive operations in the happy path. Notes: Merged: https://github.com/ruby/ruby/pull/12180 |
| Notes: Merged: https://github.com/ruby/ruby/pull/12169 |
| Notes: Merged: https://github.com/ruby/ruby/pull/12169 |
| So that it doesn't get included in the generated binaries for builds that don't support loading shared GC modules Co-Authored-By: Peter Zhu <peter@peterzhu.ca> Notes: Merged: https://github.com/ruby/ruby/pull/12149 |
| |
| We can't pass `nil` as the second parameter of `String#split`. Therefore, descriptions like "if limit is nil, ..." are not appropriate. Notes: Merged: https://github.com/ruby/ruby/pull/12109 |
| * YJIT: Specialize `String#[]` (`String#slice`) with fixnum arguments String#[] is in the top few C calls of several YJIT benchmarks: liquid-compile rubocop mail sudoku This speeds up these benchmarks by 1-2%. * YJIT: Try harder to get type info for `String#[]` In the large generated code of the mail gem the context doesn't have the type info. In that case if we peek at the stack and add a guard we can still apply the specialization and it speeds up the mail benchmark by 5%. Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com> Co-authored-by: Takashi Kokubun (k0kubun) <takashikkbn@gmail.com> --------- Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com> Co-authored-by: Takashi Kokubun (k0kubun) <takashikkbn@gmail.com> Notes: Merged-By: maximecb <maximecb@ruby-lang.org> |
| * Use FL_USER0 for ELTS_SHARED This makes space in RString for two bits for chilled strings. * Mark strings returned by `Symbol#to_s` as chilled [Feature #20350] `STR_CHILLED` now spans on two user flags. If one bit is set it marks a chilled string literal, if it's the other it marks a `Symbol#to_s` chilled string. Since it's not possible, and doesn't make much sense to include debug info when `--debug-frozen-string-literal` is set, we can't include allocation source, but we can safely include the symbol name in the warning message, making it much easier to find the source of the issue. Co-Authored-By: Étienne Barrié <etienne.barrie@gmail.com> --------- Co-authored-by: Étienne Barrié <etienne.barrie@gmail.com> Co-authored-by: Jean Boussier <jean.boussier@gmail.com> |
| Since `str_do_hash` will most likely scan the string to compute the coderange, we might as well copy it over in the interned string in case it's useful later. Notes: Merged: https://github.com/ruby/ruby/pull/12077 |
| While profiling msgpack-ruby I noticed a very substantial amout of time spent in `rb_enc_associate_index`, called by `rb_utf8_str_new`. On that benchmark, `rb_utf8_str_new` is 33% of the total runtime, in big part because it cause GC to trigger often, but even then `5.3%` of the total runtime is spent in `rb_enc_associate_index` called by `rb_utf8_str_new`. After closer inspection, it appears that it's performing a lot of safety check we can assert we don't need, and other extra useless operations, because strings are first created and filled as ASCII-8BIT and then later reassociated to the desired encoding. By directly allocating the string with the right encoding, it allow to skip a lot of duplicated and useless operations. After this change, the time spent in `rb_utf8_str_new` is down to `28.4%` of total runtime, and most of that is GC. Notes: Merged: https://github.com/ruby/ruby/pull/12076 |
| This allows to declare it as leaf just like `Symbol#to_s`. Co-Authored-By: Étienne Barrié <etienne.barrie@gmail.com> |
| Co-authored-by: Jean Boussier <byroot@ruby-lang.org> Notes: Merged: https://github.com/ruby/ruby/pull/11990 |
| When a fake string is interned, use the capa field to store the string hash. This lets us compute it once for hash lookup and embedding the hash in the interned string. Co-authored-by: Jean Boussier <byroot@ruby-lang.org> Notes: Merged: https://github.com/ruby/ruby/pull/11989 |
| and make it support `search_len == 0`, just for the case Ref [Bug #20796] Notes: Merged: https://github.com/ruby/ruby/pull/11923 |
| [Feature #20205] The warning now suggests running with --debug-frozen-string-literal: ``` test.rb:3: warning: literal string will be frozen in the future (run with --debug-frozen-string-literal for more information) ``` When using --debug-frozen-string-literal, the location where the string was created is shown: ``` test.rb:3: warning: literal string will be frozen in the future test.rb:1: info: the string was created here ``` When resurrecting strings and debug mode is not enabled, the overhead is a simple FL_TEST_RAW. When mutating chilled strings and deprecation warnings are not enabled, the overhead is a simple warning category enabled check. Co-authored-by: Jean Boussier <byroot@ruby-lang.org> Co-authored-by: Nobuyoshi Nakada <nobu@ruby-lang.org> Co-authored-by: Jean Boussier <byroot@ruby-lang.org> Notes: Merged: https://github.com/ruby/ruby/pull/11893 |
| Notes: Merged: https://github.com/ruby/ruby/pull/11700 Merged-By: nobu <nobu@ruby-lang.org> |
| UNREACHABLE uses __builtin_unreachable which is not intended to be used as an assertion. Notes: Merged: https://github.com/ruby/ruby/pull/11678 |
| The UNREACHABLE macro calls __builtin_unreachable, which according to the [GCC docs](https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html#index-_005f_005fbuiltin_005funreachable): > If control flow reaches the point of the __builtin_unreachable, the > program is undefined. But it can reach this point with the following script: "123".append_as_bytes("123") This can crash on some platforms with a `Trace/BPT trap: 5`. Notes: Merged: https://github.com/ruby/ruby/pull/11678 |
| This is a static function only called in two places (rb_to_id and rb_to_symbol), and in both places, both symbols and strings are allowed. This makes the error message consistent with rb_check_id and rb_check_symbol. Fixes [Bug #20607] Notes: Merged: https://github.com/ruby/ruby/pull/11097 |
| [Feature #20594] A handy method to construct a string out of multiple chunks. Contrary to `String#concat`, it doesn't do any encoding negociation, and simply append the content as bytes regardless of whether this result in a broken string or not. It's the caller responsibility to check for `String#valid_encoding?` in cases where it's needed. When passed integers, only the lower byte is considered, like in `String#setbyte`. Notes: Merged: https://github.com/ruby/ruby/pull/11552 |
| Notes: Merged: https://github.com/ruby/ruby/pull/11540 |
| |
| Profiling of `JSON.dump` shows a significant amount of time spent in `rb_enc_str_asciionly_p`, in large part because it fetches the encoding. It can be made twice as fast in this scenario by first checking the coderange and only falling back to fetching the encoding if the coderange is unknown. Additionally we can skip fetching the encoding for the common popular encodings. Notes: Merged: https://github.com/ruby/ruby/pull/11533 |
| On OSX, String#rindex is slow due to the lack of `memrchr`. The fallback implementation finds a match by instead doing a `memcmp` on every single character in the search string looking for a substring match. For OSX hosts, this changeset introduces a simple `memrchr` implementation, `rb_memrchr`, that can be used instead. An example benchmark below demonstrates an 8000 char long search string with a 10 char substring near the end. ``` ruby-master | substring near the end | osx UTF-8 user system total real index 0.000111 0.000000 0.000111 ( 0.000110) rindex 0.000446 0.000005 0.000451 ( 0.000454) ``` ``` ruby-patched | substring near the end | osx UTF-8 user system total real index 0.000112 0.000000 0.000112 ( 0.000111) rindex 0.000057 0.000001 0.000058 ( 0.000057) ``` Notes: Merged: https://github.com/ruby/ruby/pull/11519 |
| If both strings have the same encoding, all this work is useless. Notes: Merged: https://github.com/ruby/ruby/pull/11353 |
| If the string only contain single byte characters we can skips all the costly checks. Notes: Merged: https://github.com/ruby/ruby/pull/11353 |
| `rb_enc_from_index` is a costly operation so it is worth avoiding to call it for the common encodings. Also in the case of UTF-8, it's more efficient to scan the coderange if it is unknown that to fallback to the slower algorithms. Notes: Merged: https://github.com/ruby/ruby/pull/11353 |
| `STR_EMBED_P` uses `FL_TEST_RAW` meaning we already assume `str` isn't an immediate, so we can use `FL_TEST_RAW` here too. Notes: Merged: https://github.com/ruby/ruby/pull/11350 |
| If we assume that most strings we modify are not frozen and are independent, then we can optimize this case by replacing multiple flag checks by a single mask check. Notes: Merged: https://github.com/ruby/ruby/pull/11350 |
| codepoint values. (#11032) * Document why we need to explicitly spill registers. * Simplify passing a byte value to `str_buf_cat`. * YJIT: Enhance the `String#<<` method substitution to handle integer codepoint values. * YJIT: Move runtime type check into YJIT. Performing the check in YJIT means we can make assumptions about the type. It also improves correctness of stack traces in cases where the codepoint argument is not a String or a Fixnum. Notes: Merged-By: maximecb <maximecb@ruby-lang.org> |
| [Bug #20585] This was changed in 36a06efdd9f0604093dccbaf96d4e2cb17874dc8 because `String.new(1024)` would end up allocating `1025` bytes, but the problem with this change is that the caller may be trying to right size a String. So instead, we should just better document the behavior of `capacity:`. |
| strings. |