ruby.git - The Ruby Programming Language

Name: ruby.git - The Ruby Programming Language
Rating: 4.7 (1153 reviews)

Age	Commit message (Collapse)	Author
2025-03-18	Do not break within certain combinations with Indic_Conjunct_Break ↵	Mari Imaizumi
	(InCB)=Linker. https://www.unicode.org/reports/tr29/tr29-43.html#GB9c Notes: Merged: https://github.com/ruby/ruby/pull/12798
2025-03-18	Fix case folding in single byte encoding	Mari Imaizumi
	Notes: Merged: https://github.com/ruby/ruby/pull/12889
2025-02-28	Use mbuf instead of bitset for character class for small UTF. Fixes #16145	Maciej Rzasa
	Notes: Merged: https://github.com/ruby/ruby/pull/12787
2024-09-26	regparse possible memory leak fix proposal	David Carlier
	Notes: Merged: https://github.com/ruby/ruby/pull/5700
2024-01-19	Remove null checks for xfree	Peter Zhu
	xfree can handle null values, so we don't need to check it.
2024-01-11	Remove a unused variable in i_print_name_entry (#9468)	Hiroya Fujinami
	A warning for this is shown when `ONIG_DEBUG_COMPILE` is enabled.
2024-01-08	Fix memory leak in regexp grapheme clusters	Peter Zhu
	[Bug #20161] The cc->mbuf gets overwritten, so we need to free it to not leak memory. For example: str = "hello world".encode(Encoding::UTF_32LE) 10.times do 1_000.times do str.grapheme_clusters end puts `ps -o rss= -p #{$$}` end Before: 15536 15760 15920 16144 16304 16480 16640 16784 17008 17280 After: 15584 15584 15760 15824 15888 15888 15888 15888 16048 16112
2023-11-08	Improve error and memory handling	Adam Hess
	Apply Nobu's suggestions which improve style, memory handling and error correction. Co-authored-by: Nobuyoshi Nakada <nobu@ruby-lang.org>
2023-11-08	fix regex from regex memory corruption	Adam Hess
	before this change, creating a regex from a regex with a named capture, Regexp.new(/(?<name>)/), causes memory to be shared between the two named capture groups which can cause a segfault if the original is GCed.
2023-11-03	Fix onigmo name table without st	Nobuyoshi Nakada
	Co-authored-by: Adam Hess <HParker@github.com>
2023-11-02	Fix functions for name tables as `st_foreach_callback_func`	Nobuyoshi Nakada

2023-06-30	Don't check for null pointer in calls to free	Peter Zhu
	According to the C99 specification section 7.20.3.2 paragraph 2: > If ptr is a null pointer, no action occurs. So we do not need to check that the pointer is a null pointer. Notes: Merged: https://github.com/ruby/ruby/pull/8004
2022-10-25	Prevent potential buffer overrun in onigmo	Yusuke Endoh
	A code pattern `p + enclen(enc, p, pend)` may lead to a buffer overrun if incomplete bytes of a UTF-8 character is placed at the end of a string. Because this pattern is used in several places in onigmo, this change fixes the issue in the side of `enclen`: the function should not return a number that is larger than `pend - p`. Co-Authored-By: Nobuyoshi Nakada <nobu@ruby-lang.org> Notes: Merged: https://github.com/ruby/ruby/pull/6628
2022-10-25	Prevent buffer overrun in regparse.c	Yusuke Endoh
	A regexp that ends with an escape following an incomplete UTF-8 char might cause buffer overrun. Found by OSS-Fuzz. ``` $ valgrind ./miniruby -e 'Regexp.new("\\u2d73\\0\\0\\0\\0 \\\xE6".b)' ==296213== Memcheck, a memory error detector ==296213== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al. ==296213== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info ==296213== Command: ./miniruby -e Regexp.new("\\\\u2d73\\\\0\\\\0\\\\0\\\\0\ \ \ \ \ \ \ \ \ \ \\\\\\xE6".b) ==296213== ==296213== Warning: client switching stacks? SP change: 0x1ffe8020e0 --> 0x1ffeffff10 ==296213== to suppress, use: --max-stackframe=8379952 or greater ==296213== Invalid read of size 1 ==296213== at 0x484EA10: memmove (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==296213== by 0x339568: memcpy (string_fortified.h:29) ==296213== by 0x339568: onig_strcpy (regparse.c:271) ==296213== by 0x339568: onig_node_str_cat (regparse.c:1413) ==296213== by 0x33CBA0: parse_exp (regparse.c:6198) ==296213== by 0x33EDE4: parse_branch (regparse.c:6511) ==296213== by 0x33EEA2: parse_subexp (regparse.c:6544) ==296213== by 0x34019C: parse_regexp (regparse.c:6593) ==296213== by 0x34019C: onig_parse_make_tree (regparse.c:6638) ==296213== by 0x32782D: onig_compile_ruby (regcomp.c:5779) ==296213== by 0x313EFA: onig_new_with_source (re.c:876) ==296213== by 0x313EFA: make_regexp (re.c:900) ==296213== by 0x313EFA: rb_reg_initialize (re.c:3136) ==296213== by 0x318555: rb_reg_initialize_str (re.c:3170) ==296213== by 0x318555: rb_reg_init_str (re.c:3205) ==296213== by 0x31A669: rb_reg_initialize_m (re.c:3856) ==296213== by 0x3E5165: vm_call0_cfunc_with_frame (vm_eval.c:150) ==296213== by 0x3E5165: vm_call0_cfunc (vm_eval.c:164) ==296213== by 0x3E5165: vm_call0_body (vm_eval.c:210) ==296213== by 0x3E89BD: vm_call0_cc (vm_eval.c:87) ==296213== by 0x3E89BD: rb_call0 (vm_eval.c:551) ==296213== Address 0x9d45b10 is 0 bytes after a block of size 32 alloc'd ==296213== at 0x4844899: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==296213== by 0x20FA7B: objspace_xmalloc0 (gc.c:12146) ==296213== by 0x35F8C9: str_buf_cat4.part.0 (string.c:3132) ==296213== by 0x31359D: unescape_escaped_nonascii (re.c:2690) ==296213== by 0x313A9D: unescape_nonascii (re.c:2869) ==296213== by 0x313A9D: rb_reg_preprocess (re.c:2992) ==296213== by 0x313DFC: rb_reg_initialize (re.c:3109) ==296213== by 0x318555: rb_reg_initialize_str (re.c:3170) ==296213== by 0x318555: rb_reg_init_str (re.c:3205) ==296213== by 0x31A669: rb_reg_initialize_m (re.c:3856) ==296213== by 0x3E5165: vm_call0_cfunc_with_frame (vm_eval.c:150) ==296213== by 0x3E5165: vm_call0_cfunc (vm_eval.c:164) ==296213== by 0x3E5165: vm_call0_body (vm_eval.c:210) ==296213== by 0x3E89BD: vm_call0_cc (vm_eval.c:87) ==296213== by 0x3E89BD: rb_call0 (vm_eval.c:551) ==296213== by 0x3E957B: rb_call (vm_eval.c:877) ==296213== by 0x3E957B: rb_funcallv_kw (vm_eval.c:1074) ==296213== by 0x2A4123: rb_class_new_instance_pass_kw (object.c:1991) ==296213== ==296213== ==296213== HEAP SUMMARY: ==296213== in use at exit: 35,476,538 bytes in 9,489 blocks ==296213== total heap usage: 14,893 allocs, 5,404 frees, 37,517,821 bytes allocated ==296213== ==296213== LEAK SUMMARY: ==296213== definitely lost: 316,081 bytes in 2,989 blocks ==296213== indirectly lost: 136,808 bytes in 2,361 blocks ==296213== possibly lost: 1,048,624 bytes in 3 blocks ==296213== still reachable: 33,975,025 bytes in 4,136 blocks ==296213== suppressed: 0 bytes in 0 blocks ==296213== Rerun with --leak-check=full to see details of leaked memory ==296213== ==296213== For lists of detected and suppressed errors, rerun with: -s ==296213== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0) ``` Notes: Merged: https://github.com/ruby/ruby/pull/6625 Merged-By: mame <mame@ruby-lang.org>
2022-07-12	Fix some UBSAN false positives (#6115)	Kevin Backhouse
	* Fix some UBSAN false positives. * ruby tool/update-deps --fix Notes: Merged-By: jhawthorn <john@hawthorn.email>
2022-06-21	regparse.c: Suppress false-positive warnings of GCC 12.1	Yusuke Endoh
	http://rubyci.s3.amazonaws.com/arch/ruby-master/log/20220613T030003Z.log.html.gz ``` regparse.c:264:15: warning: array subscript 56 is outside array bounds of ‘Node[1]’ {aka ‘struct _Node[1]’} [-Warray-bounds] ``` and ``` /usr/include/bits/string_fortified.h:29:10: warning: ‘__builtin_memcpy’ pointer overflow between offset 32 and size [9223372036854775792, 9223372036854775807] [-Warray-bounds] ``` Notes: Merged: https://github.com/ruby/ruby/pull/6013
2021-09-27	Add printf-style format attribute to oniguruma functions	Nobuyoshi Nakada
	Also make the format string compatible with literal strings which are const arrays of "plain" chars. Notes: Merged: https://github.com/ruby/ruby/pull/4899 Merged-By: nobu <nobu@ruby-lang.org>
2020-12-02	Do not reduce quantifiers if it affects which text will be matched	Jeremy Evans
	Quantifier reduction when using +?)* and +?)+ should not be done as it affects which text will be matched. This removes the need for the RQ_PQ_Q ReduceType, so remove the enum entry and related switch case. Test that these are the only two patterns affected by testing all quantifier reduction tuples for both the captured and uncaptured cases and making sure the matched text is the same for both. Fixes [Bug #17341] Notes: Merged: https://github.com/ruby/ruby/pull/3808
2020-11-24	Detect the premature end of char property in regexp	Jeremy Evans
	Default to ONIGERR_INVALID_CHAR_PROPERTY_NAME in fetch_char_property_to_ctype and only set otherwise if an ending } is found. Fixes [Bug #17340] Notes: Merged: https://github.com/ruby/ruby/pull/3807
2019-12-20	Fixed misspellings	Nobuyoshi Nakada
	Fixed misspellings reported at [Bug #16437], only in ruby and rubyspec.
2019-08-27	st_foreach now free from ANYARGS	卜部昌平
	After 5e86b005c0f2ef30df2f9906c7e2f3abefe286a2, I now think ANYARGS is dangerous and should be extinct. This commit deletes ANYARGS from st_foreach. I strongly believe that this commit should have had come with b0af0592fdd9e9d4e4b863fde006d67ccefeac21, which added extra parameter to st_foreach callbacks.
2019-06-29	Fixed String#grapheme_clusters with wide encodings	Nobuyoshi Nakada
	* string.c (get_reg_grapheme_cluster): make regexp from properly encoded sources fro wide-char encodings. [Bug #15965] * regparse.c (node_extended_grapheme_cluster): suppress false duplicated range warning for the time being.
2018-12-07	convert check for array length to assertion and comment out	duerst
	In regparse.c, in function node_extended_grapheme_cluster, we used a raw if() with exit(1) as a cross-check for our length calculations for the common node array. Convert this to an assertion and comment it out because it is not needed for active code. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66269 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-12-07	remove code duplication and put everything into forward order	duerst
	In file regparse.c, in function node_extended_grapheme_cluster(), eliminate code duplication of CRLF and '.' (any character). This uses the fact that both for Unicode encodings and for non-Unicode encodings, the first alternative is CRLF, and the last alternative is '.' (any character). This puts all of the pieces into forward order (the order of the code follows the order of the syntax definition). git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66267 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-12-06	remove an unused variable	duerst
	git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66240 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-12-06	make sure all nodes are freed on error in node_extended_grapheme_cluster()	duerst
	regparse.c: In function node_extended_grapheme_cluster(), use function-global array node_common and use it for list and alternate construction. This is done so that in case of error, all nodes that have already been constructed can be correctly freed in a single for loop. Document the layout structure. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66239 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-12-06	remove code duplication and streamline identifiers	duerst
	In regparse.c: * Reduce coode duplication by merging the almost identical functions create_sequence_node and create_alternate_node into a new function create_node_from_array, adding a parameter that distinguishes between creating a list and creating an alternative. * Streamline variable/function naming. Unicode UAX #29 uses 'sequence', but the regular expression library uses 'list' for the same concept. Keep 'sequence' in the ccmments that are taken from UAX #29, but use 'list' in variable names. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66234 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-12-06	remove obsolete data from unicode.c	duerst
	* unicode.c: Remove the arrays onigenc_unicode_GCB_ranges_GAZ, onigenc_unicode_GCB_ranges_E_Base, and onigenc_unicode_GCB_ranges_Emoji, because they are not needed anymore for Unicode 11.0.0. * regparse.c: Remove external declarations for above arrays. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66232 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-12-05	remove unused variables in node_extended_grapheme_cluster()	duerst
	git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66218 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-12-05	tweak/remove comments [ci skip]	duerst
	git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66217 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-12-05	adjust some comments in node_extended_grapheme_cluster() [ci skip]	duerst
	git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66214 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-12-05	update to Unicode 11.0.0 (main step, not complete yet)	duerst
	- common.mk: Change Unicode version to 11.0.0, and Emoji version to 11.0 - test/ruby/enc/test_emoji_breaks.rb: update hard-coded Emoji version - enc/unicode/11.0.0, enc/unicode/11.0.0/casefold.h, enc/unicode/name2ctype.h: Add generated files. Files for Unicode 10.0.0 will be removed once we are sure 11.0.0 works. - lib/unicode_normalize/tables.rb: Updated table. - regparse.c: Almost completely reimplement grapheme cluster detection in function node_extended_grapheme_cluster(). git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66213 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-12-02	remove unnecessary settings with NULL_NODE in \X implementation	duerst
	Remove unnecessary settings of node_array elements to NULL_NODE. We can do this because we initialize the whole array to NULL_NODEs and set everything again to NULL_NODEs when creating a sequence or alternative node. Also, fix an index error in the initialization of node_array. (issue #15343) git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66139 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-12-02	fix order of declarations and code at start of node_extended_grapheme_cluster()	duerst
	git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66138 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-12-02	fix last commit (r66135)	ko1
	git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66137 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-12-02	make sure all nodes are freed on error in node_extended_grapheme_cluster()	duerst
	regparse.c: In function node_extended_grapheme_cluster(), introduce function-global array node_array and use it for sequence and alternate construction. This is done so that in case of error, all nodes that have already been constructed can be correctly freed. (issue #15343) git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66135 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-12-02	expand a small comment [ci skip]	duerst
	git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66132 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-12-02	add/change some comments in node_extended_grapheme_cluster() [ci skip]	duerst
	git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66123 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-12-02	reformat code [ci skip]	duerst
	git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66122 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-12-01	remove unnecessary code removing CR/LF from range	duerst
	Remove code that tries to remove CR and LF from Grapheme_Cluster_Break=Control. This code is unnecessary because Grapheme_Cluster_Break=Control already excludes CR and LF. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66116 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-12-01	* remove trailing spaces.	svn
	git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66115 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-12-01	introduce and use create_alternate_node()	duerst
	Introduce new function create_alternate_node() to create an alternative node from a list of nodes in one go. Use it once (two more uses expected). git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66114 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-12-01	eliminate a list with only one element	duerst
	git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66113 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-11-28	remove two unnecessary variables (np2 and np3)	duerst
	git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66072 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-11-28	eliminate intermediate variable in very short block (3 times)	duerst
	git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66071 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-11-28	use create_sequence_node() four more times	duerst
	Four more use of create_sequence_node() in node_extended_grapheme_cluster (a few more to come). git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66070 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-11-28	use create_sequence_node() once more	duerst
	One more use of create_sequence_node() in node_extended_grapheme_cluster (several more to come). git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66063 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-11-28	introduce macro R_ERR to reduce repetitive code	duerst
	Introduce a new preprocessor macro R_ERR to visually reduce repetitive code checking for return values and going to the err: label at the end of the function node_extended_grapheme_cluster(). git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66057 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-11-28	reduce number of arguments on quantify_property_node()	duerst
	There are only four patterns of the last two arguments to quantify_property_node(). By replacing the lower/upper arguments with a single char, we get more expressive calls, the last argument directly corresponding to the quantifier that we want to use (except for '2', which means exactly two). git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66052 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-11-27	fix order of subexpressions for Hangul	duerst
	git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66048 b2dd03c8-39d4-4d8f-98ff-823fe69b080e