Optimize JSON string encoding#17734

nielsdos · 2025-02-08T00:01:55Z

Split out a specialized function to decode multibyte UTF-8 sequences
Decoding purely multibyte UTF-8 is common for example in the case of
JSON. Furthermore, we want to avoid the switch on the character set in
such hot code. Finally, we also add UNEXPECTED markers to move code to
the cold section which reduces pressure on the µop and instruction
caches.
Optimize JSON string encoding
There are a couple of optimizations that work together:
- We now use the specialized php_next_utf8_char_mb() helper function to
  avoid pressure on the µop and instruction cache.
- It no longer emits UTF-8 bytes under PHP_JSON_UNESCAPED_UNICODE until
  it actually has to. By emitting in bulk, this improves performance.
- Code layout tweaks
  - Use a specialized php_json_append() and assertions to avoid
    allocating the initial buffer, as this is already done upfront.
  - Factor out the call to smart_str_extend() to above the UTF-16 check
    to avoid code bloat.
- Use SIMD, either with SSE2 or SSE4.2. A resolver is used when SSE4.2
  is not configured at compile time.

Here's a graph where I test different input strings.
The tests take an input string and repeat it using str_repeat, and then perform 200000 calls to json_encode with no flags. If (no escape) is in the label, then we pass JSON_UNESCAPED_UNICODE as a flag.
The first four bars show the very short strings where performance improvements won't be able to do much and no SIMD is possible. The performance difference and overhead of this approach is IMO negligible in those cases.
Next we test some extreme cases. The best case happens when a string doesn't need escaping or anything special. We can see when we need to do escaping or UTF-8 validation/escaping that the performance win is still there, but not as pronounced.
In some extreme UTF-8 (éa*200) the performance is slightly lower but when compared to the total execution time this shouldn't be noticeable. The last two bars show that the performance when using JSON_UNESCAPED_UNICODE is improved as well, which is hopefully the common way people work with UTF-8 in JSON.

Benchmarks done on an i7-4790 on Linux with the SSE4.2 resolver (so not the theoretical optimum, but a likely default).

The benchmark gist is here: https://gist.github.com/nielsdos/51f32ca9d4610b802d0ea65e7af6ae16

bukka · 2025-02-08T15:32:46Z

Nice! The benchmark is quite promising. I would prefer a bit clean up of the code so the ifdefs are not mixed up. It means creating some abstraction and having ifdefs just at the top or in different file.

bukka · 2025-02-08T15:35:45Z

But it's not a big deal if you can't be bothered to do the abstraction - certainly not a blocker - just thought that it might be nicer. If you are too busy and all tests fine, feel free to merge it and I might refactore it later maybe.

bukka · 2025-02-08T16:03:17Z

btw I think people more likely use default (including frameworks like Laravel) so I'm not sure if JSON_UNESCAPED_UNICODE is used that much...

nielsdos · 2025-02-08T16:17:44Z

I'll try to split it off / move some code more together soon-ish.

btw I think people more likely use default (including frameworks like Laravel) so I'm not sure if JSON_UNESCAPED_UNICODE is used that much...

Should be easy to check, I can check this.

nielsdos · 2025-02-08T22:09:55Z

so I'm not sure if JSON_UNESCAPED_UNICODE is used that much...

Indeed most uses within Laravel and Symfony don't use it, or let the user configure it, and only a handful of places use it. A bit unfortunate as they would miss out on a huge performance win (but necessary ofc if they want interop with different character encodings)

mvorisek · 2025-02-09T23:03:07Z

ext/json/json_encoder.c

+ pos = 0;
+ us = php_next_utf8_char_mb((unsigned char *)cur, us, len, &pos);
+ len -= pos;
+ pos += pos_old;


Suggested change
pos=0;
us=php_next_utf8_char_mb((unsigned char*)cur, us, len, &pos);
len-=pos;
pos+=pos_old;
us=php_next_utf8_char_mb((unsigned char*)cur, us, len, &pos);
len-=pos-pos_old;

We need pos to be 0 when the function call happens because len is the remaining length instead of the total length.

mvorisek · 2025-02-09T23:29:18Z

ext/json/json_encoder.c

+ /* It's more important to keep this loop tight than to optimize this with
+ * a trailing zero count. */
+ for (; mask; mask >>= 1, *s += 1) {


abcd*400 takes 0.06 s (1600 bytes)
"longenclosedstring"*50 takes 0.22 s (1000 bytes)
thus when " is present, the encoding is 6x slower. This is actually not an uncommon case, like HTML/XML/JSON (or basically any code) encoded in JSON.
The worst unoptimized case would be mask 10..0..01. Is loop with zend_ulong_ntz really slower here?

The ulong_ntz approach was in my other branch. It was significantly slower for more dense bitset, perhaps there's some heuristic to switch between the 2. I also found that the smaller code size that this has was very important as well for the other inputs (even the ones without any escaping).

It suprises me, but I saw your experiments prior this PR and I trust them. Probably too many branching is simply too many even for the latest super-insanely-fast CPUs :)
I have nothing to add to this PR except one more huge thank you!

mvorisek · 2025-02-09T23:38:44Z

ext/json/json_encoder.c

+ int options
+)
+{
+ while (*len >= sizeof(__m128i)) {


Encoding one "0123456789abcde*1000" string will have absolutely different performance then array of "0123456789abcde" 1000 strings.
How feasible is something like mb_fast_check_utf8 does? - https://github.com/php/php-src/blob/php-8.4.3/ext/mbstring/mbstring.c#L5447

The blow up in code size may optimize that case but will likely deoptimize other cases.

mvorisek · 2025-04-18T09:27:55Z

Can this PR be merged?

nielsdos · 2025-04-18T14:58:05Z

I wanted to do some extra test comparing the thing referred to in this comment (#17734 (comment)) on both an older and newer machine as I now have access to some new hardware too. So I'll try that today.

Decoding purely multibyte UTF-8 is common for example in the case of JSON. Furthermore, we want to avoid the switch on the character set in such hot code. Finally, we also add UNEXPECTED markers to move code to the cold section which reduces pressure on the µop and instruction caches.

There are a couple of optimizations that work together: - We now use the specialized php_next_utf8_char_mb() helper function to avoid pressure on the µop and instruction cache. - It no longer emits UTF-8 bytes under PHP_JSON_UNESCAPED_UNICODE until it actually has to. By emitting in bulk, this improves performance. - Code layout tweaks * Use a specialized php_json_append() and assertions to avoid allocating the initial buffer, as this is already done upfront. * Factor out the call to smart_str_extend() to above the UTF-16 check to avoid code bloat. - Use SIMD, either with SSE2 or SSE4.2. A resolver is used when SSE4.2 is not configured at compile time.

bukka · 2025-04-18T17:08:32Z

I want to actually play with it and go properly through the logic before merging so please wait a bit.

bukka

I looked quickly again and not sure if I like bundling all those things together. It's then not clear what actually gives some improvements. I would prefer to split it to multiple PR's. I'm not sure if we should bother with that utf-8 validation as it's still not optimal. It would be better to eventually look to variant of https://github.com/lemire/fastvalidate-utf-8 . IIRC it's based on variant of Bjoern Hoehrman's validator that I used in jsond: https://github.com/bukka/php-jsond/blob/587a413ee44579b9422ba36eec63bfd1734a5d16/php_jsond_utf8_decoder.h . It wasn't actually giving me better results (mainly because I didn't use ascii check) so I didn't look further but simd variant should be better and might be better look into it. Unless this gives better results already which should be however measured independently.

Also that php_json_append seems a bit like churn but maybe I'm misunderstanding something.

bukka · 2025-04-18T17:17:17Z

ext/json/json_encoder.c

+ /* dest has a minimum size of the input length,
+ * this avoids generating initial allocation code */
+ ZEND_ASSERT(dest->s);


I don't exactly how assert can avoid generating initial allocation if it's removed in production code..?

if it's removed in production code..?
Because it's not removed: it's changed to a ZEND_ASSUME on compilers that support this, and it's removed on compilers that don't. So it can be used as an optimization hint.

Ah ok, it might make sense to integrate it to smart_str if it has some possitive impact as the same case applies on serialization and possibly other places.

nielsdos · 2025-04-18T17:47:24Z

I looked quickly again and not sure if I like bundling all those things together. It's then not clear what actually gives some improvements. I would prefer to split it to multiple PR's. I'm not sure if we should bother with that utf-8 validation as it's still not optimal.

You know what, if it's like that then I don't care anymore, someone else can pick this up.

bukka · 2025-04-18T23:00:27Z

You probably confused comment with a change request - this was just a comment (purely initial thought after a quick review). Basically I just thought that it's better not to bundle unrelated changes together. But if it's the only preferred way, I wouldn't really try to block it as I can see it's still an improvement.

mvorisek · 2025-04-24T16:07:00Z

Niels, please do not let this PR to stale. I was following your experiments in nielsdos#120 and I trust you you went with the fastest solution. Even if for some inputs the improvement can be maybe a little better, this PR is a huge improvent as it is now already.

nielsdos linked an issue Feb 8, 2025 that may be closed by this pull request
json_encode can use SIMD #17672
Open

github-actionsbot added Extension: json Extension: standard ABI break labels Feb 8, 2025

nielsdos marked this pull request as ready for review February 9, 2025 17:25

nielsdos requested review from bukka as code owners February 9, 2025 17:25

mvorisek reviewed Feb 9, 2025
View reviewed changes

nielsdos force-pushed the json-encode-optimize branch from 0b110d9 to f4a2bc9Compare March 17, 2025 20:36

nielsdos force-pushed the json-encode-optimize branch from f4a2bc9 to 860b11fCompare April 18, 2025 14:59

nielsdos added 2 commits April 18, 2025 17:00

bukka reviewed Apr 18, 2025
View reviewed changes

nielsdos closed this Apr 18, 2025

nielsdos reopened this Apr 25, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize JSON string encoding#17734

Optimize JSON string encoding #17734

nielsdos commented Feb 8, 2025•
edited
Loading

bukka commented Feb 8, 2025

bukka commented Feb 8, 2025•
edited
Loading

bukka commented Feb 8, 2025

nielsdos commented Feb 8, 2025

nielsdos commented Feb 8, 2025

mvorisekFeb 9, 2025•
edited
Loading

nielsdosFeb 10, 2025

mvorisekFeb 9, 2025

nielsdosFeb 10, 2025

mvorisekFeb 24, 2025

mvorisekFeb 9, 2025

nielsdosFeb 10, 2025

mvorisek commented Apr 18, 2025

nielsdos commented Apr 18, 2025

bukka commented Apr 18, 2025

bukka left a comment •
edited
Loading

bukkaApr 18, 2025

nielsdosApr 18, 2025

bukkaApr 19, 2025

nielsdos commented Apr 18, 2025

bukka commented Apr 18, 2025

mvorisek commented Apr 24, 2025

-pos=0;
-us=php_next_utf8_char_mb((unsigned char*)cur, us, len, &pos);
-len-=pos;
-pos+=pos_old;
+us=php_next_utf8_char_mb((unsigned char*)cur, us, len, &pos);
+len-=pos-pos_old;

Optimize JSON string encoding#17734

Are you sure you want to change the base?

Optimize JSON string encoding #17734

Conversation

nielsdos commented Feb 8, 2025• edited Loading

bukka commented Feb 8, 2025

bukka commented Feb 8, 2025• edited Loading

bukka commented Feb 8, 2025

nielsdos commented Feb 8, 2025

nielsdos commented Feb 8, 2025

mvorisekFeb 9, 2025• edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mvorisek commented Apr 18, 2025

nielsdos commented Apr 18, 2025

bukka commented Apr 18, 2025

bukka left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nielsdos commented Apr 18, 2025

bukka commented Apr 18, 2025

mvorisek commented Apr 24, 2025

nielsdos commented Feb 8, 2025•
edited
Loading

bukka commented Feb 8, 2025•
edited
Loading

mvorisekFeb 9, 2025•
edited
Loading

bukka left a comment •
edited
Loading