2
\$\begingroup\$

We have a graphics library for the Ti84CE, which uses the 24bit eZ80. It has a 16bit 1555 screen, so we have a gfx_Darken function that will darken a 16bit 1555 color by some amount.

To conserve ram and improve performance, the graphics library is written entirely in assembly with a C interface. However, the out of line assembly can prevent Clang from constant folding expressions such as gfx_Darken(0x1234, 56). We could provide inline or static inline functions to allow for constant folding, but this would increase binary size when the expression is not a compile time consant, and Clang doesn't generate efficient code for the eZ80. C versus Assembly: https://godbolt.org/z/n83j3MsW1

In summary, my goal is to have gfx_Darken be evaluated at compile-time when possible, using the out-of-line function call otherwise. Similar to how I would want sin(1.0) to be evaluated at compile time, but not have the 100 lines of code for sin to be inlined when it can't be evaluated at compile time.

The solution I came up with uses __builtin_constant_p to check if the inputs are compile time constants. Otherwise it will generate an out-of-line call to gfx_Darken. This solution appears to work in C99/C++11 in Clang 15.0.7.

#ifdef __cplusplus extern "C" { #endif // out of line assembly function uint16_t gfx_Darken(uint16_t color, uint8_t amount) __attribute__((__const__)); #ifdef __cplusplus } #endif // darkens a 16bit 1555 color static inline __attribute__((__always_inline__)) #ifdef __cplusplus constexpr #endif uint16_t gfx_Darken_Inline(uint16_t color, uint8_t amount) { if (__builtin_constant_p(color) && __builtin_constant_p(amount)) { uint8_t r = (uint8_t)(color & 0x1F); uint8_t g = (uint8_t)((color & 0x3E0) >> 4) + ((color & 0x8000) ? 1 : 0); uint8_t b = (uint8_t)((color & 0x7C00) >> 10); r = (r * amount + 128) / 256; g = (g * amount + 128) / 256; b = (b * amount + 128) / 256; return ((g & 0x1) ? 0x8000 : 0x0000) | (r << 10) | ((g >> 1) << 5) | b; } return gfx_Darken(color, amount); } #define gfx_Darken(color, amount) gfx_Darken_Inline(color, amount) 

Are there any caveats to this approach? The main one I can think of is that I would have to guarantee that the assembly and C implementations would have to match exactly.

Also, is it necessary to put static inline __attribute__((__always_inline__)) on the function prototype, or should I only put inline? Additionally, should gfx_Darken_Inline have C or C++ linkage in C++?

Compiled as C: https://godbolt.org/z/413zhsjeG

Compiled as C++: https://godbolt.org/z/P66PMcfnT

\$\endgroup\$
3
  • 2
    \$\begingroup\$There is std::is_constant_evaluated(), or just if consteval.\$\endgroup\$
    – indi
    CommentedApr 20 at 20:52
  • 1
    \$\begingroup\$Oof, it seems Clang doesn't see that it can use the mlt instruction, and calls some function to do 16x16-bit multiplication instead. This then also affect register allocation and the whole thing becomes quite inefficient. Ideally you'd fix the compiler, so you can write C or C++ code and not have to hand-write assembly anymore.\$\endgroup\$CommentedApr 20 at 21:41
  • 2
    \$\begingroup\$@indi std::is_constant_evaluated() and if consteval will only evaluate an expression if it is manifestly-constant-evaluated. So constexpr int x = func(3) will be constant folded while int x = func(3) won't be. This causes test2 to emit a function call to gfx_Darken even though both arguments are constants godbolt.org/z/Woz64K3hP\$\endgroup\$CommentedApr 21 at 2:04

0

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.