4
\$\begingroup\$

This is a follow up on c++20 compile time string utility, as suggested by G. Sliepen posted as a new question, so it can be reviewed on its own.

The following code has suggested improvements from the original question integrated, another round of polishing, and a bit extra.

substr is added as an exercise to conform "string utility" to its name, but mostly to highlight the main limitation of this particular implementation: it's not "transparently constexpr" anymore.

The problem with it arises from the fact that string length is a non-type template parameter, where substr bounds (pos and count) should be preferably function parameters, if we are sticking to the constexpr paradigm. But that's not easy to return meta::string<new_length> from the substr this way, as function arguments are dynamic beings, thus limiting their use in a constant expression, which is required to instantiate a return value. Ultimately bumping us out from the function parameters (constexpr) level to the template parameters level.

Right now substr is effectively a generated family of functions, bound by implicit in-memory map to respective values with static storage duration. (Isn't it a "tag-dispatch" in a nutshell?)

That's a huge gap between the "constexpr" world and "conventional" C++ template metaprogramming. Is there some clever trick to get it "back to constexpr" in this case? Or we should wait for something like "constexpr function parameter qualifier" or similar? or else?

Live @godbolt [with comments coming back from previous question]

Live @godbolt [distilled, without comments]

Please, ignore excessive use of this-> and std::endl. It has some reasoning behind it, though irrelevant to

#ifndef META_STRING_H_INCLUDED #define META_STRING_H_INCLUDED #include <cstddef> #include <type_traits> #include <algorithm> #include <functional> #include <tuple> #include <stdexcept> namespace meta { struct string_base { using size_type = std::size_t; using char_type = char; static constexpr size_type npos = size_type(-1); }; template <string_base::size_type N> requires (N >= 1) class string; template <typename> struct is_string : std::false_type {}; template <string_base::size_type N> struct is_string<meta::string<N>> : std::true_type {}; template <typename> struct is_string_constructible : std::false_type{}; // why do we get here??? template argument type decays during deduction from a constructor? template <string_base::size_type N> struct is_string_constructible<string_base::char_type [N]> : std::true_type {}; // shouldn't it be like this? looks like a possible bug // struct is_string_constructible<const string_base::char_type (&)[N]> : std::true_type {}; // anyway we construct from a const&, so it shouldn't be big of a deal? template <string_base::size_type N> struct is_string_constructible<meta::string<N>> : std::true_type {}; template <template <typename...> typename container, typename... T> requires ((is_string_constructible<T>::value && ...)) struct is_string_constructible<container<T...>> : std::true_type {}; template <typename T> inline constexpr bool is_string_constructible_v = is_string_constructible<T>::value; template <typename T> concept string_constructible = is_string_constructible_v<T>; template <string_base::size_type N> requires (N >= 1) class string : public string_base { public: char elems[N]; // string() { elems[N - 1] = '\0'; } // was used for CTAD guide for tuples. now we avoid object construction there string() = delete; constexpr string(const char_type (&s)[N]) { std::copy_n(s, N, this->elems); } template <size_type Ni, size_type pos = 0, size_type count = npos> constexpr string(const string<Ni> (&s), std::integral_constant<size_type, pos>, std::integral_constant<size_type, count>) { *std::copy_n( &s.elems[std::min(pos, Ni - 1)], std::min(count, Ni - 1 - std::min(pos, Ni - 1)), this->elems ) = '\0'; } // removed // constexpr string(const std::array<char_type, N> (&s)) // { // std::copy_n(s.data(), N, this->elems); // } template <string_constructible... T> constexpr string(const T&... input) { // how silly of me was to enjoy this awkward symmetry between invoke and apply // while complely overlooking the main goal of the expression // std::invoke([this](const auto&... s) constexpr { this->copy_from(s...); }, detail::to_string(input)...); this->copy_from(meta::string(input)...); } template <template <typename...> typename container, string_constructible... T> constexpr string(const container<T...>& input) { // will not always compile without this-> inside the lambda // e.g. @goldbolt x86-64 clang 13.0.0 with --std=c++20 -O3 -pedantic -Wall -Wextra -Werror // std::apply([this](const auto&... s) constexpr { copy_from(meta::string(s)...); }, input); std::apply([this](const auto&... s) constexpr { this->copy_from(meta::string(s)...); }, input); } constexpr auto operator + (const auto& rhs) const { return meta::string(*this, meta::string(rhs)); } static constexpr size_type size_static() noexcept { return N; } constexpr size_type size() const noexcept { return N; } constexpr bool empty() const noexcept { return N == 1; } constexpr const char_type* data() const noexcept { return this->elems; } constexpr operator const char_type* () const noexcept { return this->elems; } constexpr operator std::string_view () const { return std::string_view{ this->elems, N }; } constexpr const char_type& at(size_type pos) const { if(pos >= N - 1) throw std::out_of_range("out of bounds"); return this->elems[pos]; } constexpr const char_type& operator [] (size_type pos) const { return this->at(pos); } constexpr const char_type& front() const noexcept { return this->elems[0]; } constexpr const char_type& back() const noexcept { return this->elems[N - 1]; } template <size_type pos = 0, size_type count = npos> constexpr auto substr() const { return meta::string(*this, std::integral_constant<size_type, pos>{}, std::integral_constant<size_type, count>{}); } constexpr size_type copy(char_type* dest, size_type count = npos, size_type pos = 0) const { if(pos >= N) throw std::out_of_range("out of bounds"); return std::copy_n(&this->elems[pos], std::min(count, N - 1 - std::min(pos, N - 1)), dest) - dest; } private: template <size_type... Ni> constexpr void copy_from(const string<Ni> (&... input)) { auto pos = this->elems; ((pos = std::copy_n(input.elems, Ni - 1, pos)), ...); *pos = 0; } }; namespace detail { // constexpr auto to_string(const auto& input) { return string(input); } // was mostly used to escape infinite recusion in CTAD... // ...now replaced with template <string_constructible... T> constexpr inline string_base::size_type string_length = ((decltype(string(std::declval<T>()))::size_static() - 1) + ... + 1); } // namespace detail template <string_base::size_type N> string(const string_base::char_type (&)[N]) -> string<N>; template <string_base::size_type Ni, string_base::size_type pos, string_base::size_type count> string(const string<Ni> (&), std::integral_constant<string_base::size_type, pos>, std::integral_constant<string_base::size_type, count>) -> string<std::min(count, Ni - 1 - std::min(pos, Ni - 1)) + 1>; // removed // template <string_base::size_type N> // string(const std::array<string_base::char_type, N>& input) // -> string<N>; template <string_constructible... T> string(const T&...) // input) // -> string<((sizeof(detail::to_string(input).elems) - 1) + ... + 1)>; // original // -> string<((decltype(detail::to_string(input))::size_static() - 1) + ... + 1)>; // avoiding external sizeof and accessing class member variable -> string<detail::string_length<T...>>; // with detail::to_string removed template <template <typename...> typename container, string_constructible... T> string(const container<T...>&) // -> string<((sizeof(detail::to_string(T()).elems) - 1) + ... + 1)>; // original // -> string<((sizeof(T) - 1) + ... + 1)>; // @G. Sliepen's suggestion, will not work for nested tuples, and unfortunately dangerous on it's own // -> string<((decltype(detail::to_string(T()))::size_static() - 1) + ... + 1)>; // avoiding external sizeof and accessing class member variable // -> string<((decltype(detail::to_string(std::declval<T>()))::size_static() - 1) + ... + 1)>; // finally deleted default constructor -> string<detail::string_length<T...>>; // with detail::to_string removed inline namespace meta_string_literals { template <string ms> inline constexpr auto operator"" _ms() noexcept { return ms; } } // inline namespace meta_string_literals } // namespace meta #endif // META_STRING_H_INCLUDED ////////////////////////////////////////////////////////////////////// // #include "meta_string.h" #include <iostream> template<meta::string str> struct X { static constexpr auto value = str; operator const char* () { return str.elems; } }; template <auto value> constexpr inline auto constant = value; int main() { using namespace meta::meta_string_literals; X<"a message"> xxx; X<"a massage"> yyy; X<meta::string(xxx.value, " is not ", yyy.value)> zzz; X<"a message"_ms + " is " + "a massage"> zzz2; std::cout << xxx << std::endl; std::cout << yyy << std::endl; std::cout << zzz << std::endl; std::cout << zzz2 << std::endl; static constexpr auto x = meta::string("1"_ms, "22"); static constexpr auto y = meta::string("11", "22"); static constexpr auto z = meta::string(std::tuple{"1xx1"_ms, "2qqq2"_ms}); static constexpr auto z2 = meta::string(std::tuple{meta::string(std::tuple{"1xx1"_ms, "2qqq2"_ms}), "2qqq2"_ms}); static constexpr auto z3 = meta::string(std::tuple{std::tuple{"1xx1"_ms, "2qqq2"_ms}, "2qqq2"_ms}); // static constexpr auto zx = meta::string(std::tuple{"1xx1"_ms, std::array<char, 6>{"2qqq2"}}); // construction from array is removed std::cout << sizeof(x.elems) << ": " << x << std::endl; std::cout << sizeof(y.elems) << ": " << y << std::endl; std::cout << sizeof(z.elems) << ": " << z << std::endl; std::cout << sizeof(z2.elems) << ": " << z2 << std::endl; std::cout << sizeof(z3.elems) << ": " << z3 << std::endl; static constexpr auto a = "1"_ms; static constexpr auto b = a + "22"_ms; std::cout << b << std::endl; // TODO: Can't the next line be implicitly forced to constexpr? std::cout << meta::string("this one "_ms, "is not ", "constant evaluated"_ms) << std::endl; std::cout << constant<meta::string("this one "_ms, "is ", "constant evaluated"_ms)> << std::endl; std::cout << constant<"0123456789"_ms[9]> << std::endl; // std::cout << constant<"0123456789"_ms[-1]> << std::endl; // will throw std::out_of_range // std::cout << constant<"0123456789"_ms[10]> << std::endl; // will throw std::out_of_range static constexpr auto sv_test = "string_view cast"_ms; static constexpr std::string_view sv1 { sv_test.elems, sv_test.size() }; static constexpr std::string_view sv2 = sv_test; static constexpr std::string_view sv3 = constant<"string_view cast in-place"_ms>; // static constexpr std::string_view sv4 = "error: is not a constant expression"_ms; // will not compile std::cout << sv1 << std::endl; std::cout << sv2 << std::endl; std::cout << sv3 << std::endl; char copy_test_dest[30] = {}; // to ensure zero-termination static constexpr auto copy_test_src = "01234"_ms; std::size_t char_copy_count = copy_test_src.copy(&copy_test_dest[0], 1, 1) + copy_test_src.copy(&copy_test_dest[1], -1, 3) ; // copy_test_src.copy(&copy_test_dest[1], -1 , 6); // will throw std::out_of_range std::cout << char_copy_count << ": " << copy_test_dest << std::endl; static constexpr auto test_substr_src = "01234"_ms; static constexpr auto test_substr_1 = test_substr_src.substr<2, 0>(); static constexpr auto test_substr_2 = test_substr_src.substr<2, 1>(); static constexpr auto test_substr_3 = test_substr_src.substr<2, 2>(); static constexpr auto test_substr_4 = test_substr_src.substr<2, 3>(); static constexpr auto test_substr_5 = test_substr_src.substr<2, 222>(); std::cout << test_substr_1.size() << ": " << test_substr_1 << std::endl; std::cout << test_substr_2.size() << ": " << test_substr_2 << std::endl; std::cout << test_substr_3.size() << ": " << test_substr_3 << std::endl; std::cout << test_substr_4.size() << ": " << test_substr_4 << std::endl; std::cout << test_substr_5.size() << ": " << test_substr_5 << std::endl; std::cout << (meta::is_string_constructible<decltype("100"_ms)>::value ? "true" : "false") << std::endl; return 0; } 
\$\endgroup\$
1
  • 3
    \$\begingroup\$will not always compile without this-> inside the lambda: That is only because of the -Werror. Clang is emitting a false positive warning which you are turning into an error. Clang bug for the false positive here. Maybe disabling that warning flag on Clang is a good idea. It certainly is not important whether or not this-> is used, but I just wanted to make sure there is no confusion on whether this-> is needed.\$\endgroup\$CommentedDec 31, 2021 at 0:52

1 Answer 1

2
\$\begingroup\$

N and the terminating NUL character

While it is very convenient to ensure there is always a terminating NUL character at the end of the string, the question is whether N should count that NUL or not. Consider that std::string does not do that in its API, and also that strlen() will report the length minus the NUL character.

Ideally, I would make it so N does not count the NUL character. But if it does, then make sure size() and size_static() return N - 1. Also consider that your operator std::string_view() currently behaves in a surprising way, and your back() always returns '\0'.

operator[] should not call at()

The standard semantics for operator[] is that it doesn't do bounds checking, and hence is noexcept, whereas at() does bounds checking and can throw.

elems[] should be an array of char_type

Since you use char_type everywhere, it should also be used when declaring the array elems[].

Throw std::out_of_range in the constructors

std::string will throw a std::out_of_range exception in its constructor that take a pos argument, if pos is out of range. You can do the same in your code, and then it will catch those errors at compile time. You should be able to get rid of all the std::min(pos, ...) calls this way.

On a related note, don't use std::min() in the deduction guides. These should just forward all the parameters unchanged to the class itself, so that the constructors can do all the error checking.

Go all in on concepts?

I see you use SFINAE template classes to provide traits like is_string and is_string_constructible. It might be possible to rewrite them purely as concepts (see this StackOverflow question), although there is nothing wrong with keeping it as it is.

You can however use concepts in regular code as well, so instead of using meta::is_string_constructible<...>::value, you can use meta::string_constructible<...>, like so:

std::cout << (meta::string_constructible<decltype("100"_ms)> ? "true" : "false") << '\n'; 

Unnecessary use of this->

It is almost never necessary to write this-> in C++. I would remove its use everywhere. As discussed before and as mentioned by user17732522 in his comment above, some compilers might not like it, but that is a bug in the compiler.

Avoid using std::endl

Prefer using \n instead of std::endl; the latter is equivalent to the former, but also forces the output to be flushed, which is usually unnecessary and hurts performance.

\$\endgroup\$
1
  • 1
    \$\begingroup\$cout << someBool can print true/false on its own. There is a formatting option for printing bools as words or 0/1.\$\endgroup\$
    – JDługosz
    CommentedJan 3, 2022 at 16:16

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.