Basic Encoding Rules (BER) class for embedded system

Question

I'm implementing a communications protocol in an embedded system and the protocol uses a subset of the ASN.1 BER (Basic Encoding Rules).

In a nutshell, this class simply represents a TLV triplet (Tag, Length, Value) in which the Tag is some label identifying the type of data, the Length is the length of the Value (in bytes) and the Value is simply a set of bytes.

Because it's going to be running on an embedded system, it will not be using any of the following:

exceptions
Runtime Type Identification (RTTI)
most of the Standard Template Library (STL)
any of the Boost library components
C++14 (the compiler's not yet smart enough)

Additionally, there are some restrictions built into the project which are important for understanding the context of this class:

all Tag values fit in a single 8-bit byte
the maximum message size is 32k; the user of this class must assure this
minimizing total memory footprint is very important
assuring adequate performance (speed) is important
the underlying microprocessor may be 16-bit, or 32-bit or 64-bit
the underlying microprocessor may be little-endian or big-endian
there is a macro (not shown) that correctly defines SHORT_IS_MORE_THAN_TWO_BYTES for the target architecture.
the Ber class will not be derived from (which is why the destructor is not virtual)

I'm interested in a general review, but particularly for clarity and performance of the code. The test code is only for convenience on a computer; the embedded system has no iostream support.

bertest.cpp

#include <iostream> #include <iomanip> #include "ber.h" std::ostream& operator<<(std::ostream &out, const Ber &b) { unsigned short len = b.totallength(b.msglength()); out << "totallength = " << std::dec << len << '\n'; for (unsigned short i=0; i < len; ++i) { out << std::setfill('0') << std::setw(2) << std::hex << (unsigned short)b[i] << ' '; if (0xf == (i & 0xf)) out << '\n'; } return out; } int main() { uint8_t m1[]{ 71, 72, 73, 74, 75 }; Ber message(0x22, sizeof(m1), m1); std::cout << message << '\n'; Ber message2('e'); Ber message3 = Ber(0x60) + message; std::cout << message3 << '\n'; for (int i=0; i < 100; ++i) { message2 += message; std::cout << message2 << '\n'; } std::cout << "tag for message2 is " << (int)message2[0] << "\n"; std::cout << "out-of-bounds byte for message2 is " << (int)message2[1000] << "\n"; }

ber.h

#ifndef BER_H #define BER_H /*! * Basic Encoding Rules class for TLV (Tag, Length, Value) messages */ class Ber { public: //! constructor Ber(uint8_t tag, unsigned short length=0, const uint8_t *message=nullptr); //! copy constructor Ber(const Ber& b); //! move constructor Ber(Ber &&b); //! destructor ~Ber(); //! returns the number of bytes required to encode the length static unsigned short encodedlen(unsigned short n); //! returns the total number of bytes required to store a Ber object with the indicated message length static unsigned short totallength(unsigned short len); //! create a new Ber object from an existing one Ber operator=(const Ber& rhs); //! append the passed Ber data to the existing Ber object Ber &operator+=(const Ber& rhs); //! return the length of the message for this Ber object unsigned short msglength() const; //! return the byte at the indicated offset or NUL if out of range uint8_t operator[](size_t idx) const; private: //! encodes and stores the length at **ptr and updates and returns pointer static uint8_t *lencode(uint8_t **ptr, unsigned short length); //! internal data pointer uint8_t * _data; }; inline Ber operator+(Ber lhs, const Ber& rhs) { lhs += rhs; return lhs; } #endif

ber.cpp

#include <algorithm> #include "ber.h" Ber::Ber(uint8_t tag, unsigned short length, const uint8_t *message) : _data(new uint8_t[totallength(length)]) { uint8_t *ptr = _data; *ptr++ = tag; std::copy(message, message+length, lencode(&ptr, length)); } Ber::Ber(const Ber& b) : _data(new uint8_t[totallength(b.msglength())]) { std::copy(b._data, b._data+totallength(b.msglength()), _data); } Ber::Ber(Ber &&b) : _data(nullptr) { std::swap(_data, b._data); } Ber::~Ber() { delete _data; } uint8_t *Ber::lencode(uint8_t **ptr, unsigned short length) { switch(encodedlen(length)) { case 1: *(*ptr)++ = (length & 0x7fu); break; case 2: *(*ptr)++ = 0x81u; *(*ptr)++ = (length & 0xffu); break; case 3: *(*ptr)++ = 0x82u; *(*ptr)++ = ((length >> 8) & 0xffu); *(*ptr)++ = (length & 0xffu); break; } return *ptr; } unsigned short Ber::encodedlen(unsigned short n) { if (n <= 0x7fu) return 1; if (n <= 0xffu) return 2; #if SHORT_IS_MORE_THAN_TWO_BYTES // assert(sizeof(unsigned short)>2); if (n <= 0xffffu) return 3; return 0; // error -- encoded length can't be > 3 #else // assert(sizeof(unsigned short)==2); return 3; #endif } unsigned short Ber::totallength(unsigned short len) { return 1u+len+encodedlen(len); } Ber Ber::operator=(const Ber& rhs) { return Ber(rhs); } Ber& Ber::operator+=(const Ber& rhs) { unsigned short datalen = msglength(); unsigned short rhslen = totallength(rhs.msglength()); unsigned short mynewlen = totallength(datalen+rhslen); uint8_t *data_copy(new uint8_t[mynewlen]); uint8_t *ptr = data_copy; *ptr++ = *_data; uint8_t *dataptr = &_data[1+encodedlen(datalen)]; ptr = std::copy(dataptr, dataptr + datalen, lencode(&ptr, datalen+rhslen)); std::copy(rhs._data, rhs._data+rhslen, ptr); std::swap(data_copy, _data); delete data_copy; return *this; } unsigned short Ber::msglength() const { unsigned short len = _data[1] & 0x7f; if (_data[1] != len) { switch(len) { case 2: len = (_data[2] << 8) | _data[3]; break; case 1: len = _data[2]; } } return len; } uint8_t Ber::operator[](size_t idx) const { return (idx > totallength(msglength())) ? '\0' : _data[idx]; }

To understand it properly, your operator += is creating some packet chain? It seems to produce [tag1][len1][load1][tag2][len2][load2]. Did I get it right? — user52292, CommentedOct 19, 2014 at 11:10
@firda: almost right. It produces nested Ber instances. So it would be [tag1][len1+len2][msg1][tag2][len2][msg2]. Typical use would be to start with an empty Ber instance (just a tag) and then add multiple Ber instances to it. — Edward, CommentedOct 19, 2014 at 12:53
And what about big/little endian conversion and huge structs (as the size can go upto 32KiB)? Maybe I got it wrong downthere in my review and this is not a packet, but really some array/tree of tagged values (but the size confused me). — user52292, CommentedOct 19, 2014 at 12:58

Community · Accepted Answer · 2017-04-13 12:40:41Z

You said Ber will not be derived from. Consider writing class Ber final{...} to make inheriting from Ber a compilation error.
Remove useless comments such as //! constructor. Either people already know that from looking at the code or they do not understand the comment.
Flesh out existing comments such as //! append the passed Ber data to the existing Ber object for operator +=. What is Ber data? The message? If you append data from one Ber-object to another, did you effectively concatenate the messages or replace them? If it is not totally obvious what a user defined operator does it is better to use a named member function instead.
Ber(uint8_t tag, unsigned short length=0, const uint8_t *message=nullptr); allows calling it like Ber ber(42, 8); which should be a compilation error since one should not have a nonzero length and nullptr as the message. You can fix that with minimal code duplication by writing 2 constructors, Ber(uint8_t tag, unsigned short length, const uint8_t *message); and Ber(uint8_t tag) : Ber(tag, 0, nullptr){}.
Your operator + behaves different than normal which is bad. Usually you have an expression such as a = b + c; where a gets a value and b and c are not changed. Your operator, however, changes b, so you could just write b + c; which looks like it has no effect, but instead it does the same as b += c;. My bad. Thanks to glampert for pointing it out.
new [] goes together with delete [], you only use delete in your destructor. Consider using std::unique_ptr<uint8_t[]> to manage the memory for you.
Replace // assert(sizeof(unsigned short)>2); with something like static_assert(sizeof(unsigned short)>2, "Macro SHORT_IS_MORE_THAN_TWO_BYTES failed");.
lencode will have a less scary and error prone implementation if you change the declaration of ptr to uint8 *&ptr.

I did not look at encodedlen, msglength and totallength in detail, but feel that it can be expressed simpler. Also without really checking I have a feeling that if you append a message and then make a copy, the appended message is gone.

operator + will work properly. If you look closely, the first param is passed by value, so only the local copy is changed. — glampert, CommentedOct 19, 2014 at 1:52
The comments are intended at least in part for automatic documentation generation using Doxygen which is configured to complain about undocumented public members; hence the short comments such as //! constructor. — Edward, CommentedOct 22, 2014 at 2:21

Community · Accepted Answer · 2017-05-23 12:41:06Z

You have used an instance of Yoda notation in your test code (inside operator <<):

if (0xf == (i & 0xf)) out << '\n';

This is an arcane habit that should be avoided.

You also have uses of C-style casts in your test code. Don't use those. Use the appropriate C++ cast operators, i.e: static_cast, reinterpret_cast, etc.

Names starting with _. Take a read on this thread. _data is not technically illegal, but I would just avoid using that notation altogether.

Your use of new and delete is wrong! Data allocated with new is freed with delete. Data allocated with new[] is freed with delete[]. Those are two different functions. In some runtimes, you should get a crash if trying to free data allocated with new[] using delete. In your case, however, you should just switch to a smart pointer and don't bother managing memory by hand.

Personal suggestion:

I find tightlypacked names a lot harder to read. Your methods encodedlen(), totallength() and others would be easier on the eyes if written using camelCase or snake_case notation:

encoded_length(); total_length(); etc...

As per request in the comments, here is one way to adjust operator += to use std::unique_ptr. This assumes that _data is also a unique_ptr:

Ber& Ber::operator+=(const Ber& rhs) { unsigned short datalen = msglength(); unsigned short rhslen = totallength(rhs.msglength()); unsigned short mynewlen = totallength(datalen + rhslen); std::unique_ptr<uint8_t[]> data_copy(new uint8_t[mynewlen]); uint8_t * ptr = data_copy.get(); *ptr++ = *_data; uint8_t *dataptr = &_data[1 + encodedlen(datalen)]; ptr = std::copy(dataptr, dataptr + datalen, lencode(&ptr, datalen + rhslen)); std::copy(rhs._data.get(), rhs._data.get() + rhslen, ptr); std::swap(data_copy, _data); return *this; }

If there are any silly mistakes in there I apologise, this was not compiled. The changes are minimal, since unique_ptr overloads the subscript operator [] and the dereference operator *, so that remains unchanged. If you need to grab the underlaying pointer to the base of the array, the get() method is provided. Cleanup is automatic once leaving the scope. So the explicit call to delete is removed.

Good call on delete[] -- it was an error I overlooked. As for the "arcane habit," putting the constant before a == operator avoids a silent but erroneous x = 5 when x == 5 was intended. IMHO, it should be encouraged rather than avoided. As for a smart pointer, I had used std::unique_ptr but found it unwieldy within the operator+=. Maybe you could show how you'd write it? I could be overlooking something. — Edward, CommentedOct 19, 2014 at 4:03
@Edward, If at the current date your compiler is not emitting a warning for something like if (x = 5), then it is time to re-evaluate your choice ;). Both GCC and Clang will warn for that in the default warning level. — glampert, CommentedOct 19, 2014 at 4:11
I don't see any reason to avoid Yoda notation, and you haven't provided any justification for doing so. — 200_success, CommentedOct 19, 2014 at 8:47
@glampert: Sure, <= comparisions are something different, I use Yoda strictly for equality or non-equality (and most often with memcmp or strcmp). — user52292, CommentedOct 19, 2014 at 13:59

Community · Accepted Answer · 2020-06-10 13:24:26Z

Not much left to review

Your static totallength and encodedlen looks like helpers that should in my opinion be private, have a bit different names (already pointed out in other review) and there could possibly be non-static public members taking no parametr for this.

A bit about the design, you have mentioned:

the underlying microprocessor may be 16-bit, or 32-bit or 64-bit
the underlying microprocessor may be little-endian or big-endian

How are you going to solve that? What I am missing is some operator << (or push_back or append) that would be used for building the packet. That could requiere a whole different class if you really need to minimize the total memory footprint, but I would not care for one more int or size_t member called e.g. position.

Packet& operator << (unsigned v) { reserve(4); _data[position++] = v >> 24; _data[position++] = v >> 16; _data[position++] = v >> 8; _data[position++] = v; }

Note: Maybe I did not get it right and BER is not designed for complex packets, but for simple tagged values. In that case, all could be done in constructor.

Here you can see I have renamed your class from Ber to Packet, while using some reserve method that would handle possible reallocation. The data is always encoded as big-endian (and packed), independent on the MCU architecture. (Note: Some prefer to use data_ instead of _data, but the later is allowed within classes, forbidden in global space).

The same could be done for reading, possibly creating two other classes: PacketReader and PacketWriter, but you can have it both embedded in the Packet (or class Ber).

The way you have it now, you have to create all the data at once (while handling the big/little endian) and pass to constructor:

Ber(uint8_t tag, unsigned short length=0, const uint8_t *message=nullptr)

What I am proposing is to have it this way:

auto pkt = Packet(MSG_TAG) << (byte)data1 << (word)data2 << (uint)data3; auto pkt2 = Packet(MSG_TAG, 1+2+4) << ....the same as above - space reserved ahead auto pkt3 = Packet::construct(MSG_TAG, (byte)data1, ....);

Much easier to work with (the last one needs variadic templates, may not be accessible in your compiler).

Finally, my proposals for naming:

Ber -> Packet
totallength -> total_length or full_length
msglength -> length (most used I think) or payload_length / data_length or message_length
lencode -> encode_length
encodedlen -> length_size (well, this one is hard to properly name)

Your guess about the Ber class only handling simple tagged values is correct. Its only (possibly) multibyte value responsibility is the length encoding. Other classes are responsible for creating the message portion, and so are responsible for the endian-ness of their data. As for naming, I actually originally had msglength called length but found that the ambiguity led to misuse. Also encodedlen was originally lenlen which was awful. It is hard to properly name! I like your other naming suggestions and will probably use them. Thanks! — Edward, CommentedOct 19, 2014 at 13:11
The question here is which one you use more? full_length or data_length? One of them could be shortened, but if it is not clear, use longer name for both. ... still I would think again about the coding-responsibility. It would be nice to have it inside (and make it real packet) with all the formatting I have suggested (operator <<). — user52292, CommentedOct 19, 2014 at 13:19

Stack Exchange Network

Basic Encoding Rules (BER) class for embedded system

bertest.cpp

ber.h

ber.cpp

3 Answers 3

Not much left to review

A bit about the design, you have mentioned:

Finally, my proposals for naming:

Linked

Hot Network Questions

Basic Encoding Rules (BER) class for embedded system

bertest.cpp

ber.h

ber.cpp

3 Answers 3

Not much left to review

A bit about the design, you have mentioned:

Finally, my proposals for naming:

Linked

Related

Hot Network Questions