4
\$\begingroup\$

For decoding binary data (in my case, delivered by a Bluetooth device), I've written this struct:

public struct ConsumableByteArray { private let bytes: [UInt8] private var idx = 0 enum Error: Swift.Error { case notEnoughBytes } init(data: Data) { bytes = [UInt8](data) } init(bytes: [UInt8]) { self.bytes = bytes } mutating func consume() throws -> UInt8 { guard idx < bytes.count else { throw Error.notEnoughBytes } defer { idx += 1} return bytes[idx] } mutating func consume() throws -> UInt16 { guard idx+1 < bytes.count else { throw Error.notEnoughBytes } defer { idx += 2 } return UInt16(bytes[idx+1]) << 8 + UInt16(bytes[idx]) } mutating func consume() throws -> Int16 { guard idx+1 < bytes.count else { throw Error.notEnoughBytes } defer { idx += 2 } return Int16(bytes[idx+1]) << 8 + Int16(bytes[idx]) } mutating func consume() throws -> UInt32 { guard idx+3 < bytes.count else { throw Error.notEnoughBytes } defer { idx += 4 } // Swift compiler insists on splitting this expression up let b3 = UInt32(bytes[idx+3]) << 24 let b2 = UInt32(bytes[idx+2]) << 16 let b1 = UInt32(bytes[idx+1]) << 8 let b0 = UInt32(bytes[idx+0]) << 0 return b3 + b2 + b1 + b0 } } 

Given some data buffer, likely containing ints of varying widths packed together, it allows those fields to be read out:

let buffer = ConsumableByteArray(data: someData) let header:UInt8 = try buffer.consume() let word1:UInt16 = try buffer.consume() let word2:UInt16 = try buffer.consume() let crc32:UInt32 = try buffer.consume() 

Values in the early data may alter the structure of the later data (e.g. whether a feature is supported or not), hence the need for the flexibility to extract data progressively.

Would you write the implementation any differently, or change the API?

\$\endgroup\$

    1 Answer 1

    4
    \$\begingroup\$

    Instead of implementing consume() for each integer type separately, you can implement a single generic method:

    mutating func consume<T: FixedWidthInteger & UnsignedInteger>() throws -> T { let size = MemoryLayout<T>.size guard idx + size <= bytes.count else { throw Error.notEnoughBytes } defer { idx += size } return bytes[idx..<idx + size].enumerated().reduce(0) { $0 + T($1.element) << (8 * $1.offset) } } 

    which can be used for all unsigned integer types UInt, UInt8, ..., UInt64, e.g.

    let header: UInt8 = try buffer.consume() 

    and for the signed integer types via the bitPattern: initializer, e.g.

    let word = try! Int16(bitPattern: buffer.consume()) 

    Another option is to copy the bytes into a value of the desired type instead of bit shifting and adding:

    mutating func consume<T: FixedWidthInteger>() throws -> T { let size = MemoryLayout<T>.size guard idx + size <= bytes.count else { throw Error.notEnoughBytes } var value: T = 0 bytes.withUnsafeBytes { _ = memcpy(&value, $0.baseAddress! + idx, size) } idx += size return T(littleEndian: value) } 

    Instead of letting the compiler infer the return type from the context, one can alternatively pass it as a parameter:

    mutating func consume<T: FixedWidthInteger>(_: T.Type) throws -> T { ... } 

    which is then – for example – called as

    let crc32 = try buffer.consume(UInt32.self) 

    I would probably call the method get() or read() instead of consume().


    Defining a local enum Error type which conforms to the (global) Error protocol is possible, but might be confusing to the reader. I would use a different name for the concrete error type, for example:

    enum ReadError: Error { case notEnoughBytes } 

    Now let's have a look how an error would be reported. The caller does not know the actual error type, so a typical calling sequence is:

    do { let someData = Data(bytes: [1]) var buffer = ConsumableByteArray(data: someData) let crc32: UInt32 = try buffer.consume() print(crc32) } catch { print(error.localizedDescription) } 

    This produces the output:

     The operation couldn’t be completed. (MyProg.ConsumableByteArray.ReadError error 0.) 

    This can be improved by adopting the LocalizedError protocol (see for example How to provide a localized description with an Error type in Swift? on Stack Overflow):

    enum ReadError: Error, LocalizedError { case notEnoughBytes public var errorDescription: String? { switch self { case .notEnoughBytes: return "Not enough bytes in buffer" } } } 

    Now the error output of the above program becomes

     Not enough bytes in buffer 

    You can even store additional information about the error in associated values:

    enum ReadError: Error, LocalizedError { case notEnoughBytes(available: Int, needed: Int) public var errorDescription: String? { switch self { case .notEnoughBytes(let available, let needed): return "Not enough bytes in buffer (available: \(available), needed: \(needed))" } } } 

    Then by throwing

     throw ReadError.notEnoughBytes(available: bytes.count - idx, needed: size) 

    an error message like

     Not enough bytes in buffer (available: 1, needed: 4) 

    is produced.


    Finally note that there is a ByteBuffer type as part of the SwiftNIO framework, which can do all this and more. Even if you decide not to use it, having a look at its documentation and interface might be instructive.

    \$\endgroup\$
    5
    • \$\begingroup\$Some great suggestions in there - I particularly like the generic one-size-fits-all change. (As you may have noticed, I'd only implemented for the types I'd encountered so far). I think I picked the enum Error: Swift.Error up from swiftbysundell.com/posts/…, but I will rethink/revisit the whole error reporting aspect.\$\endgroup\$
      – Chris
      CommentedJun 13, 2018 at 20:33
    • 1
      \$\begingroup\$Your suggestions and the style in which you presented them are greatly appreciated - I've adopted the shift and add version.\$\endgroup\$
      – Chris
      CommentedJun 14, 2018 at 9:03
    • \$\begingroup\$The .reduce(0) { $0 + T($1.element) << (8 * $1.offset) } version (my preference as it's more Swift-y) fails for signed Int8's with a value in the byte buffer > 127 as "Not enough bits to represent a signed value". If there's a neat way of resolving this, it's not apparent to me.\$\endgroup\$
      – Chris
      CommentedAug 21, 2018 at 13:05
    • \$\begingroup\$@Chris: You are right, that generic method works properly only with unsigned integer types. I have updated the answer accordingly.\$\endgroup\$
      – Martin R
      CommentedAug 21, 2018 at 18:18
    • \$\begingroup\$Thanks for the prompt update. It's a shame the caller has to do extra/different work in the signed case, but I can see no alternative.\$\endgroup\$
      – Chris
      CommentedAug 21, 2018 at 18:42

    Start asking to get answers

    Find the answer to your question by asking.

    Ask question

    Explore related questions

    See similar questions with these tags.