Dynamically-expanding array in C

Question

I'm learning C, specifically OOP in C. As practice, I wrote a dynamically-expanding array.

dynarray.h

#ifndef DYNARRAY_H #define DYNARRAY_H typedef struct DynArray { int length, capacity, *arr; void (*extend)(struct DynArray *dynArr); void (*insert)(struct DynArray *dynArr, int val); void (*print) (struct DynArray *dynArr); } DynArray; void DynArray_extend(DynArray* dynArr); void DynArray_insert(DynArray* dynArr, int value); void DynArray_print(DynArray* dynArr); #endif

dynarray.c

#include <stdlib.h> #include "dynarray.h" DynArray* createDynArray() { DynArray* newDynArray = malloc(sizeof(DynArray)); newDynArray->capacity = 10; newDynArray->arr = malloc(sizeof(int)*newDynArray->capacity); newDynArray->length = 0; newDynArray->extend = DynArray_extend; newDynArray->insert = DynArray_insert; newDynArray->print = DynArray_print; } void DynArray_extend(DynArray* dynArr) { int newCapacity = dynArr->capacity * 2, *newArr = malloc(sizeof(int)*newCapacity), i; dynArr->capacity = newCapacity; for (i = 0; i < dynArr->length; i++) { newArr[i] = dynArr->arr[i]; } } void DynArray_insert(DynArray* dynArr, int val) { if (dynArr->length == dynArr->capacity) { dynArr->extend(dynArr); } dynArr->arr[dynArr->length] = val; dynArr->length = dynArr->length + 1; } void DynArray_print(DynArray* dynArr) { int i; for (i = 0; i < dynArr->length; i++) { printf("%d\n", dynArr->arr[i]); } }

And this is an example application of the dynamic array (program.c):

#include <stdio.h> #include "dynarray.c" int main( void ) { int i = 0; DynArray* someDynArray = createDynArray(); printf("Length : %d\n", someDynArray->length); printf("Capacity: %d\n", someDynArray->capacity); for (i = 0; i < 80; i++) { someDynArray->insert(someDynArray, i); } someDynArray->print(someDynArray); }

I would like to know if there are any flaws in the style of implementing the abstract data type of the dynamically expanding array.

Ron Klein · Accepted Answer · 2012-09-22 00:31:02Z

My 2 cents:

I believe createDynArray should return something.
The returned value of createDynArray might have a special value of NULL in case that one of the mallocs calls failed (this scenario should be covered by the code).
In the implementation of extend - consider using realloc instead of malloc and manual copy. If you insist on malloc perhaps using memcpy might be more elegant than a manual copy.
I'd move magic numbers to #defines: Initial Capacity (hard-coded to 10) and Growth Factor (2). Alternatively, they could be parameters to the createDynArray function
How would you free this struct? Perhaps you should add a "destructor" to be called before calling free with the struct pointer as a parameter.
In terms of functionality, other than printing this "array", how would the consuming programmer could get (or set) a specific item in this "array"?

Woah, I just realized that I forgot to call return. Thanks for pointing that out in point 1. But I have no clue how it still manages to run without it. — Sal Rahman, CommentedSep 22, 2012 at 1:45

Corbin · Accepted Answer · 2012-09-22 07:56:02Z

Ron Klein covered it pretty well, but here's a few more things:

Consider the struct an implementation detail, not part of the API

The consumer of your vector shouldn't need to be concerned with what the struct looks like. Nothing in the struct should ever be directly accessed by the user. In fact, I woudln't even expose the struct to the user. I would just forward declare it in the header, and not fully declare it except for in the implementation file.

.h:

struct DynArray; typedef struct DynArray DynArray;

.c:

struct DynArray { ... };

A consumer of the API should never need to know what is in the struct.

Stop trying to emulate objects

someDynArray->insert(someDynArray, i); is silly. Just use DynArray_insert (and others) directly:

DynArray_insert(someDynArray, i);

Note that storing function pointers in the struct is fine, but the user should never be calling them directly.

Your print function shouldn't exist

There's nothing wrong with a print function, but it shouldn't be part of the library. Imagine if you were to bundle this code up and distribute it. Do you think everyone who used it would want exactly the same format? There's no point in having a bundled print function.

DynArray_extend is wrong

You never have dynArr->arr = newArr;.

Multi-Variable Declaration

int newCapacity = dynArr->capacity * 2, *newArr = malloc(sizeof(int)*newCapacity), i;

There's nothing technically wrong with this, but it's fairly standard to only have one variable per declaration. Multi-variable declarations can get quite nasty and confusing when values/pointers/const/non-const all come into play in the same statement.

Consistent Naming

Why is createDynArray named how it is, yet everything else is named DynArray_*? I would go with DynArray_create.

There's no reason to dynamically allocate the DynArray

There will of course be situations where it's necessary, but there's no reason to force dynamic allocation.

Consider instead a usage like this:

DynArray da; if (DynArray_Init(&da)) { //Success (or you could alternatively return 0 on success and then use non-zero for error codes) }

Raw memory functions should only be paired with each other

This is completely opinion, but I believe that the low(ish) level memory functions should only be used together.

What I mean by this is that you shouldn't have a function for creating your struct and no function for freeing it. You actually need a function to free it for correctness, but even if you did not, it would be a good idea.

Abstracting away the freeing allows you to change the free-ing method at a later time seamlessly.

Consider Extending it to be Generic

You could easily extend this to be generic. It would make the API a bit more awkward to use, but it would be worth the reuse.

If you don't know where to start on that, this video should help. About 37 or 38 minutes in is when it becomes relevant (though the professor is doing a stack, not a vector). (I actually quite love that entire lecture series... :D)

@RonKlein Whoops! Sorry, seems my vision is starting to go :) — Corbin, CommentedSep 22, 2012 at 7:55

asveikau · Accepted Answer · 2012-09-23 19:46:10Z

In addition to what others have mentioned (good points raised by others), I'd like to talk about function pointers and vtables for a moment.

The first suggestion would be to not use them in this case. I would avoid using function pointers in the struct unless you really want the behavior to change at runtime from one buffer to another (i.e. the allocation strategy of buffer subclass A is different from B -- seems unlikely that you'd want this). Otherwise using the function pointers will cost you in CPU time, memory overhead, and source code readability.

If you did want to go the "function pointer in the struct" route, you can save memory by using more of a "vtable" approach. This pattern is used extensively by the Linux kernel, by COM, and by many C++ compilers when they implement virtual methods.

Instead of:

typedef struct DynArray { // ... void (*extend)(struct DynArray *dynArr); void (*insert)(struct DynArray *dynArr, int val); void (*print) (struct DynArray *dynArr); } DynArray;

You can do:

typedef struct { void (*extend)(struct DynArray *dynArr); void (*insert)(struct DynArray *dynArr, int val); void (*print) (struct DynArray *dynArr); } DynArrayOps; typedef struct DynArray { // ... DynArrayOps *ops; } DynArray;

This means that there can be only one instance of DynArrayOps (maybe make it a static object in your .c file), and the memory overhead per DynArray is 1 pointer. (3 times less than your code).

Stack Exchange Network

Dynamically-expanding array in C

3 Answers 3

Linked

Hot Network Questions

Dynamically-expanding array in C

3 Answers 3

Linked

Related

Hot Network Questions