6
\$\begingroup\$

I saw quite a few posts about making a dynamic array so I tried making my own for learning purposes.

Here it is:

#ifndef VECTOR_H #define VECTOR_H #include <stdlib.h> #include <stdbool.h> #include <string.h> #ifndef VECTOR_RESERVE_SIZE #define VECTOR_RESERVE_SIZE 4 #endif //Declaration typedef struct { int size_of_each; int reserved; int used; void* data; } vector_t; bool vector_init(vector_t *, int); void vector_append(vector_t *, void *); void* vector_get(vector_t *, int ); void vector_set(vector_t *, int, void *); void vector_free(vector_t *); //Implementation bool vector_init(vector_t *p, int s) { p->size_of_each = s; p->reserved = VECTOR_RESERVE_SIZE; p->used = 0; p->data = malloc(VECTOR_RESERVE_SIZE * p->size_of_each); if (p->data != NULL) { return false; } else { return true; } } void vector_append(vector_t *p, void *d) { if (p->used == p->reserved) { p->data = realloc(p->data, VECTOR_RESERVE_SIZE); //printf("reallocated \n"); p->reserved += VECTOR_RESERVE_SIZE; } memcpy((void*)((char*)p->data + p->size_of_each * p->used), // copy to back of the vector d, p->size_of_each); p->used += 1; } void* vector_get(vector_t *p, int i) { if (!(i > p->used)) { return (void*)((char*)p->data + p->size_of_each * i); } else { return NULL; } } void vector_set(vector_t *p, int i, void *d) { if (!(i > p->used)) { memcpy((void*)((char*)p->data + p->size_of_each * i), d, p->size_of_each); } } void vector_free(vector_t *p) { free(p->data); } #endif 

Here is an example of its usage:

#include <stdio.h> #define VECTOR_RESERVE_SIZE 3 #include "vector.h" int main() { int a = 321; int b = 45; int c = 21; int d = 31; int e = 71; vector_t v; vector_init(&v, sizeof (int)); vector_append(&v, &a); vector_append(&v, &b); vector_append(&v, &c); vector_append(&v, &d); vector_append(&v, &e); printf("1st element is %d \n", *(int*)vector_get(&v, 0)); printf("2nd element is %d \n", *(int*)vector_get(&v, 1)); printf("3th element is %d \n", *(int*)vector_get(&v, 2)); printf("4th element is %d \n", *(int*)vector_get(&v, 3)); printf("5th element is %d \n\n", *(int*)vector_get(&v, 4)); vector_set(&v, 4, &a); printf("4th element after setting is %d \n\n", *(int*)vector_get(&v, 4)); //contiguous printf("size of int is: %d \n", sizeof (int)); printf("address of 1st element: %p \n", vector_get(&v, 0)); printf("address of 2nd element: %p \n", vector_get(&v, 1)); printf("address of 3rd element: %p \n", vector_get(&v, 2)); printf("address of 4th element: %p \n", vector_get(&v, 3)); printf("address of 5th element: %p \n", vector_get(&v, 4)); vector_free(&v); } 

I've written some C++ before and this was one of my first few projects in C.

I'd like to know overall how good my code is and whether I'm using good practices / conventions.

Edit:

this: p->data = realloc(p->data, VECTOR_RESERVE_SIZE);

should be this: p->data = realloc(p->data, (p->reserved + VECTOR_RSERVE_SIZE) * P->size_of_each);

Edit: Maybe it's too late but, if people are still reviewing this I would really like to know how does this approach compare to std::vector.

\$\endgroup\$

    2 Answers 2

    2
    \$\begingroup\$

    Self document

    Consider users do not want to see or may not have access to the implementation and are left to declarations.

    These deserve parameters names and at least a comment about what each function does and how to use. (e.g. Call vector_init() first. It overwrites all vector_t members.)

    // Deserve for documentation here. bool vector_init(vector_t *, int); void vector_append(vector_t *, void *); void* vector_get(vector_t *, int ); void vector_set(vector_t *, int, void *); void vector_free(vector_t *); 

    Think big

    Why limit to sizes up to INT_MAX when C readily handles sizes up to SIZE_MAX. That could be billions times more.

    Use size_t for array indexing and sizing.

     // int size_of_each; // int used; size_t size_of_each; size_t used; // vector_get(vector_t *, int ); vector_get(vector_t *, size_t); 

    Design: Smaller empty impact

    Consider an app making 100s or 1,000,000s instance of this vector (e.g. a vector of vectors). Many of them will could be empty and never used. This is no advantage in doing that first malloc() at vector_init() time and a wasted allocation for all those unused instances. Allocate when needed. IMO, vector_init() should not allocate anything - it does make then that function error free - an important attribute to an initialization function.

    Design: Slow Linear growth

    Rather than linearly grow the array (incurring O(n*n) re-allocation time), scale the new size by maybe doubling (or some factor).

    Design: Error handling

    Detect and report (re-)allocation failures.

    // void vector_append(vector_t *p, void *d) bool vector_append(vector_t *p, void *d) //return true on error 

    Design: range check

    vector_get(vector_t *p, int i), vector_set(vector_t *p, int i, void *d) perform no range check on i. Live as dangerously as you please, yet I'd definitely, at least, include parameters check on vector_set().

    Design: const

    I'd expect const to show that the data state has not changed.

    // void* vector_get(vector_t *p, int i) void* vector_get(const vector_t *p, int i) 

    Bug: Code in header file

    Code is apparently designed for different VECTOR_RESERVE_SIZE depending on which .c includes it.

    Unfortunately, if 2 .c files each include this, the there will be 2 vector_init(), vector_free(), etc, creating linker conflict.

    Instead, divide code into vector.h and vector.c files. I suspect though this causes trouble with the VECTOR_RESERVE_SIZE scheme. IMO, VECTOR_RESERVE_SIZE is unnecessary.

    Corner bug

    No need to assume vector_free() is not called again. Leave data in a diminished, but correct state.

    void vector_free(vector_t *p) { free(p->data); p->data = NULL; // add p->used = 0; // add } 

    else not needed

     if (!(i > p->used)) { return (void*)((char*)p->data + p->size_of_each * i); } // else { return NULL; } return NULL; 

    Odds & Ends

    p->size_of_each * p->used may overflow int math leading to UB. Best to ensure at least size_t math.

    void * not needed:

    // memcpy((void*)((char*)p->data + p->size_of_each * p->used), d, p->size_of_each); memcpy((char*)p->data + (size_t)1 * p->size_of_each * p->used, d, p->size_of_each); // return (void*)((char*)p->data + p->size_of_each * i); return (char*)p->data + (size_t)1 * p->size_of_each * i; 

    New functions

    int vector_apply(vector, int (*function)(void *state, void *element), void *state) ⟶ Apply the function to each element in the vector. Return early if not zero returned from function.

    vector_right_size(vector) ⟶ Shrink the vector to the minimum needed size.

    \$\endgroup\$
    1
    • 1
      \$\begingroup\$It might be worth emphasising that the documentation should indicate what the return values mean - I certainly had trouble inferring that from the code. Also, vector_right_size() could be vector_shrink_to_fit to help anyone familiar with C++ standard containers.\$\endgroup\$CommentedMar 24, 2021 at 7:57
    3
    \$\begingroup\$

    In C, as in C++, object pointers can be freely converted to void*, so quite a few casts can be removed.

    It's probably convenient to store data as a char*, since that's how we use it. We should still use void* as external interface, of course.

    This is a dangerous anti-pattern:

     p->data = realloc(p->data, VECTOR_RESERVE_SIZE); 

    If the realloc() fails, it returns a null pointer. By overwriting p->data before we check for null, we leak the old value of p->data, which is no longer accessible. And the null-check is also missing, so subsequent access is Undefined Behaviour.

    Style-wise, I think I'd find (!(i > p->used)) easier to read as (i <= p->used). And if (p->data != NULL) { return false; } else { return true; } as return !p->data;.

    The example code should set a good example, and check the return value of vector_init() (and also vector_append() once you realise it needs to indicate success/failure) and react appropriately. Otherwise, you'll find you're encouraging slapdash usage.

    \$\endgroup\$
    8
    • \$\begingroup\$Thanks for the feedback,I have problem when I'm trying to implement those safety checks at vector_append(), I get a segfault when i try to just assert the reallocated pointer and i can't figure out why\$\endgroup\$
      – user237990
      CommentedMar 22, 2021 at 15:50
    • \$\begingroup\$It's obviously wrong to assert, since we know the pointer might be null.\$\endgroup\$CommentedMar 22, 2021 at 16:49
    • \$\begingroup\$I don't get it, by asserting I mean this : assert(ptr != NULL);\$\endgroup\$
      – user237990
      CommentedMar 22, 2021 at 17:08
    • 1
      \$\begingroup\$You need something more like char *ptr = realloc(p->data, new_size); if (!ptr) return false; p->data = ptr; …; return true;.\$\endgroup\$CommentedMar 22, 2021 at 17:32
    • 1
      \$\begingroup\$@Roland, you're probably thinking of conversion fromvoid*, which is allowed in C, but requires a cast in C++. The conversion to void* is a widening conversion and fine in both languages.\$\endgroup\$CommentedMar 23, 2021 at 10:09