8

I have the following code in many places in a large application:

if (datastruct.data_ok && cur_time > datastruct.start_time && cur_time < datastruct.end_time) { do_something(); } 

Now, I wanted to eliminate repeated code by defining a function time_ok(...) which would simplify the repeated code to:

if (time_ok(&datastruct, cur_time)) { do_something(); } 

In my opinion, this is a well-made refactoring and should be therefore done. If the definition of the condition changes in the future, code maintenance becomes easier as only one location needs to be changed.

However, several team members noticed that the function can be abused by giving it a NULL argument, which crashes the program. They also complained that the return value is a boolean and not an error code, and all functions should according to them return an error code.

So, he refactoring would according to them be done in the following way:

bool time_ok_condition; int err; err = time_ok(&datastruct, &time_ok_condition, cur_time); if (err != 0) { fprintf(stderr, "Error %d at file %s line %d", err, __FILE__, __LINE__); exit(1); } if (time_ok_condition) { do_something(); } 

However, in my opinion this results in excessively long code and if I really have to do all this, I'd much rather have the original non-refactored code.

I invented a way around this rule by using a macro:

if (TIME_OK(datastruct, cur_time)) { do_something(); } 

...which can never crash the program on NULL arguments as the macro takes the structure name and not a pointer to the structure as the argument. However, I don't particularly like this approach of using a macro because the macro evaluates its arguments multiple times.

I noticed that although C was developed for Unix and Unix system calls traditionally return error codes, the standard C library functions such as strstr, strchr, strspn do not return an error code and therefore handle NULL arguments by crashing the program.

Now, it is an entirely different question whether the program should crash with coredump or print a meaningful error message for the theoretical error that never happens in this program (see the question How should “this can never happen” style of errors in C be handled? on Stack Overflow).

However, this question is about whether it should be allowed to return something else than an error code. A positive answer would in practice necessitate the use of "crash with coredump" strategy in this very specific case as the range of a boolean return cannot represent errors, but does not necessarily mean that a "crash with coredump" strategy is optimal in all cases.

Who is correct? Should C functions always return an error code like Unix system calls, or should it be allowed to write functions that return something else if there is justified reason for it such as code simplicity?

5
  • 2
    how is this different from question you posted at Stack Overflow?
    – gnat
    CommentedMar 17, 2015 at 9:51
  • Yes, I agree there is a lot of overlap between these two questions and I came up with both in the same discussion with team members. Nevertheless, I still think this is a different question. This question is about whether it is a good idea to return something else than an error code, whereas the other question is about whether I should crash the program with coredump or print a meaningful error message if a "can't never happen" error happens. I think if you read the questions carefully, you'll note that they are indeed different questions.
    – juhist
    CommentedMar 17, 2015 at 9:59
  • I see, thanks. Consider editing the question here to help readers find related one at SO and better understand how it is different from one posted here
    – gnat
    CommentedMar 17, 2015 at 10:03
  • 1
    If you're using GCC there is a nonnull attribute which may help you catch the error you're talking about (passing a null pointer when this is not allowed by the function). And in general you can use assert to catch errors that logically shouldn't happen (i.e. if an assert failed it's because the programmer messed up)
    – Brandin
    CommentedMar 17, 2015 at 10:34
  • Side note - time ranges, like all positive contiguous-range types, should have an inclusive lower-bound (that is, >=). The upper-bound should still be exclusive, though.CommentedMar 17, 2015 at 13:58

2 Answers 2

5

The requirement for an error code arises from the fact the new function now can have 3 outcomes instead of 2: the time struct is ok, it is not ok, or the passed argument is NULL. The contract you have in mind for your function is "do not pass a NULL value in, otherwise the program will crash or show some undefined behaviour", whilst your coworkers believe programs become safer without such functions.

According to this former PSE post, and the topmost answer, the viewpoint of your coworkers is an extremely pedantic one, and in the context you have in mind the function will be called with something extremly unlikely to be NULL. Thus it will be most probably safe (maybe safer!) not to handle NULL by the return of an error code (as long as the program will crash or exit immediately with an error message, so it does not hide any severe error).

Of course, when your "time_ok" function is going to become part of a library, which might be reused in lots of different contexts (and maybe in a context where calling exit is not a good option), then it is indeed better to return an error code for the NULL argument to let the caller decide how to handle that problem.

1
  • The time_ok function is part of a 7000-line long component of the system which is a runnable program and not a general-purpose library and there are no plans to make it a general-purpose library. This conflict was eventually solved by printing a clear error message in the time_ok function if argument is NULL and exiting using exit(1), as the coworkers don't like core dumps. Of course, the error condition currently never happens in the software.
    – juhist
    CommentedMar 17, 2015 at 13:11
1

All functions should return an error code?

I can't see a good reason change a function like

int Add( int lhs, int rhs ) {...} 

to

// Out parameter: result error_t Add( int lhs, int rhs, int * result ) {...} 

especially as now you are having to deal with the possibility that result is null.


In your specific example, I would use something similar to the first form without the pointer, leaving to the compiler the decision between inlining to save the copying vs the increase in code size (for something this small I expect it to always pick inlining, but it is in a better position to judge)

bool time_ok( data_t datastruct, time_t cur_time ) { return datastruct.data_ok && ( cur_time > datastruct.start_time ) && ( cur_time < datastruct.end_time ) } 

where data_t and time_t are the types of datastruct and cur_time

This relies on the fact that time_ok is a Total Function (i.e. not a Partial Function), equivalently that for all data_t and for all time_t there is a defined bool that time_ok can return. This is easiest to prove in cases such as this, where the only thing occurring is values are compared and combined.

    Start asking to get answers

    Find the answer to your question by asking.

    Ask question

    Explore related questions

    See similar questions with these tags.