4
\$\begingroup\$

Explanation

I have done another command-line parser here -- I am kinda obsessed. It was really verbose for the user of the parser, it grew complicated quickly with option parsing and a regular command-line parser. This one should have them all, be easier to add and change things, and be more readable. My worry is that it is too complicated/clever and bloated, making the comments are too verbose and having to note oddities.

This parser deals with as much as it can. It only returns to the user whether there was a failure or if a special option was given (--version or --help), or equivalent. It supports deeper subcommands (e.g. create partition primary) and treats non-final commands as dummies, similarly to Windows diskpart. If a help option is given or a version option is given before an error happened, the last subcommand's help message is printed; if none was given yet, the main help message is printed. And if a dummy command is given with no parameters afterwards, its help message is also written. So, the syntax is something like this:

[--help | -h | -? | /?] [--version] [<ops>] [(<cmd> [--help | ...] [<ops>] [(...)]] [<args>] 

Its interface is verbose to prepare, but afterwards, everything is handled by the parser.

Code

There are 3 parts:

  1. The main code: args.h/c, defining the interface and implementing the command-line parser.

  2. A fat example: example.c.

  3. A dependency: stringi.h/c, which writes string functions that are case insensitive.

. I hope the interface is "obvious" by the example and args.h.

args.h

/* This defines the interface and some helpful/needed macros for the * command-line parsing interface. It is not thread-safe! * * Note: the help options include (possibly outdated): "--help", "-h", "/?", and * "-?". * * The following macros are not needed, but they should be used. It has a * function to parse through subcommands until the final command, and a pair of * functions: one two parse the command and anther to iterate through the * non-option arguments. * * The interface definition is flexible regarding the structs "args_subcommand", * "args_option". As long as the necessary elements are kept with the * appropriate conditions, the rest is informative, and the parsing will be done * equally well. */ #ifndef ARGS_H #define ARGS_H #define ARGS_DEF_SCMD(_scmd, _name, _help, _ops, _scmds, _fn)\ {\ (_scmd).name = _name;\ (_scmd).help = _help;\ (_scmd).ops = _ops;\ (_scmd).scmds = _scmds;\ (_scmd).fn = _fn;\ } #define ARGS_DEF_SCMD_END(_scmd) {_scmd.name = NULL;} #define ARGS_DEF_OPT(_opt, _name, _type, _form, _r)\ {\ (_opt).name = _name;\ (_opt).type = ARGS_OPTION_TYPE_##_type;\ (_opt).form = ARGS_OPTION_FORM_##_form;\ (_opt).r = _r;\ } #define ARGS_DEF_OPT_END(_opt) \ {\ (_opt).name = NULL;\ } enum args_option_type { ARGS_OPTION_TYPE_FLAG, ARGS_OPTION_TYPE_RAW }; enum args_option_form { ARGS_OPTION_FORM_LONG, ARGS_OPTION_FORM_SHORT }; /* If the option is LONG, name is any length (>0); if it SHORT, only its first * character is considered, the rest is ignored. * * If the option form is FLAG, then "r" is seen as a pointer to an int, and it * is set to 1. If the option form is RAW, then "r" is seen as a pointer to a * char pointer, and it is set to the beggining address of the option argument, * in "argv". */ struct args_option { char *name; enum args_option_type type; enum args_option_form form; void *r; }; struct args_subcommand { char *name; void (*help)(void); /* Terminate by "->name = NULL". */ struct args_subcommand *scmds; /* Terminated by "->name = NULL". This is ignored when "scmds" = NULL. */ struct args_option *ops; /* The parameters above are needed and used by "args", but the * parameters below are not, so they can be whatever. */ int (*fn)(int, char **); }; /* Parses the command-line for a subcommand or subcommands. The two dash ("--") * is supported. The behaviour is the following. * * 1. If a help option is given: if any, the last subcommand's "help" function * is called and return 1, otherwise the function "h()" is called. This all * independently of any error, but still parsing until the last possible * subcommand. * 2. If a version option is given and no help option, then "v" is called and * return 1. * 3. If an unknown option is given and no special option (see above) is given, * an error is logged and return -1. * 4. If a known option is found, it is set as described in the comments for * "struct args_option". * 5. If "--" is found, the parsing stops. * 6. If an unknown subcommand is found and no special option is given (see * above), an error is logged and return -1. * 7. If the subcommand is known: if it has no subcommands, then the parsing * stops, otherwise the parse continue following its own options and * sub-subcommands. The parsing of help continues the same and if it was set * before this subcommand or another, it is still set. The version option is * only parsed before the first subcommand. * 8. If a subcommand is given and nothing after it, is given, then its help * function is called and return 0. * * @param [inout] argc The pointer to argc as input, outputs the argc * for the subcommand. * @param [inout] argv The pointer to argv as input. Outputs the argv * for the subcommand. * @param [in] h The function of that prints the general help * message. * @param [in] scmds A list of the subcommands, terminated with * ".name" being NULL. * @param [out] scmd The pointer to the subcommand given by the user. * @param [in] ops A list of the options, terminated with ".name" * being NULL. * @param [in] v The function that prinst the version message. * @return Returns -1 when a parsing error occurs, returns 1 when * only the help option was printed without error. */ int args_parse_subcommand( int *argc, char ***argv, void (*h)(void), void (*v)(void), struct args_subcommand *scmds, struct args_subcommand **scmd, struct args_option *ops ); /* Parses the command-line. The two dash ("--") is support. The behaviour is the * following. * * 1. If a help option is found, the function 'h' is called and return 1. * 2. If an unknown option is found, an error is logged and return -1. * 3. Non-options arguments are ignored. * 4. If no argument is given after argv[0], it calls "h" and return 0. * * If none of the above exits happen, then it returns 0. * * @param argc argc. * @param [in] argv argv. * @param [in] h The function that prints the help message. * @param [in] ops A list of the options, terminated with ".name" being * NULL. * @return Returns -1 when a parsing occured, returns 1 when only the help * message was printed without error. Returns 0 otherwise. */ int args_parse(int argc, char **argv, void (*h)(void), struct args_option *ops); /* Returns next non-option argument, after "init". It assumes the function * "args_parse()" returned or would return 0. * @param [in] argv argv. * @param [in] ops The command-line options. This is neede so an * option-argument is skipped. * @param [in] init The pointer to the previous non-option argument. If * there is not a previous, use "argv[0]". If "init" or * prev[0] is NULL, NULL is returned. * @return The pointer to the next non-option argument, or NULL if * there is no next one. */ char **args_next_argument(char **argv, struct args_option *ops, char **init); #endif /* ARGS_H */ 

args.c

/* This implements "args.h", and in a bit of a complicated manner. * * The thing is that everything is parsed normally, but then an error is found. * With reasonable headed people, this would just log the error and exit * failure. I am not reasonable: if a help option or a version option was given, * the program simply cannot error and something will be printed. So, how is * this handled? * * It uses a pseudo-exception system with "setjmp". If an error is found, it * logs if no special option was given, and then "longjmp"s to "g_exception". * And since it only "raises" when no more parsing can be done and a special * option was given, we have all the information to handle it. This is made * easier through "RAISE_BEG_LOG()"/"RAISE_END_LOG()", and also * "SPECIAL_OPT_ON()". There are also other behavioural issues: for example, * 'args_next_argument()' does not want for the option results to be output, so * a "dry" run has to be done. This information could be passed through a * parameter, but this means other functions may need to receive it as a * parameter, and this would scale poorly, so global variables are used. To * check all the global variables, see below. * * NOTE: Yeah, whatever, I could just use a struct to hold the parser condition * and it would even be prettier, but whatever. * * This means all global variables have to be initialized so not undefined * behaviour would happen. To make this less error-prone, "init_globals()" is * used; if a new global is added, this function should fail in case the call * has not adapted to the new declaration. */ #include "args.h" #include <assert.h> #include <stddef.h> #include <stdlib.h> #include <stdio.h> #include <setjmp.h> #include <string.h> #include "stringi.h" /* These are QoL macro. */ #define SPECIAL_OPT_ON() (g_help | g_version) #define RAISE_BEG_LOG(condition)\ if (condition) {\ if (SPECIAL_OPT_ON()) {\ longjmp(g_exception, 1);\ } #define RAISE_END_LOG()\ longjmp(g_exception, 1);\ } #define SET_GHELP()\ {\ g_help = 1;\ if (g_help_exception) {\ longjmp(g_exception, 1);\ }\ } #define FIND_OPTION(_n, _nLen, _ops, _form) (\ find_option(\ _n,\ _nLen,\ _ops,\ ARGS_OPTION_FORM_##_form\ )\ ) /** State of the parser. **/ int g_help; int g_version; jmp_buf g_exception; /** Options that affect the behaviour of the parsing. **/ /* When true, if g_help is set, it "raises" an exception. */ int g_help_exception; /* When true, ignores version options. */ int g_version_ignore; /* If true, (op->r) is never set. */ int g_dry; static void init_globals( int hExc, int vIgn, int dry ) { g_help = 0; g_version = 0; g_help_exception = hExc; g_version_ignore = vIgn; g_dry = dry; } static int verify_options(const struct args_option *ops) { const struct args_option *op; if (ops == NULL) return 1; for (op = ops; op->name != NULL; op++) { if (op->name[0] == 0) return 0; if (strchr(op->name, '=') != NULL) return 0; if ( op->type != ARGS_OPTION_TYPE_RAW && op->type != ARGS_OPTION_TYPE_FLAG ) { return 0; } if ( op->form != ARGS_OPTION_FORM_LONG && op->form != ARGS_OPTION_FORM_SHORT ) { return 0; } if (op->r == NULL) return 0; } return 1; } /* Returns whether subcommands are correct. */ static int verify_subcommands(const struct args_subcommand *scmds) { const struct args_subcommand *scmd; if (scmds == NULL) return 1; for (scmd = scmds; scmd->name != NULL; scmd++) { assert(scmd->help != NULL); if (scmd->scmds == NULL) continue; if (!verify_subcommands(scmd->scmds)) return 0; if (!verify_options(scmd->ops)) return 0; } return 1; } static int is_option_long(const char *arg) { assert(arg != NULL); return arg[0] == '-' && arg[1] == '-' && arg[2] != 0; } static int is_option_short(const char *arg) { assert(arg != NULL); return arg[0] == '-' && arg[1] != '-' && arg[1] != 0; } static int is_2dash(const char *arg) { return stringi_equal(arg, "--"); } /* If it is a help option with non-standard option syntax. */ static int is_option_special_help(const char *arg) { return stringi_equal(arg, "/?") || stringi_equal(arg, "-?"); } /* ops = NULL is allowed, in which case NULL is returned. */ static struct args_option *find_option( const char *n, size_t nLen, struct args_option *ops, enum args_option_form f ) { struct args_option *op; if (ops == NULL) return NULL; for (op = ops; op-> name != NULL; op++) { if (op->form != f) continue; if (!stringin_equal(op->name, n, nLen)) continue; return op; } return NULL; } static void set_option_result(struct args_option *op, char *arg) { assert(op != NULL); if (g_dry) return; if (op->type == ARGS_OPTION_TYPE_FLAG) { *((int*)(op->r)) = 1; } else { assert(op->type == ARGS_OPTION_TYPE_RAW); assert(arg != NULL); *((char**)(op->r)) = arg; } } /* Returns the number of args to skip. */ static int parse_long_option(struct args_option *ops, char *arg0, char *arg1) { struct args_option *op; char *eq; size_t nLen; char *n; char *opArg; assert(arg0 != NULL); n = arg0 + 2; eq = strchr(n, '='); if (eq == NULL) nLen = strlen(n); else nLen = eq - n; /* This means we allow "--help=" and "--version=". This is a worthy * sacrifice. */ if (stringi_equal(n, "help")) { SET_GHELP() return 1; } if (!g_version_ignore && stringi_equal(n, "version")) { g_version = 1; return 1; } op = FIND_OPTION(n, nLen, ops, LONG); RAISE_BEG_LOG(op == NULL) fprintf(stderr, "Unknown option: '--%.*s'\n", nLen, n); RAISE_END_LOG() if (op->type == ARGS_OPTION_TYPE_FLAG) { RAISE_BEG_LOG(eq != NULL) fprintf(stderr, "Option takes no argument: '--%.*s'\n", nLen, n); RAISE_END_LOG() set_option_result(op, NULL); return 1; } assert(op->type == ARGS_OPTION_TYPE_RAW); if (eq != NULL) opArg = eq[1] != 0 ? eq + 1 : arg1; else opArg = arg1; RAISE_BEG_LOG(opArg == NULL) fprintf(stderr, "No argument given for '--%.*s'\n", nLen, n); RAISE_END_LOG() set_option_result(op, opArg); if (opArg == arg1) return 2; return 1; } /* Returns the number of args to skip. */ static int parse_short_option(struct args_option *ops, char *arg0, char *arg1) { char *n; struct args_option *op; char *opArg; assert(arg0 != NULL); n = arg0; while (1) { n++; if (*n == 0) break; if (*n == 'h') { SET_GHELP() continue; } op = FIND_OPTION(n, 1, ops, SHORT); RAISE_BEG_LOG(op == NULL) fprintf(stderr, "Unknown option: '-%c'\n", *n); RAISE_END_LOG() if (op->type == ARGS_OPTION_TYPE_FLAG) { RAISE_BEG_LOG(n[1] == '=') fprintf(stderr, "Option takes no argument: " "'-%c'\n", *n); RAISE_END_LOG() set_option_result(op, NULL); continue; } assert(op->type == ARGS_OPTION_TYPE_RAW); if (n[1] == 0) opArg = arg1; else if (n[1] == '=') opArg = n[2] == 0 ? arg1 : n + 2; else opArg = n + 1; RAISE_BEG_LOG(opArg == NULL) fprintf(stderr, "No argument given for '-%c'\n", *n); RAISE_END_LOG() set_option_result(op, opArg); if (opArg == arg1) return 2; return 1; } return 1; } /* If no error, returns the pointer to the subcommand in "scmds" of the * subcommand in the command-line, otherwise return NULL. If NULL is returned, * there are two possibilities: * 1) If g_help, then no error was printed. * 2) If !g_help, then the error message was printed. * * @param [inout] argc Vide "args_parse_subcommand()". * @param [inout] argv Idem. * @param [in] scmds Idem. */ /* Sets "g_exception". */ static struct args_subcommand *parse_subcommand( int *argc, char ***argv, struct args_subcommand *scmds, struct args_option *ops ) { int i; struct args_subcommand *scmd; int cmdI = 0; assert(argv != NULL); assert(*argv != NULL); assert(scmds != NULL); if (setjmp(g_exception) != 0) return NULL; if (*argc <= 1 || ((*argv)[0] == NULL && (*argv)[1] == NULL)) { SET_GHELP() return NULL; } for (i = 1; i < *argc && (*argv)[i] != NULL; i++) { char *arg = (*argv)[i]; char *nArg = (*argv)[i + 1]; if (is_option_special_help(arg)) { SET_GHELP() continue; } if (is_2dash(arg)) { break; } if (is_option_long(arg)) { i += parse_long_option(ops, arg, nArg) - 1; continue; } if (is_option_short(arg)) { i += parse_short_option(ops, arg, nArg) - 1; continue; } cmdI = i; break; } if (cmdI == 0) { if (SPECIAL_OPT_ON()) return NULL; (void)fprintf(stderr, "No command given\n"); return NULL; } for (scmd = scmds; scmd->name != NULL; scmd++) { if (stringi_equal(scmd->name, (*argv)[cmdI])) break; } if (scmd->name == NULL) { if (SPECIAL_OPT_ON()) return NULL; (void)fprintf(stderr, "Unknown command: '%s'\n", (*argv)[cmdI]); return NULL; } *argc -= cmdI; *argv += cmdI; return scmd; } /* Sets "g_help", "g_dry",, "g_help_exception", and "g_version". */ int args_parse_subcommand( int *argc, char ***argv, void (*h)(void), void (*v)(void), struct args_subcommand *scmds, struct args_subcommand **scmd, struct args_option *ops ) { struct args_subcommand *lstScmd = NULL; struct args_subcommand *cScmd; assert(argc != NULL); assert(argv != NULL); assert(*argv != NULL); assert(h != NULL); assert(scmds != NULL); assert(scmd != NULL); assert(verify_subcommands(scmds)); assert(verify_options(ops)); init_globals(0, 0, 0); while (1) { cScmd = parse_subcommand(argc, argv, scmds, ops); if (cScmd == NULL) break; g_version_ignore = 1; lstScmd = cScmd; if (lstScmd->scmds == NULL) break; scmds = lstScmd->scmds; ops = lstScmd->ops; } if (cScmd == NULL && !SPECIAL_OPT_ON()) return -1; if (g_help) { if (lstScmd == NULL) h(); else lstScmd->help(); return 1; } if (g_version) { v(); return 1; } *scmd = cScmd; return 0; } /* Sets "g_help", "g_help_exception", "g_dry", "g_exception", and "g_version". */ int args_parse(int argc, char **argv, void (*h)(void), struct args_option *ops) { int i; assert(argv != NULL); assert(h != NULL); assert(verify_options(ops)); init_globals(1, 1, 0); if (argc <= 1 || (argv[0] == NULL && argv[1] == NULL)) { h(); return 1; } if (setjmp(g_exception) != 0) { if (g_help) goto help; return -1; } for (i = 1; i < argc && argv[i] != NULL; i++) { char *arg = argv[i]; char *nArg = argv[i + 1]; if (is_option_special_help(arg)) goto help; if (is_2dash(arg)) break; if (is_option_long(arg)) { i += parse_long_option(ops, arg, nArg) - 1; continue; } if (is_option_short(arg)) { i += parse_short_option(ops, arg, nArg) - 1; continue; } /* We ignore non-option arguments. */ } if (g_help) goto help; return 0; help: h(); return 1; } char **args_next_argument(char **argv, struct args_option *ops, char **init) { int dash2; char **arg; char **r; init_globals(0, 0, 1); assert(setjmp(g_exception) == 0); dash2 = 0; arg = argv; while (1) { arg++; if (arg > init) { break; } assert(*arg != NULL); if (is_2dash(arg[0])) { dash2 = 1; break; } } if (dash2) { r = init + 1; goto end; } arg = init; while (1) { arg++; if (arg[0] == NULL) return NULL; if (is_option_special_help(arg[0])) { continue; } if (is_2dash(arg[0])) { r = arg + 1; break; } if (is_option_long(arg[0])) { arg += parse_long_option(ops, arg[0], arg[1]) - 1; continue; } if (is_option_short(arg[0])) { arg += parse_long_option(ops, arg[0], arg[1]) - 1; continue; } r = arg; break; } end: if (r[0] == NULL) return NULL; return r; } 

example.c

#include <stdio.h> #include "args.h" #define VERSION_MAJOR (1) #define VERSION_MINOR (0) #define VERSION_PATCH (3) static const char *g_program_name; static void help(void) { printf("Usage: %s\n", g_program_name); } static void help_create(void) { const char **cmd; const char *cmds[2] = { "volume", NULL }; printf("Usage: %s create <command>\n", g_program_name); printf("Commands:\n"); for (cmd = cmds; *cmd != NULL; cmd++) printf("\t%s\n", *cmd); } static void help_create_volume(void) { printf("Usage: %s create volume\n", g_program_name); } static int builtin_create_volume(int argc, char **argv) { int r = args_parse(argc, argv, help_create_volume, NULL); if (r == 1) return 0; else if (r == -1) return 1; printf("Create volume!\n"); return 0; } static void help_read(void) { printf("Read: %s read\n", g_program_name); } static int builtin_read(int argc, char **argv) { struct args_option ops[3]; char *raw = NULL; int flag = 0; char **arg; ARGS_DEF_OPT(ops[0], "r", RAW, LONG, &raw) ARGS_DEF_OPT(ops[1], "f", FLAG, LONG, &flag) ARGS_DEF_OPT_END(ops[2]) int r = args_parse(argc, argv, help_read, ops); if (r == 1) return 0; else if (r == -1) return 1; printf("Read!\n"); printf("Raw = %s\n", raw); printf("Flag = %d\n", flag); arg = argv; while (1) { arg = args_next_argument(argv, ops, arg); if (arg == NULL) break; printf("Arg = '%s'\n", *arg); } return 0; } static void version(void) { printf("Version %d.%d.%d\n", VERSION_MAJOR, VERSION_MINOR, VERSION_PATCH ); } int main(int argc, char **argv) { struct args_subcommand scmds[3]; struct args_subcommand *scmd; struct args_subcommand create_scmds[2]; struct args_option ops0[5]; struct args_option ops10[5]; char *op0_raw = NULL; int op0_flag = 0; char *op0_r = NULL; int op0_f = 0; char *op10_raw = NULL; int op10_flag = 0; char *op10_r = NULL; int op10_f = 0; int r; if (argc >= 1 && argv[0] != NULL) g_program_name = argv[0]; else g_program_name = "example"; /*** Configuring main command. ***/ ARGS_DEF_SCMD(scmds[0], "create", help_create, ops10, create_scmds, NULL) ARGS_DEF_SCMD(scmds[1], "read", help_read, NULL, NULL, builtin_read) ARGS_DEF_SCMD_END(scmds[2]) ARGS_DEF_OPT(ops0[0], "raw", RAW, LONG, &op0_raw) ARGS_DEF_OPT(ops0[1], "flag", FLAG, LONG, &op0_flag) ARGS_DEF_OPT(ops0[2], "r", RAW, SHORT, &op0_r) ARGS_DEF_OPT(ops0[3], "f", FLAG, SHORT, &op0_f) ARGS_DEF_OPT_END(ops0[4]) /** Configuring "create" **/ ARGS_DEF_SCMD(create_scmds[0], "volume", help_create_volume, NULL, NULL, builtin_create_volume) ARGS_DEF_SCMD_END(create_scmds[1]) ARGS_DEF_OPT(ops10[0], "raw", RAW, LONG, &op10_raw) ARGS_DEF_OPT(ops10[1], "flag", FLAG, LONG, &op10_flag) ARGS_DEF_OPT(ops10[2], "r", RAW, SHORT, &op10_r) ARGS_DEF_OPT(ops10[3], "f", FLAG, SHORT, &op10_f) ARGS_DEF_OPT_END(ops10[4]) r = args_parse_subcommand( &argc, &argv, help, version, scmds, &scmd, ops0 ); if (r == 1) return 0; else if (r == -1) return 1; printf("Raw0 = %s\n", op0_raw); printf("Flag0 = %d\n", op0_flag); printf("R0 = %s\n", op0_r); printf("F0 = %d\n", op0_f); printf("Raw10 = %s\n", op10_raw); printf("Flag10 = %d\n", op10_flag); printf("R10 = %s\n", op10_r); printf("F10 = %d\n", op10_f); return scmd->fn(argc, argv); } 

Dependency

stringi.h

/* String functions ignoring case. */ #ifndef STRINGI_H #define STRINGI_H #include <stddef.h> /* Returns whether two strings are equal ignoring case. */ int stringi_equal(const char *s1, const char *s2); /* Returns whether a strings starts with another string. */ int stringi_starts(const char *s, const char *with); /* Returns whether the character array are equal up to the first 'n' character, * and stopping at the first null character. */ int stringin_equal(const char *s1, const char *s2, size_t n); #endif /* STRINGI_H */ 

stringi.c

/* The implementation of "stringi.h". */ #include "stringi.h" #include <ctype.h> #include <stddef.h> #include <assert.h> #include <string.h> static int chari_equal(char c1, char c2) { return toupper(c1) == toupper(c2); } static int charin_equal(const char *s1, const char *s2, size_t n) { size_t i; assert(s1 != NULL); assert(s2 != NULL); for (i = 0; i < n; i++) { if (!chari_equal(s1[i], s2[i])) return 0; } return 1; } int stringin_equal(const char *s1, const char *s2, size_t n) { size_t i; assert(s1 != NULL); assert(s2 != NULL); for (i = 0; i < n;i++) { if (s1[i] == 0 || s2[i] == 0) return 0; if (!chari_equal(s1[i], s2[i])) return 0; } return 1; } int stringi_equal(const char *s1, const char *s2) { size_t s1Len, s2Len; assert(s1 != NULL); assert(s2 != NULL); s1Len = strlen(s1); s2Len = strlen(s2); if (s1Len != s2Len) return 0; return charin_equal(s1, s2, s1Len); } int stringi_starts(const char *s, const char *with) { size_t sLen, withLen; assert(s != NULL); assert(with != NULL); sLen = strlen(s); withLen = strlen(with); if (sLen < withLen) return 0; return charin_equal(s, with, withLen); } 
\$\endgroup\$
0

    1 Answer 1

    4
    \$\begingroup\$

    Some errors can be found simply by enabling more warning options when compiling:

    • args_option.name and args_subcommand.name are both assigned from string literals, so would be better as char const*, to ensure no attempt is made to modify read-only data.
    • nLen = eq - n assigns unsigned type size_t from a subtraction (which yields signed type ptrdiff_t).
    • fprintf(stderr, "Unknown option: '--%.*s'\n", nLen, n); expects an int for the first conversion, but we provide a size_t. Similarly for the other prints in parse_long_option().

    That's the robo-review done; let's go through it in detail. I'll work from bottom up, starting with stringi.h.

    stringi.h/stringi.c

    Functions from <ctype.h> take positive integer arguments, so passing (possibly negative) char values invites Undefined behaviour. We need to convert to unsigned char before conversion to int:

    static int chari_equal(char c1, char c2) { return toupper((unsigned char)c1) == toupper((unsigned char)c2); } 

    It's locale-dependent whether that's a valid case-insensitive equality test. Some platforms provide a function (e.g. strcasecmp() in POSIX), which is likely more correct, so consider using that where it's available.

    The identifiers stringin_equal, stringi_equal and stringi_starts are all in the set that's reserved for future library expansion (i.e. beginning with str followed by a letter), so I recommend changing those.

    The assertions in those functions (and indeed elsewhere in the code) don't appear to be justified. The pointers come from outside the functions, so I don't believe we can be certain they are not null. It looks like this is assert() being a misguided attempt at error checking.

    stringi_equal() traverses both strings twice (once in strlen() and once in charin_equal(). That's less efficient than just iterating over the contents once.

    args.h

    I don't much like the macros that could be functions - especially when they expand arguments multiple times, preventing callers from doing reasonable things such as

     ARGS_DEF_SCMD(scmds[i++], "create", help_create, ops10, create_scmds, NULL) 

    It's concerning that most of the arguments in the expansion aren't enclosed in (…). Although assignment is (almost) the lowest precedence, that's just something we expect to see in a macro.

    I think that a more helpful approach is to have macros that can be used as structure initialisers:

    #define ARGS_SUBCOMMAND(name_, help_, ops_, scmds_, fn_) \ { \ .name = (name_), \ .help = (help_), \ .ops = (ops_), \ .scmds = (scmds_), \ .fn = (fn_), \ } #define ARGS_OPTION(name_, type_, form_, r_) \ { \ .name = (name_), \ .type = ARGS_OPTION_TYPE_##type_, \ .form = ARGS_OPTION_FORM_##form_, \ .r = (r_), \ } #define ARGS_END { .name = NULL } 

    The advantage this gives is that the user doesn't need to keep count of the number of subcommands or options, and can easily add, remove or reorder the lists:

     struct args_option ops0[] = { ARGS_OPTION("raw", RAW, LONG, &op0_raw), ARGS_OPTION("flag", FLAG, LONG, &op0_flag), ARGS_OPTION("r", RAW, SHORT, &op0_r), ARGS_OPTION("f", FLAG, SHORT, &op0_f), ARGS_END }; struct args_option ops10[] = { ARGS_OPTION("raw", RAW, LONG, &op10_raw), ARGS_OPTION("flag", FLAG, LONG, &op10_flag), ARGS_OPTION("r", RAW, SHORT, &op10_r), ARGS_OPTION("f", FLAG, SHORT, &op10_f), ARGS_END }; struct args_subcommand create_scmds[] = { ARGS_SUBCOMMAND("volume", help_create_volume, NULL, NULL, builtin_create_volume), ARGS_END }; struct args_subcommand scmds[] = { ARGS_SUBCOMMAND("create", help_create, ops10, create_scmds, NULL), ARGS_SUBCOMMAND("read", help_read, NULL, NULL, builtin_read), ARGS_END }; 

    This is a pattern that's well known; for example, the Linux kernel uses it extensively for building tables (especially in subsystems such as PCI and USB where modules present a table of vendor and product identifiers which they support).

    I'm not sure that help really needs to be a function. With just a string, the argument processor could enumerate the subcommands and options, rather than the user having to do that.

    It's not clear why the functions need modifiable structures to work with. In fact, just changing all the pointers to point to const structures works perfectly, so we should do that. That allows users to declare their options const, further reducing scope for error.

    stringi.c

    I don't like the large number of globals - I'm surprised we need so many. I'm also surprised that we couldn't write the optimistic parsing without longjmp() (which most of us avoid because it complicates reasoning about program state).

    It seems strange to use stringi_equal() in these cases, where standard strcmp() should obtain the same results for non-alphabetic strings:

    static int is_2dash(const char *arg) { return stringi_equal(arg, "--"); } /* If it is a help option with non-standard option syntax. */ static int is_option_special_help(const char *arg) { return stringi_equal(arg, "/?") || stringi_equal(arg, "-?"); } 

    I'd probably write this loop without the continues:

     for (op = ops; op-> name != NULL; op++) { if (op->form != f) continue; if (!stringin_equal(op->name, n, nLen)) continue; return op; } 

    I reckon it's clearer with tests in the positive sense:

     for (op = ops; op-> name; op++) { if (op->form == f && stringin_equal(op->name, n, nLen)) { return op; } } 

    It seems harmful to me to compare options case-insensitively. That means we can't have options that differ only in case, such as the very different ls -l and ls -L, or cp -t and cp -T. Not what most people would expect.

    example.c

    This is redundant:

     if (argc >= 1 && argv[0] != NULL) 

    argv[argc] is always the only null element in the array, so argc!argv[0]. No need to test both.

    My very first test (invoke program with no arguments) produced the usage message, but then it exited with a success status. That seems wrong, since the usage message was a consequence of erroneous invocation rather than specifically requested.

    \$\endgroup\$
    5
    • \$\begingroup\$Thank you a lot for the review! I forgot to say I am using C89, so the macro thing cannot be done (unfortunately), but I learnt something! I compiled with MinGW GCC with all warnings on (-Wall -Wextra -pedantic) and it didn't report any problems.; with WSL, it complained about nLen not being int. Also, if eq - n <= SIZE_MAX , wouldn't it be just an implicit conversion from ptrdiff_t to size_t? And, now thinking, I can just remove the use argc, since it's non-negative and argv[argc] = NULL. Thank you!\$\endgroup\$
      – Schilive
      CommentedSep 13, 2024 at 19:34
    • \$\begingroup\$I am actually using assert() for checking for internal errors. The idea is to use signal.h to change SIGBART's handler to one which will tell the user of an internal error, close the files, and then exit error. Also, it is all case-insensitive because of an old decision, which I now think is dumb. And it's meant to be ASCII-only, but then it should return an error if it weren't ASCII-only. Again, thanks for the taking the time to write it all.\$\endgroup\$
      – Schilive
      CommentedSep 13, 2024 at 19:38
    • \$\begingroup\$In case it's useful, I used gcc-14 -std=c23 -Wall -Wextra -Wwrite-strings -Wno-parentheses -Wpedantic -Warray-bounds -Wmissing-braces -Wconversion -Wstrict-prototypes -fanalyzer.\$\endgroup\$CommentedSep 13, 2024 at 19:43
    • \$\begingroup\$Toby, it is very useful. I thought -Wall -Wextra -pedantic was enough, but... I guess not. I'll still read GCC's manual one day.\$\endgroup\$
      – Schilive
      CommentedSep 13, 2024 at 19:46
    • \$\begingroup\$Also, the reason it returns 0 when only example or a subcommand (example create) is executed is because the parser interprets you are asking for the help message for that command/subcommand, like GitHub's gh.\$\endgroup\$
      – Schilive
      CommentedSep 14, 2024 at 2:30

    Start asking to get answers

    Find the answer to your question by asking.

    Ask question

    Explore related questions

    See similar questions with these tags.