Changes between Initial Version and Version 1 of Developer/3aCodingStyle

Jul 11, 2020, 11:56:07 AM (4 years ago)



  • Developer/3aCodingStyle

    v1 v1  
     1= Coding Style Guidelines =
     3OML is written in C. Currently the code exhibits a mix of different styles, but we'd like to move it towards a clean, simple C style.
     5== Whitespace ==
     6Whitespace at the end of lines is annoying. Git provides a default pre-commit hook that prevents a commit that violates some sensible whitespace rules. Please enable it by doing this immediately after cloning the repository:
     81$ git clone git://
     92$ cd oml
     103$ cp .git/hooks/pre-commit.sample .git/hooks/pre-commit
     114$ chmod 755 .git/hooks/pre-commit
     13Tabs-vs-spaces is a religious war, which can only be solved by a deliberate dictatorial decision. In this case, I choose to base our choices on jwz. There should be no tab character in source files, and shift width is 2. The link has good advice for emacs and vim users. The source code already has emacs local variables to set the correct style (in particular, indent-tabs-mode: nil).
     15== Identation and Braces ==
     16The basic indentation style should be K&R style. For functions, opening brace in column 0; for control structures, opening brace on the first line of the statement. If braces aren't required, use them anyway (Sonar insists on this). Descriptive comments should be Doxygen -formatted, as a block before the function implementation by default, or the prototype in the headers for user-visible or generic API.
     18 1/** Doxygen brief comment.
     19 2 *
     20 3 * Longer Summary
     21 4 * \param a first parameter
     22 5 * \param b second parameter
     23 6 * \return an integer
     24 7 *
     25 8 * \see doxygen(1)
     26 9 */
     2811foo (int a, char *b)
     3013  char *p = b;
     3114  if (a < 0 || b == NULL) {
     3215    return -1;
     3316  }
     3518  while (*p && (p - b) < a) {
     3619    *p = toupper (*p);
     3720    if (*p == '\t') {
     3821      *p = ',';
     3922    }
     4023    p++;
     4124  }
     4326  return 0;
     47Switch statements should be indented with the subordinate "case" labels lined up with the "switch" keyword:
     49 1  switch(v->type) {
     50 2  case OML_LONG_VALUE:   v->value.longValue   = 0;    break;
     51 3  case OML_INT32_VALUE:  v->value.int32Value  = 0;    break;
     52 4  case OML_UINT32_VALUE: v->value.uint32Value = 0;    break;
     53 5  case OML_INT64_VALUE:  v->value.int64Value  = 0LL;  break;
     54 6  case OML_UINT64_VALUE: v->value.uint64Value = 0ULL; break;
     55 7  case OML_DOUBLE_VALUE: v->value.doubleValue = 0;    break;
     56 8  case OML_STRING_VALUE: {
     57 9    if (v->value.stringValue.is_const) {
     5810      v->value.stringValue.ptr = NULL;
     5911    } else {
     6012      v->value.stringValue.size = 0;
     6113      if (v->value.stringValue.length > 0) {
     6214        *v->value.stringValue.ptr = '\0';
     6315      }
     6416    }
     6517    break;
     6618  }
     6719  default:
     6820    logerror("Copy for type '%d' not implemented'\n", v->type);
     6921    return -1;
     7022  }
     72Do-while loops should look like this:
     741  do {
     752    i *= 2;
     763    j++;
     774  } while (i < n);
     79If-elseif-else constructs should look like this:
     811  if (x == y) {
     822    ..
     833  } else if (x > y) {
     844    ...
     855  } else {
     866    ....
     877  }
     89== Functions ==
     90Function prototypes should have the return type on a first line, and the function name and parameters signature on the following one:
     932oml_value_from_typed_s (OmlValue *value, const char *type_s, const char *value_s)
     99This makes source files easier to grep. Prototype that is a declaration (not part of the function definition) should however be all on one line.
     100int oml_value_from_typed_s (OmlValue *value, const char *type_s, const char *value_s);
     102This makes header files easier to read. Note that OML is written in C, not C++: declarations of pointer variables attach the asterisk to the variable name, not the type. However, if your function returns a pointer to something, in the function definition it is permissible to skip the space:
     1052oml_type_to_s(OmlValueT type)
     111But note:
     113char *oml_type_to_s(OmlValueT type);
     115is proper for the declaration prototype in the header file.
     116Note that the following is an abomination, and whilst it does exist in some places in the OML source code currently, they will be subject to ruthless repression when found:
     118 1int
     119 2omlc_init(
     120 3  const char* appName,
     121 4  int* argcPtr,
     122 5  const char** argv,
     123 6  o_log_fn custom_oml_log
     124 7) {
     125 8...
     126 9}
     128== Identifiers ==
     129OML is written in C. As a general rule, identifiers should be short and concise. Where this is not possible, they should be as short as possible whilst still being descriptive. If they consist of multiple words, then the words should be separated by underscores. OML is written in C, not Java: there should be no capital letters -- mixed case identifiers are an abomination.
     131The one exception to this is for typedef names. Because there is only one exception, they stand out, and this is a good thing (tm). For instance:
     133 1static int
     134 2out(OmlWriter *writer, OmlValue *values, int value_count)
     135 3{
     136 4  int i;
     137 5  OmlValue *v = values;
     138 6  OmlFileWriter *self = (OmlFileWriter*)writer;
     139 7  FILE *f = self->f;
     140 8  if (f == NULL) {
     141 9    return 1;
     14210  }
     14412  for (i = 0; i < value_count; i++, v++) {
     14513    switch (v->type) {
     14614    case OML_LONG_VALUE:
     14715      fprintf(f, "\t%" PRId32, oml_value_clamp_long (v->value.longValue));
     14816      break;
     14917    case OML_INT32_VALUE:  fprintf(f, "\t%" PRId32,  v->value.int32Value);  break;
     15018    case OML_UINT32_VALUE: fprintf(f, "\t%" PRIu32,  v->value.uint32Value); break;
     15119    case OML_INT64_VALUE:  fprintf(f, "\t%" PRId64,  v->value.int64Value);  break;
     15220    case OML_UINT64_VALUE: fprintf(f, "\t%" PRIu64,  v->value.uint64Value); break;
     15321    case OML_DOUBLE_VALUE: fprintf(f, "\t%f",  v->value.doubleValue); break;
     15422    case OML_STRING_VALUE: fprintf(f, "\t%s",  v->value.stringValue.ptr); break;
     15523    default:
     15624      logerror ("Unsupported value type '%d'\n", v->type);
     15725      return 0;
     15826    }
     15927  }
     16028  return 1;
     163This example is instructive. Above I have claimed that mixed case is not allowed except for in typedef names, but the OmlValueU union has members that have mixed-case names (e.g. v->value.int32Value). Doesn't that break the rules? Yes, in fact it does. However, these rules postdate the OML2 API, and we have to preserve backwards compatibility within the 2.x series, so for the moment those names have to stay.
     165Although OML doesn't follow the Linux kernel coding style, that document does have much sage advice, such as this from Chapter 4:
     167C is a Spartan language, and so should your naming be. Unlike Modula-2
     168and Pascal programmers, C programmers do not use cute names like
     169ThisVariableIsATemporaryCounter. A C programmer would call that
     170variable "tmp", which is much easier to write, and not the least more
     171difficult to understand.
     173GLOBAL variables (to be used only if you really need them) need to
     174have descriptive names, as do global functions. If you have a function
     175that counts the number of active users, you should call that
     176"count_active_users()" or similar, you should not call it "cntusr()".
     178Encoding the type of a function into the name (so-called Hungarian
     179notation) is brain damaged - the compiler knows the types anyway and can
     180check those, and it only confuses the programmer. No wonder MicroSoft
     181makes buggy programs.
     183LOCAL variable names should be short, and to the point. If you have
     184some random integer loop counter, it should probably be called "i".
     185Calling it "loop_counter" is non-productive, if there is no chance of it
     186being mis-understood. Similarly, "tmp" can be just about any type of
     187variable that is used to hold a temporary value.
     189If you are afraid to mix up your local variable names, you have another
     190problem, which is called the function-growth-hormone-imbalance syndrome.
     193== Transparent and Opaque Types ==
     195Traditionally OML2 has used a lot of typedefs for internal structures and types. I am starting to move away from this. The guidance in the Linux kernel coding style guide influenced my thinking (see Chapter 5).
     197Typedefs should be used for types that are opaque. Such types should only be manipulated via a well defined API. For instance, the MBuffer and MString types meet this criterion. In this case, the typedef is used to actively hide what the object is, and especially to hide its implementation details.
     199Types in the API should be opaque, but unfortunately we already put the full structure definitions in the <oml2/omlc.h> header file. This was a mistake because it causes problems if we want to modify the internal implementation details whilst maintaining binary and source backwards compatibility. Adding a member to a struct breaks binary compatibility unless it is added at the end; and even then, if it is used in an array (such as OmlValueU and OmlMPDef) it probably still breaks binary compatibility. Removing a member from a struct can break both source and binary compatibility. For the OML3 API (still only a twinkle in my eye) I am planning to hide almost all of the implementation-specific details behind opaque types (essentially void* pointers).
     201Internal data structures that are meant to be transparent should not be given typedefs. The Database typedef is an example in the server source code that should probably be removed in favour of referring to the underlying struct directly; we already access its internals directly in the code. The schema.c file is an example of the new policy. It is used to represent measurement tuple and database table schemas, and it is not typedef'd. That is because it is more convenient for client code to manipulate the internals directly. It is used in several ways and places, so a "well defined" API is difficult to, well, define.
     203Typedefs should also not be used for types that are easily represented using native C types. If a quantity could be an int on one platform and a long on another, then a typedef is appropriate, but otherwise not. The C standard defines portable typedefs for integers of pre-determined widths. If the number of bits (and signedness) used to represent a quantity are important, use the types from <stddint.h> (uint8_t, int32_t, etc.). Otherwise just use simple types. For instance, loop counters should be int, strings should be char*, etc.
     205Struct and union names for transparent types should be all lowercase with words separated by underscores, just like all other identifiers. For instance:
     207 1struct schema_field
     208 2{
     209 3  char *name;
     210 4  OmlValueT type;
     211 5};
     212 6
     213 7struct schema
     214 8{
     215 9  char *name;
     21610  struct schema_field *fields;
     21711  int nfields;
     21812  int index;
     22115struct schema *schema_from_meta (char *meta);
     225== Code Documentation ==
     226Doxygen documentation of the code is desirable. Important functions should be well documented, with both brief and longer descriptions, along with documentation of parameters and return. Short, straightforward functions can have a brief description only. C-style comments are obviously preferred, with a slash ('\') for Doxygen commands.
     228Other comments are useful, but overcommenting is not. Be reasonable.
     230The following should summarise the expectations.
     232 1int frob; /** Frobnication counter */
     233 2
     234 3/** Frobnicate */
     235 4void
     236 5frobnicate(void) {
     237 6  frob++;
     238 7}
     239 8
     240 9/** Do something complex.
     24110 *
     24211 * This function fiddles with i and c, and frobnicates some more.
     24413 * \param c character string of size i
     24514 * \param i size of c
     24615 * \return the number of times we have frobnicated
     24716 *
     24817 * \see frobnicate
     24918 */
     25120something_complex(char *c, int i) {
     25221  /* Skip on the actual work */
     25423  frobnicate();
     25524  return frob;
     258== License boilerplate ==
     259OML is MIT licensed by Nicta. The full license comes in the COPYING file at the root of the tree. However, we want to make sure each file is clearly labelled as MIT-licensed and copyrighted by Nicta. In the interest of space, the following boilerplate is to be used at the beginning of each file:
     262 * Copyright 20XX-20XX National ICT Australia Limited (NICTA)
     263 *
     264 * This software may be used and distributed solely under the terms of
     265 * the MIT license (License).  You should find a copy of the License in
     266 * COPYING or at By downloading or
     267 * using this software you accept the terms and the liability disclaimer
     268 * in the License.
     269 */
     271Several copyright lines, listing the previous owners/contributors should be kept.
     273== Modelines ==
     275To help support a coherent style, let's instruct our favorite, and less favorite editors, by adding the following modelines at the end of each file.
     278 Local Variables:
     279 mode: C
     280 tab-width: 2
     281 indent-tabs-mode: nil
     282 End:
     283 vim: sw=2:sts=2:expandtab
     287== Other languages ==
     288=== Python ===
     289Use a four-space indentation.
     291=== Ruby ===
     292Follow the Design_Documents OMF practice (