Embedded C code doesn't have to be ugly

Preface

A contractor engineer we employed years ago was tasked with implementing a UVC-compliant functionality for the SUB2r camera we were working on. Around a year ago it became clear that this implementation was becoming increasingly unstable with the addition of new exciting capabilities and it was at that time that I had a chance to open the source files. Only to get disgusted at the dreadful state of the code. Proverbial “spaghetti code” was surely based on text in those files!

That UVC implementation was heavily based on an example implementation provided by the hardware manufacturer which was taken more or less verbatim by that said engineer and slightly masaged around to do “what the customer asked” (oh, there was one other major “improvement” - that engineer has made sure that there was a big banner with his name at the beginning of each and every file).

It was obvious that before continuing developing new code there was an obligation to pay the technical debt accumulated over all this time… This text is the observations made, lessons learnt, and guidelines emerged from that project of bringing the C code into the XXI century.

0. Understand the code you are modifying

This one is not specific to the language of choice but is a more general best practice about anything an engineer of any kind should always keep in mind when working on a project: understand the problem and the current implementation before venturing into changing things.

It may not be obvious, especially at first, why certain decisions have been made that resulted in the choices made for the code you are about to modify. For example: the original implementation was producing MPEG frames programmatically, which required a synchronization between threads (which I'm not going to dig into right now). That, now unnecessary, code was left behind and built on top of. Which eventually resulted in stability issues caused by race conditions introduced by adding more code without understanding why the code was what it was in the first place.

Needless to say that debugging race conditions in embedded code is about as much “fun” as anyone should ever indure. Ever.

Even if unit tests are not something that pop to mind when working on firmware (embedded) code, one must always make sure to fully understand the code before modifying it.

1. Use YOUR favorite IDE

The default SDK provided an IDE based around Eclipse. My personal preference is MS Visual Studio with its slew of features that I can no longer live without. Namely:

My favorite keyboard shortcuts (A mouse? It often just slows me down when I'm working with text!)
Code browsing database
Mouse-over function/variable comment display
Datatype information (especially useful for C++'s auto variables, IMHO)
“Find all references”
Search text in: block, file, all open files, all files for the project/solution (that last one will include all the #include files as well)
Jump to declaration/definition (including a “quick view”)
View call hierarchy (both “from” and “to” up and down the call stack)
Refactoring support
Intelligent renaming (which includes renaming identifiers in comments and disabled code blocks)
(A very intelligent!) syntax error check while editing text
Code block comment/uncomment with a keyboard shortcut
Style suggestions
Clang-style formatting support (via .clang-format file)

In the end all that was needed to allow MSVS to work with the code was to add a single “settings” file named CppProperties.json to reside in the same directory as the rest of the source files and use the "Open Folder" feature to start editing code.

The plan is to eventually add “actions” to that setup so that it is possible to build without switching to that Eclipse's IDE. But for now this is what the CppProperties.json file looks like (with a future-use compiler flag for when the code is eventually transitioned into C++, but that part can easily be changed back to c99 or even a more advanced c17 if needed):

CppProperties.json

{
  "configurations": [
    {
      "name": "IoT-Release",
      "inheritEnvironments": [
        "linux_arm"
      ],
      "includePath": [
        "${env.FX3_INSTALL_PATH}/firmware/u3p_firmware/inc",
        "${projectRoot}",
        "${env.ARMGCC_INSTALL_PATH}/arm-none-eabi/include",
        "${env.ARMGCC_INSTALL_PATH}/lib/gcc/arm-none-eabi/${env.ARMGCC_VERSION}/include",
        "${workspaceRoot}\\**"
      ],
      "defines": [
        "__CYU3P_TX__=1"
      ],
      "intelliSenseMode": "linux-gcc-arm",
      "compilerSwitches": "-std=c++17"
    }
  ]
}

N.B.: There are quite a few compiler options for the actual compilation but as they don't affect the syntax analysis (like -fabi-version=0 -fno-exceptions -fno-rtti) I opted for not including them in the settings file for now.

2. Language standard

Many smart individuals are working on improving the languages we use, providing better ways to express our goals and ideas in code, improving the code's readability and maintainability, reducing chances of making stupid mistakes. Allowing compilers to generate faster, smaller, more optimized code.

As such (barring external dependencies) I always advise on using the latest stable language standard supported by the toolchain that your organization is comfortable with. And yes, that means that engineers must continually improve their grasp of the language and be on top of the latest stable standard to efficiently take advantage of the improvements provided by that standard.

3. #define (and const) vs. enum and magic numbers

So much has been said about the many advantages of pushing as much work as possible away from the preprocessor and into the compiler that is amazes me to still see tons of #define's in the modern code where language standard features would provide numerous benefits over the plain old text-based code processing that knows nothing of the code structure or the data types used.

Consider a list of unique identifiers used for something, say event IDs. Specifying those with the preprocessor puts the burden of making sure the IDs are unique onto the developer. And let's be honest - the potential for silly mistakes¹⁾ is enormous here. The solution? Simple: use unnamed enums!

enum{
    event_start   // acquire resources and begin processing
  , event_end     // stop the processing and free up the resources
  , event_pause   // stop processing but keep the resources
  , event_resume
};

As an added benefit it helps with the code annotation²⁾ which, being on the same line as the code itself, prevents comments to get shifted around and get attached to an unrelated line of code³⁾.

Same goes for bitmap values, quite honestly:

typedef enum UsbIsoSyncAttr{
    transfer_type_iso  = (0x01 << 0)
  , sync_type_async    = (0x01 << 2)
  , sync_type_adaptive = (0x02 << 2)
  , sync_type_sync     = (0x03 << 2)
} UsbIsoSyncAttr;

That very suggestion (using enum's) is also applicable for places which are normally thought of a “const variables' space” but avoided to save some RAM runtime space⁴⁾. Using enum's of a single value makes sure the value is calculated during compile time, is appropriately type-cast, and doesn't get an address in RAM associated with it during runtime execution:

enum{ s2r_uvc_stream_buf_size = s2r_ep_bulk_video_pkts_count * s2r_ep_bulk_video_pkt_size };

Magic numbers

You know, that kind:

if(interface == 3){

Compare that to this code and tell me - which one makes you understand what that code does?

if(interface == s2r_id_audio_stream_ifc){

4. Bit manipulations vs. structured data

Working with the low-level code⁵⁾ quite often requires direct bit manipulation. And as much as every C programmer is familiar and is comfortable with bit operations in the language, using those results in the ugliest code - it is unnecessarily verbose, prone to errors⁶⁾, makes the code that much harder to port⁷⁾.

Here's a more concrete example: working with 16/32 bit integers, where depending on the context it is either a CPU-addressable word, or a stream of bytes on the wire (be it USB or I²C). Traditionally that would result in the 1970's style code like this:

uint32_t = pageAddress * glSpiPageSize;
uint8_t location[4];
location[1] = (byteAddress >> 16) & 0xFF; /* MS byte */
location[2] = (byteAddress >> 8) & 0xFF;
location[3] = byteAddress & 0xFF; /* LS byte */

Do you see an obvious bug in that code⁸⁾?

Here's a much more modernized version of this, which also (automagically!!!) takes care of that idiotic bug:

typedef union PackedUint16{
    uint16_t v;
    struct{ uint8_t l, h; };
    uint8_t a[2];
} PackedUint16;
typedef union PackedInt16{
    int16_t v;
    struct{ uint8_t l, h; };
    uint8_t a[2];
} PackedInt16;
typedef union PackedUint32{
    uint32_t v;
    struct{ PackedUint16 l, h; };
    uint8_t a[4];
} PackedUint32;
 
///////////////////////////////////
 
PackedUint32 byteAddress = {pageAddress * glSpiPageSize};
uint8_t location[4] = {};      /* uninitialized memory is evil */
location[1] = byteAddress.h.l; /* compiler error, of course - must be PackedUint32, not 16 */
location[2] = byteAddress.l.h;
location[3] = byteAddress.l.l; /* LS byte */
 
/* alternatively the code above can be written much more to-the-point, clearly expressing the intent  */
const uint8_t * const location_arr = pageAddress.a;

A much more "complicated" example

Incoming USB requests have a standard “setup” header which results in old-style code looking something like the following⁹⁾:

static CyBool_t CyFxUVCApplnUSBSetupCB(uint32_t setupdat0, /* SETUP Data 0 */
                                       uint32_t setupdat1 /* SETUP Data 1 */
) {
    CyU3PIoMatrixConfig_t io_cfg;
    CyBool_t uvcHandleReq = CyFalse;
    uint32_t status;
    CyU3PReturnStatus_t apiRetStatus = CY_U3P_SUCCESS;
    uint8_t address, data;
 
    /* Obtain Request Type and Request */
    bmReqType = (uint8_t) (setupdat0 & CY_FX_USB_SETUP_REQ_TYPE_MASK );
    bRequest = (uint8_t) ((setupdat0 & CY_FX_USB_SETUP_REQ_MASK ) >> 8);
    bType = (bmReqType & CY_U3P_USB_TYPE_MASK);
    bTarget = (bmReqType & CY_U3P_USB_TARGET_MASK);
    wValue = (uint16_t) ((setupdat0 & CY_FX_USB_SETUP_VALUE_MASK ) >> 16);
    wIndex = (uint16_t) (setupdat1 & CY_FX_USB_SETUP_INDEX_MASK );
    wLength = (uint16_t) ((setupdat1 & CY_FX_USB_SETUP_LENGTH_MASK ) >> 16);
 
    /* Handle USB vendor requests */
    if (bType == CY_U3P_USB_VENDOR_RQT) {

The function goes on for HUNDREDS of lines of deep nesting!!!¹⁰⁾.

Don't know about you, but all this bit shifting and byte extraction just screams “fruitful field for errors!” to me. Consider having abstracted all this ugliness into a structure like the following:

enum UsbRequestTarget{usb_tgt_dev, usb_tgt_ifc, usb_tgt_ep, usb_tgt_other};
enum UsbRequestType{usb_rt_std, usb_rt_class, usb_rt_vendor, usb_rt_reserved};
typedef struct UsbSetupPacket{
    union{
        uint32_t d0;
        struct{
            struct{
                struct{
                    enum UsbRequestTarget bTarget : 5;
                    enum UsbRequestType bType : 2;
                    uint8_t dirIn : 1; // "IN" is the direction towards USB host (reading data)
                };
                uint8_t bRequest;
            };
            union{
                PackedUint16 wValue;
                uint16_t featureSelector;
            };
        };
    };
    union{
        uint32_t d1;
        struct{
            union{
                PackedUint16 wIndex;
                struct{
                    union{
                        uint8_t all;
                        struct{
                            uint8_t num : 4;
                            uint8_t res3 : 3;
                            uint8_t dirIn : 1;
                        };
                    };
                    uint8_t res8 : 8;
                } ep;
                struct Infc{
                    uint8_t num;
                    uint8_t res;
                } infc;
            };
            PackedUint16 wLength;
        };
    };
} UsbSetupPacket;

With that little prep work the same code becomes so much more readable and maintainable:

static CyBool_t CyFxUVCApplnUSBSetupCB(uint32_t setupdat0, uint32_t setupdat1)
{
    UsbSetupPacket req = {};
    // the next 2 lines is all it takes to distribute the bits into proper NAMED variables
    req.d0 = setupdat0;
    req.d1 = setupdat1;
 
    // dispatch to appropriate handler based on request type and return status of request handling to the USB driver
    switch(req.bType){
        case usb_rt_vendor: return handleVendorRqt(&req);
        case usb_rt_std:    return handleUsbStdRqt(&req);
        case usb_rt_class:  return handleUvcRqt(&req);
        default: return CyFalse;
    }
}

Not only the code becomes that much more readable - it also opens up opportunities for compilers to do platform-specific optimizations which are otherwise just not available to them as the intent of the code is not clearly expressed with all that manual bit shifting…

5. Globals

Having global variables is absolutely the major contributor to entangling unrelated blocks of code and making sure no human mind can fully comprehend your text without first truly understanding all of it. “A variable is declared in one file, set in another, accessed in yet another… what fun it is to trace its lifetime!” especially on an embedded system that lacks “usual” debugging tools like stepping through the lines or examining the state of variables.

There are a few kinds of globals, IMO:

"Global" globals

These are the ones declared at file level and mentioned with an extern keyword in header files. In my experience almost all of these can be safely made non-globals with little to no impact on the overall code structure and quite often moved into function scopes and passed around as parameters on a stack¹¹⁾.

See if some of those globals can be made file-scoped by grouping related functions together¹²⁾. Some const¹³⁾ globals can even be turned into enum's¹⁴⁾. And most often those globals can, and should, be safely made function-scope.

Function-level globals

A relic of K&R style C language, that required all the function variables be declared up-front, still finds its proponents even in the modern world of not-dumb-anymore-compilers that are able to deduce a lot more information from the code's text than some 50 years ago.

These same compilers are capable of making quite intelligent choices on how to translate your code into zeroes and ones, so that you should truly be not afraid of expressing your functionality in the language of abstraction, rather than writing in C like it is a macro-Assembler¹⁵⁾.

Stack allocation is extremely cheap, even on the most low-end down-to-earth hardware, so concerning yourself with the cost of that operation by grouping all-the-possible-function-variables together at the top should be reserved only for the most critical code¹⁶⁾.

Cross-function globals

Oh, these are absolutely the worst, as they tend to sneak up on you at the least expected time and place! They get declared as globals and then are only set in one place while being accessed in plenty of other places. Especially dangerous ones cross thread boundaries, making for extra-nasty race condition type bugs.

Even if one doesn't do anything about the first two kinds, one must always strive to eliminate these ones. In the long run it never makes sense to have that kind of (questionable) optimization benefit at the expense of destabilizing your code so badly.

6. Code blocks' nesting

Somewhat related to the general question of “how many LoC in a function is too many?” this one goes back to gut feeling that cramming all the code into a single function will somehow have positive optimization benefits overall. Let me be clear here: it doesn't. That perception is stemming from the fact that function calls in C, while being very cheap, are not zero-overhead operations.

This perception completely ignores all the advances made in CPU manufacturing in the past 60 years or so¹⁷⁾.

Just write the code in manageable (one-two page length) chunks, packaged as individual functions. And if there really is a concern for stack depth you always have the option of inlining those functions¹⁸⁾.

If you have a long-ass switch statement - make it into a function. If the branches of your if or case statements are large - make them into functions. If there's an overgrown block of code that is responsible for one clearly defined operation - make it into a function. If you find yourself writing an if inside a case or a similar situation - extract the code block into a function.

And as a general rule: if you see a block of code that doesn't fit onto one screen (or your prefrontal cortex, for that matter) - make it into a function!

Think of functions as text paragraphs: if you read some text that consists of one huge paragraph chances are you'll give up soon enough and look for some other source of information. One that is made for human consumption.

Stack depth

Speaking of stack depth: partitioning your code into smaller functions may indeed reduce your stack requirement as you won't be needing to allocate every-single-variable-ever-used-in-any-branch in one chunk, but instead only use up as much stack as needed for each individual function.

7. Code comments

CyBool_t isUsbConnected = CyFalse; /* Whether USB connection is active. */
 
int i = 1;   // set i to 1
 
i++;         // increment i by 1

The comments above are utterly useless. Not only they don't anything to what is already expressed in the code¹⁹⁾, there's a high potential for these types of comments to become stale and ultimately very misleading.

¹⁾

including mistypes, code merges, code copy-paste during refactoring(!)

²⁾

if needed, of course, which is definitely not every line of code, despite what some people think

³⁾

ensuring hours of some poor soul's chasing around the code base to answer an obvious “WTF?” question that usually arises from such shifts

⁴⁾

whether that saves space or not is a separate topic

⁵⁾

which is, basically, pretty much all the embedded code

⁶⁾

operator precedence is often to blame

⁷⁾

little- vs. big-endian porting, anyone?

⁸⁾

yes, this is what was in the code! An uninitialized first byte of the location buffer

⁹⁾

there's a separate discussion on the use of globals later on

¹⁰⁾

over 600 LoC to handle the standard USB requests

¹¹⁾