The Nature of the C Programming Language

This is part of a larger series titled “How To Program Anything: C Programming

Preface

This article refers to and makes use of information presented in my “Programming Crash Course” article.  Perusal of that article is recommended.

C was the second programming language I learned.  I had cut my teeth on BASIC as provided by my TRS-80 Color Computer II 16k (from Tandy Radioshack).  I would’ve encountered another programming language such as Pascal earlier if it wasn’t for the fact that my Macintosh IIsi had, as installed by my father, a QuickBASIC compiler/interpreter on it which enabled me to continue my BASIC ways.  I tried to program a lot of things in BASIC from simple text adventure, to a 300 pg magnum opus (written in one weekend) for a top down role-playing game, to a real time side-scroller (which didn’t work at all.)  It wasn’t until I got my own computer, unfortunately I don’t remember it’s tech specs as well as I do my older brother’s Quadra, that I encountered (and installed) CodeWarrior.  Now, in CodeWarrior we used C, C++, and Assembly.  Now I as in the big leagues!

It also was a good time to upgrade my programming skills too because the books I was encountering on my brief and intermittent visits to city bookstores (I lived in a semi-rural town in the mountains that, really, had no book store) were using C and C++.  These included Secrets of the Game Programming Wizards and other such titles for Macintosh.  So I started programming C for System 7, 8, and 9 (yes I was an Apple computer programmer.)  This is where I first wrapped my brain around functions, structs, classes, objects, pointers, addresses, and more.  I always wanted to create some kind of linkable library to my QuickBASIC installation (the manual said I could do it) so that I could program machine intensive stuff like graphics in C, but code the rest of the game in BASIC, but I could unfortunately never figure it out.  The manual said it could be done but of course gave very little explanation.

Moving up from a “high level” language like BASIC (that dates me) to a “lower-level” language like C (more on these levels to come) actually hones my understanding of the machines I was programming, and improved my programming skills.  Now I had access to the entire machine, and I could implement anything I wanted on it, not just what part of the system toolbox the BASIC interpreter decided it would let me try out.  For anyone stuck in a higher level language but wishing to learn more about their machine and computer science in general such as algorithms and such, learning a “middle level” language like C is a must.  It can surely open many doors in the future.

Where Does C Come From?

A man named Dennis Ritchie, then working for a place called Bell Labs, came up with the seed for what we now know as the C programming language sometime between 1969 and 1973.  C is actually the third in a line of a number of early programming languages, the others being BPCL and B.  BPCL (Basic Combined Programming Language) was developed by Martin Richards at the University of Cambridge in 1966.  It’s main concern at the time was or creating compilers for other programming languages.  In fact, it introduced the concept of “brace programming,” something which many other later programming languages still use. BCPL influenced the development of B, a language meant for non-numeric, possibly recursive, cross-machine programs, which was developed at Bell Labs around 1969 by Ken Thompson and Dennis Ritchie.

To know the history of C we actually get to know a small sliver of early history of th eUnix operating system.  The original Unix operating system was implemented in assembly language on a PDP-7, also by Dennis Ritchie and Ken Thompson (man those guys were busy weren’t they?)  When the PDP-11 came out it led to the idea of porting the system to the new machine, and Ritchie and Thompson at first considered using B.  However, B wasn’t really written to take advantage of the PDP-11 and so they came up with a new language that addressed these concerns, mainly being the ability to address bytes.  Thus, C was brought to life.  C was the name given the language presumably because C comes after B.

As a compilable language, C made it’s debut appearance in Version 2 Unix, and soon in 1972 a large part of Unix was rewritten in C.  Funny enough, C was not purposefully designed to be executable on many different machines but partly because it was part of Unix, and partly because of its ease and simplicity, it started to be run on many different machines.  By 1973 the C language had enjoyed enough developments that it was now powerful enough to comprise most of the Unix operating system kernel (the inner core of the operating system’s code).  Around 1977 Richie and Stephen Johnson continued development of the language to facilitate adoption of the Unix operating system, and Johnson wrote a “portable C compiler” which increased the ease of developing with C on new machines.

in 1978 Ritchie and Brian Kernighan published the first edition of The C Programming Language (pictured left).  This was in fact the very book I “received” in my ANSI C programming course when I attended the university.  The first edition of this book was not ANSI compliant (explained in a bit), since the standard didn’t exist, and served as an informal specification.  This version of C is often called “K&R C.”  Once more widespread adoption began occurring it spread, so to speak, like wildfire and has since become one of the most popular and widely utilized programming languages to ever hit the mainframe.  In fact, the American National Standards Institute (ANSI) in 1989 created a standardized version of C and since then a standard has been adopted by the International Organization for Standardization (ISO).  The first standard implemented many additional features that weren’t present in K&R C, and was informally dubbed C89.  The seminal book was published again in a second edition and dubbed “ANSI C” compliant.  C89 served as the “basis” or foundation upon which the C++ standard was formalized, and this standard remained relatively stable for several years.  That is until 1995, when an amendment was made to the 1990 standard (known as C95) that corrected some details and included more support for international character sets.

This is not the end of the story of C.  Oh no, in fact, the standard was further revised in 1999.  This version became known informally as C99.  It was also amended three times.  C99 introduced several new features including inline functions, complex numbers, variable-length arrays, flexible array members, and most notably support for one line comments (as in BCPL back in the day). Then, in 2007, work began on another revision which was published in 2011.  This version came to be known as C11, and adds another slew of new features including generic macros, anonymous structures, atomic operations, and more.  Each version sought to be backwards compatible as it can, but does include methods of testing which version is being used so that programs can make any needed adjustments.

Of particular note here is that C is often used in embedded environments.  What this means is that a program is being compiled and written to execute in the environment of a micro-controller or specialized processor with direct access to various hardware such as GPIO pins, and such.  You might think of some more popular “embedded” systems as the Raspberry PI, or Arduino, but embedded can imply any kind of processor meant to run a RTOS (real-time operating system) “embedded” onto a circuit board.  However, often times there are nonstandard requirements (nonstandard in respect to the C programming language standard) in these types of systems including fixed-point airthmetic, multiple memory banks, access to hardware pins and interrupts, and various unusual I/O operations.  To address this the C Standards Committee has attempted to formalize what is known as Embedded C: a set of language extensions that address these common issues.

C is a thriving and alive language used all over the world in all sorts of applications.  It combines the ease of using more abstract terms to program with than assembly without sacrificing the ability to deal in addresses, bits, and bytes.  We shall see what the future holds for C, and what kinds of revisions and enhancements it might enjoy in the next version.

How Does C Relate To Other Languages

C is a middle-level language, meaning that it combines the best elements of high-level languages with the best of lower-level languages.  The “level” of language is no reflection on the quality of the language: it does mean a language is more or less powerful, nor if its easier or more difficult to use.  It is more a reflection of the abstraction utilized in the process of programming in the language.  I shall explain what I mean by this.

Take a low-level language like assembly for instance.  At this perspective you program directly with the processor, controlling its cycles, its registers, its addresses, every little bit is carefully shuttled around.  The problem with something like assembly, lacking any kind of true abstraction beyond mnemonics for processor commands, that anything large or nuanced becomes cumbersome and wrought with all sorts of indices, jumps, and hard to understand algorithms.  Also, as a side effect of working so closely with a particular processor, the eventual program lacks portability to other processor architectures.

So you might try a high-level language, such as BASIC (like me).  You are no longer moving each little bit and byte around, but instead talking about variables, arrays, encapsulated flow control (see my programming crash course post on imperative programming), and very abstract methods of getting things done.  On top of that, because these abstract methods are programmed for you, they can be programmed for other platforms, making your code itself portable to those platforms.  But, in some ways, though you can now approach the problem the program is trying to solve without having to concretize the movement of each byte, you lack the ability to achieve that level of control with the processor when you desire to.

Enter the middle-level language: it offers the ease and simplicity of higher program abstraction (with variables, data type, flow control), but also gives you access and control over the more detailed bits which renders it as flexible and powerful (and fast) as something like assembly.  On top of that, with a little careful preparation, C is easily portable to other machines. This is where much of the power of the C programming language lies.

What Does C Do?

A fundamental notion of the C programming language is that of the function.  The function functionality of C allows the programmer to encapsulate algorithms and pieces of instructions into chunks that can run on their own, with their own input and output and their own variables, apart from other pieces of code.  Basically one function ideally has no impact on the operation of another function, and because of this you can write in essence mini-programs that can work together towards some solution.  The best part is that, if you know what a function does you can use it without necessarily knowing how it was programmed.  This is known as sharing sections of code, and code re-use.  A language that implements this compartmentalization, such as C, is known as a structured language. Structured languages in contrast discourage the use of global variables and other such notions because having something that any part of the code can affect at any time is very volatile and against the idea of functional encapsulation.  One function then impacts the state and operation of another function indirectly.  Likewise, the use of a jump command (known as goto from BASIC) is strictly forbidden or discouraged.  Jumping from one random piece of code, or functionality, to another interrupts the program flow and would adversely affect compatmentaliation.  To move from one part of the program to another you must go through the gateways of function calls.  By the way, collections of these functions, and their “gateways”, are called interfaces.

Note: on a more “miniature” scale structuring also occurs through what are known as code blocks.  These are statement enclosed inside braces, such as in a loop or if-clause.  These are treated as a unit of code in the sense that all statements in the block are executed together (linearly.)

C allows us to use common parts of imperative programming with many flow control statements like if-clauses, while loops, and for loops.  However, of more import is the system of pointers and addresses.  C employs a couple operators that enable us to refer either to a piece of data as its value, such as normal variable access, or to refer to that data’s memory address.  A variable that refers to a piece of data’s memory address is known as a pointer in C.  The pointer and addressing system in C is at once both simple and complex: simple in how it’s designed (in my opinion), complex in how it is usually employed in real world programming.  This system of memory manipulation is part of what gives C it’s “low-level” power and its ability to be compared in some cases to assembly programming.

Another thing C is known for is its lack of strong data typing.  What that means is that C does employ data types for variables (if you remember from the crash course data types prescribe what kind of values variables can hold), it doesn’t enforce these data types for operations.  For example, in C you can use integers, characters, and floating point typed variables all in the same mathematical expression.  This is accomplished by what is called as implicit type conversion.  When compiling C the compiler converts disparate data types to the required formats for computation to occur.  It is possible in some instances to pass different data types to functions than what they define for their arguments, C auto-converts the data types to fit.

C is a bit of a terse language as well.  What I mean by that is that C utilizes a much smaller number of keywords (character strings that are interpreted by the compiler as part of the language) than many other languages.  C89 employs 32 keywords, whereas C99 adds 5, and C11 adds only a few more.  BASIC as a point of comparison can have up to over 50 keywords.

What Does C Not Do?

Because of implicit type conversion C does not require or mandate strict type compatibility between variables, functions, and expressions.  As well, because of the influence of its use by “real world programmers” (more on that below) it pretty much lets the programmer do what he wants and gets out of the way.  This means that some things that are taken for granted in higher-level languages don’t exist in C, such as run-time error checking.  What this means is that there is no potential error checking done while the program is being executed.  This translates to, practically, a lack of checks or notices to inform that array boundaries are not overrun, that type conversions aren’t malicious, that pointers pointing to nothing or garbage aren’t used, etc.  This is the responsibility of the programmer writing the program.  When I worked as a full-time professional programmer I can’t tell you how many times a null pointer caused my program to seg-fault.

C does not perform any kind of real-time (meaning while the given program is executing) garbage collection.  In many higher-level languages such as Python, whose interpreter (we’ll cover the difference between compilers and interpreters in another post) is written in C, data objects are stored in memory until they are no longer referenced by any other parts of variables in the program.  Then they are succinctly removed from memory automatically, no action is required by the programmer.  This is known as automatic garbage collection, and can be found in many languages ranging from JavaScript, Java, Python, and more.  Because C allows the direct purposeful manipulation of memory and “gets out of the programmers way”, it is up to the programmer to provide their own means of garbage collection.  Memory will stay in use until the programmer frees it up himself.

Conclusion: What Influenced C?

C was created for programmers.  This may seem redundant, but remember, during the time of C’s creation many languages were being created to mitigate the trouble of non-technical individuals telling the computer what to do.  Sophisticated end-user programs that wrapped everything up in nice user interfaces for just about any field you can think of didn’t exist yet.  Using a computer often times meant programming it to do what you wanted and there were many non-technical people, who despite this, needed to accomplish this task.  Languages such as BASIC, PASCAL, and COBOL were invented to try to alleviate this difficulty.  BASIC and COBOL for instance, were not designed to be used by highly technical people.  They did very little to nothing to improve the process of their task of programming a computer.  They were designed to enable non-programmers to define things the computer should do.

C was created out of the sea of assembly programming by programmers who wanted to do highly technical things.  It was meant to improve the very process in which professional programmers were already engaged in the real world.  It was meant as a much faster and more reliable way to achieve the same types of things you could do with assembly programming, without the headache of assembly programming.  In fact, that C can be used in place of assembly language, but with the power of assembly language, was a huge key to its success.

In it’s early days C was employed to achieve highly technical purposes, that is in particular the programming of operating systems and mainframe control systems.  These types of programs needed to schedule processes, deal with memory on a very intimate fashion, offer support services to other programs, and generally required assembly level machine access.  C’s interchangeability with assembly, and in fact, superiority over assembly in terms of C being a structured language, improved these fields considerably.

Nowadays many people can achieve quite a bit with higher-level languages, and many use them in place of a middle-level language like C due to their accessibility.  Machines these days have used languages like C to build very sophisticated operating system environments that allow higher-level languages like Python or JavaScript to be able to accomplish many of the same things C has in the past, however, C is still the foundation.  And you can find that foundation still in practical use today in the realm of embedded processors and microcontrollers.  There are single-board computers that offer up an entire operating system giving us the power of Python or JavaScript (such as the Raspberry Pi, or the BeagleBone), but there are a myriad of smaller “embedded” ARM processer boards such as those offered by Texas Instruments that require the use of C to control them.  These boards have a bare bones RTOS (real-time operating system) and offer the programmer near full control of the given circuitry.  C has served as the foundation, funnily enough, to more sophisticated and accessible forms of programming, but at the same time it is here to stay for low-level programming operations.  Being almost one step away from assembly, it is an excellent language for any aspiring programmer to learn and master.

This is part of a larger series titled “How To Program Anything: C Programming

If you liked this article you might consider supporting me on my Patreon.

If a monthly commitment is a bit much for you that’s okay, you might consider buying me a cup of coffee.

photo credit: Ubahnverleih RoboCup 2016 Leipzig – Debugging via photopin (license)

Liked it? Take a second to support kadar on Patreon!

kadar

I'm just a wunk, trying to enjoy life. I am a cofounder of http//originalpursuitssoc.com/ and I like computers, code, creativity, and friends.

You may also like...

Leave a Reply

%d bloggers like this: