Programming in C: Structures

This article is part of a series – How to Program Anything: C Programming

Preface

So far we’ve been dealing with pieces of data separate from each other.  We can declare an int in one variable, a char in another variable, and arrays of many different types of variables.  But what happens when data is related to each other?  For example, let us say that we have a program that keeps track of a list of employees.  We need to know each employee’s hourly rate, and how many hours they’ve worked this pay period.  We could set up something like the following:

Then when we wanted to access a particular employee, we could use a specific index number.  So, for instance, if we wanted to know about employee number 23 we would write:

It may appear to work, but this approach has major drawbacks.  First, though we haven’t discussed it yet, each of these variables would have to be global variables, if we didn’t want to pass all three of them around to any functions that might operate on this data.  Second, each array is separate from each other, what happens if there’s a typo in the code, or some other mistake, and some of the employee information for employee 23 is accidentally read from or written to employee 25?   On top of this, if we want to change the number of employees possible in the program, we have to edit three places in the code to increase the employee number.  And lastly, let us say we use a for loop with pointer addressing.  To be able to access all the data for a particular employee we’d have to update three separate pointers into their respective arrays.  All of these issues are problematic and clunky, and fortunately there is a better way.

Structures

There is a way to combine so to speak, all the pertinent information for a given employee into one data type.  How?  By the use of a structure.  A structure is an aggregate data type, meaning that it aggregates or gathers disparate pieces of data and puts them under one heading.  In our case we can gather all the pertinent elements of an employee, being name, rate, and hours, and put them in one place as one data type.  It might look something like this:

Now our program would actually have a new data type called  employee and in fact be able to create new variables using this data type (in a moment).  This snippet of code demonstrates a structure declaration.  Note here that we didn’t actually create any new data, we simply specified fields of particular data into a template that can be used to create new variables and data.  This structure declaration is thus only a template.  We create or instantiate particular employee records by declaring variables that are of the type of the structure.  Here for example, we create an employee type variable named robert:

Now we have a variable named robert that contains a name array, a rate, and a number of hours.  Like an array, when robert is created the compiler automatically sets up memory so that the structure has all the space it needs to exist.  It creates in contiguous space the 25 element char array, the float variable, and the integer variable.  You can imagine it in memory like this:

Note that if we were to make another employee variable, say  john he would have his own memory allocation.  The name field in john would not be the same name field in robert.  Both john and robert have their own memory and their own field values.

You actually can declare variables that contain a sruct right in the struct declaration itself:

In this case, robert, john, and andrea would all be separate variabels containing a struct in the template of employee.  So you can see, the generalized struct declaration goes like this:

One nice thing about structures is that you can assign them to each other (granted the structures are of the same type) and all the data gets transferred from one structure to another.  For example, in our employee structure example, I could write:

And all the data being held in the john structure variable would get copied into the robert structure variable.  In this way I don’t have to copy each field of the structure over to robert, it’s just one big go.  But how would I access the fields of the struct if say, I only wanted the name variable?

Accessing Structure Fields

Like pointers, which had their operators * and &, structures have their own operators: . (dot) and -> (arrow).  We’ll examine the dot operator first.

When you have a variable that is of a type of struct, the fields of the struct can be accessed by appending a dot (.) to the variable name and then specifying the name of the field given in the struct.  In our employee example, say I wanted to reference the hourly rate of our robert variable.  I would write something like this:

This would access the rate variable of robert.  Remember that john, andrea, and robert all take up different memory slots, and each have their own values for the fields defined in the structure.  The generalized template for accessing a field of a variable of a particular structure is:

However, what if our variable for a particular struct is a pointer to a struct.  (We covered pointers in a previous article.)  To declare a pointer to a structure, place an asterisk after the data type and before the variable just as when creating pointers for basic data types.   Let us say we have the following in our program:

In this example empPointer is a pointer to employeeVariable.  We can still access the elements of the structure that empPointer points to, without having to resolve the pointer or use the original employeeVariable.  This is done using the arrow ( -> ) operator.  That is a dash followed by a greater than sign.  The arrow operator is used just like the dot operator above except it’s used on pointers to structures.  So for example, say we wanted to access the hourly rate for the employee record pointed to by empPointer.  We would write something like this:

In this way, pointers to structures can be passed to and from functions and into our program and we are still able to access the fields.  If we had to resolve every pointer before accessing the fields of a structure, we may encounter some situations where we wouldn’t be able to access the structure.

Field Arrays vs Structure Arrays

It is perfectly valid to define an array of structure variables, but we must make sure we know what we’re indexing.  Let’s take our employee example further and suppose we wanted an array (we discuss arrays in a previous article).  We might write something like the following:

Here we define an array titled employees as a hundred element long sequence of employee structures.  Now, the employee structure from above has an array inside of it in the form of name (a 25 element char array).  Say, in our array we want to access employee number 25, or even the hourly rate of employee number 25, we would write:

Note here that we are indexing the array of structure elements first.  Suppose then that we wanted to access the sixth letter of the name of employee number 15 in the array.  In this way then, we index the structure element first, and then the field array second:

An array of structures doesn’t change any arrays that may be contained inside the structure, you simply need to access each array from the outside in.  That is, the outside array of the structure array itself, and then the inner array after the dot operator.

Structure Within Structures

It’s actually possible to build structures that utilize other structures inside their template.  This is often referred to as a nested structure.  Say for example, we wanted to define an address structure, and then use that address structure in each record of our employee structure.  We might consider writing the following:

Here we have an address structure ’embedded’ or included in an employee structure.  Memory layout wise, each structure respects its layout, but the employee structure creates enough space in its instantiations to accommodate the address structure.  The address structure acts and performs just like a regular structure in assignments and other operations.  The only special notation that this kind of situation incurs is an extra dot notation when accessing fields.  In this sense, we have to access the outermost structure and move into the “innermost” embedded structure.  A code snippet illustrates:

You can see that in order to access the zip array of the address struct included in the employee struct we had to use the dot operator twice.  Keep in mind that when dealing with pointers to structs we can use the arrow operator where the dot operator would usually appear.

C99 Flexible Array Structure Members

C99, one of the later versions of the C language, allows you to specify an unsized array as the last member of a structure.  Remember, it must be the last element, otherwise the compiler wouldn’t be sure where to put the rest of the elements since it wouldn’t know how large the array was.  This unsized array is known as a flexible array member.

You would define such a field like so:

This can cause various number of things we haven’t covered yet to operate differently, particularly the sizeof operator, and the way we dynamically allocate memory for the structure.  Those caveats will be addressed as they come up.

C99 Designated Initializers

With C99 you can also, like with arrays, designate fields of a structure to have a value when you specify the variable of the struct data type.  You do so with the following syntax:

This is inserted in a curly brackets set after the variable declaration so for example:

This initializes the values of the structure to the given values as soon as the variable is declared.  Remember however, this is a C99 only feature.

Conclusion

Structures are ways to encapsulate related data, presumably, under one data type.  This allows us to organize data that is related to each other into one place.  Otherwise, all of our data would have to exist in separate variables, and as we programmed we’d have to remember how it call connected ourselves.  This is truly prone to drastic error, and memory wise is convoluted.  Particularly when we get to functions, we’d either have to pass an innumerable number of variables around all the time, or have a lot of global variables.  Structures allow us to put like data together.  An address can hold address fields, an employee structure can hold information pertinent to an employee.  Structures become even more important when we get into data structures, as they enable us to attach additional fields to data such as next or previous structure pointers.

This article is part of a series – How to Program Anything: C Programming

If you appreciate this article you might consider supporting my Patreon.

But if a monthly commitment is a bit much, I get it, you might consider buying me a coffee.

photo credit: Kᵉⁿ Lᵃⁿᵉ Ray and Maria Stata Center at MIT (Cambridge MA) via photopin (license)

kadar

I'm just a wunk, trying to enjoy life. I am a cofounder of http//originalpursuitssoc.com/ and I like computers, code, creativity, and friends.

You may also like...

1 Response

  1. October 11, 2017

    […] must be declared inside structures (see our previous article) or unions, they cannot exist on their own.  Because of this we’ll cover the basic generic […]

Leave a Reply

%d bloggers like this: