Introduction
Video reviewWhen we declare/define a symbol (e.g. variable, constant, function, etc.), where and how we declare it determines its scope. Let's look at a simple example that everyone should understand:
/* file1.c */
#include <stdio.h>
/* This function is visible/accessible to other functions in this file */
/* as well as functions in other files. It is a global function. */
int add(int a, int b) /* a and b are visible only in this function */
{
int local_var1 = a + b; /* visible only in this function */
return local_var1;
}
/* add's local_var1, a and b are not accessible here. */
/* main is global and MUST be global because it is called from elsewhere */
int main(void)
{
int local_var1; /* visible anywhere within main, uninitialized */
int local_var2 = 5; /* visible anywhere within main, initialized to 5 */
for (local_var1 = 0; local_var1 < 5; local_var1++)
{
int local_var3; /* visible only in this loop, uninitialized */
local_var3 = add(local_var1, local_var2);
printf("%i + %i is %i\n", local_var1, local_var2, local_var3);
}
/* local_var3 is not accessible here */
return 0;
}
/* main's local_var1 and local_var2 are not accessible here. */
/* The global function add is still accessible here. */
Compiling and executing:
Putting the add function and main into separate files:gcc -Wall -Wextra -ansi -pedantic file1.c -o file1 && ./file1 0 + 5 is 5 1 + 5 is 6 2 + 5 is 7 3 + 5 is 8 4 + 5 is 9
/* main.c */
#include <stdio.h>
/* prototype, add is defined in another file (functions.c) */
int add(int a, int b);
/* main is global and MUST be global because it is called from elsewhere */
int main(void)
{
int local_var1; /* visible anywhere within main, uninitialized */
int local_var2 = 5; /* visible anywhere within main, initialized to 5 */
for (local_var1 = 0; local_var1 < 5; local_var1++)
{
int local_var3; /* visible only in this loop, uninitialized */
local_var3 = add(local_var1, local_var2);
printf("%i + %i is %i\n", local_var1, local_var2, local_var3);
}
/* local_var3 is not accessible here */
return 0;
}
/* functions.c */
/* This function is visible/accessible to other functions in this file */
/* as well as functions in other files. It is a global function. */
int add(int a, int b)
{
int local_var1 = a + b; /* visible only in this function */
return local_var1;
}
Now, we have to specify both files when building the program:
None of this information is new. We've been doing things like this for a while. Only the most trivial programs will have all of its code in a single file. Most of the time, we will have multiple files and we need to be able to access the functions across the files.gcc -Wall -Wextra -ansi -pedantic main.c functions.c -o prog
Linkage: External vs. Internal vs. None
The technical term for the accessiblity of a symbol (e.g. function, variable, constant, etc.) is its linkage, which is either external, internal, or none:
It's interesting to note that C does not allow local functions, i.e. functions defined inside other functions. Some languages, like Pascal, Ada, and Python do allow this. (Newer versions of C# also allow this.) However, the GNU C compiler (not C++) has an extension that supports nested functions. For example:
/* nested.c Compile without -pedantic */
#include <stdio.h>
int main(void)
{
int factor = 10; /* Visible every where in main */
/* Local or nested function, not allowed in standard C. */
/* The gcc compiler does support nested functions. */
/* Neither Clang nor Microsoft's compiler supports them. */
int calculate(int a, int b)
{
int c = a + b; /* Local to this nested function */
return c * factor;
}
/* Call local function and print result */
printf("Calculated: %i\n", calculate(3, 5));
return 0;
}
Here's an example of how nested functions could be useful. This is especially useful
if the nested function is never going to be used by any other code.
void PrintInts(int array[], int size)
{
int i;
for (i = 0; i < size; i++)
printf("%i ", array[i]);
printf("\n");
}
void TestInts(void)
{
int array[] = {5, 12, 8, 4, 23, 13, 15, 2, 13, 20};
/* This comparison function is "private" to TestInts */
int compare_int1(const void *arg1, const void *arg2)
{
return *(int *)arg1 - *(int *)arg2;
}
PrintInts(array, 10); /* print the array */
qsort(array, 10, sizeof(int), compare_int1); /* sort the array */
PrintInts(array, 10); /* print the sorted array */
}
Output:
5 12 8 4 23 13 15 2 13 20
2 4 5 8 12 13 13 15 20 23
This document won't spend any more time on local variables, i.e. variables with no linkage, because
they are straight-forward and easily understood. What will be covered in more detail is the other
two types: external and internal.
Changing the Linkage with the extern and static Keywords
Going back to our add function:This is the default behavior for functions. Unless otherwise specified, functions have external linkage. There is a keyword, extern, which you can use to make this explicit:/* This function has external linkage and is accessible to all */ /* files/functions in the program. It is a global function. */ int add(int a, int b) { return a + b; }
C programmers rarely, if ever, use this keyword with functions because it is redundant. The default for all functions is extern, so the keyword is generally omitted. So, all functions have external linkage. The question is, how do you specify that a function should have internal linkage? With the intern keyword? Sadly, as obvious as that sounds, that doesn't exist. It's the static keyword:/* The extern keyword explicitly marks this function */ /* as having external linkage, which is the default. */ extern int add(int a, int b) { return a + b; }
Building the program:/* The static keyword marks this function as having internal */ /* linkage. Only functions in this file can access the function. */ static int add(int a, int b) { return a + b; }
leads to this linker error:gcc -Wall -Wextra -ansi -pedantic main.c functions.c -o prog
The exact error message will vary depending on the linker and platform, but the one thing that will be the same is the "undefined reference" to the add function. This is because, with the static keyword, the function has internal linkage, making it only visible within the file (functions.c) where it is defined./tmp/cck78WC2.o: In function 'main': main.c:(.text+0x23): undefined reference to 'add' collect2: error: ld returned 1 exit status
So, what's the main purpose of marking a function static? It's used when you don't intend for other files to access the function. Think helper functions.
Helper functions are functions that are not meant to be called from outside of the file they are defined in. They are only meant for other functions within the same file. Sure, they don't have to be made static, but, if they have external linkage (global), there's a higher chance that the name of the helper function will conflict with other global functions.
With small programs (i.e. beginning programmers), this is not usually a big deal. But, when you start having thousands or tens of thousands of functions accessible from your program (not that unlikely), you will get yourself into trouble. So, there is a simple rule-of-thumb:
If you ONLY need to access the function from within the file it is defined, mark it with the static keyword to keep it hidden/private to the file. If you intend to access the function from within the entire program (i.e. other files), don't use the static keyword.
Linkage and Non-Local Variables
As stated earlier, local variables have no linkage and are only accessible within the scope where they are defined. However, it is possible to have variables with external linkage (global) and internal linkage (file-scope). First, let's talk about file-scope variables first, as they are a little easier to understand.Here's the simple header file:
Here's a file that calculates a few geometrical values:/* geometry.h */ typedef struct GeometryResults { double circle_area; double circle_circumference; double sphere_volume; }GeometryResults;
/* file2.c */
#include "geometry.h" /* struct GeometryResults */
/* external linkage (global) */
const double PI = 3.1415926;
/* internal linkage (file-scope) */
static double area_of_circle(double radius)
{
return PI * radius * radius;
}
/* internal linkage (file-scope) */
static double circumference_of_circle(double radius)
{
return 2 * PI * radius;
}
/* internal linkage (file-scope) */
static double volume_of_sphere(double radius)
{
return 4.0 / 3.0 * PI * radius * radius * radius;
}
/* external linkage (global) */
struct GeometryResults calculate_values(double radius)
{
struct GeometryResults results;
results.circle_area = area_of_circle(radius);
results.circle_circumference = circumference_of_circle(radius);
results.sphere_volume = volume_of_sphere(radius);
return results;
}
And this is how we might want to use it:
/* file1.c */
#include <stdio.h> /* printf */
#include "geometry.h" /* GeometryResults */
/* external linkage (global) */
const double PI = 3.14;
/* prototype, defined in file2.c */
GeometryResults calculate_values(double radius);
/* helper function, not for use outside of this file */
static void print_results(const GeometryResults *results, double radius)
{
printf("With a radius of %.2f:\n", radius);
printf("----------------------\n");
printf("Area of a circle is %.2f\n", results->circle_area);
printf("Circumference of a circle is %.2f\n", results->circle_circumference);
printf("Volume of a sphere is %.2f\n", results->sphere_volume);
}
int main(void)
{
double radius; /* radius used in all calculations */
double height; /* used for volume of a cone */
double cone_volume; /* volume of a cone */
GeometryResults results; /* other geometric calculations */
radius = 5.5;
height = 10.0;
cone_volume = PI * radius * radius * height / 3;
printf("A cone with radius %.2f and height of %.2f has volume %.2f\n\n",
radius, height, cone_volume);
results = calculate_values(radius);
print_results(&results, radius);
return 0;
}
Attempting to build the program:
results in this linker cryptic error message:gcc -Wall -Wextra -ansi -pedantic -g file1.c file2.c -o prog
It's telling us the PI is defined twice, which was done on purpose to demonstrate how to avoid this message. These are the duplicated definitions:/tmp/ccsSdszg.o:(.rodata+0x0): multiple definition of 'PI' /tmp/cc42WWRP.o:(.rodata+0x0): first defined here collect2: error: ld returned 1 exit status
In file1.c
In file2.c/* external linkage (global) */ const double PI = 3.14;
This often happens when one programmer creates a global symbol in one file, and another programmer (maybe the same programmer?), creates a duplicate definition of the same symbol in another file. Also, it would still be an error even if the values assigned to PI were identical. By now, we know what the solution is: static/* external linkage (global) */ const double PI = 3.1415926;
In both files, add the static keyword to the definition to give PI internal linkage:
In file1.c
In file2.c/* internal linkage (file-scope) */ static const double PI = 3.14;
Now, building and running works as expected:/* internal linkage (file-scope) */ static const double PI = 3.1415926;
But, it seems that there's something not right with this. We have multiple definitions for PI. Even though each definition is internal to the file it is defined within, it just seems bad. Especially, since one file's definition has more precision than the other. That will likely lead to odd results at some point.gcc -Wall -Wextra -ansi -pedantic -g file1.c file2.c -o prog && ./prog A cone with radius 5.50 and height of 10.00 has volume 316.62 With a radius of 5.50: ---------------------- Area of a circle is 95.03 Circumference of a circle is 34.56 Volume of a sphere is 696.91
What we really want is to have just one definition and be able to use that one definition for all files. Let's go back to global functions to see what the solution is.
All functions (as well as all symbols), must obey the One Definition Rule (ODR). This is something we learned from the beginning and everyone should understand it pretty well. There can be exactly one definition of a function. That definition will exist in exactly one file.
All other files that wish to call the function must have a prototype (i.e. declaration) for that function. Remember, a function prototype does not include the body (i.e. curly braces and code). There are no limits to how many times you can prototype/declare a function, as long as they are all identical.
Going back to our original example with the add function, the definition was in a file called functions.c
and in main.c we have a prototype:/* This function is visible/accessible to other functions in this file */ /* as well as functions in other files. It is a global function. */ int add(int a, int b) { return a + b; }
This is pretty straight-forward and we've been doing this for a while now. So, the question becomes, "How do I create a single definition of a variable in one file and then prototype that variable in other files so I can use it?"/* add is defined in another file */ int add(int a, int b);
The answer is: with the extern keyword.
Here's a sample to demonstrate:
Attempting to build the program:
fileA.c fileB.c /* fileA.c */ #include <stdio.h> /* definition, external linkage (global) */ int a = 5; /* prototype, it's defined in fileB.c */ void foo(void); int main(void) { printf("The value of a in main is %i\n", a); foo(); return 0; } /* fileB.c */ #include <stdio.h> /* definition, external linkage (global) */ void foo(void) { /* Try to access a from fileA.c */ printf("The value of a in foo is %i\n", a); }
leads to this compiler error:gcc -Wall -Wextra -ansi -pedantic -g fileA.c fileB.c -o prog
This error makes complete sense. The compiler sees only one file at a time (fileB.c) and is unaware of the global variable a in fileA.c. The linker doesn't even get a chance to do its magic because the compilation fails.fileB.c: In function 'foo': fileB.c:7:43: error: 'a' undeclared (first use in this function) printf("The value of a in foo is %i\n", a); ^ fileB.c:7:43: note: each undeclared identifier is reported only once for each function it appears in
Of course, if we try to "declare" the variable a in the foo function, it actually ends up hiding the global a from the other file:
Not only does it hide the one we want, it's an uninitialized local variable that the compiler warns about:void foo(void) { int a; /* Try to "prototype" a from fileA.c */ printf("The value of a in foo is %i\n", a); }
This is the proper way to "declare" (NOT define) a variable in another file:fileB.c: In function 'foo': fileB.c:9:3: warning: 'a' is used uninitialized in this function [-Wuninitialized] printf("The value of a in foo is %i\n", a); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The reason we need the extern keyword is so that the compiler can distinguish between a declaration and a definition. With functions, it's easy. If the function has a body, then it is a definition. If there is no body, it's a declaration. There is no ambiguity. The extern used with functions has no bearing on the declaration/definition difference.void foo(void) { /* This is NOT a definition, it's a declaration. No space is */ /* allocated. The linker will figure out where the definition is. */ /* This tells the compiler that the variable a exists elsewhere */ /* and to not emit any errors. */ extern int a; printf("The value of a in foo is %i\n", a); }
If a variable has the extern keyword, then it is a declaration. It's just telling the compiler that the variable is defined elsewhere (so don't give an error message) and that the linker will figure out where it is. If there is no extern keyword, then it is a definition.
Like functions, you must have exactly one definition of the variable, but you can have as many declarations (using the extern keyword) that you want.
Unfortunately, there is a slight caveat in C:
The output shows that there is only one a in the program:
fileA.c fileB.c /* fileA.c */ #include <stdio.h> /* * External linkage (global). With no initializer this is * considered extern in C, but error in C++. If more than * one are initialized, it's an error in C, as well. The * solution is to use extern on all but one. There is still * only one a in the program and the linker will make sure * that there is only one definition. */ int a; int foo(void); /* prototype, it's defined in fileB.c */ int main(void) { printf("The value of a in main is %i and address is %p\n", a, (void *)&a); foo(); return 0; } /* fileB.c */ #include <stdio.h> /* * External linkage (global). With no initializer this is * considered extern in C, but error in C++. If more than * one are initialized, it's an error in C, as well. The * solution is to use extern on all but one. There is still * only one a in the program and the linker will make sure * that there is only one definition. */ int a; /* definition, external linkage (global) */ void foo(void) { printf("The value of a in foo is %i and address is %p\n", a, (void *)&a); }
By the way, what is the value of a? Why didn't the compiler complain about using an uninitialized variable?The value of a in main is 0 and address is 0x601044 The value of a in foo is 0 and address is 0x601044
Going back to our original problem:
In file1.c
In file2.c/* internal linkage (file-scope) */ static const double PI = 3.14;
we want to put the extern keyword on one of these. We'll assume that the definition is in file2.c and that file1.c will have the extern keyword:/* internal linkage (file-scope) */ static const double PI = 3.1415926;
In file1.c
In file2.c/* Declaration. Keeps the compiler happy. PI is defined in another file. */ /* Do not initialize it with any value, or you will get a compiler error. */ */ extern const double PI;
/* Definition. External linkage (global), visible to entire program */ const double PI = 3.1415926;
Be careful not to do this:With global variables, only one occurrence (the variable without the extern keyword) is allowed to have an initializer. All of the others (with the extern keyword) can not have any initializer.
file1.c file2.c file3.c file4.c int somevar = 10; extern int somevar; extern int somevar; extern int somevar;
This will lead to this helpful (linker) error message:
fileA.c fileB.c /* fileA.c */ #include <stdio.h> extern int a; /* Not a definition */ int foo(void); /* Not a definition */ int main(void) { printf("The value of a in main is %i and address is %p\n", a, (void *)&a); foo(); return 0; } /* fileB.c */ #include <stdio.h> extern int a; /* Not a definition */ /* definition, external linkage (global) */ void foo(void) { printf("The value of a in foo is %i and address is %p\n", a, (void *)&a); }
Tip: If many files need to access a global variable, you should put the extern declaration in a header file and include that in the files that need access to it. Remember, you can't put definitions in header files, only declarations, so this is a valid technique./tmp/ccIKnPSd.o: In function 'main': fileA1.c:(.text+0x6): undefined reference to 'a' fileA1.c:(.text+0xb): undefined reference to 'a' /tmp/ccp26QUD.o: In function 'foo': fileB1.c:(.text+0x6): undefined reference to 'a' fileB1.c:(.text+0xb): undefined reference to 'a' collect2: error: ld returned 1 exit status
Storage Classes
The previous discussions above focused on visibility and scope. Now, we're going to talk about storage classes. There are two parts to a storage class:The lifetime of an auto object is until the end of the scope.
The lifetime of an extern object is until the program ends.
The lifetime of a static object is until the program ends.
The lifetime of a register object is until the end of the scope.
/* storage.c */
#include <stdio.h>
void foo(void)
{
static int count = 1; /* not stored on the stack, value persists between function calls */
printf("The function has been called %i times.\n", count++);
}
int main(void)
{
auto int a; /* a is uninitialized on the stack, auto is redundant */
int b; /* b is uninitialized and is also on the stack (same as auto) */
register int c; /* c will be put in a register, if possible (no guarantee) */
extern int d; /* d is defined elsewhere, this is just a declaration */
for (a = 0; a < 5; a++)
{
static int e = 5; /* not stored on the stack, value persists in loops */
static int f; /* not stored on the stack, f will be initialized to 0 */
int g = 10; /* on the stack, g will be initialized to 10 each time */
printf("e is %i, f is %i, g is %i\n", e++, f++, g++);
foo();
}
return 0;
}
Output:
Notes:e is 5, f is 0, g is 10 The function has been called 1 times. e is 6, f is 1, g is 10 The function has been called 2 times. e is 7, f is 2, g is 10 The function has been called 3 times. e is 8, f is 3, g is 10 The function has been called 4 times. e is 9, f is 4, g is 10 The function has been called 5 times.