First C Programs (The Chicken and The Egg)

The general form of a C program looks like this: (The parts in bold are required)
include files

function declarations (prototypes)

function definitions (implentations)

data declarations/definitions (global)

main function header
  data declarations/definitions (local)

other functions
Therefore, the simplest C program you can write:
int main(void)
Technically, you should have a return statement:
int main(void)
  return 0;
It looks simple because it is. It does nothing of interest. But, nevertheless, it produces a "functional" program. This simple program demonstrates many characteristics of a C program.

Students that think that int main(void) can be changed to int main() are required to read this, as I'm not going to spend any more time on it. (Yes, I know that some good textbooks interchange them, but you shouldn't.)

A real example of a function that takes two integers, adds them together, and returns the result:

int add(int a, int b)
  return a + b;
Sample usage:
x = add(1, 2);
y = add(5, 10);
z = add(-4, 4);
A second program in C that actually does something (Diagram):
#include <stdio.h>

/* Say hi to the world */
int main(void)
  printf("Hello, World!\n");
  return 0;
Hello, World!
Adding line numbers for clarity. They are not (and cannot be) present in the actual code.
1. #include <stdio.h>
3. /* Say hi to the world */
4. int main(void)
5. {
6.   printf("Hello, World!\n");
7.   return 0;
8. }
Each non-blank line above has significant meaning to the C compiler. Look at stdio.h to see what the pre-processor adds.

Editing, Preprocessing, Compiling, Linking, and Executing

Edit/Preprocess/Compile/Execute LoopEdit/Preprocess/Compile/Execute Loop Extended

Contrasting Languages at 3 Levels

Let's say we have 4 variables (just like in algebra), where a = 1, b = 2, c = 3, d = 4. We want to set a fifth one, e, to ab(c + d). We can rewrite the multiplication explicitly as

a ⋅ b ⋅ (c + d)
This will result in e having the value 14.

C code (partial program)

int a = 1;
int b = 2;
int c = 3;
int d = 4;
int e = a * b * (c + d);

Assembly language (16-bit Intel x86)

Assembly language (compiler-generated)Same Assembly language with comments
_main proc  far
   mov word ptr [bp-2],1  
   mov word ptr [bp-4],2  
   mov word ptr [bp-6],3  
   mov word ptr [bp-8],4  
   mov ax,word ptr [bp-2] 
   imul  word ptr [bp-4]  
   mov dx,word ptr [bp-6] 
   add dx,word ptr [bp-8] 
   imul  dx               
   mov word ptr [bp-10],ax 
_main endp
_main proc  far
   mov word ptr [bp-2],1  ;the address of a is bp-2
                          ;the value is 1

   mov word ptr [bp-4],2  ;the address of b is bp-4
                          ;the value is 2

   mov word ptr [bp-6],3  ;the address of c is bp-6
                          ;the value is 3

   mov word ptr [bp-8],4  ;the address of d is bp-8  
                          ;the value is 4 

   mov ax,word ptr [bp-2] ;put a's value in ax reg

   imul  word ptr [bp-4]  ;multiply ax reg by b, 
                          ;put result back in ax

   mov dx,word ptr [bp-6] ;put c's value in dx reg 
   add dx,word ptr [bp-8] ;add d to the dx reg
                          ;put result back in dx

   imul  dx               ;multiply ax reg by dx
                          ;put result back in ax

   mov word ptr [bp-10],ax ;the address of e is bp-10
                           ;the value of e is 14
_main endp

Hand-coded assembler (32-bit 80386 using GNU's assembler, comments start with # and are in bold)

.section .data
a: .long 1
b: .long 2
c: .long 3
d: .long 4
e: .long

.section .text
.globl _start

  movl a, %eax      #     a -> %eax
  movl b, %ebx      #     b -> %ebx
  imull %eax, %ebx  # a * b -> %ebx

  movl c, %eax      #     c -> %eax
  movl d, %ecx      #     d -> %ecx
  addl %eax, %ecx   # c + d -> %ecx

  imull %ebx, %ecx  # (a * b) * (c + d) -> %ecx
  movl %ecx, e      # put result in e
Here are two other assembler samples output from Microsoft's compiler and GNU's compiler.

Machine language (Intel 80386)

Hexadecimal dump (base 16)Octal dump (base 8)
Now, which language would you rather work with?

Simple calculation:

int x = 5;
int y = 3;
int z = x + y;
Simplified view at runtime showing 3 variables in memory, a CPU with 4 registers and an Arithmetic-Logic Unit:
or possibly like this:    
Assembly code (generated by the GNU C compiler, comments added):
movl  $5, -12(%rbp)    ; put 5 into memory location x
movl  $3, -8(%rbp)     ; put 3 into memory location y
movl  -12(%rbp), %eax  ; put contents of x into eax register
movl  -8(%rbp), %ebx   ; put contents of y into ebx register
addl  %ebx, %eax       ; add ebx and eax registers, put result in eax register
movl  %eax, -4(%rbp)   ; put contents of eax register into memory location z

Hopefully, you can see why people prefer to program in C rather than assembly. The end result is the same, but the amount of effort required from the C programmer is significantly less.

Putting It All Together

Step 1: Edit

Create a text file for this C code named simple.c. The size of this file is about 120 bytes (although this depends on the operating system and the type of whitespace used). You can use any text editor like, such as Notepad++ (Windows only) or Geany (Windows, OS X, Linux, and others).
int main(void)
  int a = 1;
  int b = 2;
  int c = 3;
  int d = 4;
  int e = a * b * (c + d);

  return 0;

Step 2: Compile (with Preprocess)

The source code (text) is compiled into object or machine code (binary) and saved in a file named simple.o. The size of this file will probably be between 500 and 2000 bytes (depends on the compiler, version, and options used). Using
gcc -c simple.c -o simple.o

Step 3: Link

The object file is linked (combined) with other object code and saved in a file named simple.exe. (The .exe is a Windows requirement as other operating systems don't require it.) The size of this file is going to be tens of thousands of bytes. (Again, depends on the compiler.)
gcc simple.o -o simple.exe

Step 3: Execute

Run the executable file by simply typing the name of the executable file (providing the .exe extension when executing it is optional under Windows):
Of course, nothing appears to happen. The program did run and it did perform the calculations. There just aren't any instructions in the program that display anything for you to see.

We can modify it to use the printf (print formatted) function and display the value of e after the calculations:

#include <stdio.h> /* printf */

/* Calculate some values */ 
int main(void)
  int a = 1;
  int b = 2;
  int c = 3;
  int d = 4;
  int e = a * b * (c + d);

    /* Display the result as an integer value */ 
  printf("%i\n", e);

  return 0;
The C program above is a complete program that, when executed, will print the value 14 on the screen.

Here's a more detailed diagram of what's happening. Assume that the program has been written in C and is being built with the GNU tools.

This is the Lifetime of a Program:

The source code is text. This is the only part of the
development process that involves the programmer.

The preprocessor program is called cpp (C Pre-Processor).

The output from the preprocessor is also C (text) and is
called a translation unit.

The compiler is called gcc.

The output from the compiler is text and is called
assembly code.

The assembler program is called as.

The output from the assembler is binary and called object
or machine code.

The linker program is called ld. It also gets input
from the libraries.

The output from the linker is also binary and it's ready
to be executed.

The loader is part of the operating system that loads the
binary from the disk, translates addresses, and places it
in memory.

The translated executable code is then fed into the CPU for

The program (now called a process) will continue to
execute until it receives an instruction to stop.


The commands above were really shortcuts for the entire process. Let's look at each step individually to gain a clearer understanding of what is actually taking place. Remember that the C programming language (and all of its associated tools) are case sensitive. 99% of the time, everything should be in lowercase. Exceptions will be noted below.

Step 1 - Creating/Editing the source file (.c):

From the command line, invoke a text editor (e.g. notepad++) and create/edit the source file:
notepad++ simple.c
Just type the code above (or copy/paste) and save the file.

Step 2 - Preprocessing the source file (.i):

cpp simple.c -o simple.i
The output file (simple.i) contains the original source code, plus a lot of code from the stdio.h header file. The -o option tells cpp what to name the output file. The output file, simple.i must follow immediately after the -o option. (The file extension .i is the conventional way to name a preprocessed file.)

Note: The cpp command above stands for C pre-processor, NOT C plus plus.

Step 3 - Compiling to an assembly file (.s):

Use the GNU C compiler, gcc, to translate the preprocessed file into assembly code.
gcc -fpreprocessed -S simple.i -o simple.s

Step 4 - Assembling into an object file (.o):

Use the GNU assembler (as) to assemble/convert the assembly file to object code.
as simple.s -o simple.o
The -o option tells the assembler, as, to name the object file simple.o (The file extension .o is the conventional way to name an object file.) To view the disassembled code, use either of these:
objdump -d simple.o (with code comments)
dumpbin /disasm simple.o

Step 5 - Linking to an executable file (.exe):

Use the GNU linker, ld (that's a lowercase 'L'), to link the object file with code from the standard libraries and create an executable named simple.exe

Do NOT copy and paste the text below. The command that you type must be on a single line and the text below is broken into multiple lines. This link here: linker command on one line, has the command as one long line. Copy that text instead.

ld -plugin C:/mingw64/bin/../libexec/gcc/x86_64-w64-mingw32/8.1.0/liblto_plugin-0.dll
-plugin-opt=-pass-through=-lmingw32 -plugin-opt=-pass-through=-lgcc -plugin-opt=-pass-through=-lgcc_eh
-plugin-opt=-pass-through=-lmoldname -plugin-opt=-pass-through=-lmingwex -plugin-opt=-pass-through=-lmsvcrt
-plugin-opt=-pass-through=-lpthread -plugin-opt=-pass-through=-ladvapi32 -plugin-opt=-pass-through=-lshell32
-plugin-opt=-pass-through=-luser32 -plugin-opt=-pass-through=-lkernel32 -plugin-opt=-pass-through=-liconv
-plugin-opt=-pass-through=-lmingw32 -plugin-opt=-pass-through=-lgcc -plugin-opt=-pass-through=-lgcc_eh
-plugin-opt=-pass-through=-lmoldname -plugin-opt=-pass-through=-lmingwex -plugin-opt=-pass-through=-lmsvcrt
--sysroot=C:/mingw710/x86_64-710-posix-seh-rt_v5-rev1/mingw64 -m i386pep
-Bdynamic C:/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.1.0/../../../../x86_64-w64-mingw32/lib/../lib/crt2.o
-LC:/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.1.0 -LC:/mingw64/bin/../lib/gcc
-LC:/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.1.0/../../.. simple.o -lmingw32 -lgcc
-lgcc_eh -lmoldname -lmingwex -lmsvcrt -lpthread -ladvapi32 -lshell32 -luser32 -lkernel32 -liconv -lmingw32 -lgcc
-lgcc_eh -lmoldname -lmingwex -lmsvcrt C:/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.1.0/crtend.o -o simple.exe

Step 6 - Executing the program:

At the command prompt you just need to type the name of the executable file. On Windows type:
On Mac or Linux type (or if you're using the Windows Power Shell):
Usually, you don't need to supply the .exe file extension. The linker will add that automatically (on Windows) and leave it off for Mac and Linux. By omitting the extension:
-o simple
the linker will "do the right thing" based on the operating system.


The above step-by-step tutorial was just to show you what's going on behind-the-scenes. Normally for a simple project (meaning 99% of the time) you will simply do this:
gcc simple.c -o simple
and that will perform all of the steps above.

Video review

Additional Compile Options

The compiler accepts many, many (hundreds? thousands?) of options that guide its behavior. To get some information on gcc, type:
man gcc
and it will display a 332 page help file (manual page) on everything you would want to know about gcc. You can also browse the options online.

Setting the Warning Level

To enable the compiler to perform a more thorough check of your source code, use the -Wall command line option like so:
gcc -Wall -c simple.c -o simple.o
Now, if the compiler detects any potential problems or misuse of the C language, it will alert you with a warning message. For example, if I add the variable f like this:
  int a = 1;
  int b = 2;
  int c = 3;
  int d = 4;
  int f = 5; /* Add this variable, but don't use it anywhere */ 

  int e = a * b * (c + d);
  printf("%i\n", e);
  return 0;
I get this warning from the compiler:
simple.c: In function 'main':
simple.c:10: warning: unused variable 'f'
or, if I add something like this:
a + b - c;   /* Perform some calculation, but discard the value */ 
I get this warning from the compiler:
simple.c: In function 'main':
simple.c:11: warning: statement with no effect

Compile and Link in One Step

If you want to perform both compile and link steps with one command, don't provide the -c option. This will compile and then link the program:
gcc simple.c -o simple.exe
However, if the compile step fails for any reason, the link step is skipped.

Other Useful Options

If you want to generate the assembly output as above, use -S option:
gcc -S simple.c 
This will produce a text file called simple.s which you can view with any text editor.

If you want to generate the preprocessor output directly from gcc, use -E option:

gcc -E simple.c 
This will produce a ton of information to the screen. To capture the output so you can view it more easily, redirect the output to a file:
gcc -E simple.c > simple.out
  > simple.out 
causes the output to be written to a file (named simple.out) instead of to the screen. This file is a text file that you can view with any text editor.

Modern compilers have two primary goals:

1. To verify that your code is valid C code
2. To convert/translate the C (text) code into machine (binary) code

The compiler will verify that this code is valid, legal C code, although it is incorrect. The formula to calculate the volume of a sphere is:
─── πr3
Here's an attempt to calculate it based on a radius of 2:
#include <stdio.h>

#define PI 3.1415926

int main(void)
  int radius = 2;
  double volume = 4 / 3 * PI * radius * radius * radius;

  printf("The volume of a sphere with radius %i is %f\n", radius, volume);

  return 0;
The expected output is:
but the actual output is
As far as the compiler is concerned, there is nothing wrong with the code. But, the use of the data is incorrect. That's because the compiler sees this:
1 * 3.1415926 * 2 * 2 * 2
(4/3 is 1 using integer division, meaning no fractional portion) when it should see this:
1.333333 * 3.1415926 * 2 * 2 * 2
It's kind of like doing this:
printf("The sum of 3 and 4 is %i\n", 3 - 4);
The compiler can't check that you are subtracting when you meant to be adding.

So, the moral of the story is:

Just because your code compiles, links, (and even runs), doesn't mean it is correct.

Another Example

This example will show constructs such as identifiers, literal constants, defines, expressions, and several others.

The C code: marathon.c: (with line numbers for clarity)

 1. #include <stdio.h> /* printf */
 3. /* Convenient definitions */
 4. #define YARDS_PER_MILE 1760
 5. #define KILOS_PER_MILE 1.609
 7. /* A marathon is 26 miles, 385 yards               */
 8. /* Prints the distance of a marathon in kilometers */
 9. int main(void)
10. {
11.   int miles = 26;    /* Miles in a marathon                 */
12.   int yards = 385;   /* Yards in a marathon                 */
13.   double kilometers; /* Calculated kilometers in a marathon */
15.     /* Convert miles and yards into kilometers */
16.   kilometers = (miles + (double)yards / YARDS_PER_MILE) * KILOS_PER_MILE;
18.     /* Display the result on the screen */
19.   printf("A marathon is %f kilometers\n", kilometers);
21.     /* Return successful value to OS */
22.   return 0;
23. }
The program will output: A marathon is 42.185969 kilometers

Without line numbers (to copy and paste):

#include <stdio.h> /* printf */

/* Convenient definitions */
#define YARDS_PER_MILE 1760
#define KILOS_PER_MILE 1.609

/* A marathon is 26 miles, 385 yards               */
/* Prints the distance of a marathon in kilometers */
int main(void)
  int miles = 26;    /* Miles in a marathon                 */
  int yards = 385;   /* Yards in a marathon                 */
  double kilometers; /* Calculated kilometers in a marathon */
    /* Convert miles and yards into kilometers */
  kilometers = (miles + (double)yards / YARDS_PER_MILE) * KILOS_PER_MILE;

    /* Display the result on the screen */
  printf("A marathon is %f kilometers\n", kilometers);
    /* Return successful value to OS */
  return 0;

The preprocessed file named marathon.i (generated by: gcc -E marathon.c -o marathon.i)

The assembly file named marathon.s (generated by: gcc -S marathon.c -o marathon.s)

The code above uses a few arithmetic operators. Many operators correspond with the ones you've seen in algebra. Here are a few binary operators:

Operator Meaning
+ Add
- Subtract
* Multiply
/ Divide
% Modulo
A full list of operators including their precedence and associativity.

A note on casting: The integer values are stored differently than the floating point values:

integer 1: 00000000000000000000000000000001
float 1.0: 00111111100000000000000000000000

integer 50: 00000000000000000000000000110010
float 50.0: 01000010010010000000000000000000
The IEEE-754 notation is really showing the bit pattern for 1.0 x 20, which is 1.0 and 1.5625 x 25, which is 50.0.

You can see that the format of the binary 1.0 is very different than the integer version. The integer format is in two's complement and the floating point value is in IEEE Standard for Floating-Point Arithmetic notation.

Computer Data Storage (Refresher)

This section is just a short refresher on the binary, decimal, and hexadecimal number systems. Relationship between hex and binary numbers:
0 0000 4 0100 8 1000 C 1100
1 0001 5 0101 9 1001 D 1101
2 0010 6 0110 A 1010 E 1110
3 0011 7 0111 B 1011 F 1111

Character representations

The number above, 1011101000010101000111100011, translates into hex (BA151E3) and decimal (195,121,635) as:

Binary 1011 1010 0001 0101 0001 1110 0011
Hexadecimal B A 1 5 1 E 3
Decimal 195121635
More information on Binary numbers.

Binary/Decimal converter (BinConverter.exe)

Lexical Conventions

C programs are typically stored in one or more files on the disk. These files are given a rather fancy name: translation units. After the pre-processor has removed the directives and performed the appropriate action, the compiler starts its job. The first thing the compiler needs to do is to parse through all of the tokens in the files.

There are different classes of tokens (lexical elements). In no particular order they are:

  1. keywords
  2. identifiers
  3. constants
  4. operators (Lots!)
  5. punctuators and separators
The standard actually names these 6:
  1. keywords
  2. identifiers
  3. constants
  4. string literals
  5. operators
  6. punctuators
White space includes things like blank spaces, tab, newlines, etc. Comments are a form of whitespace since they are stripped out (by the preprocessor) and replaced by a single space.

We will spend the entire course studying these aspects of the C language.


Some examples:
Valid Invalid Invalid Reason
Starts with a digit (Must start with a letter or underscore)
1. Doesn't start with a letter or underscore
2. $ is illegal character
$ is an illegal character
- is an illegal character (for an identifier)
foo bar
Can't have spaces in identifier names
int is a keyword
Good Identifier Names Bad Identifiier Names
  int rate;
  int time;
  int distance;

  rate = 60;
  time = 20;
  distance = rate * time;
  int x;
  int y;
  int z;

  x = 60;
  y = 20;
  z = x * y;
  #define PI 3.1416F
  float radius = 5.25F;
  float sphere_volume;
  sphere_volume = 4.0F / 3.0F * PI * radius * radius * radius;
  #define A 3.1416F
  float id1 = 5.25F;
  float id2;
  id2 = 4.0F / 3.0F * A * id1 * id1 *id1;
  double base = 2.75;
  double height = 4.8;
  double area_of_triangle = 0.5 * base * height;    
  double table = 2.75;
  double chair = 4.8;
  double couch = 0.5 * table * chair;  


Keywords are identifiers that are reserved for the compiler. You can't use any of these as identifiers:
auto const double float int short struct unsigned
break continue else for long signed switch void
case default enum goto register sizeof typedef volatile
char do extern if return static union while
The keywords in red are not used in this course as they are somewhat more advanced. The auto and register keywords are deprecated and shouldn't be used at all. (The meaning of auto means something completely different in modern C++.) There are also about 92 keywords in C++, which is an enormous language!)

Also, remember that C is case-sensitive so these keywords must be typed exactly as shown. int is not the same as Int or INT.


A literal value is a constant just as you type it in the code:
 int a = 1;
 float f = 3.14F;
 double d = 23.245;
Constant Type
5 int
3.14 double
3.14F float
'A' int
"hello" string


A large part of your programming career will deal with algorithms.

Euclid's algorithm in English (with some algebra thrown in):

StepActions to be Performed
1 Assign the larger number to M, and the smaller number to N.
2 Divide M by N (M/N) and assign the remainder to R.
3 If R is not 0, then assign the value of N to M, assign the value of R to N, and return to step 2.
If R = 0, then the GCD is N and the algorithm terminates.

Notice that in Step 3 there are two different possibilities. This is typically how algorithms work. There usually needs to be some terminating condition, otherwise the algorithm (program) runs forever.

How about the numbers 101 and 27?

Here is how we might write the algorithm in pseudo-code:

  1. Assign larger value to M
  2. Assign smaller value to N
  3. Divide M by N and assign remainder to R
  4. While remainder, R, is not 0
    1. Assign N to M
    2. Assign R to N
    3. Divide M by N and assign remainder to R
  5. End While
  6. The algorithm has terminated and the GCD is N
Coding this algorithm in an assembler language might look like the code below. This version is using memory locations that are named M and N for easier understanding.

Video review

.section .data
M: .long 45        # put 45 in location named M
N: .long 12        # put 12 in location named N

.section .text
.globl main

  movl M, %eax     # put M in %eax
  movl N, %ecx     # put N in %ecx
  movl $0, %edx    # zero out for idivl
  idivl %ecx       # divide M/N, (%edx:%eax/%ecx)
                   # result -> %eax, remainder -> %edx

  cmpl $0, %edx    # is remainder 0?
  je loop_exit     # if 0, we're done

  movl %ecx, %eax  # put N in M
  movl %edx, %ecx  # put R in N
  movl $0, %edx    # zero out for idivl
  idivl %ecx       # divide M/N, (%edx:%eax/%ecx)
                   # result -> %eax, remainder -> %edx
  jmp start_loop   # check again
This second version doesn't use any memory locations (like M and N above). All values are stored directly in the registers on the CPU.
.section .data
.section .text
.globl main

  movl $45, %eax    # put 45 in eax (M)
  movl $0, %edx     # set to 0 (high word of divisor)
  movl $12, %ecx    # put 12 in ecx (N)
  idivl %ecx        # divide M/N, (%edx:%eax/%ecx)
                    # result -> %eax, remainder -> %edx

  cmpl $0, %edx     # Is remainder 0?
  je loop_exit      # if 0, we're done

  movl %ecx, %eax   # put N into M
  movl %edx, %ecx   # put R into N
  movl $0, %edx     # set high word
  idivl %ecx        # divide M/N, (%edx:%eax/%ecx)
                    # result -> %eax, remainder -> %edx
  jmp start_loop    # continue algorithm

Euclid's GCD algorithm as high-level computer programs (assuming that M and N are integers and already have values and that M > N)

R = M % N;
while (R != 0)
  M = N;
  N = R;
  R = M % N;
$R = $M % $N;
while ($R != 0)
  $M = $N;
  $N = $R;
  $R = $M % $N;
R = M % N
while R != 0:
  M = N
  N = R
  R = M % N
R := M Mod N;
While R <> 0 Do 
  M := N;
  N := R;
  R := M Mod N;
WHILE R <> 0
  M = N
  N = R
  R = M MOD N

The algorithm reprinted:

StepActions to be Performed
1 Assign the larger number to M, and the smaller number to N.
2 Divide M by N (M/N) and assign the remainder to R.
3 If R is not 0, then assign the value of N to M, assign the value of R to N, and return to step 2.
If R = 0, then the GCD is N and the algorithm terminates.

Another way to state step #3:
While R is not 0, assign the value of N to M, assign the value of R to N, and go to step 2.
If R = 0, then the GCD is N and the algorithm terminates.

This is how you would begin to implement this algorithm. Use the pseudocode to guide your implementation:

#include <stdio.h> /* printf */

int main(void)
  /* 1. Assign larger value to M  */
  /* 2. Assign smaller value to N */
  /* 3. Divide M by N and assign remainder to R */
  /*  4. While remainder, R, is not 0 */
    /* a. Assign N to M */
    /* b. Assign R to N */
    /* c. Divide M by N and assign remainder to R */
  /* 5. End While */
  /* 6. The algorithm has terminated and the GCD is N */

  return 0;