The Steps of Compilation with the GNU Compiler Collection (GCC).

Gustavo Hornedo
2 min readDec 2, 2020

--

By now you must have already been trough the UN-official ritual of initiation when taking on any programming language, the printing of the “Hello World” message. If you went thought this ritual using the LINUX terminal you must likely used the GNU Compiler Collection (GCC). You created a file with a ‘.c’ extension, you most likely called it ‘main.c’ and used the command <gcc main.c> to compile it.

In the image above we first show the contents of the current directory with <ls> , we display the content of the only file in said directory <cat main.c>, we COMPILE it using the GNU Compiler Collection (GCC) <gcc main.c> and we show once again the contents of the current directory <ls>.

Well well well, turns out that by default on a successful compilation the GCC compiler will save the output (a.k.a. program) to the current directory under the name “a.out”, careful, if a file already exist with that name it will be replaced by the new one without warning. But how does said compilation worked? How does it separates the code from the comments? How does it know where to look for that the ‘printf’ function? When is it translated into something understood by the computer?….

The Four steps in The Compilation Process

When you invoked the GCC compiler, a series of steps are performed in order to generate an executable from the file(s) you indicated for your C program source file(s). These steps are as follow :

The Compilation Process
  • Preprocessor : Removes comments from the source code and interpret prepocessor directives, which are statements that begin with # in our main.c file. The Preprocessor would replace the “include <stdio.h>” with the contents of stdio.h. and removes the comments describing the function.
  • Compiler : Translates the C code into machine level code, assembly language, that has the instruction that manipulate the memory and processor directly. Output is a file with the ‘.s’ extension, “main.s”
  • Assembler : Takes in the assembly code from the compiler and turns it into binary code and the output is and Object code file with the ‘.o’ extension. In this case “main.o”
  • Linker : The object code from the source file is linked with the pre-compiled libraries provided by the C compiler and any other library we specified and gives us the executable file that in this case is called “a.out” by default.

PS: You can specify the desired filename of the executable using the “-o’ option at the moment of compilation. <gcc main.c -o desiredName>

--

--