The Steps of Compilation with the GNU Compiler Collection (GCC).
By now you must have already been trough the UN-official ritual of initiation when taking on any programming language, the printing of the “Hello World” message. If you went thought this ritual using the LINUX terminal you must likely used the GNU Compiler Collection (GCC). You created a file with a ‘.c’ extension, you most likely called it ‘main.c’ and used the command <gcc main.c> to compile it.
Well well well, turns out that by default on a successful compilation the GCC compiler will save the output (a.k.a. program) to the current directory under the name “a.out”, careful, if a file already exist with that name it will be replaced by the new one without warning. But how does said compilation worked? How does it separates the code from the comments? How does it know where to look for that the ‘printf’ function? When is it translated into something understood by the computer?….
The Four steps in The Compilation Process
When you invoked the GCC compiler, a series of steps are performed in order to generate an executable from the file(s) you indicated for your C program source file(s). These steps are as follow :
- Preprocessor : Removes comments from the source code and interpret prepocessor directives, which are statements that begin with # in our main.c file. The Preprocessor would replace the “include <stdio.h>” with the contents of stdio.h. and removes the comments describing the function.
- Compiler : Translates the C code into machine level code, assembly language, that has the instruction that manipulate the memory and processor directly. Output is a file with the ‘.s’ extension, “main.s”
- Assembler : Takes in the assembly code from the compiler and turns it into binary code and the output is and Object code file with the ‘.o’ extension. In this case “main.o”
- Linker : The object code from the source file is linked with the pre-compiled libraries provided by the C compiler and any other library we specified and gives us the executable file that in this case is called “a.out” by default.
PS: You can specify the desired filename of the executable using the “-o’ option at the moment of compilation. <gcc main.c -o desiredName>