C++ Basics
This post is about C++ and the way we can compile a C++ program. I remember in the early days of my PhD in physics that I used to write all my code into one main.cpp
and compile it into an executable. Over the months and years I understood that it is more efficient to write in different files serving different purposes and compile and link your libraries. I write this post as a reminder for me but also for these people that are starting in C++ as a guide.
TLDR
1
2
3
4
5
6
7
8
9
# step by step compilation
g++ -E main.cpp -o main.i # preprocessing
g++ -S main.cpp -o main.s # compilation: assembly
g++ -c main.s -o main.o # compilation: object file
g++ main.o -o hello_world # linking
# compilation all at once (one file)
g++ main.cpp -o hello_world
Compilation
Compilation is a process by which we translate from human language (code) to machine language, or instructions that the machine can understand. There are several steps, let’s explain it using a file example saved as main.cpp
1
2
3
4
5
#include <iostream>
int main() {
std::cout << "Hello, World!" << std::endl; return 0;
}
Installing the compiler
We will isntall gcc
(GNU compiler collection), the compiler for C and g++
, the compiler for C++. In linux install with
1
2
sudo apt update
sudo apt install gcc g++
In MacOS install using brew:
1
brew install gcc
The brew formulae alrady has gcc
and g++
, check both compilers have been successfully installed with
1
2
gcc --version
g++ --version
Preprocessing
The preprocessing step has several functions:
- File inclusion: When encountering
#include
, the preprocessor replaces it with the actual content of the included file - Macro expansion: Macros defined with
#define
are replaced wherever they appear in the code. E.g.#define PI 3.14159; double area = PI * r * r;
is after preprocessingdouble area = 3.14159 * r * r;
on the variablearea
. - Conditional compilation: IFDEFs are included/not included in this step, for instance
#ifdef DEBUG std::cout << "Debugging enabled" << std::endl; #endif
. ifDEBUG
is added in the flag-DDEBUG
it will add these lines. - Removes comments: For instance comments in
//this is a comment
won’t be included after the source file is processed ` The command to generate amain.i
preprocessed file is:
1
g++ -E main.cpp -o main.i
Try to take a look at the file. There’s many things modified. The preprocessing is useful to add all the necessary code in the preprocessed file, also the compiler will get a unique file with code to compile to an object. It is useful to add or delete configurations and other directives using ifdef
and other commands, this way we can compile for DEBUG
or maybe CUDA
only code.
Compilation: Generate assembly code
This step instructs the compiler to translate the preprocessed C++ source file (main.i
) into assembly code, which is a low-level, human-readable representation of the instructions for the target machine’s CPU.
The following command precompiles and then generates the assembly code for your specific CPU:
1
g++ -S main.cpp -o main.s
The compiler reads the preprocessed file and checks for errors in variables, types correctness, compliance of the C++ standard. Then it constructs the abstract syntax tree, that is a tree structure of the program’s execution. Then it optimizes the instructions by removing unnecessary operations or simplifying operations. Finally writes the hardware specific instructions in human readable code to the file.
Compilation: Generate object file
The following step is produced by the assembler, that translates assembly code produced in the previous step to object code or compiled code that the CPU can execute. However it is not yet a complete executable.
The code for this step is
1
g++ -c main.s -o main.o
We create a source binary file that is not linked to any executable. This is useful for modular projects with multiple source files.
Linking: Executable generation
Now that we have machine interpretable code we need to link all the objects to generate an executable that can be called from the machine and produces an output or calculation. In our simple case
1
g++ main.o -o hello_world
The linking phase is simple here because we are just dealing with a single object file but in general we have many. We will see how to deal with that in a follow-up post. What we end up having from the command is a hello_world
file that can be executed as ./hello_world
to output Hello World! in your screen.