<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en"><generator uri="https://jekyllrb.com/" version="4.4.1">Jekyll</generator><link href="https://agramunt.me/feed.xml" rel="self" type="application/atom+xml" /><link href="https://agramunt.me/" rel="alternate" type="text/html" hreflang="en" /><updated>2026-02-09T22:48:41-08:00</updated><id>https://agramunt.me/feed.xml</id><title type="html">Sebastia Agramunt Puig - Blog</title><subtitle>A blog about software engineering, maths, physics and cryptography.</subtitle><author><name>Sebastia Agramunt Puig</name></author><entry><title type="html">CUDA Python package</title><link href="https://agramunt.me/posts/cuda-python/" rel="alternate" type="text/html" title="CUDA Python package" /><published>2025-12-29T18:28:00-08:00</published><updated>2026-01-24T14:40:09-08:00</updated><id>https://agramunt.me/posts/cuda-python</id><content type="html" xml:base="https://agramunt.me/posts/cuda-python/"><![CDATA[<p>In this post we will learn how to expose CUDA functionality in Python so effectively calling your custom CUDA code from Python without much hassle.</p>

<h2 id="introduction">Introduction</h2>

<p>CUDA is not the simplest framework to learn, first you need to know C++ and then understand the Nvidia GPU internals. Then you can write kernels and parallelize your calculations. People like to code in Python as it is a super easy interpreted language to learn but it’s simplicity sometimes is a drawback, i.e. you can’t code very specific instructions in Python that interacts with the GPU, right?. Well, the Nvidia community is putting a lot of effort in creating tools for python to write efficient code for your GPU, at least that’s one of the main takeaways I got from <a href="https://www.nvidia.com/gtc/">Nvidia GTC conference 2025</a>. For python you have <a href="https://cupy.dev/">CuPy</a>, <a href="https://numba.pydata.org/">Numba</a>, <a href="https://docs.jax.dev/en/latest/">JAX</a>, <a href="https://openai.com/index/triton/">Triton</a>, the <a href="https://docs.rapids.ai/">RAPIDS</a> ecosystem containing cuDF, cuML, cuGraph, etc. Also in that conference they spoke a lot about <a href="https://docs.nvidia.com/cuda/cutile-python/">cuTile</a> that finally has been released this month!.</p>

<p>All these libraries are super promissing, specially cuTile (I will write some posts about these libraries soon!). However sometimes one needs to have full control of the CUDA code and write directly the CUDA kernels. In my current position at Eikon Therapeutics I coded algorithms for detection and localization of proteins in images using CUDA kernels and reduced the calculation time from 3 minutes (CPU) to 5 seconds (GPU). Apart from the technical difficulty of coding the kernels and handling memory, one difficult part was to expose this functionality to a regular user that codes in Python. For that I learned how to compile the CUDA code with <code class="language-plaintext highlighter-rouge">nvcc</code> and create the bindings for Python. Also how to package this on a wheel for specific architecture and run the tests in an automated pipeline.</p>

<p>In this post we will learn how to expose CUDA functionality in Python so effectively calling your custom CUDA code from Python without much hassle. The code for this project can be found in my <a href="https://github.com/SebastiaAgramunt/python-cuda">python-cuda</a> GitHub repository not in <a href="https://github.com/SebastiaAgramunt/blogging-code">blogging-code</a> where I usually publish the code for this blog.</p>

<h2 id="the-structure">The structure</h2>

<p>Our project will have this structure</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
</pre></td><td class="rouge-code"><pre>.
├── MANIFEST.in
├── README.md
├── cuda
│   ├── include
│   │   ├── cuBLASMultiply.h
│   │   ├── tiledMultiply.h
│   │   └── utils.h
│   └── src
│       ├── cuBLASMultiply.cu
│       └── tiledMultiply.cu
├── pyproject.toml
├── scripts
│   ├── build.sh
│   ├── launch.sh
│   └── script.py
├── setup.py
├── src
│   └── bindings.cpp
└── tests
    └── test_matmul.py
</pre></td></tr></tbody></table></code></pre></div></div>

<p>We will expose two functions, <code class="language-plaintext highlighter-rouge">matmul</code>, a tiled matrix multiplication, and <code class="language-plaintext highlighter-rouge">matmul_cublas</code>, a matrix multiplication using the library <a href="https://docs.nvidia.com/cuda/cublas/">cuBLAS</a>.</p>

<h2 id="cuda-specific-files">CUDA specific files</h2>

<p>The <code class="language-plaintext highlighter-rouge">tiledMultiply.cu</code> file has this contents:</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
</pre></td><td class="rouge-code"><pre><span class="cp">#include</span> <span class="cpf">"tiledMultiply.h"</span><span class="cp">
</span>
<span class="n">__global__</span> <span class="kt">void</span> <span class="nf">tiledMultiply</span><span class="p">(</span><span class="k">const</span> <span class="kt">float</span><span class="o">*</span> <span class="n">__restrict__</span> <span class="n">A</span><span class="p">,</span> <span class="c1">// M x K</span>
                              <span class="k">const</span> <span class="kt">float</span><span class="o">*</span> <span class="n">__restrict__</span> <span class="n">B</span><span class="p">,</span> <span class="c1">// K x N</span>
                              <span class="kt">float</span><span class="o">*</span> <span class="n">__restrict__</span> <span class="n">C</span><span class="p">,</span>       <span class="c1">// M x N</span>
                              <span class="n">std</span><span class="o">::</span><span class="kt">size_t</span> <span class="n">M</span><span class="p">,</span>
                              <span class="n">std</span><span class="o">::</span><span class="kt">size_t</span> <span class="n">K</span><span class="p">,</span>
                              <span class="n">std</span><span class="o">::</span><span class="kt">size_t</span> <span class="n">N</span><span class="p">)</span> <span class="p">{</span>

    <span class="kt">int</span> <span class="n">by</span> <span class="o">=</span> <span class="n">blockIdx</span><span class="p">.</span><span class="n">y</span><span class="p">;</span>
    <span class="kt">int</span> <span class="n">bx</span> <span class="o">=</span> <span class="n">blockIdx</span><span class="p">.</span><span class="n">x</span><span class="p">;</span>

    <span class="kt">int</span> <span class="n">ty</span> <span class="o">=</span> <span class="n">threadIdx</span><span class="p">.</span><span class="n">y</span><span class="p">;</span>
    <span class="kt">int</span> <span class="n">tx</span> <span class="o">=</span> <span class="n">threadIdx</span><span class="p">.</span><span class="n">x</span><span class="p">;</span>

    <span class="c1">// global row/col this thread is responsible for</span>
    <span class="kt">int</span> <span class="n">i</span> <span class="o">=</span> <span class="n">by</span> <span class="o">*</span> <span class="n">TILE</span> <span class="o">+</span> <span class="n">ty</span><span class="p">;</span>  <span class="c1">// row in C</span>
    <span class="kt">int</span> <span class="n">j</span> <span class="o">=</span> <span class="n">bx</span> <span class="o">*</span> <span class="n">TILE</span> <span class="o">+</span> <span class="n">tx</span><span class="p">;</span>  <span class="c1">// col in C</span>

    <span class="n">__shared__</span> <span class="kt">float</span> <span class="n">As</span><span class="p">[</span><span class="n">TILE</span><span class="p">][</span><span class="n">TILE</span><span class="p">];</span>
    <span class="n">__shared__</span> <span class="kt">float</span> <span class="n">Bs</span><span class="p">[</span><span class="n">TILE</span><span class="p">][</span><span class="n">TILE</span><span class="p">];</span>

    <span class="kt">float</span> <span class="n">value</span> <span class="o">=</span> <span class="mf">0.0f</span><span class="p">;</span>

    <span class="c1">// number of tiles along K</span>
    <span class="kt">int</span> <span class="n">numTiles</span> <span class="o">=</span> <span class="p">(</span><span class="n">K</span> <span class="o">+</span> <span class="n">TILE</span> <span class="o">-</span> <span class="mi">1</span><span class="p">)</span> <span class="o">/</span> <span class="n">TILE</span><span class="p">;</span>

    <span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">ph</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">ph</span> <span class="o">&lt;</span> <span class="n">numTiles</span><span class="p">;</span> <span class="o">++</span><span class="n">ph</span><span class="p">)</span> <span class="p">{</span>
        <span class="c1">// column in A, row in B that this thread wants to load</span>
        <span class="kt">int</span> <span class="n">aCol</span> <span class="o">=</span> <span class="n">ph</span> <span class="o">*</span> <span class="n">TILE</span> <span class="o">+</span> <span class="n">tx</span><span class="p">;</span>  <span class="c1">// along K</span>
        <span class="kt">int</span> <span class="n">bRow</span> <span class="o">=</span> <span class="n">ph</span> <span class="o">*</span> <span class="n">TILE</span> <span class="o">+</span> <span class="n">ty</span><span class="p">;</span>  <span class="c1">// along K</span>

        <span class="c1">// load A tile (row = i, col = aCol)</span>
        <span class="k">if</span> <span class="p">(</span><span class="n">i</span> <span class="o">&lt;</span> <span class="n">M</span> <span class="o">&amp;&amp;</span> <span class="n">aCol</span> <span class="o">&lt;</span> <span class="n">K</span><span class="p">)</span>
            <span class="n">As</span><span class="p">[</span><span class="n">ty</span><span class="p">][</span><span class="n">tx</span><span class="p">]</span> <span class="o">=</span> <span class="n">A</span><span class="p">[</span><span class="n">i</span> <span class="o">*</span> <span class="n">K</span> <span class="o">+</span> <span class="n">aCol</span><span class="p">];</span>
        <span class="k">else</span>
            <span class="n">As</span><span class="p">[</span><span class="n">ty</span><span class="p">][</span><span class="n">tx</span><span class="p">]</span> <span class="o">=</span> <span class="mf">0.0f</span><span class="p">;</span>

        <span class="c1">// load B tile (row = bRow, col = j)</span>
        <span class="k">if</span> <span class="p">(</span><span class="n">bRow</span> <span class="o">&lt;</span> <span class="n">K</span> <span class="o">&amp;&amp;</span> <span class="n">j</span> <span class="o">&lt;</span> <span class="n">N</span><span class="p">)</span>
            <span class="n">Bs</span><span class="p">[</span><span class="n">ty</span><span class="p">][</span><span class="n">tx</span><span class="p">]</span> <span class="o">=</span> <span class="n">B</span><span class="p">[</span><span class="n">bRow</span> <span class="o">*</span> <span class="n">N</span> <span class="o">+</span> <span class="n">j</span><span class="p">];</span>
        <span class="k">else</span>
            <span class="n">Bs</span><span class="p">[</span><span class="n">ty</span><span class="p">][</span><span class="n">tx</span><span class="p">]</span> <span class="o">=</span> <span class="mf">0.0f</span><span class="p">;</span>

        <span class="c1">// sync all threads to make sure the tiles are loaded</span>
        <span class="n">__syncthreads</span><span class="p">();</span>

        <span class="cp">#pragma unroll
</span>        <span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">t</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">t</span> <span class="o">&lt;</span> <span class="n">TILE</span><span class="p">;</span> <span class="o">++</span><span class="n">t</span><span class="p">)</span> <span class="p">{</span>
            <span class="n">value</span> <span class="o">+=</span> <span class="n">As</span><span class="p">[</span><span class="n">ty</span><span class="p">][</span><span class="n">t</span><span class="p">]</span> <span class="o">*</span> <span class="n">Bs</span><span class="p">[</span><span class="n">t</span><span class="p">][</span><span class="n">tx</span><span class="p">];</span>
        <span class="p">}</span>

        <span class="c1">// sync before loading the next tile</span>
        <span class="n">__syncthreads</span><span class="p">();</span>
    <span class="p">}</span>

    <span class="c1">// write back only if in-bounds</span>
    <span class="k">if</span> <span class="p">(</span><span class="n">i</span> <span class="o">&lt;</span> <span class="p">(</span><span class="kt">int</span><span class="p">)</span><span class="n">M</span> <span class="o">&amp;&amp;</span> <span class="n">j</span> <span class="o">&lt;</span> <span class="p">(</span><span class="kt">int</span><span class="p">)</span><span class="n">N</span><span class="p">)</span> <span class="p">{</span>
        <span class="n">C</span><span class="p">[</span><span class="n">i</span> <span class="o">*</span> <span class="n">N</span> <span class="o">+</span> <span class="n">j</span><span class="p">]</span> <span class="o">=</span> <span class="n">value</span><span class="p">;</span>
    <span class="p">}</span>
<span class="p">}</span>

<span class="kt">void</span> <span class="nf">tiledMultiply_call</span><span class="p">(</span><span class="k">const</span> <span class="kt">float</span><span class="o">*</span> <span class="n">__restrict__</span> <span class="n">A</span><span class="p">,</span>
                    <span class="k">const</span> <span class="kt">float</span><span class="o">*</span> <span class="n">__restrict__</span> <span class="n">B</span><span class="p">,</span>
                    <span class="kt">float</span><span class="o">*</span> <span class="n">__restrict__</span> <span class="n">C</span><span class="p">,</span>
                    <span class="n">std</span><span class="o">::</span><span class="kt">size_t</span> <span class="n">M</span><span class="p">,</span>
                    <span class="n">std</span><span class="o">::</span><span class="kt">size_t</span> <span class="n">K</span><span class="p">,</span>
                    <span class="n">std</span><span class="o">::</span><span class="kt">size_t</span> <span class="n">N</span><span class="p">){</span>
    <span class="n">dim3</span> <span class="n">threads</span><span class="p">(</span><span class="n">TILE</span><span class="p">,</span> <span class="n">TILE</span><span class="p">);</span>
    <span class="n">dim3</span> <span class="n">blocks</span><span class="p">((</span><span class="n">N</span> <span class="o">+</span> <span class="n">TILE</span> <span class="o">-</span> <span class="mi">1</span><span class="p">)</span> <span class="o">/</span> <span class="n">TILE</span><span class="p">,</span> <span class="p">(</span><span class="n">M</span> <span class="o">+</span> <span class="n">TILE</span> <span class="o">-</span> <span class="mi">1</span><span class="p">)</span> <span class="o">/</span> <span class="n">TILE</span><span class="p">);</span>
    <span class="n">tiledMultiply</span><span class="o">&lt;&lt;&lt;</span><span class="n">blocks</span><span class="p">,</span> <span class="n">threads</span><span class="o">&gt;&gt;&gt;</span><span class="p">(</span><span class="n">A</span><span class="p">,</span> <span class="n">B</span><span class="p">,</span> <span class="n">C</span><span class="p">,</span> <span class="n">M</span><span class="p">,</span> <span class="n">K</span><span class="p">,</span> <span class="n">N</span><span class="p">);</span>
    <span class="n">cudaDeviceSynchronize</span><span class="p">();</span>
<span class="p">}</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>This is, a CUDA kernel <code class="language-plaintext highlighter-rouge">tiledMultiply</code> and a C++ function that calls that kernel <code class="language-plaintext highlighter-rouge">tiledMultiply_call</code>. The header <code class="language-plaintext highlighter-rouge">tiledMultiply.h</code> includes just the last function exposed:</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
</pre></td><td class="rouge-code"><pre><span class="cp">#ifndef TILEDMULTIPLY_H
#define TILEDMULTIPLY_H
</span>
<span class="cp">#include</span> <span class="cpf">&lt;iostream&gt;</span><span class="cp">
#include</span> <span class="cpf">&lt;cuda_runtime.h&gt;</span><span class="cp">
</span>
<span class="cp"># define TILE 16
</span><span class="kt">void</span> <span class="nf">tiledMultiply_call</span><span class="p">(</span><span class="k">const</span> <span class="kt">float</span> <span class="o">*</span> <span class="n">__restrict__</span> <span class="n">A</span><span class="p">,</span>
    <span class="k">const</span> <span class="kt">float</span> <span class="o">*</span> <span class="n">__restrict__</span> <span class="n">B</span><span class="p">,</span>
    <span class="kt">float</span> <span class="o">*</span> <span class="n">__restrict__</span> <span class="n">C</span><span class="p">,</span>
    <span class="k">const</span> <span class="n">std</span><span class="o">::</span><span class="kt">size_t</span> <span class="n">M</span><span class="p">,</span>
    <span class="k">const</span> <span class="n">std</span><span class="o">::</span><span class="kt">size_t</span> <span class="n">K</span><span class="p">,</span>
    <span class="k">const</span> <span class="n">std</span><span class="o">::</span><span class="kt">size_t</span> <span class="n">N</span><span class="p">);</span>

<span class="cp">#endif
</span></pre></td></tr></tbody></table></code></pre></div></div>
<p>These functions have been explained in the <a href="../cuda-matrix-multiplication">CUDA Matrix Multiplication</a> post. Also we included the <code class="language-plaintext highlighter-rouge">cuBLAS</code> equivalent <code class="language-plaintext highlighter-rouge">cuBLASMultiply.cu</code>:</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
</pre></td><td class="rouge-code"><pre><span class="cp">#include</span> <span class="cpf">"cuBLASMultiply.h"</span><span class="cp">
</span>
<span class="kt">void</span> <span class="nf">cuBLASmultiply_call</span><span class="p">(</span><span class="k">const</span> <span class="kt">float</span><span class="o">*</span> <span class="n">__restrict__</span> <span class="n">A</span><span class="p">,</span>
                    <span class="k">const</span> <span class="kt">float</span><span class="o">*</span> <span class="n">__restrict__</span> <span class="n">B</span><span class="p">,</span>
                    <span class="kt">float</span><span class="o">*</span> <span class="n">__restrict__</span> <span class="n">C</span><span class="p">,</span>
                    <span class="n">std</span><span class="o">::</span><span class="kt">size_t</span> <span class="n">M</span><span class="p">,</span>
                    <span class="n">std</span><span class="o">::</span><span class="kt">size_t</span> <span class="n">K</span><span class="p">,</span>
                    <span class="n">std</span><span class="o">::</span><span class="kt">size_t</span> <span class="n">N</span><span class="p">,</span>
                    <span class="n">cudaStream_t</span> <span class="n">stream</span><span class="p">){</span>

    
    <span class="n">cublasHandle_t</span> <span class="n">handle</span><span class="p">;</span>
    <span class="n">CHECK_CUDA_ERROR</span><span class="p">(</span><span class="n">cudaStreamCreate</span><span class="p">(</span><span class="o">&amp;</span><span class="n">stream</span><span class="p">));</span>
    <span class="n">CHECK_CUBLAS_ERROR</span><span class="p">(</span><span class="n">cublasCreate</span><span class="p">(</span><span class="o">&amp;</span><span class="n">handle</span><span class="p">));</span>
    <span class="n">CHECK_CUBLAS_ERROR</span><span class="p">(</span><span class="n">cublasSetStream</span><span class="p">(</span><span class="n">handle</span><span class="p">,</span> <span class="n">stream</span><span class="p">));</span>

    <span class="k">const</span> <span class="kt">float</span> <span class="n">alpha</span> <span class="o">=</span> <span class="mf">1.0f</span><span class="p">;</span>
    <span class="k">const</span> <span class="kt">float</span> <span class="n">beta</span>  <span class="o">=</span> <span class="mf">0.0f</span><span class="p">;</span>

    <span class="c1">// A: M x K (row-major)</span>
    <span class="c1">// B: K x N (row-major)</span>
    <span class="c1">// C: M x N (row-major)</span>

    <span class="c1">// We ask cuBLAS to compute: C^T = (B^T) * (A^T)</span>
    <span class="n">CHECK_CUBLAS_ERROR</span><span class="p">(</span>
        <span class="n">cublasSgemm</span><span class="p">(</span><span class="n">handle</span><span class="p">,</span>
            <span class="n">CUBLAS_OP_N</span><span class="p">,</span>
            <span class="n">CUBLAS_OP_N</span><span class="p">,</span>
            <span class="n">N</span><span class="p">,</span>               <span class="c1">// m = rows of C^T</span>
            <span class="n">M</span><span class="p">,</span>               <span class="c1">// n = cols of C^T</span>
            <span class="n">K</span><span class="p">,</span>               <span class="c1">// k</span>
            <span class="o">&amp;</span><span class="n">alpha</span><span class="p">,</span>
            <span class="n">B</span><span class="p">,</span> <span class="n">N</span><span class="p">,</span>            <span class="c1">// matrix A is B, leading dimension N</span>
            <span class="n">A</span><span class="p">,</span> <span class="n">K</span><span class="p">,</span>            <span class="c1">// matrix B is A, leading dimension K</span>
            <span class="o">&amp;</span><span class="n">beta</span><span class="p">,</span>
            <span class="n">C</span><span class="p">,</span> <span class="n">N</span><span class="p">)</span>
        <span class="p">);</span>
    <span class="n">CHECK_CUDA_ERROR</span><span class="p">(</span><span class="n">cudaStreamSynchronize</span><span class="p">(</span><span class="n">stream</span><span class="p">));</span>
    <span class="n">CHECK_CUBLAS_ERROR</span><span class="p">(</span><span class="n">cublasDestroy</span><span class="p">(</span><span class="n">handle</span><span class="p">));</span>
<span class="p">}</span>
</pre></td></tr></tbody></table></code></pre></div></div>
<p>And the corresponding header <code class="language-plaintext highlighter-rouge">cuBLASMultiply.h</code>:</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
</pre></td><td class="rouge-code"><pre><span class="cp">#ifndef CUBLASMATMULTIPLY_H
#define CUBLASMATMULTIPLY_H
</span>
<span class="cp">#include</span> <span class="cpf">"utils.h"</span><span class="cp">
#include</span><span class="cpf">&lt;cublas_v2.h&gt;</span><span class="cp">
#include</span> <span class="cpf">&lt;cuda_runtime.h&gt;</span><span class="cp">
</span>

<span class="kt">void</span> <span class="nf">cuBLASmultiply_call</span><span class="p">(</span><span class="k">const</span> <span class="kt">float</span><span class="o">*</span> <span class="n">__restrict__</span> <span class="n">A</span><span class="p">,</span>
                    <span class="k">const</span> <span class="kt">float</span><span class="o">*</span> <span class="n">__restrict__</span> <span class="n">B</span><span class="p">,</span>
                    <span class="kt">float</span><span class="o">*</span> <span class="n">__restrict__</span> <span class="n">C</span><span class="p">,</span>
                    <span class="n">std</span><span class="o">::</span><span class="kt">size_t</span> <span class="n">M</span><span class="p">,</span>
                    <span class="n">std</span><span class="o">::</span><span class="kt">size_t</span> <span class="n">K</span><span class="p">,</span>
                    <span class="n">std</span><span class="o">::</span><span class="kt">size_t</span> <span class="n">N</span><span class="p">,</span>
                    <span class="n">cudaStream_t</span> <span class="n">stream</span><span class="p">);</span>

<span class="cp">#endif
</span></pre></td></tr></tbody></table></code></pre></div></div>

<p>The <code class="language-plaintext highlighter-rouge">utils.h</code> contain a set of very useful functions for flagging errors in the cuda runtime execution. Won’t explain but is a great resource to copy paste in any CUDA project.</p>

<h2 id="the-bindings">The bindings</h2>

<p>Out of the two exposed functions I’m just going to explain the tiled, the cuBLAS is equivalent.</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
</pre></td><td class="rouge-code"><pre><span class="n">py</span><span class="o">::</span><span class="n">array_t</span><span class="o">&lt;</span><span class="kt">float</span><span class="o">&gt;</span> <span class="n">matmul_tiled</span><span class="p">(</span>
    <span class="n">py</span><span class="o">::</span><span class="n">array_t</span><span class="o">&lt;</span><span class="kt">float</span><span class="p">,</span> <span class="n">py</span><span class="o">::</span><span class="n">array</span><span class="o">::</span><span class="n">c_style</span> <span class="o">|</span> <span class="n">py</span><span class="o">::</span><span class="n">array</span><span class="o">::</span><span class="n">forcecast</span><span class="o">&gt;</span> <span class="n">A</span><span class="p">,</span>
    <span class="n">py</span><span class="o">::</span><span class="n">array_t</span><span class="o">&lt;</span><span class="kt">float</span><span class="p">,</span> <span class="n">py</span><span class="o">::</span><span class="n">array</span><span class="o">::</span><span class="n">c_style</span> <span class="o">|</span> <span class="n">py</span><span class="o">::</span><span class="n">array</span><span class="o">::</span><span class="n">forcecast</span><span class="o">&gt;</span> <span class="n">B</span><span class="p">)</span>
<span class="p">{</span>
    <span class="k">if</span> <span class="p">(</span><span class="n">A</span><span class="p">.</span><span class="n">ndim</span><span class="p">()</span> <span class="o">!=</span> <span class="mi">2</span> <span class="o">||</span> <span class="n">B</span><span class="p">.</span><span class="n">ndim</span><span class="p">()</span> <span class="o">!=</span> <span class="mi">2</span><span class="p">)</span> <span class="p">{</span>
        <span class="k">throw</span> <span class="n">std</span><span class="o">::</span><span class="n">runtime_error</span><span class="p">(</span><span class="s">"A and B must be 2D arrays"</span><span class="p">);</span>
    <span class="p">}</span>

    <span class="k">const</span> <span class="k">auto</span> <span class="n">M</span>  <span class="o">=</span> <span class="k">static_cast</span><span class="o">&lt;</span><span class="n">std</span><span class="o">::</span><span class="kt">size_t</span><span class="o">&gt;</span><span class="p">(</span><span class="n">A</span><span class="p">.</span><span class="n">shape</span><span class="p">(</span><span class="mi">0</span><span class="p">));</span>
    <span class="k">const</span> <span class="k">auto</span> <span class="n">K</span>  <span class="o">=</span> <span class="k">static_cast</span><span class="o">&lt;</span><span class="n">std</span><span class="o">::</span><span class="kt">size_t</span><span class="o">&gt;</span><span class="p">(</span><span class="n">A</span><span class="p">.</span><span class="n">shape</span><span class="p">(</span><span class="mi">1</span><span class="p">));</span>
    <span class="k">const</span> <span class="k">auto</span> <span class="n">Kb</span> <span class="o">=</span> <span class="k">static_cast</span><span class="o">&lt;</span><span class="n">std</span><span class="o">::</span><span class="kt">size_t</span><span class="o">&gt;</span><span class="p">(</span><span class="n">B</span><span class="p">.</span><span class="n">shape</span><span class="p">(</span><span class="mi">0</span><span class="p">));</span>
    <span class="k">const</span> <span class="k">auto</span> <span class="n">N</span>  <span class="o">=</span> <span class="k">static_cast</span><span class="o">&lt;</span><span class="n">std</span><span class="o">::</span><span class="kt">size_t</span><span class="o">&gt;</span><span class="p">(</span><span class="n">B</span><span class="p">.</span><span class="n">shape</span><span class="p">(</span><span class="mi">1</span><span class="p">));</span>

    <span class="k">if</span> <span class="p">(</span><span class="n">K</span> <span class="o">!=</span> <span class="n">Kb</span><span class="p">)</span> <span class="p">{</span>
        <span class="k">throw</span> <span class="n">std</span><span class="o">::</span><span class="n">runtime_error</span><span class="p">(</span><span class="s">"Inner dimensions must match: A(M,K) @ B(K,N)"</span><span class="p">);</span>
    <span class="p">}</span>

    <span class="n">py</span><span class="o">::</span><span class="n">array_t</span><span class="o">&lt;</span><span class="kt">float</span><span class="o">&gt;</span> <span class="n">C</span><span class="p">({</span><span class="k">static_cast</span><span class="o">&lt;</span><span class="n">py</span><span class="o">::</span><span class="kt">ssize_t</span><span class="o">&gt;</span><span class="p">(</span><span class="n">M</span><span class="p">),</span>
                          <span class="k">static_cast</span><span class="o">&lt;</span><span class="n">py</span><span class="o">::</span><span class="kt">ssize_t</span><span class="o">&gt;</span><span class="p">(</span><span class="n">N</span><span class="p">)});</span>

    <span class="k">const</span> <span class="kt">float</span><span class="o">*</span> <span class="n">hA</span> <span class="o">=</span> <span class="n">A</span><span class="p">.</span><span class="n">data</span><span class="p">();</span>
    <span class="k">const</span> <span class="kt">float</span><span class="o">*</span> <span class="n">hB</span> <span class="o">=</span> <span class="n">B</span><span class="p">.</span><span class="n">data</span><span class="p">();</span>
    <span class="kt">float</span><span class="o">*</span> <span class="n">hC</span>       <span class="o">=</span> <span class="n">C</span><span class="p">.</span><span class="n">mutable_data</span><span class="p">();</span>

    <span class="kt">float</span> <span class="o">*</span><span class="n">dA</span> <span class="o">=</span> <span class="nb">nullptr</span><span class="p">,</span> <span class="o">*</span><span class="n">dB</span> <span class="o">=</span> <span class="nb">nullptr</span><span class="p">,</span> <span class="o">*</span><span class="n">dC</span> <span class="o">=</span> <span class="nb">nullptr</span><span class="p">;</span>

    <span class="n">std</span><span class="o">::</span><span class="kt">size_t</span> <span class="n">bytesA</span> <span class="o">=</span> <span class="n">M</span> <span class="o">*</span> <span class="n">K</span> <span class="o">*</span> <span class="nf">sizeof</span><span class="p">(</span><span class="kt">float</span><span class="p">);</span>
    <span class="n">std</span><span class="o">::</span><span class="kt">size_t</span> <span class="n">bytesB</span> <span class="o">=</span> <span class="n">K</span> <span class="o">*</span> <span class="n">N</span> <span class="o">*</span> <span class="nf">sizeof</span><span class="p">(</span><span class="kt">float</span><span class="p">);</span>
    <span class="n">std</span><span class="o">::</span><span class="kt">size_t</span> <span class="n">bytesC</span> <span class="o">=</span> <span class="n">M</span> <span class="o">*</span> <span class="n">N</span> <span class="o">*</span> <span class="nf">sizeof</span><span class="p">(</span><span class="kt">float</span><span class="p">);</span>

    <span class="n">cuda_check</span><span class="p">(</span><span class="n">cudaMalloc</span><span class="p">(</span><span class="o">&amp;</span><span class="n">dA</span><span class="p">,</span> <span class="n">bytesA</span><span class="p">),</span> <span class="s">"cudaMalloc dA failed"</span><span class="p">);</span>
    <span class="n">cuda_check</span><span class="p">(</span><span class="n">cudaMalloc</span><span class="p">(</span><span class="o">&amp;</span><span class="n">dB</span><span class="p">,</span> <span class="n">bytesB</span><span class="p">),</span> <span class="s">"cudaMalloc dB failed"</span><span class="p">);</span>
    <span class="n">cuda_check</span><span class="p">(</span><span class="n">cudaMalloc</span><span class="p">(</span><span class="o">&amp;</span><span class="n">dC</span><span class="p">,</span> <span class="n">bytesC</span><span class="p">),</span> <span class="s">"cudaMalloc dC failed"</span><span class="p">);</span>

    <span class="n">cuda_check</span><span class="p">(</span><span class="n">cudaMemcpy</span><span class="p">(</span><span class="n">dA</span><span class="p">,</span> <span class="n">hA</span><span class="p">,</span> <span class="n">bytesA</span><span class="p">,</span> <span class="n">cudaMemcpyHostToDevice</span><span class="p">),</span>
               <span class="s">"cudaMemcpy A failed"</span><span class="p">);</span>
    <span class="n">cuda_check</span><span class="p">(</span><span class="n">cudaMemcpy</span><span class="p">(</span><span class="n">dB</span><span class="p">,</span> <span class="n">hB</span><span class="p">,</span> <span class="n">bytesB</span><span class="p">,</span> <span class="n">cudaMemcpyHostToDevice</span><span class="p">),</span>
               <span class="s">"cudaMemcpy B failed"</span><span class="p">);</span>

    <span class="n">tiledMultiply_call</span><span class="p">(</span><span class="n">dA</span><span class="p">,</span> <span class="n">dB</span><span class="p">,</span> <span class="n">dC</span><span class="p">,</span> <span class="n">M</span><span class="p">,</span> <span class="n">K</span><span class="p">,</span> <span class="n">N</span><span class="p">);</span>

    <span class="n">cuda_check</span><span class="p">(</span><span class="n">cudaGetLastError</span><span class="p">(),</span> <span class="s">"Kernel launch failed"</span><span class="p">);</span>
    <span class="n">cuda_check</span><span class="p">(</span><span class="n">cudaDeviceSynchronize</span><span class="p">(),</span> <span class="s">"cudaDeviceSynchronize failed"</span><span class="p">);</span>

    <span class="n">cuda_check</span><span class="p">(</span><span class="n">cudaMemcpy</span><span class="p">(</span><span class="n">hC</span><span class="p">,</span> <span class="n">dC</span><span class="p">,</span> <span class="n">bytesC</span><span class="p">,</span> <span class="n">cudaMemcpyDeviceToHost</span><span class="p">),</span>
               <span class="s">"cudaMemcpy C failed"</span><span class="p">);</span>

    <span class="n">cudaFree</span><span class="p">(</span><span class="n">dA</span><span class="p">);</span>
    <span class="n">cudaFree</span><span class="p">(</span><span class="n">dB</span><span class="p">);</span>
    <span class="n">cudaFree</span><span class="p">(</span><span class="n">dC</span><span class="p">);</span>

    <span class="k">return</span> <span class="n">C</span><span class="p">;</span>
<span class="p">}</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>In the function we allocate the memory for the matrices <code class="language-plaintext highlighter-rouge">A</code>, <code class="language-plaintext highlighter-rouge">B</code> and <code class="language-plaintext highlighter-rouge">C</code> and</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
</pre></td><td class="rouge-code"><pre><span class="n">cuda_check</span><span class="p">(</span><span class="n">cudaMalloc</span><span class="p">(</span><span class="o">&amp;</span><span class="n">dA</span><span class="p">,</span> <span class="n">bytesA</span><span class="p">),</span> <span class="s">"cudaMalloc dA failed"</span><span class="p">);</span>
<span class="n">cuda_check</span><span class="p">(</span><span class="n">cudaMalloc</span><span class="p">(</span><span class="o">&amp;</span><span class="n">dB</span><span class="p">,</span> <span class="n">bytesB</span><span class="p">),</span> <span class="s">"cudaMalloc dB failed"</span><span class="p">);</span>
<span class="n">cuda_check</span><span class="p">(</span><span class="n">cudaMalloc</span><span class="p">(</span><span class="o">&amp;</span><span class="n">dC</span><span class="p">,</span> <span class="n">bytesC</span><span class="p">),</span> <span class="s">"cudaMalloc dC failed"</span><span class="p">);</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>we copy the data:</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
</pre></td><td class="rouge-code"><pre><span class="n">cuda_check</span><span class="p">(</span><span class="n">cudaMemcpy</span><span class="p">(</span><span class="n">dA</span><span class="p">,</span> <span class="n">hA</span><span class="p">,</span> <span class="n">bytesA</span><span class="p">,</span> <span class="n">cudaMemcpyHostToDevice</span><span class="p">),</span>
               <span class="s">"cudaMemcpy A failed"</span><span class="p">);</span>
<span class="n">cuda_check</span><span class="p">(</span><span class="n">cudaMemcpy</span><span class="p">(</span><span class="n">dB</span><span class="p">,</span> <span class="n">hB</span><span class="p">,</span> <span class="n">bytesB</span><span class="p">,</span> <span class="n">cudaMemcpyHostToDevice</span><span class="p">),</span>
               <span class="s">"cudaMemcpy B failed"</span><span class="p">);</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>launch the calculation in GPU</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre><span class="n">tiledMultiply_call</span><span class="p">(</span><span class="n">dA</span><span class="p">,</span> <span class="n">dB</span><span class="p">,</span> <span class="n">dC</span><span class="p">,</span> <span class="n">M</span><span class="p">,</span> <span class="n">K</span><span class="p">,</span> <span class="n">N</span><span class="p">);</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>and copy back to host the calculated matrix <code class="language-plaintext highlighter-rouge">C</code>:</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre><span class="n">cuda_check</span><span class="p">(</span><span class="n">cudaMemcpy</span><span class="p">(</span><span class="n">hC</span><span class="p">,</span> <span class="n">dC</span><span class="p">,</span> <span class="n">bytesC</span><span class="p">,</span> <span class="n">cudaMemcpyDeviceToHost</span><span class="p">),</span>
               <span class="s">"cudaMemcpy C failed"</span><span class="p">);</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>before freeing the memory. We decided to encapsualte all the logic of memory management here in the bindings file, however we could have written another c++ function to have this code and call it directly on the bindings. I’m trying to make things simpler for this example.</p>

<h2 id="python-project-specifics">Python project specifics</h2>

<p>The python project needs for a <code class="language-plaintext highlighter-rouge">pyproject.toml</code> first just to indicate the build system:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
</pre></td><td class="rouge-code"><pre>[build-system]
requires = ["setuptools==70.3.0", "wheel", "pybind11&gt;=2.6", "numpy"]
build-backend = "setuptools.build_meta"
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Then we need to specify the <code class="language-plaintext highlighter-rouge">setup.py</code>, this file will specify how to compile and build the code. It’s the default in Python. Let’s show the file by parts, at the beginning we have</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
</pre></td><td class="rouge-code"><pre><span class="kn">import</span> <span class="n">os</span>
<span class="kn">import</span> <span class="n">sys</span>
<span class="kn">import</span> <span class="n">subprocess</span>
<span class="kn">import</span> <span class="n">sysconfig</span>
<span class="kn">from</span> <span class="n">pathlib</span> <span class="kn">import</span> <span class="n">Path</span>

<span class="kn">from</span> <span class="n">setuptools</span> <span class="kn">import</span> <span class="n">setup</span><span class="p">,</span> <span class="n">Extension</span>
<span class="kn">from</span> <span class="n">setuptools.command.build_ext</span> <span class="kn">import</span> <span class="n">build_ext</span>
<span class="kn">import</span> <span class="n">pybind11</span>

<span class="n">REPO_PATH</span> <span class="o">=</span> <span class="nc">Path</span><span class="p">(</span><span class="n">__file__</span><span class="p">).</span><span class="nf">resolve</span><span class="p">().</span><span class="n">parent</span>

<span class="n">python_include_path</span> <span class="o">=</span> <span class="n">sysconfig</span><span class="p">.</span><span class="nf">get_path</span><span class="p">(</span><span class="sh">"</span><span class="s">include</span><span class="sh">"</span><span class="p">)</span>

<span class="n">CUDA_HOME</span> <span class="o">=</span> <span class="sh">"</span><span class="s">/usr/local/cuda</span><span class="sh">"</span>
<span class="n">CUDA_INCLUDE_DIR</span> <span class="o">=</span> <span class="n">os</span><span class="p">.</span><span class="n">path</span><span class="p">.</span><span class="nf">join</span><span class="p">(</span><span class="n">CUDA_HOME</span><span class="p">,</span> <span class="sh">"</span><span class="s">include</span><span class="sh">"</span><span class="p">)</span>
<span class="n">CUDA_LIB_DIR</span> <span class="o">=</span> <span class="n">os</span><span class="p">.</span><span class="n">path</span><span class="p">.</span><span class="nf">join</span><span class="p">(</span><span class="n">CUDA_HOME</span><span class="p">,</span> <span class="sh">"</span><span class="s">lib64</span><span class="sh">"</span><span class="p">)</span>
<span class="n">PACKAGE_NAME</span> <span class="o">=</span> <span class="sh">"</span><span class="s">matmul</span><span class="sh">"</span>

<span class="n">INCLUDE_DIRS</span> <span class="o">=</span> <span class="p">[</span>
    <span class="sh">"</span><span class="s">cuda/include</span><span class="sh">"</span><span class="p">,</span>
    <span class="n">CUDA_INCLUDE_DIR</span><span class="p">,</span>
    <span class="n">python_include_path</span><span class="p">,</span>
    <span class="n">pybind11</span><span class="p">.</span><span class="nf">get_include</span><span class="p">(),</span>
<span class="p">]</span>

<span class="n">LIBRARY_DIRS</span> <span class="o">=</span> <span class="p">[</span><span class="n">CUDA_LIB_DIR</span><span class="p">]</span>
<span class="n">LIBRARIES</span> <span class="o">=</span> <span class="p">[</span><span class="sh">"</span><span class="s">cudart</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">cublas</span><span class="sh">"</span><span class="p">]</span>

<span class="n">CXX_FLAGS</span> <span class="o">=</span> <span class="p">[</span><span class="sh">"</span><span class="s">-std=c++17</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">-O3</span><span class="sh">"</span><span class="p">]</span>
<span class="n">NVCC_FLAGS</span> <span class="o">=</span> <span class="p">[</span>
    <span class="sh">"</span><span class="s">-std=c++17</span><span class="sh">"</span><span class="p">,</span>
    <span class="sh">"</span><span class="s">-O3</span><span class="sh">"</span><span class="p">,</span>
    <span class="sh">"</span><span class="s">-Xcompiler</span><span class="sh">"</span><span class="p">,</span>
    <span class="sh">"</span><span class="s">-fPIC</span><span class="sh">"</span><span class="p">,</span>
    <span class="sh">"</span><span class="s">-arch=sm_80</span><span class="sh">"</span><span class="p">,</span>
<span class="p">]</span>

<span class="n">SRC_FILES</span> <span class="o">=</span> <span class="p">[</span>
    <span class="sh">"</span><span class="s">src/bindings.cpp</span><span class="sh">"</span><span class="p">,</span>
    <span class="sh">"</span><span class="s">cuda/src/tiledMultiply.cu</span><span class="sh">"</span><span class="p">,</span>
    <span class="sh">"</span><span class="s">cuda/src/cuBLASMultiply.cu</span><span class="sh">"</span><span class="p">,</span>
<span class="p">]</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>The <code class="language-plaintext highlighter-rouge">python_include_path</code> is the path to the includes of python (bascially <code class="language-plaintext highlighter-rouge">Python.h</code>). The <code class="language-plaintext highlighter-rouge">CUDA_HOME</code> is the path to the CUDA libraries and includes, it can change depending on the system, adjust accordingly (you can also use CMAKE if you are up for it). Then we join all the include directories in <code class="language-plaintext highlighter-rouge">INCLUDE_DIRS</code> list, then the library directories in <code class="language-plaintext highlighter-rouge">LIBRARY_DIRS</code>, then libraries in <code class="language-plaintext highlighter-rouge">LIBRARIES</code> and then the flags for the C++ and nvcc compilers. Finally the <code class="language-plaintext highlighter-rouge">SRC_FILES</code> that we are going to compile.</p>

<p>Next we need to check that the machine has <code class="language-plaintext highlighter-rouge">nvcc</code> compiler</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
</pre></td><td class="rouge-code"><pre><span class="k">try</span><span class="p">:</span>
    <span class="n">subprocess</span><span class="p">.</span><span class="nf">check_call</span><span class="p">([</span><span class="sh">"</span><span class="s">nvcc</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">--version</span><span class="sh">"</span><span class="p">])</span>
<span class="k">except</span> <span class="nb">Exception</span> <span class="k">as</span> <span class="n">e</span><span class="p">:</span>
    <span class="nf">print</span><span class="p">(</span><span class="sa">f</span><span class="sh">"</span><span class="s">nvcc compiler for CUDA not found: </span><span class="si">{</span><span class="n">e</span><span class="si">}</span><span class="s">; exiting</span><span class="sh">"</span><span class="p">)</span>
    <span class="n">sys</span><span class="p">.</span><span class="nf">exit</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>And now we need to write the compiling logic</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
</pre></td><td class="rouge-code"><pre><span class="k">class</span> <span class="nc">BuildExtCUDA</span><span class="p">(</span><span class="n">build_ext</span><span class="p">):</span>
    <span class="sh">"""</span><span class="s">Compile .cu files with nvcc, others with the normal C++ compiler.</span><span class="sh">"""</span>

    <span class="k">def</span> <span class="nf">build_extensions</span><span class="p">(</span><span class="n">self</span><span class="p">):</span>
        <span class="kn">from</span> <span class="n">distutils.sysconfig</span> <span class="kn">import</span> <span class="n">customize_compiler</span>

        <span class="n">compiler</span> <span class="o">=</span> <span class="n">self</span><span class="p">.</span><span class="n">compiler</span>
        <span class="nf">customize_compiler</span><span class="p">(</span><span class="n">compiler</span><span class="p">)</span>

        <span class="c1"># Let distutils know about .cu files
</span>        <span class="k">if</span> <span class="sh">"</span><span class="s">.cu</span><span class="sh">"</span> <span class="ow">not</span> <span class="ow">in</span> <span class="n">compiler</span><span class="p">.</span><span class="n">src_extensions</span><span class="p">:</span>
            <span class="n">compiler</span><span class="p">.</span><span class="n">src_extensions</span><span class="p">.</span><span class="nf">append</span><span class="p">(</span><span class="sh">"</span><span class="s">.cu</span><span class="sh">"</span><span class="p">)</span>

        <span class="n">default_compile</span> <span class="o">=</span> <span class="n">compiler</span><span class="p">.</span><span class="n">_compile</span>
        <span class="n">nvcc</span> <span class="o">=</span> <span class="sh">"</span><span class="s">nvcc</span><span class="sh">"</span>

        <span class="k">def</span> <span class="nf">_compile</span><span class="p">(</span><span class="n">obj</span><span class="p">,</span> <span class="n">src</span><span class="p">,</span> <span class="n">ext</span><span class="p">,</span> <span class="n">cc_args</span><span class="p">,</span> <span class="n">extra_postargs</span><span class="p">,</span> <span class="n">pp_opts</span><span class="p">):</span>
            <span class="k">if</span> <span class="n">src</span><span class="p">.</span><span class="nf">endswith</span><span class="p">(</span><span class="sh">"</span><span class="s">.cu</span><span class="sh">"</span><span class="p">):</span>
                <span class="c1"># nvcc compile
</span>                <span class="n">cmd</span> <span class="o">=</span> <span class="p">[</span><span class="n">nvcc</span><span class="p">,</span> <span class="sh">"</span><span class="s">-c</span><span class="sh">"</span><span class="p">,</span> <span class="n">src</span><span class="p">,</span> <span class="sh">"</span><span class="s">-o</span><span class="sh">"</span><span class="p">,</span> <span class="n">obj</span><span class="p">]</span> <span class="o">+</span> <span class="n">NVCC_FLAGS</span>
                <span class="k">for</span> <span class="n">inc</span> <span class="ow">in</span> <span class="n">INCLUDE_DIRS</span><span class="p">:</span>
                    <span class="n">cmd</span><span class="p">.</span><span class="nf">append</span><span class="p">(</span><span class="sa">f</span><span class="sh">"</span><span class="s">-I</span><span class="si">{</span><span class="n">inc</span><span class="si">}</span><span class="sh">"</span><span class="p">)</span>
                <span class="nf">print</span><span class="p">(</span><span class="sh">"</span><span class="s">NVCC:</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s"> </span><span class="sh">"</span><span class="p">.</span><span class="nf">join</span><span class="p">(</span><span class="n">cmd</span><span class="p">))</span>
                <span class="n">self</span><span class="p">.</span><span class="nf">spawn</span><span class="p">(</span><span class="n">cmd</span><span class="p">)</span>
            <span class="k">else</span><span class="p">:</span>
                <span class="c1"># normal C++ compile
</span>                <span class="n">extra_postargs</span> <span class="o">=</span> <span class="nf">list</span><span class="p">(</span><span class="n">extra_postargs</span> <span class="ow">or</span> <span class="p">[])</span> <span class="o">+</span> <span class="n">CXX_FLAGS</span>
                <span class="nf">default_compile</span><span class="p">(</span><span class="n">obj</span><span class="p">,</span> <span class="n">src</span><span class="p">,</span> <span class="n">ext</span><span class="p">,</span> <span class="n">cc_args</span><span class="p">,</span> <span class="n">extra_postargs</span><span class="p">,</span> <span class="n">pp_opts</span><span class="p">)</span>

        <span class="n">compiler</span><span class="p">.</span><span class="n">_compile</span> <span class="o">=</span> <span class="n">_compile</span>
        <span class="nf">super</span><span class="p">().</span><span class="nf">build_extensions</span><span class="p">()</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>This is a sublass of <code class="language-plaintext highlighter-rouge">build_ext</code> class. We overload the function <code class="language-plaintext highlighter-rouge">_compile</code> from the parent class. In this case, if the file ends with <code class="language-plaintext highlighter-rouge">.cu</code> we build the command <code class="language-plaintext highlighter-rouge">cmd</code> to execute in a subprocess <code class="language-plaintext highlighter-rouge">cmd = [nvcc, "-c", src, "-o", obj] + NVCC_FLAGS</code> then include the includes one by one in a for loop. Finally this command is launched in bash. If the file is not ending with <code class="language-plaintext highlighter-rouge">.cu</code> and is listed in the source files, then we assume its C++ and compile it with the default compiler (usucally <code class="language-plaintext highlighter-rouge">g++</code> or <code class="language-plaintext highlighter-rouge">gcc</code>).</p>

<p>Now we define the <code class="language-plaintext highlighter-rouge">ext_modules</code> and the <code class="language-plaintext highlighter-rouge">setup</code> file</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
</pre></td><td class="rouge-code"><pre><span class="n">ext_modules</span> <span class="o">=</span> <span class="p">[</span>
    <span class="nc">Extension</span><span class="p">(</span>
        <span class="n">PACKAGE_NAME</span><span class="p">,</span>
        <span class="n">sources</span><span class="o">=</span><span class="n">SRC_FILES</span><span class="p">,</span>
        <span class="n">include_dirs</span><span class="o">=</span><span class="n">INCLUDE_DIRS</span><span class="p">,</span>
        <span class="n">library_dirs</span><span class="o">=</span><span class="n">LIBRARY_DIRS</span><span class="p">,</span>
        <span class="n">libraries</span><span class="o">=</span><span class="n">LIBRARIES</span><span class="p">,</span>
        <span class="n">language</span><span class="o">=</span><span class="sh">"</span><span class="s">c++</span><span class="sh">"</span><span class="p">,</span>
    <span class="p">)</span>
<span class="p">]</span>

<span class="nf">setup</span><span class="p">(</span>
    <span class="n">name</span><span class="o">=</span><span class="n">PACKAGE_NAME</span><span class="p">,</span>
    <span class="n">version</span><span class="o">=</span><span class="sh">"</span><span class="s">0.1.0</span><span class="sh">"</span><span class="p">,</span>
    <span class="n">description</span><span class="o">=</span><span class="sh">"</span><span class="s">CUDA tiled matrix multiplication exposed to Python via pybind11</span><span class="sh">"</span><span class="p">,</span>
    <span class="n">author</span><span class="o">=</span><span class="sh">"</span><span class="s">Sebastia Agramunt Puig</span><span class="sh">"</span><span class="p">,</span>
    <span class="n">ext_modules</span><span class="o">=</span><span class="n">ext_modules</span><span class="p">,</span>
    <span class="n">cmdclass</span><span class="o">=</span><span class="p">{</span><span class="sh">"</span><span class="s">build_ext</span><span class="sh">"</span><span class="p">:</span> <span class="n">BuildExtCUDA</span><span class="p">},</span>
    <span class="n">zip_safe</span><span class="o">=</span><span class="bp">False</span><span class="p">,</span>
    <span class="n">install_requires</span><span class="o">=</span><span class="p">[</span>
        <span class="sh">"</span><span class="s">numpy</span><span class="sh">"</span><span class="p">,</span>
    <span class="p">],</span>
<span class="p">)</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>In the setup we specify how to build the external modules, and its through the class <code class="language-plaintext highlighter-rouge">BuildExtCUDA</code>. That’s it!, that makes it compilable and pip installable.</p>

<h2 id="the-manifest">The manifest</h2>

<p>The file <code class="language-plaintext highlighter-rouge">MANIFEST.in</code> is crucial when creating wheels, in this we tell the python build to include certain files:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
</pre></td><td class="rouge-code"><pre>recursive-include cuda/include *.h
recursive-include cuda/src *.cu
recursive-include src *.h *.cpp
</pre></td></tr></tbody></table></code></pre></div></div>

<p>specially we need the headers, otherwise the code won’t work as the binaries need for the function definitions there.</p>

<h2 id="install-the-package">Install the package</h2>

<p>Just create a new environment and <code class="language-plaintext highlighter-rouge">pip install</code> the package</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
</pre></td><td class="rouge-code"><pre><span class="nb">rm</span> <span class="nt">-rf</span> .venv
python <span class="nt">-m</span> venv .venv
.venv/bin/python <span class="nt">-m</span> pip <span class="nb">install</span> <span class="nt">--upgrade</span> pip
.venv/bin/python <span class="nt">-m</span> pip <span class="nb">install</span> <span class="nb">.</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Now you can run the script to test your code:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>.venv/bin/python scripts/script.py
</pre></td></tr></tbody></table></code></pre></div></div>

<h2 id="testing">Testing</h2>

<p>As usual I include some testing. It’s very important to test your code always!. The tests are very simple, just check that the <code class="language-plaintext highlighter-rouge">matmul</code> and <code class="language-plaintext highlighter-rouge">matmul_cublas</code> yield the same result. To execute the tests run</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre>.venv/bin/python <span class="nt">-m</span> pip <span class="nb">install </span>pytest
.venv/bin/pytest <span class="nt">-v</span> <span class="nb">.</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>after installing the python environment.</p>

<h2 id="building-a-wheel">Building a wheel</h2>

<p>For convenience I included a bash script <code class="language-plaintext highlighter-rouge">scripts/build.sh</code> to build the wheels. Just execute the following</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
</pre></td><td class="rouge-code"><pre><span class="nv">TASK</span><span class="o">=</span>install_environment  ./scripts/build.sh
<span class="nv">TASK</span><span class="o">=</span>run_tests  ./scripts/build.sh
<span class="nv">TASK</span><span class="o">=</span>build_wheel  ./scripts/build.sh
<span class="nv">TASK</span><span class="o">=</span>test_install_wheel  ./scripts/build.sh
<span class="nv">TASK</span><span class="o">=</span>cleanup  ./scripts/build.sh
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Obviously the <code class="language-plaintext highlighter-rouge">build_wheel</code> task will build the wheel. It places it in the <code class="language-plaintext highlighter-rouge">wheelhouse</code> directory. Let’s inspect this function</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
</pre></td><td class="rouge-code"><pre>build_wheel<span class="o">(){</span>
    
    <span class="nb">rm</span> <span class="nt">-rf</span> <span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/dist
    <span class="nb">rm</span> <span class="nt">-rf</span> <span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/build

    <span class="c"># create blank environment</span>
    <span class="nb">rm</span> <span class="nt">-rf</span> <span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/<span class="k">${</span><span class="nv">ENV_NAME</span><span class="k">}</span>
    python <span class="nt">-m</span> venv <span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/<span class="k">${</span><span class="nv">ENV_NAME</span><span class="k">}</span>
    <span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/<span class="k">${</span><span class="nv">ENV_NAME</span><span class="k">}</span>/bin/python <span class="nt">-m</span> pip <span class="nb">install</span> <span class="nt">--upgrade</span> pip

    <span class="c"># activate, install pkgs and build wheel</span>
    <span class="nb">source</span> <span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/<span class="k">${</span><span class="nv">ENV_NAME</span><span class="k">}</span>/bin/activate
    pip <span class="nb">install </span>wheel pybind11 auditwheel repairwheel patchelf build
    pip <span class="nb">install </span><span class="nv">setuptools</span><span class="o">==</span>70.3.0
    python <span class="nt">-m</span> build

    <span class="k">if</span> <span class="o">[</span> <span class="si">$(</span><span class="nb">arch</span><span class="si">)</span> <span class="o">=</span> <span class="s2">"x86_64"</span> <span class="o">]</span><span class="p">;</span> <span class="k">then
        </span><span class="nv">platform</span><span class="o">=</span><span class="s2">"manylinux_2_34_x86_64"</span>
    <span class="k">elif</span> <span class="o">[</span> <span class="si">$(</span><span class="nb">arch</span><span class="si">)</span> <span class="o">=</span> <span class="s2">"aarch64"</span> <span class="o">]</span><span class="p">;</span> <span class="k">then
        </span><span class="nv">platform</span><span class="o">=</span><span class="s2">"manylinux_2_34_aarch64"</span>
    <span class="k">else
        </span><span class="nb">echo</span> <span class="s2">"ERROR: Unknown architecture"</span>
        <span class="nb">exit </span>1<span class="p">;</span>
    <span class="k">fi

    </span>auditwheel repair <span class="nt">--exclude</span> libcu<span class="k">*</span> <span class="se">\</span>
                      <span class="si">$(</span><span class="nb">ls </span>dist/<span class="k">*</span>.whl | <span class="nb">head</span> <span class="nt">-n</span> 1<span class="si">)</span> <span class="se">\</span>
                      <span class="nt">--plat</span> <span class="k">${</span><span class="nv">platform</span><span class="k">}</span> <span class="se">\</span>
                      <span class="nt">-w</span> wheelhouse
<span class="o">}</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>The function removes the directories <code class="language-plaintext highlighter-rouge">dist</code> and <code class="language-plaintext highlighter-rouge">build</code> to start fresh. Then creates a new python environment and activates it. After that installs <code class="language-plaintext highlighter-rouge">wheel</code>, <code class="language-plaintext highlighter-rouge">pybind11</code>, <code class="language-plaintext highlighter-rouge">auditwheel</code>, <code class="language-plaintext highlighter-rouge">repairwheel</code>, <code class="language-plaintext highlighter-rouge">patchelf</code>, <code class="language-plaintext highlighter-rouge">build</code> and <code class="language-plaintext highlighter-rouge">setuptools</code>.</p>

<p>The step that really builds the wheel is <code class="language-plaintext highlighter-rouge">python -m build</code>, this will create the wheel directly in the <code class="language-plaintext highlighter-rouge">build</code> directory. At this point we need to repair the wheel: Linux systems have different versions of the library <code class="language-plaintext highlighter-rouge">GLIBC</code>, in this case we want to make it compatible from version 2.34 (see <a href="https://ftp.gnu.org/gnu/glibc/">list of versions</a>) onwards. For that we indicate the plaform <code class="language-plaintext highlighter-rouge">platform="manylinux_2_34_x86_64"</code>. The next instruction will “repair” this wheel</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
</pre></td><td class="rouge-code"><pre>auditwheel repair <span class="nt">--exclude</span> libcu<span class="k">*</span> <span class="se">\</span>
                    <span class="si">$(</span><span class="nb">ls </span>dist/<span class="k">*</span>.whl | <span class="nb">head</span> <span class="nt">-n</span> 1<span class="si">)</span> <span class="se">\</span>
                    <span class="nt">--plat</span> <span class="k">${</span><span class="nv">platform</span><span class="k">}</span> <span class="se">\</span>
                    <span class="nt">-w</span> wheelhouse
</pre></td></tr></tbody></table></code></pre></div></div>

<p>and exclude all libraries starting from <code class="language-plaintext highlighter-rouge">libcu</code>. That’s key in the auditwheel, this program incldues all the libraries in the wheel so that it is complete and therefore there’s no need to install any external libraries. We decide to exclude the cuda libraries because to run any cuda program you need to have those libraries installed… It would be duplicated, besides they are quite heavy in memory usage.</p>

<p>After running this <code class="language-plaintext highlighter-rouge">auditwheel</code> the wheel will appear in the directory we indicated <code class="language-plaintext highlighter-rouge">wheelhouse</code>.</p>

<p>In the official documentation of <a href="https://github.com/pypa/auditwheel">auditwheel</a> the developers use docker images listed in <a href="https://quay.io/organization/pypa">https://quay.io</a>. Those work well for C++ only code and not for CUDA code, this is why I had to come up with a manual way to build and repair the wheel.</p>

<p>Another problem we are having with CUDA extensions is that for now GitHub won’t have CUDA agents (i.e. machines with GPUs able to run CUDA code) so if you want to implement proper CI/CD you need to create your own pipeline in a custom machine. <a href="https://www.jenkins.io/">Jenkins</a> could be a good tool for that. I used a comertial software that my employer provided but It’s essentially the same (bash scripts here and there).</p>

<h2 id="conclusions">Conclusions</h2>

<p>This is a very simple example on how to create a Python package that uses CUDA in the backend. You can complicate this further, add other C++ implementation, more functions, more tests… But I hope by reading this you could understand the basics and have the tooling to build your first python bindings for CUDA.</p>]]></content><author><name>Sebastia Agramunt Puig</name></author><category term="C++" /><category term="Python" /><category term="CUDA" /><category term="computer science" /><summary type="html"><![CDATA[In this post we will learn how to expose CUDA functionality in Python so effectively calling your custom CUDA code from Python without much hassle.]]></summary></entry><entry><title type="html">C++ Python package boilerplate</title><link href="https://agramunt.me/posts/cpp-python-boilerplate/" rel="alternate" type="text/html" title="C++ Python package boilerplate" /><published>2025-11-22T18:28:00-08:00</published><updated>2025-11-22T18:28:00-08:00</updated><id>https://agramunt.me/posts/cpp-python-boilerplate</id><content type="html" xml:base="https://agramunt.me/posts/cpp-python-boilerplate/"><![CDATA[<p>Previously in <a href="../../posts/cpp-python-extension">C++ basic Python extension</a> we learned the basic mechanism on building a C++ extension for Python. Here in this post we will be more practical and we will create a full end to end package that is fully tested and builds the wheels for different platforms and architectures. As before we will use pybind11 to create the bindings. This entire repository <a href="https://github.com/SebastiaAgramunt/python-boilerplate">python-boilerplate</a> lives in my GitHub account and not in the <a href="https://github.com/SebastiaAgramunt/blogging-code">blogging-code</a> where I usually publish.</p>

<h2 id="project-structure">Project structure</h2>

<p>The files and directories are the following after runnint <code class="language-plaintext highlighter-rouge">tree</code> command</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
</pre></td><td class="rouge-code"><pre><span class="nb">.</span>
├── Docker
│   ├── Dockerfile-python-3.13
│   └── build-run.sh
├── MANIFEST.in
├── README.md
├── include
│   └── matmul.h
├── pyproject.toml
├── scripts
│   ├── build_wheel.sh
│   └── example.py
├── setup.py
├── src
│   ├── bindings.cpp
│   ├── matmul.cpp
│   └── package_example
│       ├── __init__.py
│       └── operations.py
└── tests
    └── test_matmul.py
</pre></td></tr></tbody></table></code></pre></div></div>

<p>In the <code class="language-plaintext highlighter-rouge">src</code> we include all the <code class="language-plaintext highlighter-rouge">*.cpp</code> files, including the <code class="language-plaintext highlighter-rouge">bindings.cpp</code> written using pybind11. The <code class="language-plaintext highlighter-rouge">src</code> directory also contains the python package files under the package name directory <code class="language-plaintext highlighter-rouge">package_example</code>. The <code class="language-plaintext highlighter-rouge">include</code> directory has all the C++ headers. The <code class="language-plaintext highlighter-rouge">Docker</code> directory contains a docker file and a bash script to build and run the image. Then the usual <code class="language-plaintext highlighter-rouge">README.md</code> for documenting the build, publication etc. The <code class="language-plaintext highlighter-rouge">test</code> directory is where we place the tests, sometimes it is also recommended to create tests for the C++ part before binding it to Python, however not to overcomplicate things in this boilerplate we just create python tests using the bindings. The way we build the package is done with <code class="language-plaintext highlighter-rouge">setup.py</code> and <code class="language-plaintext highlighter-rouge">pyproject.toml</code>.</p>

<h2 id="building-the-package">Building the package</h2>

<p>The code in <code class="language-plaintext highlighter-rouge">matmul.cpp</code>, <code class="language-plaintext highlighter-rouge">matmul.h</code> and <code class="language-plaintext highlighter-rouge">bindings.cpp</code> is simply a matrix multiplication and its bindings in C++ so we won’t really comment into that, go to <a href="../../posts/cpp-python-extension">C++ basic Python extension</a>  to learn more. Comparing to that post the build is different, let’s start by the <code class="language-plaintext highlighter-rouge">pyproject.toml</code> file:</p>

<div class="language-toml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
</pre></td><td class="rouge-code"><pre><span class="k">[</span><span class="n">build-system</span><span class="k">]</span>
<span class="n">requires</span> <span class="o">=</span><span class="w"> </span><span class="p">[</span><span class="s">"setuptools&gt;=64"</span><span class="p">,</span> <span class="s">"wheel"</span><span class="p">,</span> <span class="s">"pybind11&gt;=2.10"</span><span class="p">]</span>
<span class="n">build-backend</span> <span class="o">=</span><span class="w"> </span><span class="s">"setuptools.build_meta"</span>


<span class="k">[</span><span class="n">tool</span><span class="k">.</span><span class="n">ruff</span><span class="k">]</span>
<span class="n">line-length</span> <span class="o">=</span><span class="w"> </span><span class="mi">99</span>

<span class="k">[</span><span class="n">tool</span><span class="k">.</span><span class="n">ruff</span><span class="k">.</span><span class="n">lint</span><span class="k">]</span>
<span class="n">select</span> <span class="o">=</span><span class="w"> </span><span class="p">[</span>
    <span class="c"># Pyflakes</span>
    <span class="s">"F"</span><span class="p">,</span>
    <span class="c"># Pycodestyle &amp; Warnings</span>
    <span class="s">"E"</span><span class="p">,</span>
    <span class="s">"W"</span><span class="p">,</span>
    <span class="c"># isort for unsorted imports</span>
    <span class="s">"I001"</span><span class="p">,</span>
<span class="p">]</span>

<span class="k">[</span><span class="n">tool</span><span class="k">.</span><span class="n">ruff</span><span class="k">.</span><span class="n">format</span><span class="k">]</span>
<span class="n">quote-style</span> <span class="o">=</span><span class="w"> </span><span class="s">"single"</span>
<span class="n">indent-style</span> <span class="o">=</span><span class="w"> </span><span class="s">"space"</span>
<span class="n">docstring-code-format</span> <span class="o">=</span><span class="w"> </span><span class="kc">true</span>
<span class="n">docstring-code-line-length</span> <span class="o">=</span><span class="w"> </span><span class="mi">20</span>

<span class="k">[</span><span class="n">tool</span><span class="k">.</span><span class="n">mypy</span><span class="k">]</span>
<span class="n">python_version</span> <span class="o">=</span><span class="w"> </span><span class="s">"3.13"</span>
<span class="n">ignore_missing_imports</span> <span class="o">=</span><span class="w"> </span><span class="kc">true</span>
<span class="n">exclude</span> <span class="o">=</span><span class="w"> </span><span class="s">"^(build/|</span><span class="se">\\</span><span class="s">.venv/)"</span>

<span class="k">[</span><span class="n">tool</span><span class="k">.</span><span class="n">pytest</span><span class="k">.</span><span class="n">ini_options</span><span class="k">]</span>
<span class="n">testpaths</span> <span class="o">=</span><span class="w"> </span><span class="p">[</span><span class="s">"tests"</span><span class="p">]</span>
<span class="n">addopts</span> <span class="o">=</span><span class="w"> </span><span class="s">"-v"</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>In this file we only specify the build system, which is setuptools which is the default built system in python but it is not part of the Python standard library. Tools we will use in this project use setuptools like pybind11 and cibuildwheel. Aside from the build-system we have configuration information for <a href="https://github.com/astral-sh/ruff">ruff</a>, <a href="https://github.com/python/mypy">mypy</a> and <a href="https://docs.pytest.org/en/stable/">pytest</a>.</p>

<p>The file <code class="language-plaintext highlighter-rouge">setup.py</code> contains the real bread and butter on the compilation of the project, something that in the previous post we did manually.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
</pre></td><td class="rouge-code"><pre><span class="kn">from</span> <span class="n">pybind11.setup_helpers</span> <span class="kn">import</span> <span class="n">Pybind11Extension</span><span class="p">,</span> <span class="n">build_ext</span>
<span class="kn">from</span> <span class="n">setuptools</span> <span class="kn">import</span> <span class="n">setup</span><span class="p">,</span> <span class="n">find_packages</span>
<span class="kn">from</span> <span class="n">pathlib</span> <span class="kn">import</span> <span class="n">Path</span>
<span class="kn">import</span> <span class="n">sysconfig</span>

<span class="n">__version__</span> <span class="o">=</span> <span class="sh">'</span><span class="s">0.0.1</span><span class="sh">'</span>

<span class="n">REPO_PATH</span> <span class="o">=</span> <span class="nc">Path</span><span class="p">(</span><span class="n">__file__</span><span class="p">).</span><span class="nf">resolve</span><span class="p">().</span><span class="n">parent</span>

<span class="n">PACKAGE_NAME</span> <span class="o">=</span> <span class="sh">'</span><span class="s">package_example</span><span class="sh">'</span>

<span class="n">PYTHON_LIB_INCLUDES</span> <span class="o">=</span> <span class="n">sysconfig</span><span class="p">.</span><span class="nf">get_path</span><span class="p">(</span><span class="sh">'</span><span class="s">include</span><span class="sh">'</span><span class="p">)</span>
<span class="n">PACKAGE_LIB_INCLUDES</span> <span class="o">=</span> <span class="n">REPO_PATH</span> <span class="o">/</span> <span class="sh">'</span><span class="s">include</span><span class="sh">'</span>

<span class="n">SRC_FILES</span> <span class="o">=</span> <span class="p">[</span>
    <span class="nf">str</span><span class="p">(</span><span class="n">REPO_PATH</span> <span class="o">/</span> <span class="sh">'</span><span class="s">src</span><span class="sh">'</span> <span class="o">/</span> <span class="sh">'</span><span class="s">matmul.cpp</span><span class="sh">'</span><span class="p">),</span>
    <span class="nf">str</span><span class="p">(</span><span class="n">REPO_PATH</span> <span class="o">/</span> <span class="sh">'</span><span class="s">src</span><span class="sh">'</span> <span class="o">/</span> <span class="sh">'</span><span class="s">bindings.cpp</span><span class="sh">'</span><span class="p">),</span>
<span class="p">]</span>

<span class="n">EXTRA_COMPILE_ARGS</span> <span class="o">=</span> <span class="p">[</span><span class="sh">'</span><span class="s">-O3</span><span class="sh">'</span><span class="p">,</span> <span class="sh">'</span><span class="s">-std=c++17</span><span class="sh">'</span><span class="p">]</span>

<span class="n">ext_modules</span> <span class="o">=</span> <span class="p">[</span>
    <span class="nc">Pybind11Extension</span><span class="p">(</span>
        <span class="n">PACKAGE_NAME</span> <span class="o">+</span> <span class="sh">'</span><span class="s">._core</span><span class="sh">'</span><span class="p">,</span>
        <span class="n">SRC_FILES</span><span class="p">,</span>
        <span class="n">include_dirs</span><span class="o">=</span><span class="p">[</span><span class="n">PYTHON_LIB_INCLUDES</span><span class="p">,</span> <span class="nf">str</span><span class="p">(</span><span class="n">PACKAGE_LIB_INCLUDES</span><span class="p">)],</span>
        <span class="n">extra_compile_args</span><span class="o">=</span><span class="n">EXTRA_COMPILE_ARGS</span><span class="p">,</span>
        <span class="n">define_macros</span><span class="o">=</span><span class="p">[(</span><span class="sh">'</span><span class="s">VERSION_INFO</span><span class="sh">'</span><span class="p">,</span> <span class="n">__version__</span><span class="p">)],</span>
    <span class="p">),</span>
<span class="p">]</span>

<span class="nf">setup</span><span class="p">(</span>
    <span class="n">name</span><span class="o">=</span><span class="n">PACKAGE_NAME</span><span class="p">,</span>
    <span class="n">version</span><span class="o">=</span><span class="n">__version__</span><span class="p">,</span>
    <span class="n">author</span><span class="o">=</span><span class="sh">'</span><span class="s">Sebastia Agramunt Puig</span><span class="sh">'</span><span class="p">,</span>
    <span class="n">author_email</span><span class="o">=</span><span class="sh">'</span><span class="s">contact@agramunt.me</span><span class="sh">'</span><span class="p">,</span>
    <span class="n">url</span><span class="o">=</span><span class="sh">'</span><span class="s">https://github.com/SebastiaAgramunt/python-boilerplate</span><span class="sh">'</span><span class="p">,</span>
    <span class="n">description</span><span class="o">=</span><span class="sh">'</span><span class="s">Example package with C++ extension</span><span class="sh">'</span><span class="p">,</span>
    <span class="n">long_description</span><span class="o">=</span><span class="nf">open</span><span class="p">(</span><span class="sh">'</span><span class="s">README.md</span><span class="sh">'</span><span class="p">).</span><span class="nf">read</span><span class="p">(),</span>
    <span class="n">long_description_content_type</span><span class="o">=</span><span class="sh">'</span><span class="s">text/markdown</span><span class="sh">'</span><span class="p">,</span>
    <span class="n">packages</span><span class="o">=</span><span class="nf">find_packages</span><span class="p">(</span><span class="n">where</span><span class="o">=</span><span class="sh">'</span><span class="s">src</span><span class="sh">'</span><span class="p">),</span>
    <span class="n">package_dir</span><span class="o">=</span><span class="p">{</span><span class="sh">''</span><span class="p">:</span> <span class="sh">'</span><span class="s">src</span><span class="sh">'</span><span class="p">},</span>
    <span class="n">ext_modules</span><span class="o">=</span><span class="n">ext_modules</span><span class="p">,</span>
    <span class="n">cmdclass</span><span class="o">=</span><span class="p">{</span><span class="sh">'</span><span class="s">build_ext</span><span class="sh">'</span><span class="p">:</span> <span class="n">build_ext</span><span class="p">},</span>
    <span class="n">zip_safe</span><span class="o">=</span><span class="bp">False</span><span class="p">,</span>
    <span class="n">python_requires</span><span class="o">=</span><span class="sh">'</span><span class="s">&gt;=3.9,&lt;3.14</span><span class="sh">'</span><span class="p">,</span>
    <span class="n">install_requires</span><span class="o">=</span><span class="p">[</span>
        <span class="sh">'</span><span class="s">numpy&gt;=1.20</span><span class="sh">'</span><span class="p">,</span>
    <span class="p">],</span>
    <span class="n">extras_require</span><span class="o">=</span><span class="p">{</span><span class="sh">'</span><span class="s">test</span><span class="sh">'</span><span class="p">:</span> <span class="p">[</span><span class="sh">'</span><span class="s">pytest</span><span class="sh">'</span><span class="p">,</span> <span class="sh">'</span><span class="s">ruff</span><span class="sh">'</span><span class="p">,</span> <span class="sh">'</span><span class="s">mypy</span><span class="sh">'</span><span class="p">,</span> <span class="sh">'</span><span class="s">pre-commit</span><span class="sh">'</span><span class="p">]},</span>
<span class="p">)</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Starting from the beginning, we have a variable <code class="language-plaintext highlighter-rouge">__version__</code>, this will be the version of our package, change it on every release. For convenience we define some variables <code class="language-plaintext highlighter-rouge">PACKAGE_NAME</code>, is the name we will give to our package. Then the variable <code class="language-plaintext highlighter-rouge">PYTHON_LIB_INCLUDES</code> is where our python header files live (i.e. <code class="language-plaintext highlighter-rouge">Python.h</code>), needed for the bindings compilation. We define the includes of our project in <code class="language-plaintext highlighter-rouge">PACKAGE_LIB_INCLUDES</code> and finally the source files in a list of <code class="language-plaintext highlighter-rouge">SRC_FILES</code>. Some optimization for the compiler may be needed so I added performance flags like <code class="language-plaintext highlighter-rouge">-O3</code> and <code class="language-plaintext highlighter-rouge">c++17</code> standard. Then we define the external modules in the <code class="language-plaintext highlighter-rouge">ext_modules</code> variable, the inputs are obvious. The final setup is defined through the function <code class="language-plaintext highlighter-rouge">setup</code> from setuptools. Here we specify the python version range, the required packages and as a bonus the extra requirements that we may want to use for testing.</p>

<p>A file that is sometimes disregarded is the <code class="language-plaintext highlighter-rouge">MANIFEST.in</code> file:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
</pre></td><td class="rouge-code"><pre><span class="c"># MANIFEST.in needed to include non-Python files in the package to build wheel distributions</span>
<span class="c"># e.g. C++ source files, headers, README, pyproject.toml, etc.</span>

include pyproject.toml
include README.md

recursive-include src <span class="k">*</span>.py <span class="k">*</span>.cpp <span class="k">*</span>.hpp <span class="k">*</span>.h
recursive-include include <span class="k">*</span>.hpp <span class="k">*</span>.h
</pre></td></tr></tbody></table></code></pre></div></div>

<p>this is key if you want to release wheels. Essentially it tells python to include files from the directory and ship them in your compiled wheel.</p>

<p>To install from source just create a new environment</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
</pre></td><td class="rouge-code"><pre><span class="nb">rm</span> <span class="nt">-rf</span> .venv
python3 <span class="nt">-m</span> venv .venv
.venv/bin/python <span class="nt">-m</span> pip <span class="nb">install</span> <span class="nt">--upgrade</span> pip
</pre></td></tr></tbody></table></code></pre></div></div>

<p>and then use pip to install it</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>.venv/bin/python <span class="nt">-m</span> pip <span class="nb">install</span> <span class="nb">.</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>This will compile C++, the bindings and add your python code. After installing you should be able to see the compiled file and python sources:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre><span class="nb">ls</span> <span class="nt">-lhat</span> .venv/lib/python3.13/site-packages/package_example 
</pre></td></tr></tbody></table></code></pre></div></div>

<p>In my case (running this in MacOS) I find the file <code class="language-plaintext highlighter-rouge">_core.cpython-313-darwin.so</code>, that is our C++ shared library. Also the file <code class="language-plaintext highlighter-rouge">operations.py</code> and the <code class="language-plaintext highlighter-rouge">__init__.py</code>.</p>

<p>To really confirm the package is installed and working, run the example script</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre>.venv/bin/python <span class="nt">-c</span> <span class="s2">"import package_example"</span>
.venv/bin/python scripts/example.py
</pre></td></tr></tbody></table></code></pre></div></div>

<h2 id="building-the-wheel-locally">Building the wheel locally</h2>

<p>Instead of building from source each time we can build a wheel and pull this wheel to other projects. We will do this manually and also with GitHub actions. In this section we will learn the manual way of generating a wheel.</p>

<p>In <code class="language-plaintext highlighter-rouge">scripts/build_wheel.h</code> you will find a bash script that crates the package. The build is different for Linux or MacOS (we won’t cover Windows here). We have a function to create an environment called <code class="language-plaintext highlighter-rouge">crate_venv</code> that is executed regardless, then depending on the operating system we use <code class="language-plaintext highlighter-rouge">build_wheel_linux</code> or <code class="language-plaintext highlighter-rouge">build_wheel_macos</code>. The wheel is built with the command</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>python <span class="nt">-m</span> build <span class="s2">"</span><span class="nv">$PROJECT_DIR</span><span class="s2">"</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>and the wheels are placed in <code class="language-plaintext highlighter-rouge">dist</code> directory.</p>

<p>In the case of the wheel in Linux we do extra things. First we identify which architecture are we on <code class="language-plaintext highlighter-rouge">x86</code> or <code class="language-plaintext highlighter-rouge">aarch64</code> and give the platform tag <code class="language-plaintext highlighter-rouge">manylinux_2_28_x86_64</code> for the first and <code class="language-plaintext highlighter-rouge">manylinux_2_28_aarch64</code> for the latter. This will be used to repair the wheel.</p>

<p>Manylinux is a Linux compatibility standard for Python wheels. Its purpose is to allow developers to build binary wheels (wheels that contain compiled C/C++ code) that work on most Linux distributions, even very old ones. Linux distributions vary a lot, different glibc versions, compiler versions, system libraries… Manylinux solve this problem. Specifically in this case we will repair the wheel so that it is compatible with <code class="language-plaintext highlighter-rouge">glibc</code> version 2.28 and above for the two architectures.</p>

<p>We use <a href="https://github.com/pypa/auditwheel">auditwheel</a> as mentioned to repair the wheel.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
</pre></td><td class="rouge-code"><pre>auditwheel repair <span class="s2">"</span><span class="nv">$WHEEL_FILE</span><span class="s2">"</span> <span class="se">\</span>
<span class="nt">--plat</span> <span class="s2">"</span><span class="nv">$PLATFORM_TAG</span><span class="s2">"</span> <span class="se">\</span>
<span class="nt">-w</span> <span class="s2">"</span><span class="nv">$PROJECT_DIR</span><span class="s2">/wheelhouse"</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>This program will include all the dependencies needed all libraries that are used in your package (shared objects) will be included. The problem with this is that it could potentially add super large libraries like <code class="language-plaintext highlighter-rouge">libcuda</code> if your project is compiled cuda code. You really don’t need this library because it will be installed in the machine you will be running the code (otherwise how can you talk to the GPU?). To exclude libraries and make your wheel a bit smaller use the <code class="language-plaintext highlighter-rouge">--exclude</code> flag, i.e. <code class="language-plaintext highlighter-rouge">--exclude libcu* --exclude libnvcomp*</code>.</p>

<p>That’s it, your repaired linux wheel will be saved in the <code class="language-plaintext highlighter-rouge">wheelhouse</code> directory.</p>

<h2 id="cicd-on-github-actions">CI/CD on GitHub Actions</h2>

<p>In <code class="language-plaintext highlighter-rouge">.github/workflows/build-wheels.yml</code> we placed some code that builds (compiles) the wheels, runs the tests and publishes the wheels when tagging a release. Let´s inspect the file <code class="language-plaintext highlighter-rouge">build-wheels.yml</code>:</p>

<div class="language-yml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
</pre></td><td class="rouge-code"><pre><span class="na">name</span><span class="pi">:</span> <span class="s">Build wheels</span>

<span class="na">on</span><span class="pi">:</span>
  <span class="na">push</span><span class="pi">:</span>
    <span class="na">branches</span><span class="pi">:</span> <span class="pi">[</span> <span class="nv">main</span><span class="pi">,</span> <span class="nv">master</span> <span class="pi">]</span>
    <span class="na">tags</span><span class="pi">:</span> <span class="pi">[</span> <span class="s2">"</span><span class="s">v*"</span> <span class="pi">]</span> <span class="c1"># only build/publish on version tags</span>
  <span class="na">pull_request</span><span class="pi">:</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>This indicates that the job will be triggered in the <code class="language-plaintext highlighter-rouge">main</code> and <code class="language-plaintext highlighter-rouge">master</code> branches (usually you just have one of these), on all pull requests and on tags starting with <code class="language-plaintext highlighter-rouge">v</code> (we will name our versions like <code class="language-plaintext highlighter-rouge">v0.0.1</code>).</p>

<p>Then we define two jobs, <code class="language-plaintext highlighter-rouge">build_wheels</code> and <code class="language-plaintext highlighter-rouge">publish_release_assests</code>. The firts job starts with</p>

<div class="language-yml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
</pre></td><td class="rouge-code"><pre><span class="na">build_wheels</span><span class="pi">:</span>
  <span class="na">name</span><span class="pi">:</span> <span class="s">Build wheels on ${{ matrix.os }}</span>
  <span class="na">runs-on</span><span class="pi">:</span> <span class="s">${{ matrix.os }}</span>

  <span class="na">strategy</span><span class="pi">:</span>
    <span class="na">fail-fast</span><span class="pi">:</span> <span class="kc">false</span>
    <span class="na">matrix</span><span class="pi">:</span>
    <span class="na">os</span><span class="pi">:</span> <span class="pi">[</span><span class="nv">ubuntu-latest</span><span class="pi">,</span> <span class="nv">macos-latest</span><span class="pi">,</span> <span class="nv">windows-latest</span><span class="pi">]</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Tells us to run the job in three operating systems.</p>

<p>Then the steps to follow for each of the operating systems is</p>

<div class="language-yml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
</pre></td><td class="rouge-code"><pre><span class="na">steps</span><span class="pi">:</span>
  <span class="pi">-</span> <span class="na">uses</span><span class="pi">:</span> <span class="s">actions/checkout@v4</span>

  <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">Set up Python</span>
    <span class="na">uses</span><span class="pi">:</span> <span class="s">actions/setup-python@v5</span>
    <span class="na">with</span><span class="pi">:</span>
      <span class="na">python-version</span><span class="pi">:</span> <span class="s2">"</span><span class="s">3.11"</span>

  <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">Install cibuildwheel</span>
  <span class="na">run</span><span class="pi">:</span> <span class="pi">|</span>
      <span class="s">python -m pip install --upgrade pip</span>
      <span class="s">python -m pip install cibuildwheel</span>

  <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">Build wheels with cibuildwheel</span>
    <span class="na">env</span><span class="pi">:</span>
      <span class="na">CIBW_BUILD</span><span class="pi">:</span> <span class="s2">"</span><span class="s">cp3{10,11,12,13}-*"</span>
      <span class="na">CIBW_SKIP</span><span class="pi">:</span> <span class="s2">"</span><span class="s">pp*</span><span class="nv"> </span><span class="s">*-musllinux_*"</span>
      <span class="na">CIBW_ARCHS_MACOS</span><span class="pi">:</span> <span class="s2">"</span><span class="s">x86_64</span><span class="nv"> </span><span class="s">arm64"</span>
      <span class="na">CIBW_TEST_REQUIRES</span><span class="pi">:</span> <span class="s2">"</span><span class="s">pytest</span><span class="nv"> </span><span class="s">numpy"</span>
      <span class="na">CIBW_TEST_COMMAND</span><span class="pi">:</span> <span class="s2">"</span><span class="s">pytest</span><span class="nv"> </span><span class="s">-q</span><span class="nv"> </span><span class="s">{project}/tests"</span>
    <span class="na">run</span><span class="pi">:</span> <span class="pi">|</span>
        <span class="s">cibuildwheel --output-dir wheelhouse</span>

  <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">List built wheels</span>
    <span class="na">run</span><span class="pi">:</span> <span class="s">ls wheelhouse</span>

  <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">Upload wheels as artifact</span>
    <span class="na">uses</span><span class="pi">:</span> <span class="s">actions/upload-artifact@v4</span>
    <span class="na">with</span><span class="pi">:</span>
      <span class="na">name</span><span class="pi">:</span> <span class="s">wheels-${{ matrix.os }}</span>
      <span class="na">path</span><span class="pi">:</span> <span class="s">wheelhouse/*.whl</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>The steps use the <a href="https://github.com/actions/checkout">actions/checkout@v4</a>, that only checks out your repository under the <code class="language-plaintext highlighter-rouge">$GITHUB_WORKSPACE</code> so that your runner on github can access it. The first action is just setting up python to be used by the next step, the <code class="language-plaintext highlighter-rouge">cibuildwheel</code> installation. Then we build the wheels using <code class="language-plaintext highlighter-rouge">cibuildwheel</code>, we specify architectures, python versions and testing requiremtents. This will build all the wheels and test our repository. Then just for sanity check we list the <code class="language-plaintext highlighter-rouge">wheelhouse</code> directory, where we decided to place the wheels in the previous step. Finally we upload the artifact using the <a href="https://github.com/actions/upload-artifact">actions/upload-artifact@v4</a>, this will store the wheels into an internal github storage. And that’s it, after this action is executed we will have a bunch of wheels in different platforms and architectures already tested.</p>

<p>The second action is <code class="language-plaintext highlighter-rouge">publish_release_assests</code>, this one starts with</p>

<div class="language-yml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
</pre></td><td class="rouge-code"><pre><span class="na">publish_release_assets</span><span class="pi">:</span>
  <span class="na">name</span><span class="pi">:</span> <span class="s">Attach wheels to GitHub Release</span>
  <span class="na">needs</span><span class="pi">:</span> <span class="s">build_wheels</span>
  <span class="na">runs-on</span><span class="pi">:</span> <span class="s">ubuntu-latest</span>
  <span class="na">if</span><span class="pi">:</span> <span class="s">startsWith(github.ref, 'refs/tags/v')</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>which specifies that only needs to be run on <code class="language-plaintext highlighter-rouge">ubuntu-latest</code> and after <code class="language-plaintext highlighter-rouge">build_wheels</code> has run successfully. Also only trigger this job for a tagged release starting with <code class="language-plaintext highlighter-rouge">v</code>. The steps are the following</p>

<div class="language-yml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
</pre></td><td class="rouge-code"><pre><span class="na">steps</span><span class="pi">:</span>
  <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">Download wheel artifacts</span>
    <span class="na">uses</span><span class="pi">:</span> <span class="s">actions/download-artifact@v4</span>
    <span class="na">with</span><span class="pi">:</span>
      <span class="na">pattern</span><span class="pi">:</span> <span class="s">wheels-*</span>
      <span class="na">path</span><span class="pi">:</span> <span class="s">./artifacts</span>
      <span class="na">merge-multiple</span><span class="pi">:</span> <span class="kc">true</span>

  <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">List downloaded wheels</span>
    <span class="na">run</span><span class="pi">:</span> <span class="pi">|</span>
      <span class="s">echo "Contents of ./artifacts:"</span>
      <span class="s">ls -R ./artifacts || echo "No artifacts found"</span>

  <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">Create / update GitHub Release and upload wheels</span>
    <span class="na">uses</span><span class="pi">:</span> <span class="s">softprops/action-gh-release@v2</span>
    <span class="na">with</span><span class="pi">:</span>
      <span class="na">files</span><span class="pi">:</span> <span class="s">artifacts/*.whl</span>
    <span class="na">env</span><span class="pi">:</span>
      <span class="na">GITHUB_TOKEN</span><span class="pi">:</span> <span class="s">${{ secrets.GITHUB_TOKEN }}</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>The first downloads the weheels published internally to our <code class="language-plaintext highlighter-rouge">./artifacts</code> directory. The second just lists the wheels and the third uses <code class="language-plaintext highlighter-rouge">softprops/action-gh-release@v2</code> action to publish the wheels in the tagged release.</p>

<p>To trigger this job <code class="language-plaintext highlighter-rouge">publish_release_assets</code> we tag the release once we merge a PR to master. This can be done executing the following in your master branch locally:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
</pre></td><td class="rouge-code"><pre><span class="nv">VERSION</span><span class="o">=</span>0.0.5
git tag <span class="nt">-a</span> v<span class="k">${</span><span class="nv">VERSION</span><span class="k">}</span> <span class="nt">-m</span> <span class="s2">"v</span><span class="k">${</span><span class="nv">VERSION</span><span class="k">}</span><span class="s2">"</span>
git push origin v<span class="k">${</span><span class="nv">VERSION</span><span class="k">}</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>The job will be triggered and you need to wait a bit to see all the artifacts in the <a href="https://github.com/SebastiaAgramunt/python-boilerplate/releases/tag/v0.0.5">repository/releases/tag/v0.0.5</a> in our example. That’s it, you have all</p>

<h2 id="install-the-package-from-source">Install the package from source</h2>

<p>You can build the package locally</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
</pre></td><td class="rouge-code"><pre><span class="nb">rm</span> <span class="nt">-rf</span> .venv
<span class="c"># create a virtual environment with uv (yes, my new favorite tool)</span>
uv venv .venv <span class="nt">-p</span> 3.13
uv pip <span class="nb">install</span> <span class="nb">.</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Now you can try and run the tests</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre>uv pip <span class="nb">install </span>pytest
uv run pytest <span class="nb">.</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>If you want to use <code class="language-plaintext highlighter-rouge">pyenv</code> instead you can do</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre>pyenv shell 3.13
.venv/bin/python <span class="nt">-m</span> pip <span class="nb">install</span> <span class="nb">.</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>and run the tests to try it</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre>.venv/bin/python <span class="nt">-m</span> pip <span class="nb">install </span>pytest
.venv/bin/pytest <span class="nb">.</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<h2 id="install-the-pacakge-from-the-wheel">Install the pacakge from the wheel</h2>

<p>Once you have your wheel uploaded It’s very easy to pull the wheel from GitHub and install it in your environment. Let’s create a new virtual environment with Python 3.13 on a Mac with the new ARM64 CPU chip.</p>

<p>As before create the virtual environment (I use <code class="language-plaintext highlighter-rouge">uv</code> now)</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre><span class="nb">rm</span> <span class="nt">-rf</span> .venv
uv venv .venv <span class="nt">-p</span> 3.13
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Now you can install the wheel that we crated in CI/CD on GitHub actions</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
</pre></td><td class="rouge-code"><pre><span class="nv">PKG_VERSION</span><span class="o">=</span><span class="s2">"0.0.5"</span>
<span class="nv">PKG_NAME</span><span class="o">=</span><span class="s2">"package_example-</span><span class="k">${</span><span class="nv">PKG_VERSION</span><span class="k">}</span><span class="s2">-cp313-cp313-macosx_11_0_arm64.whl"</span>
<span class="nv">PKG_URL</span><span class="o">=</span><span class="s2">"https://github.com/SebastiaAgramunt/python-boilerplate/releases/download/v</span><span class="k">${</span><span class="nv">PKG_VERSION</span><span class="k">}</span><span class="s2">"</span>
<span class="nv">WHEEL</span><span class="o">=</span><span class="s2">"</span><span class="k">${</span><span class="nv">PKG_URL</span><span class="k">}</span><span class="s2">/</span><span class="k">${</span><span class="nv">PKG_NAME</span><span class="k">}</span><span class="s2">"</span>
uv pip <span class="nb">install</span> <span class="k">${</span><span class="nv">WHEEL</span><span class="k">}</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>and just import the package to see if it has been installed</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>.venv/bin/python -c "import package_example"
</pre></td></tr></tbody></table></code></pre></div></div>

<p>And that’s it, you can tell your friends to install from your wheel direclty to their system!, all compiled, no problems!.</p>

<h2 id="using-the-package">Using the package</h2>

<p>Now how do we use the pacakge, in <code class="language-plaintext highlighter-rouge">scripts/example.py</code> we have a example script to use the code:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
</pre></td><td class="rouge-code"><pre><span class="kn">import</span> <span class="n">numpy</span> <span class="k">as</span> <span class="n">np</span>
<span class="kn">import</span> <span class="n">package_example</span> <span class="k">as</span> <span class="n">pe</span>  <span class="c1"># this uses your __init__.py exports
</span>

<span class="k">def</span> <span class="nf">main</span><span class="p">():</span>
    <span class="c1"># Create a random matrix A of shape (M, N)
</span>    <span class="n">M</span><span class="p">,</span> <span class="n">N</span> <span class="o">=</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">3</span>
    <span class="n">A</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">random</span><span class="p">.</span><span class="nf">randn</span><span class="p">(</span><span class="n">M</span><span class="p">,</span> <span class="n">N</span><span class="p">).</span><span class="nf">astype</span><span class="p">(</span><span class="n">np</span><span class="p">.</span><span class="n">float32</span><span class="p">)</span>

    <span class="nf">print</span><span class="p">(</span><span class="sh">'</span><span class="s">A:</span><span class="sh">'</span><span class="p">)</span>
    <span class="nf">print</span><span class="p">(</span><span class="n">A</span><span class="p">)</span>

    <span class="n">B</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">random</span><span class="p">.</span><span class="nf">randn</span><span class="p">(</span><span class="n">N</span><span class="p">,</span> <span class="n">M</span><span class="p">).</span><span class="nf">astype</span><span class="p">(</span><span class="n">np</span><span class="p">.</span><span class="n">float32</span><span class="p">)</span>

    <span class="nf">print</span><span class="p">(</span><span class="sh">'</span><span class="se">\n</span><span class="s">B:</span><span class="sh">'</span><span class="p">)</span>
    <span class="nf">print</span><span class="p">(</span><span class="n">B</span><span class="p">)</span>

    <span class="c1"># Prepare output matrix C
</span>    <span class="n">C</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="nf">zeros</span><span class="p">((</span><span class="n">M</span><span class="p">,</span> <span class="n">M</span><span class="p">),</span> <span class="n">dtype</span><span class="o">=</span><span class="n">np</span><span class="p">.</span><span class="n">float32</span><span class="p">)</span>

    <span class="c1"># multiply A and B using the C++ extension
</span>    <span class="n">pe</span><span class="p">.</span><span class="nf">matmul</span><span class="p">(</span><span class="n">A</span><span class="p">,</span> <span class="n">B</span><span class="p">,</span> <span class="n">C</span><span class="p">)</span>

    <span class="nf">print</span><span class="p">(</span><span class="sh">'</span><span class="se">\n</span><span class="s">A * B:</span><span class="sh">'</span><span class="p">)</span>
    <span class="nf">print</span><span class="p">(</span><span class="n">C</span><span class="p">)</span>

    <span class="c1"># Verify correctness using pure NumPy
</span>    <span class="n">C_np</span> <span class="o">=</span> <span class="n">A</span> <span class="o">@</span> <span class="n">B</span>

    <span class="nf">print</span><span class="p">(</span><span class="sh">'</span><span class="se">\n</span><span class="s">NumPy result:</span><span class="sh">'</span><span class="p">)</span>
    <span class="nf">print</span><span class="p">(</span><span class="n">C_np</span><span class="p">)</span>

    <span class="nf">print</span><span class="p">(</span><span class="sh">'</span><span class="se">\n</span><span class="s">Difference (should be near zero):</span><span class="sh">'</span><span class="p">)</span>
    <span class="nf">print</span><span class="p">(</span><span class="n">C</span> <span class="o">-</span> <span class="n">C_np</span><span class="p">)</span>


<span class="k">if</span> <span class="n">__name__</span> <span class="o">==</span> <span class="sh">'</span><span class="s">__main__</span><span class="sh">'</span><span class="p">:</span>
    <span class="nf">main</span><span class="p">()</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Install the package using the environment and then run the script</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>.venv/bin/python scripts/example.py
</pre></td></tr></tbody></table></code></pre></div></div>

<p>In the script we just calcualte using the exposed function <code class="language-plaintext highlighter-rouge">matmul</code> in python (backend in pure C++) and the same in the usual <code class="language-plaintext highlighter-rouge">numpy</code>. We print out the difference of the two results which should be zero. I haven’t tested the speedup but my implementation should be worse than numpy. Certainly, numpy already uses <code class="language-plaintext highlighter-rouge">BLAS</code> and <code class="language-plaintext highlighter-rouge">LAPACKE</code> libraries, which are already very optimized for numerical computing. This post is just an example on how to create C++ bindings.</p>

<h2 id="final-remarks">Final remarks</h2>

<p>I hope this end to end python project for C++ bindings has been useful to you. I tried to add most of the basic ingredients to create it. Hope you can create amazing Python packages with C++ backend and obviously share them with the communtiy. Have fun coding.</p>]]></content><author><name>Sebastia Agramunt Puig</name></author><category term="C++" /><category term="Python" /><category term="computer science" /><summary type="html"><![CDATA[Previously in C++ basic Python extension we learned the basic mechanism on building a C++ extension for Python. Here in this post we will be more practical and we will create a full end to end package that is fully tested and builds the wheels for different platforms and architectures. As before we will use pybind11 to create the bindings. This entire repository python-boilerplate lives in my GitHub account and not in the blogging-code where I usually publish.]]></summary></entry><entry><title type="html">Matrix Multiplication in CUDA</title><link href="https://agramunt.me/posts/cuda-matrix-multiplication/" rel="alternate" type="text/html" title="Matrix Multiplication in CUDA" /><published>2025-11-01T18:10:00-07:00</published><updated>2025-11-01T18:10:00-07:00</updated><id>https://agramunt.me/posts/cuda-matrix-multiplication</id><content type="html" xml:base="https://agramunt.me/posts/cuda-matrix-multiplication/"><![CDATA[<p>In this post we dive a little bit deeper into CUDA and GPU parallelization with a more practical case: Matrix multiplication. Matrices are used everywhere, in convolutions, solving linear systems of equations, neural networks, transformers etc… And in every possible mathematics application you can think of, in physics, mechanics, computer vision etc. Therefore is interesting to be able to calculate matrix multplications as fast as possible, it’s a heavy compute calculation.</p>

<p>The code for this post is in my <a href="https://github.com/SebastiaAgramunt/blogging-code">GitHub Blogging Code Repository</a> in the <a href="https://github.com/SebastiaAgramunt/blogging-code/tree/main/cuda-matrix-multiplication">cuda-matrix-multiplication</a> subsection.</p>

<p>I thank <a href="https://lambda.ai/">Lambda AI</a> for providing free credit to run the experiments described in the post. Throughout this post we will be using <code class="language-plaintext highlighter-rouge">gpu_1x_a100_sxm4</code>.</p>

<h2 id="matrix-multiplication">Matrix Multiplication</h2>

<p>Consider we have two matrices $A_{M \times K}$ and $B_{K \times N}$ that mutiplied give a matrix $C_{M \times N}$. The first element of the size is the number of rows and the second the number of columns. Each element of the matrix $c_{i,j}$ is calculated as</p>

\[c_{i,j}=\sum_{p=1}^{p=K}a_{i,p} \cdot b_{p,j}\]

<p>So, per each element $c_{i,j}$ we perofrm $K$ multiplications and $K-1$ additions. Since there are $M \times N$ elements in the $C$ matrix, there will be a total of $M \times N \times K$ multiplications and $M \times N \times (K -1)$ additions. So we have a number of ploating points operations of approximately</p>

\[\textbf{FLOPS} \approx 2 M \times N \times K\]

<p>And the time complexity goes as</p>

\[\mathcal{O}(M \times N \times K)\]

<p>to simplify in our calculations we will consider squared matrices of size $N$, so time complexity will go as $\mathcal{O} (N^3)$ which is huge.</p>

<h2 id="cpu-implementation-of-matrix-multiplication">CPU implementation of matrix multiplication</h2>

<p>We will use a very simple one threaded function that will be optimized by the compiler using the appropiate flags</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
</pre></td><td class="rouge-code"><pre><span class="kt">void</span> <span class="nf">simpleMatrixMultiplication_cpp</span><span class="p">(</span><span class="k">const</span> <span class="kt">float</span><span class="o">*</span> <span class="n">__restrict__</span> <span class="n">A</span><span class="p">,</span>
    <span class="k">const</span> <span class="kt">float</span><span class="o">*</span> <span class="n">__restrict__</span> <span class="n">B</span><span class="p">,</span>
    <span class="kt">float</span><span class="o">*</span> <span class="n">__restrict__</span> <span class="n">C</span><span class="p">,</span>
    <span class="k">const</span> <span class="kt">size_t</span> <span class="n">M</span><span class="p">,</span>
    <span class="k">const</span> <span class="kt">size_t</span> <span class="n">K</span><span class="p">,</span>
    <span class="k">const</span> <span class="kt">size_t</span> <span class="n">N</span><span class="p">)</span> <span class="p">{</span>
    <span class="k">for</span> <span class="p">(</span><span class="kt">size_t</span> <span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">i</span> <span class="o">&lt;</span> <span class="n">M</span><span class="p">;</span> <span class="o">++</span><span class="n">i</span><span class="p">)</span> <span class="p">{</span>
        <span class="k">for</span> <span class="p">(</span><span class="kt">size_t</span> <span class="n">j</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">j</span> <span class="o">&lt;</span> <span class="n">N</span><span class="p">;</span> <span class="o">++</span><span class="n">j</span><span class="p">)</span> <span class="p">{</span>
            <span class="kt">float</span> <span class="n">sum</span> <span class="o">=</span> <span class="mf">0.0f</span><span class="p">;</span>
            <span class="k">for</span> <span class="p">(</span><span class="kt">size_t</span> <span class="n">k</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">k</span> <span class="o">&lt;</span> <span class="n">K</span><span class="p">;</span> <span class="o">++</span><span class="n">k</span><span class="p">)</span> <span class="p">{</span>
                <span class="n">sum</span> <span class="o">+=</span> <span class="n">A</span><span class="p">[</span><span class="n">i</span> <span class="o">*</span> <span class="n">K</span> <span class="o">+</span> <span class="n">k</span><span class="p">]</span> <span class="o">*</span> <span class="n">B</span><span class="p">[</span><span class="n">k</span> <span class="o">*</span> <span class="n">N</span> <span class="o">+</span> <span class="n">j</span><span class="p">];</span>
            <span class="p">}</span>
            <span class="n">C</span><span class="p">[</span><span class="n">i</span> <span class="o">*</span> <span class="n">N</span> <span class="o">+</span> <span class="n">j</span><span class="p">]</span> <span class="o">=</span> <span class="n">sum</span><span class="p">;</span>
        <span class="p">}</span>
    <span class="p">}</span>
<span class="p">}</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>In this it is implicit that the matrices are row-major, i.e. for row-column position $i$,$j$, the matrix element for $A$ is $a_{i,j}=A[j+i\times K]$. This is not the most performant code, it’s an example, if you want a better version of this multiplication do it with a library like <a href="https://www.netlib.org/lapack/lapacke.html">LAPACKE</a> like we did in <a href="../blas-lapack"> the BLAS and LAPACK post</a>.</p>

<h2 id="gpu-simple-kernel">GPU simple kernel</h2>

<p>The most basic kernel for matrix multiplication is very similar to the CPU implementation above. We use two auxiliary variables <code class="language-plaintext highlighter-rouge">row</code> and <code class="language-plaintext highlighter-rouge">col</code> per thread to calculate the $C$ element per row and column.</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
</pre></td><td class="rouge-code"><pre>
<span class="n">__global__</span> <span class="kt">void</span> <span class="nf">simpleMatrixMultiplication</span><span class="p">(</span><span class="k">const</span> <span class="kt">float</span><span class="o">*</span> <span class="n">__restrict__</span> <span class="n">A</span><span class="p">,</span>
    <span class="k">const</span> <span class="kt">float</span><span class="o">*</span> <span class="n">__restrict__</span> <span class="n">B</span><span class="p">,</span>
    <span class="kt">float</span><span class="o">*</span> <span class="n">__restrict__</span> <span class="n">C</span><span class="p">,</span>
    <span class="k">const</span> <span class="kt">size_t</span> <span class="n">M</span><span class="p">,</span>
    <span class="k">const</span> <span class="kt">size_t</span> <span class="n">K</span><span class="p">,</span>
    <span class="k">const</span> <span class="kt">size_t</span> <span class="n">N</span><span class="p">)</span> <span class="p">{</span>

    <span class="kt">int</span> <span class="n">row</span> <span class="o">=</span> <span class="n">blockIdx</span><span class="p">.</span><span class="n">y</span> <span class="o">*</span> <span class="n">blockDim</span><span class="p">.</span><span class="n">y</span> <span class="o">+</span> <span class="n">threadIdx</span><span class="p">.</span><span class="n">y</span><span class="p">;</span>
    <span class="kt">int</span> <span class="n">col</span> <span class="o">=</span> <span class="n">blockIdx</span><span class="p">.</span><span class="n">x</span> <span class="o">*</span> <span class="n">blockDim</span><span class="p">.</span><span class="n">x</span> <span class="o">+</span> <span class="n">threadIdx</span><span class="p">.</span><span class="n">x</span><span class="p">;</span>

    <span class="k">if</span> <span class="p">(</span><span class="n">row</span> <span class="o">&lt;</span> <span class="n">M</span> <span class="o">&amp;&amp;</span> <span class="n">col</span> <span class="o">&lt;</span> <span class="n">N</span><span class="p">)</span> <span class="p">{</span>
        <span class="kt">float</span> <span class="n">sum</span> <span class="o">=</span> <span class="mf">0.0f</span><span class="p">;</span>
        <span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">k</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">k</span> <span class="o">&lt;</span> <span class="n">K</span><span class="p">;</span> <span class="o">++</span><span class="n">k</span><span class="p">)</span> <span class="p">{</span>
            <span class="n">sum</span> <span class="o">+=</span> <span class="n">A</span><span class="p">[</span><span class="n">row</span> <span class="o">*</span> <span class="n">K</span> <span class="o">+</span> <span class="n">k</span><span class="p">]</span> <span class="o">*</span> <span class="n">B</span><span class="p">[</span><span class="n">k</span> <span class="o">*</span> <span class="n">N</span> <span class="o">+</span> <span class="n">col</span><span class="p">];</span>
        <span class="p">}</span>
        <span class="n">C</span><span class="p">[</span><span class="n">row</span> <span class="o">*</span> <span class="n">N</span> <span class="o">+</span> <span class="n">col</span><span class="p">]</span> <span class="o">=</span> <span class="n">sum</span><span class="p">;</span>
    <span class="p">}</span>
<span class="p">}</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>This kernel serves our purposes to calculate the correct matrix multiplication but it can be improved a lot. For starters we aren’t using any <code class="language-plaintext highlighter-rouge">__shared__</code> memory, this kind of memory is shared among all threads in a kernel and is much faster than the global memory. Here we are reading over and over from the global memory which makes this slow. We will fix this in the next kernel and comment other improvements.</p>

<h2 id="gpu-tiled-multiplication">GPU tiled multiplication</h2>

<p>The following kernel makes use of the <code class="language-plaintext highlighter-rouge">__shared__</code> memory to load elements of the matrix A and B so that they are faster to access by the individual threads of the block.</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
</pre></td><td class="rouge-code"><pre><span class="cp"># define TILE 16
</span><span class="n">__global__</span> <span class="kt">void</span> <span class="nf">tiledMultiply</span><span class="p">(</span><span class="k">const</span> <span class="kt">float</span><span class="o">*</span> <span class="n">__restrict__</span> <span class="n">A</span><span class="p">,</span> <span class="c1">// M x K</span>
                              <span class="k">const</span> <span class="kt">float</span><span class="o">*</span> <span class="n">__restrict__</span> <span class="n">B</span><span class="p">,</span> <span class="c1">// K x N</span>
                              <span class="kt">float</span><span class="o">*</span> <span class="n">__restrict__</span> <span class="n">C</span><span class="p">,</span>       <span class="c1">// M x N</span>
                              <span class="n">std</span><span class="o">::</span><span class="kt">size_t</span> <span class="n">M</span><span class="p">,</span>
                              <span class="n">std</span><span class="o">::</span><span class="kt">size_t</span> <span class="n">K</span><span class="p">,</span>
                              <span class="n">std</span><span class="o">::</span><span class="kt">size_t</span> <span class="n">N</span><span class="p">)</span> <span class="p">{</span>

    <span class="kt">int</span> <span class="n">by</span> <span class="o">=</span> <span class="n">blockIdx</span><span class="p">.</span><span class="n">y</span><span class="p">;</span>
    <span class="kt">int</span> <span class="n">bx</span> <span class="o">=</span> <span class="n">blockIdx</span><span class="p">.</span><span class="n">x</span><span class="p">;</span>

    <span class="kt">int</span> <span class="n">ty</span> <span class="o">=</span> <span class="n">threadIdx</span><span class="p">.</span><span class="n">y</span><span class="p">;</span>
    <span class="kt">int</span> <span class="n">tx</span> <span class="o">=</span> <span class="n">threadIdx</span><span class="p">.</span><span class="n">x</span><span class="p">;</span>

    <span class="c1">// global row/col this thread is responsible for</span>
    <span class="kt">int</span> <span class="n">i</span> <span class="o">=</span> <span class="n">by</span> <span class="o">*</span> <span class="n">TILE</span> <span class="o">+</span> <span class="n">ty</span><span class="p">;</span>  <span class="c1">// row in C</span>
    <span class="kt">int</span> <span class="n">j</span> <span class="o">=</span> <span class="n">bx</span> <span class="o">*</span> <span class="n">TILE</span> <span class="o">+</span> <span class="n">tx</span><span class="p">;</span>  <span class="c1">// col in C</span>

    <span class="n">__shared__</span> <span class="kt">float</span> <span class="n">As</span><span class="p">[</span><span class="n">TILE</span><span class="p">][</span><span class="n">TILE</span><span class="p">];</span>
    <span class="n">__shared__</span> <span class="kt">float</span> <span class="n">Bs</span><span class="p">[</span><span class="n">TILE</span><span class="p">][</span><span class="n">TILE</span><span class="p">];</span>

    <span class="kt">float</span> <span class="n">value</span> <span class="o">=</span> <span class="mf">0.0f</span><span class="p">;</span>

    <span class="c1">// number of tiles along K</span>
    <span class="kt">int</span> <span class="n">numTiles</span> <span class="o">=</span> <span class="p">(</span><span class="n">K</span> <span class="o">+</span> <span class="n">TILE</span> <span class="o">-</span> <span class="mi">1</span><span class="p">)</span> <span class="o">/</span> <span class="n">TILE</span><span class="p">;</span>

    <span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">ph</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">ph</span> <span class="o">&lt;</span> <span class="n">numTiles</span><span class="p">;</span> <span class="o">++</span><span class="n">ph</span><span class="p">)</span> <span class="p">{</span>
        <span class="c1">// column in A, row in B that this thread wants to load</span>
        <span class="kt">int</span> <span class="n">aCol</span> <span class="o">=</span> <span class="n">ph</span> <span class="o">*</span> <span class="n">TILE</span> <span class="o">+</span> <span class="n">tx</span><span class="p">;</span>  <span class="c1">// along K</span>
        <span class="kt">int</span> <span class="n">bRow</span> <span class="o">=</span> <span class="n">ph</span> <span class="o">*</span> <span class="n">TILE</span> <span class="o">+</span> <span class="n">ty</span><span class="p">;</span>  <span class="c1">// along K</span>

        <span class="c1">// load A tile (row = i, col = aCol)</span>
        <span class="k">if</span> <span class="p">(</span><span class="n">i</span> <span class="o">&lt;</span> <span class="n">M</span> <span class="o">&amp;&amp;</span> <span class="n">aCol</span> <span class="o">&lt;</span> <span class="n">K</span><span class="p">)</span>
            <span class="n">As</span><span class="p">[</span><span class="n">ty</span><span class="p">][</span><span class="n">tx</span><span class="p">]</span> <span class="o">=</span> <span class="n">A</span><span class="p">[</span><span class="n">i</span> <span class="o">*</span> <span class="n">K</span> <span class="o">+</span> <span class="n">aCol</span><span class="p">];</span>
        <span class="k">else</span>
            <span class="n">As</span><span class="p">[</span><span class="n">ty</span><span class="p">][</span><span class="n">tx</span><span class="p">]</span> <span class="o">=</span> <span class="mf">0.0f</span><span class="p">;</span>

        <span class="c1">// load B tile (row = bRow, col = j)</span>
        <span class="k">if</span> <span class="p">(</span><span class="n">bRow</span> <span class="o">&lt;</span> <span class="n">K</span> <span class="o">&amp;&amp;</span> <span class="n">j</span> <span class="o">&lt;</span> <span class="n">N</span><span class="p">)</span>
            <span class="n">Bs</span><span class="p">[</span><span class="n">ty</span><span class="p">][</span><span class="n">tx</span><span class="p">]</span> <span class="o">=</span> <span class="n">B</span><span class="p">[</span><span class="n">bRow</span> <span class="o">*</span> <span class="n">N</span> <span class="o">+</span> <span class="n">j</span><span class="p">];</span>
        <span class="k">else</span>
            <span class="n">Bs</span><span class="p">[</span><span class="n">ty</span><span class="p">][</span><span class="n">tx</span><span class="p">]</span> <span class="o">=</span> <span class="mf">0.0f</span><span class="p">;</span>

        <span class="c1">// sync all threads to make sure the tiles are loaded</span>
        <span class="n">__syncthreads</span><span class="p">();</span>

        <span class="cp">#pragma unroll
</span>        <span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">t</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">t</span> <span class="o">&lt;</span> <span class="n">TILE</span><span class="p">;</span> <span class="o">++</span><span class="n">t</span><span class="p">)</span> <span class="p">{</span>
            <span class="n">value</span> <span class="o">+=</span> <span class="n">As</span><span class="p">[</span><span class="n">ty</span><span class="p">][</span><span class="n">t</span><span class="p">]</span> <span class="o">*</span> <span class="n">Bs</span><span class="p">[</span><span class="n">t</span><span class="p">][</span><span class="n">tx</span><span class="p">];</span>
        <span class="p">}</span>

        <span class="c1">// sync before loading the next tile</span>
        <span class="n">__syncthreads</span><span class="p">();</span>
    <span class="p">}</span>

    <span class="c1">// write back only if in-bounds</span>
    <span class="k">if</span> <span class="p">(</span><span class="n">i</span> <span class="o">&lt;</span> <span class="p">(</span><span class="kt">int</span><span class="p">)</span><span class="n">M</span> <span class="o">&amp;&amp;</span> <span class="n">j</span> <span class="o">&lt;</span> <span class="p">(</span><span class="kt">int</span><span class="p">)</span><span class="n">N</span><span class="p">)</span> <span class="p">{</span>
        <span class="n">C</span><span class="p">[</span><span class="n">i</span> <span class="o">*</span> <span class="n">N</span> <span class="o">+</span> <span class="n">j</span><span class="p">]</span> <span class="o">=</span> <span class="n">value</span><span class="p">;</span>
    <span class="p">}</span>
<span class="p">}</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>That we launch with a C++ function wrapper</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
</pre></td><td class="rouge-code"><pre><span class="kt">void</span> <span class="nf">tiledMultiply_call</span><span class="p">(</span><span class="k">const</span> <span class="kt">float</span><span class="o">*</span> <span class="n">__restrict__</span> <span class="n">A</span><span class="p">,</span>
                    <span class="k">const</span> <span class="kt">float</span><span class="o">*</span> <span class="n">__restrict__</span> <span class="n">B</span><span class="p">,</span>
                    <span class="kt">float</span><span class="o">*</span> <span class="n">__restrict__</span> <span class="n">C</span><span class="p">,</span>
                    <span class="n">std</span><span class="o">::</span><span class="kt">size_t</span> <span class="n">M</span><span class="p">,</span>
                    <span class="n">std</span><span class="o">::</span><span class="kt">size_t</span> <span class="n">K</span><span class="p">,</span>
                    <span class="n">std</span><span class="o">::</span><span class="kt">size_t</span> <span class="n">N</span><span class="p">){</span>
    <span class="n">dim3</span> <span class="n">threads</span><span class="p">(</span><span class="n">TILE</span><span class="p">,</span> <span class="n">TILE</span><span class="p">);</span>
    <span class="n">dim3</span> <span class="n">blocks</span><span class="p">((</span><span class="n">N</span> <span class="o">+</span> <span class="n">TILE</span> <span class="o">-</span> <span class="mi">1</span><span class="p">)</span> <span class="o">/</span> <span class="n">TILE</span><span class="p">,</span> <span class="p">(</span><span class="n">M</span> <span class="o">+</span> <span class="n">TILE</span> <span class="o">-</span> <span class="mi">1</span><span class="p">)</span> <span class="o">/</span> <span class="n">TILE</span><span class="p">);</span>
    <span class="n">tiledMultiply</span><span class="o">&lt;&lt;&lt;</span><span class="n">blocks</span><span class="p">,</span> <span class="n">threads</span><span class="o">&gt;&gt;&gt;</span><span class="p">(</span><span class="n">A</span><span class="p">,</span> <span class="n">B</span><span class="p">,</span> <span class="n">C</span><span class="p">,</span> <span class="n">M</span><span class="p">,</span> <span class="n">K</span><span class="p">,</span> <span class="n">N</span><span class="p">);</span>
    <span class="n">cudaDeviceSynchronize</span><span class="p">();</span>
<span class="p">}</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Let’s understand this code, an exellent visual representation of the calculation here is found in <a href="https://www.youtube.com/watch?v=Q3GgbfGTnVc">this</a> YouTube video. For a detailed explanation check the book <a href="https://www.oreilly.com/library/view/programming-massively-parallel/9780323984638/">Programming Massively Parallel Processors</a>.</p>

<p>In matrix multiplication we find that we use the same matrix elements from A and B over and over so the idea is to move those elements to the shared memory to use them in small tiles. Shared memory is much faster than global memory and since by doing this we are reducing the number of calls, it will most certainly reduce our total calculation time.</p>

<p>In the kernel <code class="language-plaintext highlighter-rouge">tiledMultiply</code> we focus on calculating the elements of <code class="language-plaintext highlighter-rouge">C</code> for a tile of size <code class="language-plaintext highlighter-rouge">TILE</code> for each block. Think of defining your blocks so that they cover all the elements in matrix <code class="language-plaintext highlighter-rouge">C</code>. In the code we calculate how many tiles we need, since the mutliplication dimension is <code class="language-plaintext highlighter-rouge">K</code> (columns of A and rows of B), we need <code class="language-plaintext highlighter-rouge">K</code> tiles per block (or, to cover all the elemtns in case K is not divisible by TILE we calculate <code class="language-plaintext highlighter-rouge">(K + TILE - 1) / TILE</code>). Then for every tile we load the elements of A and B that will participate in the multiplication, inside that same for loop we sync for all threads in that block. Indeed!, inside your for loop you can stop till all threads load their data, then you just need to multiply as usual the elements of the matrix A and B to give you the individual summands before syncing threads again and finally assigning the value for the element <code class="language-plaintext highlighter-rouge">i * N + j</code>.</p>

<p>When launching this kernel we consider squared blocks of size <code class="language-plaintext highlighter-rouge">TILE x TILE</code>, usually this tile is of size 16, remember the shared memory is quite limited on GPUs. Then we need to launch a total of <code class="language-plaintext highlighter-rouge">(N + TILE - 1) / TILE</code> blocks in the <code class="language-plaintext highlighter-rouge">x</code> dimension (rows) and <code class="language-plaintext highlighter-rouge">(M + TILE - 1) / TILE</code> in the <code class="language-plaintext highlighter-rouge">y</code> drection, columns to cover all elements of <code class="language-plaintext highlighter-rouge">C</code>. This is conveniently wrapped in the function <code class="language-plaintext highlighter-rouge">tiledMultiply_call</code>.</p>

<h2 id="matrix-multiplication-performance">Matrix multiplication performance</h2>

<p>So far we went very deep into the coding. Let’s calculate some benchmarks. To execute this part go to the <a href="https://github.com/SebastiaAgramunt/blogging-code/tree/main/cuda-matrix-multiplication">cuda-matrix-multiplication</a> and compile and execute the code with</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre>./scripts/build.sh
./scripts/execute.sh
</pre></td></tr></tbody></table></code></pre></div></div>
<p>This will produce a csv that can then be analyzed in Python. To produce the plots just install the python environment and run the analysis script:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre>./scripts/install_env.sh
.venv/bin/python scripts/analyze.py
</pre></td></tr></tbody></table></code></pre></div></div>

<p>The first graph shows the time taken to multiply two squared matrices of size $N$. The green line corresponds to the GPU while the black line to the CPU.</p>

<figure style="text-align: center;">
  <img src="/assets/img/posts/2025-11-02-cuda-matrix-mutliplication/gpu_cpu_performance.png" alt="" width="700" />
  <figcaption><strong>Figure 1.</strong> GPU vs CPU calculation time comparison as a function on the matrix size for square matrices of size $N$. For GPU we include the loading time, calculation time and loading back to host times.</figcaption>
</figure>

<p>We can see how in small matrices the time taken is constant (independent of the size of the matrix) because the majority of the time is spent initializing the process and loading the data. CPU in smaller matrix sizes is much faster. For matrices smaller than $N=55$ it is better to use CPU whilst GPU is faster. For larger matrix sizes we make the following plot:</p>

<figure style="text-align: center;">
  <img src="/assets/img/posts/2025-11-02-cuda-matrix-mutliplication/gpu_cpu_performance_large.png" alt="" width="700" />
  <figcaption><strong>Figure 2.</strong> Same as Fig. 1 but with larger matrix sizes</figcaption>
</figure>

<p>It can be seen that CPU time goes out of the scale whilst GPU slowly increases. For $N \approx 10K$ elements the multiplication time is around 0.85 seconds for the GPU.</p>

<p>So far we have just considered the slow approach for the GPU, the one that loads the elements from the global memory in the GPU. Still we get huge speedups compared to CPU. In the following plot we compare three methods of matrix multiplication: The simple approach, the tiled matrix multiplication and finally using a library in CUDA called <a href="https://developer.nvidia.com/cublas">cuBLAS</a>, the CUDA equivalent of BLAS.</p>

<figure style="text-align: center;">
  <img src="/assets/img/posts/2025-11-02-cuda-matrix-mutliplication/gpu_comparison_times_large.png" alt="" width="700" />
  <figcaption><strong>Figure 3.</strong> Matrix multiplication times for two squared matrices as a function of the matrix size. The green line corresponds to the simple matrix multiplication (the same as in Fig. 1 and Fig. 2). The blue line our custom tiled matrix multiplcation implementation and the gray line the cuBLAS implementation.</figcaption>
</figure>

<p>Our tiled matrix multiplication clearly improves the simple one initially considered. The improvement seems to be around 2X the time. However the cuBLAS implementation is much faster, around 4x, which is impressive. cuBLAS has decades of optimization so it is expected to run much faster than any custom implementation.</p>

<p>cuBLAS uses specialized matrix mutiplication hardware, tensor cores. Those can give up to 8-16x more FLOPs than standard FP32 cores used in our custom implementation. Also cuBLAS has deeper tiling, it uses large block tiles and each thread computes multiple elements. These are the main optimizations but we can list more and will probably take another entire post to describe them all.</p>

<p>This last graph shows us that as expected it is better to use the library cuBLAS to multiply any two matrices on the GPU. However, the call for these functions has a cost, you cannot implement something custom i.e. a matrix multiplication and another custom optimization operation using a self programmed kernel. The moment you want to fuse operations in kernels you would probably need to implement your own multiplication with other operations inside the same CUDA kernel.</p>

<h2 id="conclusions">Conclusions</h2>

<p>We have seen how to multiply two matrices in CUDA. Specifically we learned how to use the shared memory to gain some extra speedup in the GPU. Finally we showed that experience is a plus and using cuBLAS is the easiest route to get a super optimized matrix multiplication. Do not reinvent the wheel and try to implement your version of CUDA matrix multiplication unless you need it for a very specific application that involves fusing kernels with other custom operations.</p>]]></content><author><name>Sebastia Agramunt Puig</name></author><category term="C++" /><category term="CUDA" /><category term="computer science" /><category term="GPU" /><summary type="html"><![CDATA[In this post we dive a little bit deeper into CUDA and GPU parallelization with a more practical case: Matrix multiplication. Matrices are used everywhere, in convolutions, solving linear systems of equations, neural networks, transformers etc… And in every possible mathematics application you can think of, in physics, mechanics, computer vision etc. Therefore is interesting to be able to calculate matrix multplications as fast as possible, it’s a heavy compute calculation.]]></summary></entry><entry><title type="html">BLAS and LAPACK for Linear Algebra</title><link href="https://agramunt.me/posts/blas-lapack/" rel="alternate" type="text/html" title="BLAS and LAPACK for Linear Algebra" /><published>2025-09-13T21:35:00-07:00</published><updated>2025-12-04T13:20:05-08:00</updated><id>https://agramunt.me/posts/blas-lapack</id><content type="html" xml:base="https://agramunt.me/posts/blas-lapack/"><![CDATA[<p><a href="https://www.netlib.org/blas/">BLAS</a> is the basic linear algebra subprograms library, It’s a well tested library for basic algebraic operations. For instance, vector operations, dot products, matrix transpositions, multiplications etc. The <a href="https://www.netlib.org/lapack/">LAPACK</a> library (Linear Algebra PACKage) is a higher level library built on top of BLAS: while BLAS provides low-level building blocks (vector, matrix operations), LAPACK implements full algorithms for solving core linear algebra problems. For instance, with LAPACK we cans olve systems of linear equations (LU decomposition), least square problems, eigenvalue problems, matrix factorizations (LU, Cholesky, QR, Schur). Lapack is oringally built in Fortran, so the indexes are colum-order instead of the usual row-order in C, that might be a source of error for an avid C developer, however it exist a C wrapper for LAPACK, called <a href="https://www.netlib.org/lapack/">LAPACKE</a> (see <a href="https://www.netlib.org/lapack/lapacke.html">the user guide</a>).</p>

<p>In this tutorial we will show how to install BLAS, LAPACK ans LAPACKE and compile a simple program that uses one function of each. Specifically we will install first <a href="https://github.com/OpenMathLib/OpenBLAS">OpenBLAS</a>, which is a concrete implementation of the BLAS standard with optimizations and extensions that also bundles LAPACK, but not LAPACKE, which will be also installed in this post. Find the <a href="https://github.com/SebastiaAgramunt/blogging-code/tree/main/blas-lapacke">code</a> in the GitHub <a href="https://github.com/SebastiaAgramunt/blogging-code/tree/main">blogging-code</a> repository.</p>

<h2 id="install-openblas-and-lapacke-in-macos">Install OpenBlas and Lapacke in MacOS</h2>

<p>In MacOS simply use <a href="https://brew.sh/">Homebrew</a>, if you don’t have it just install with the command</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>/bin/bash <span class="nt">-c</span> <span class="s2">"</span><span class="si">$(</span>curl <span class="nt">-fsSL</span> https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh<span class="si">)</span><span class="s2">"</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Then you can install the two libraries as</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>brew <span class="nb">install </span>openblas lapack
</pre></td></tr></tbody></table></code></pre></div></div>

<p>These are installed in</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre><span class="nv">OPBENBLAS_INSTALL_DIR</span><span class="o">=</span><span class="si">$(</span>brew <span class="nt">--prefix</span> openblas<span class="si">)</span>
<span class="nv">LAPACK_INSTALL_DIR</span><span class="o">=</span><span class="si">$(</span>brew <span class="nt">--prefix</span> lapack<span class="si">)</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>There you will find the subdirectories <code class="language-plaintext highlighter-rouge">include</code> and <code class="language-plaintext highlighter-rouge">lib</code> that will be needed to compile and link your program that uses BLAS and LAPACK.</p>

<h2 id="install-openblas-and-lapack-in-ubuntu">Install OpenBlas and Lapack in Ubuntu</h2>

<p>To install in Ubuntu use <code class="language-plaintext highlighter-rouge">apt-get</code>:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre>apt-get update
apt-get <span class="nb">install</span> <span class="nt">-y</span> libopenblas-dev liblapacke-dev
</pre></td></tr></tbody></table></code></pre></div></div>

<p>I tried the above in a Ubuntu docker container in my MacOS:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre>docker pull ubuntu
docker run <span class="nt">--rm</span> <span class="nt">-it</span> <span class="nt">--entrypoint</span> bash ubuntu
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Find the headers in</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">/usr/include</code> for <code class="language-plaintext highlighter-rouge">lapack.h</code>, <code class="language-plaintext highlighter-rouge">lapacke.h</code></li>
  <li><code class="language-plaintext highlighter-rouge">/usr/include/x86_64-linux-gnu</code> for <code class="language-plaintext highlighter-rouge">cblas.h</code></li>
</ul>

<p>And libraries in</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">/usr/lib/x86_64-linux-gnu/</code> for <code class="language-plaintext highlighter-rouge">libblas.a</code>, <code class="language-plaintext highlighter-rouge">libblas.so</code>, <code class="language-plaintext highlighter-rouge">libopenblas.a</code>, <code class="language-plaintext highlighter-rouge">libopenblas.so</code> (static and dynamic libraries for BLAS).</li>
  <li><code class="language-plaintext highlighter-rouge">/usr/lib/x86_64-linux-gnu/</code> for <code class="language-plaintext highlighter-rouge">liblapack.a</code>, <code class="language-plaintext highlighter-rouge">liblapack.so</code>, <code class="language-plaintext highlighter-rouge">liblapacke.a</code>, <code class="language-plaintext highlighter-rouge">liblapacke.so</code> (static and dynamic libraries for lapack and lapacke).</li>
</ul>

<p>To double check the flags with <code class="language-plaintext highlighter-rouge">pkg-config</code></p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
</pre></td><td class="rouge-code"><pre>apt-get <span class="nb">install </span>pkg-config

<span class="c"># includes</span>
<span class="nb">echo</span> <span class="si">$(</span>pkg-config <span class="nt">--cflags</span> openblas lapacke<span class="si">)</span>

<span class="c"># libraries</span>
<span class="nb">echo</span> <span class="si">$(</span>pkg-config <span class="nt">--libs</span>   openblas lapacke<span class="si">)</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<h2 id="compile-and-install-openblas-and-lapack">Compile and install Openblas and Lapack</h2>

<p>This is my favourite way to install libraries, download the source code and install in your project directory. I agree it takes more time but if you only use these libraries in one project may be worth just installing them in one directory. The project structure subdirectory for this install is</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
</pre></td><td class="rouge-code"><pre><span class="nb">.</span>
├── README.md
├── scripts
│   ├── build-run.sh
│   └── install-external-libraries.sh
└── src
    ├── cblas_example.cpp
    └── lapacke_example.cpp
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Let’s first build the libraries</p>

<h3 id="download-and-build-openblas">Download and build OpenBlas</h3>

<p>We can download the source code from github’s official page, we will uinstall the most recent version currently which is <code class="language-plaintext highlighter-rouge">0.3.30</code>:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
</pre></td><td class="rouge-code"><pre>
<span class="c"># create a directory where you will build the software</span>
<span class="nb">mkdir </span>external
<span class="nb">cd </span>external

<span class="c"># select openblas version</span>
<span class="nv">OPENBLAS_VERSION</span><span class="o">=</span><span class="s2">"0.3.30"</span>
<span class="nv">OPENBLAS_URL</span><span class="o">=</span><span class="s2">"https://github.com/OpenMathLib/OpenBLAS/releases/download/v</span><span class="k">${</span><span class="nv">OPENBLAS_VERSION</span><span class="k">}</span><span class="s2">/OpenBLAS-</span><span class="k">${</span><span class="nv">OPENBLAS_VERSION</span><span class="k">}</span><span class="s2">.tar.gz"</span>

<span class="c"># install dir, change this to wherever you want</span>
<span class="nv">INSTALL_DIR</span><span class="o">=</span><span class="k">${</span><span class="nv">HOME</span><span class="k">}</span>/libs

<span class="c"># download</span>
wget <span class="k">${</span><span class="nv">OPENBLAS_URL</span><span class="k">}</span>

<span class="c"># untar</span>
<span class="nb">tar</span> <span class="nt">-xvzf</span> OpenBLAS-<span class="k">${</span><span class="nv">OPENBLAS_VERSION</span><span class="k">}</span>.tar.gz

<span class="c"># go to the untarred directory</span>
<span class="nb">cd </span>OpenBLAS-<span class="k">${</span><span class="nv">OPENBLAS_VERSION</span><span class="k">}</span>

<span class="c"># compile the library</span>
<span class="c"># we ran this in Ubuntu x86_64 architecture</span>
<span class="c"># can be different in other OSs and arch.</span>
<span class="nb">mkdir</span> <span class="nt">-p</span> build <span class="o">&amp;&amp;</span> <span class="nb">cd </span>build
cmake <span class="nt">-DCMAKE_INSTALL_PREFIX</span><span class="o">=</span><span class="k">${</span><span class="nv">INSTALL_DIR</span><span class="k">}</span> <span class="se">\</span>
      <span class="nt">-DCMAKE_BUILD_TYPE</span><span class="o">=</span>Release <span class="se">\</span>
      <span class="nt">-DBUILD_SHARED_LIBS</span><span class="o">=</span>ON <span class="se">\</span>
      ..

make <span class="nt">-j</span> 64
make <span class="nb">install
cd</span> ../../..
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Now, you will find the includes and libs the installation dir. Just ls the directories</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre><span class="nb">ls</span> <span class="nv">$INSTALL_DIR</span>/include/openblas
</pre></td></tr></tbody></table></code></pre></div></div>

<p>where you will find <code class="language-plaintext highlighter-rouge">cblas.h</code>, <code class="language-plaintext highlighter-rouge">lapack.h</code>, <code class="language-plaintext highlighter-rouge">lapacke.h</code>.</p>

<p>Also the libraries</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre><span class="nb">ls</span> <span class="nv">$INSTALL_DIR</span>/lib
</pre></td></tr></tbody></table></code></pre></div></div>

<p>to find <code class="language-plaintext highlighter-rouge">libopenblas.dylib</code> (in MacOS)</p>

<h3 id="download-and-build-lapacklapacke">Download and build Lapack/Lapacke</h3>

<p>Lapack and Lapacke are basically the same libraries, Lapack is the original library built in Fortran, which treats matrices by default in column order. If you come from the C/C++ world like me, you would prefer to use lapacke, which is a C wrapper for the standard lapack library. In Lapacke, routines are row-order by default. Let’s install both libraries.</p>

<p>Let’s begin by downloading the source files</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
</pre></td><td class="rouge-code"><pre><span class="nv">LAPACK_VERSION</span><span class="o">=</span><span class="s2">"3.12.1"</span>
<span class="nv">LAPACK_URL</span><span class="o">=</span><span class="s2">"https://github.com/Reference-LAPACK/lapack/archive/refs/tags/v</span><span class="k">${</span><span class="nv">LAPACK_VERSION</span><span class="k">}</span><span class="s2">.tar.gz"</span>

<span class="c"># download, unpack and change directory</span>
wget <span class="k">${</span><span class="nv">LAPACK_URL</span><span class="k">}</span>
<span class="nb">tar</span> <span class="nt">-xvzf</span> v<span class="k">${</span><span class="nv">LAPACK_VERSION</span><span class="k">}</span>.tar.gz
<span class="nb">cd </span>lapack-<span class="k">${</span><span class="nv">LAPACK_VERSION</span><span class="k">}</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>and define the installation dir</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre><span class="nv">INSTALL_DIR</span><span class="o">=</span><span class="k">${</span><span class="nv">HOME</span><span class="k">}</span>/libs
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Assuming again we are on Linux:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
</pre></td><td class="rouge-code"><pre><span class="nb">mkdir</span> <span class="nt">-p</span> build <span class="o">&amp;&amp;</span> <span class="nb">cd </span>build
cmake <span class="nt">-DCMAKE_INSTALL_PREFIX</span><span class="o">=</span><span class="k">${</span><span class="nv">INSTALL_DIR</span><span class="k">}</span>/lapack <span class="se">\</span>
    <span class="nt">-DCBLAS</span><span class="o">=</span>ON <span class="se">\</span>
    <span class="nt">-DBUILD_SHARED_LIBS</span><span class="o">=</span>ON <span class="se">\</span>
    <span class="nt">-DLAPACKE</span><span class="o">=</span>ON <span class="se">\</span>
    <span class="nt">-DBLAS_LIBRARIES</span><span class="o">=</span><span class="s2">"</span><span class="k">${</span><span class="nv">LIB_DIR</span><span class="k">}</span><span class="s2">/openblas/lib/libopenblas.so"</span> <span class="se">\</span>
    <span class="nt">-DCMAKE_BUILD_TYPE</span><span class="o">=</span>Release <span class="se">\</span>
    ..
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Finally you can check that both of your libraries, <code class="language-plaintext highlighter-rouge">openblas</code> and <code class="language-plaintext highlighter-rouge">lapacke</code> are in the subdirectoires <code class="language-plaintext highlighter-rouge">${INSTALL_DIR}/openblas</code> and <code class="language-plaintext highlighter-rouge">${INSTALL_DIR}/lapack</code>. There you should find the includes and libraries compiled, shared and static objects.</p>

<h2 id="code-examples">Code examples</h2>

<h3 id="blas-examples">BLAS examples</h3>

<p>Now is time to use the libraries. Please check the <a href="https://github.com/SebastiaAgramunt/blogging-code/tree/main/blas-lapack-install">code</a> in the main GitHub repository for full details, we will be writing the file <code class="language-plaintext highlighter-rouge">cblas_example.cpp</code> from there. We will do first a basic dot product of two vectors, take a look at the <a href="https://www.netlib.org/blas/">BLAS documentation</a> and to your <code class="language-plaintext highlighter-rouge">cblas.h</code> header to see the definitions of the functions. We use is <code class="language-plaintext highlighter-rouge">cblas_ddot</code>, which is the <code class="language-plaintext highlighter-rouge">cblas</code> implementation of the double <code class="language-plaintext highlighter-rouge">dot</code> (<code class="language-plaintext highlighter-rouge">ddot</code>) function. In the BLAS documentation the signature is <code class="language-plaintext highlighter-rouge">double cblas_ddot(OPENBLAS_CONST blasint n, OPENBLAS_CONST double *x, OPENBLAS_CONST blasint incx, OPENBLAS_CONST double *y, OPENBLAS_CONST blasint incy);</code> where</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">n</code>: number of elements of the vectors</li>
  <li><code class="language-plaintext highlighter-rouge">x</code>: pointer to the first array</li>
  <li><code class="language-plaintext highlighter-rouge">incx</code>: stride for first array</li>
  <li><code class="language-plaintext highlighter-rouge">y</code>: pointer to the second array</li>
  <li><code class="language-plaintext highlighter-rouge">incy</code>: stride for second array.</li>
</ul>

<p>The code is something like</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
</pre></td><td class="rouge-code"><pre><span class="cp">#include</span> <span class="cpf">&lt;cblas.h&gt;</span><span class="cp">
#include</span> <span class="cpf">&lt;iostream&gt;</span><span class="cp">
#include</span> <span class="cpf">&lt;vector&gt;</span><span class="cp">
</span>
<span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o">&lt;</span><span class="kt">double</span><span class="o">&gt;</span> <span class="n">x</span> <span class="o">=</span> <span class="p">{</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">};</span>
<span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o">&lt;</span><span class="kt">double</span><span class="o">&gt;</span> <span class="n">y</span> <span class="o">=</span> <span class="p">{</span><span class="mi">4</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="mi">6</span><span class="p">};</span>

<span class="c1">// BLAS double dot operation</span>
<span class="kt">double</span> <span class="n">dot</span> <span class="o">=</span> <span class="n">cblas_ddot</span><span class="p">(</span><span class="mi">3</span><span class="p">,</span> <span class="n">x</span><span class="p">.</span><span class="n">data</span><span class="p">(),</span> <span class="mi">1</span><span class="p">,</span> <span class="n">y</span><span class="p">.</span><span class="n">data</span><span class="p">(),</span> <span class="mi">1</span><span class="p">);</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Since we want the dot product we set strides to 1. The result of this operation is 32.</p>

<p>For a second operation we will do a matrix multiplication using the function <code class="language-plaintext highlighter-rouge">dgemm</code>, which is the double eversion of the GEneral Matrix Multiplication. Checking the <a href="https://www.netlib.org/blas/">BLAS documentation</a> (you can also check <code class="language-plaintext highlighter-rouge">cblas.h</code>) we see the signature of the function is</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
</pre></td><td class="rouge-code"><pre><span class="kt">void</span> <span class="nf">cblas_dgemm</span><span class="p">(</span>
  <span class="n">OPENBLAS_CONST</span> <span class="k">enum</span> <span class="n">CBLAS_ORDER</span>     <span class="n">Order</span><span class="p">,</span>   <span class="c1">// memory layout: row/col-major</span>
  <span class="n">OPENBLAS_CONST</span> <span class="k">enum</span> <span class="n">CBLAS_TRANSPOSE</span> <span class="n">TransA</span><span class="p">,</span>  <span class="c1">// op on A: NoTrans / Trans / ConjTrans</span>
  <span class="n">OPENBLAS_CONST</span> <span class="k">enum</span> <span class="n">CBLAS_TRANSPOSE</span> <span class="n">TransB</span><span class="p">,</span>  <span class="c1">// op on B: NoTrans / Trans / ConjTrans</span>
  <span class="n">OPENBLAS_CONST</span> <span class="n">blasint</span> <span class="n">M</span><span class="p">,</span>                    <span class="c1">// rows of op(A) and C</span>
  <span class="n">OPENBLAS_CONST</span> <span class="n">blasint</span> <span class="n">N</span><span class="p">,</span>                    <span class="c1">// cols of op(B) and C</span>
  <span class="n">OPENBLAS_CONST</span> <span class="n">blasint</span> <span class="n">K</span><span class="p">,</span>                    <span class="c1">// cols of op(A) and rows of op(B)</span>
  <span class="n">OPENBLAS_CONST</span> <span class="kt">double</span> <span class="n">alpha</span><span class="p">,</span>                 <span class="c1">// scales A*B</span>
  <span class="n">OPENBLAS_CONST</span> <span class="kt">double</span> <span class="o">*</span><span class="n">A</span><span class="p">,</span>                    <span class="c1">// pointer to A</span>
  <span class="n">OPENBLAS_CONST</span> <span class="n">blasint</span> <span class="n">lda</span><span class="p">,</span>                  <span class="c1">// leading dimension of A</span>
  <span class="n">OPENBLAS_CONST</span> <span class="kt">double</span> <span class="o">*</span><span class="n">B</span><span class="p">,</span>                    <span class="c1">// pointer to B</span>
  <span class="n">OPENBLAS_CONST</span> <span class="n">blasint</span> <span class="n">ldb</span><span class="p">,</span>                  <span class="c1">// leading dimension of B</span>
  <span class="n">OPENBLAS_CONST</span> <span class="kt">double</span> <span class="n">beta</span><span class="p">,</span>                  <span class="c1">// scales existing C</span>
  <span class="kt">double</span> <span class="o">*</span><span class="n">C</span><span class="p">,</span>                                   <span class="c1">// pointer to C (in/out)</span>
  <span class="n">OPENBLAS_CONST</span> <span class="n">blasint</span> <span class="n">ldc</span>                   <span class="c1">// leading dimension of C</span>
<span class="p">);</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>and the operation is</p>

\[C=\alpha A^*B^*+ \beta C\]

<p>The parameters are explained in the signature here. One thing that may be consufing is the leading dimension of the matrices. If the matrix is row-major, then the leading dimension is the number of columns, and vice versa, if the matrix is column-major, the leading dimension is the number of rows. This is important as we are passing arrays and the algorithm needs to know each dimension and how the matrices are expressed. To set an example, let’s muliply a matrix <code class="language-plaintext highlighter-rouge">A</code> of size <code class="language-plaintext highlighter-rouge">M x K = 2 x 3</code> and a matrix <code class="language-plaintext highlighter-rouge">B</code> of size <code class="language-plaintext highlighter-rouge">K x N = 3 x 2</code> and store the result in <code class="language-plaintext highlighter-rouge">C</code> of size <code class="language-plaintext highlighter-rouge">M x N = 2 x 2</code> using the function <code class="language-plaintext highlighter-rouge">cblas_dgemm</code>:</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
</pre></td><td class="rouge-code"><pre><span class="cp">#include</span> <span class="cpf">&lt;cblas.h&gt;</span><span class="cp">
#include</span> <span class="cpf">&lt;iostream&gt;</span><span class="cp">
#include</span> <span class="cpf">&lt;vector&gt;</span><span class="cp">
</span>
<span class="k">const</span> <span class="kt">int</span> <span class="n">M</span> <span class="o">=</span> <span class="mi">2</span><span class="p">,</span> <span class="n">K</span> <span class="o">=</span> <span class="mi">3</span><span class="p">,</span> <span class="n">N</span> <span class="o">=</span> <span class="mi">2</span><span class="p">;</span>

<span class="c1">// Row-major layout</span>
<span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o">&lt;</span><span class="kt">double</span><span class="o">&gt;</span> <span class="n">A</span> <span class="o">=</span> <span class="p">{</span>
    <span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span>
    <span class="mi">4</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="mi">6</span>
<span class="p">};</span> <span class="c1">// 2x3=MxK</span>

<span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o">&lt;</span><span class="kt">double</span><span class="o">&gt;</span> <span class="n">B</span> <span class="o">=</span> <span class="p">{</span>
    <span class="mi">7</span><span class="p">,</span>  <span class="mi">8</span><span class="p">,</span>
    <span class="mi">9</span><span class="p">,</span> <span class="mi">10</span><span class="p">,</span>
    <span class="mi">11</span><span class="p">,</span> <span class="mi">12</span>
<span class="p">};</span> <span class="c1">// 3x2=K,N</span>

<span class="c1">// C: matrix of zeroes, we save the result there</span>
<span class="c1">// A (MxK), B (KxN) -&gt; C (MxN)</span>
<span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o">&lt;</span><span class="kt">double</span><span class="o">&gt;</span> <span class="n">C</span><span class="p">(</span><span class="n">M</span> <span class="o">*</span> <span class="n">N</span><span class="p">,</span> <span class="mf">0.0</span><span class="p">);</span> <span class="c1">// 2x2</span>

<span class="c1">// C := alpha * Op(A) * Op(B) + beta * C</span>
<span class="n">cblas_dgemm</span><span class="p">(</span>
    <span class="n">CblasRowMajor</span><span class="p">,</span>    <span class="c1">// Matrix order, our case is Row major</span>
    <span class="n">CblasNoTrans</span><span class="p">,</span>     <span class="c1">// Transpose matrix A</span>
    <span class="n">CblasNoTrans</span><span class="p">,</span>     <span class="c1">// Transpose matrix B</span>
    <span class="n">M</span><span class="p">,</span>                <span class="c1">// number of rows of op(A) and C</span>
    <span class="n">N</span><span class="p">,</span>                <span class="c1">// number of columns of op(B) and C</span>
    <span class="n">K</span><span class="p">,</span>                <span class="c1">// number of columns of op(A) and rows of op(B)</span>
    <span class="mf">1.0</span><span class="p">,</span>              <span class="c1">// alpha</span>
    <span class="n">A</span><span class="p">.</span><span class="n">data</span><span class="p">(),</span>         <span class="c1">// A</span>
    <span class="n">K</span><span class="p">,</span>                <span class="c1">// for row-major, lda = #cols of A</span>
    <span class="n">B</span><span class="p">.</span><span class="n">data</span><span class="p">(),</span>         <span class="c1">// B</span>
    <span class="n">N</span><span class="p">,</span>                <span class="c1">// for row-major, ldb = #cols of B</span>
    <span class="mf">0.0</span><span class="p">,</span>              <span class="c1">// beta</span>
    <span class="n">C</span><span class="p">.</span><span class="n">data</span><span class="p">(),</span>         <span class="c1">// C</span>
    <span class="n">N</span>                 <span class="c1">// for row-major, ldc = #cols of C</span>
<span class="p">);</span>

<span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"C = A*B:</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span>
<span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">i</span> <span class="o">&lt;</span> <span class="n">M</span><span class="p">;</span> <span class="o">++</span><span class="n">i</span><span class="p">)</span> <span class="p">{</span>
    <span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">j</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">j</span> <span class="o">&lt;</span> <span class="n">N</span><span class="p">;</span> <span class="o">++</span><span class="n">j</span><span class="p">)</span>
        <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="n">C</span><span class="p">[</span><span class="n">i</span> <span class="o">*</span> <span class="n">N</span> <span class="o">+</span> <span class="n">j</span><span class="p">]</span> <span class="o">&lt;&lt;</span> <span class="s">" "</span><span class="p">;</span>
    <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span>
<span class="p">}</span>
<span class="c1">// Expected:</span>
<span class="c1">// [ 58  64 ]</span>
<span class="c1">// [139 154 ]</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>The result matches the expected. In the <a href="https://github.com/SebastiaAgramunt/blogging-code/tree/main/blas-lapack-install">code</a> you will find a file named <code class="language-plaintext highlighter-rouge">src/cblas_example.cpp</code>. To compile it use the bash script <code class="language-plaintext highlighter-rouge">scripts/build-run.sh</code>, there you will find this code:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
</pre></td><td class="rouge-code"><pre>
<span class="nv">OPENBLAS_INC</span><span class="o">=</span><span class="s2">"</span><span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span><span class="s2">/external/lib/openblas/include/openblas"</span>
<span class="nv">OPENBLAS_LIB</span><span class="o">=</span><span class="s2">"</span><span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span><span class="s2">/external/lib/openblas/lib"</span>

<span class="c"># OPENBLAS example</span>
<span class="c"># compile object</span>
g++ <span class="nt">-O3</span> <span class="se">\</span>
    <span class="nt">-std</span><span class="o">=</span>c++17 <span class="se">\</span>
    <span class="nt">-c</span> <span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/src/cblas_example.cpp <span class="se">\</span>
    <span class="nt">-I</span><span class="k">${</span><span class="nv">OPENBLAS_INC</span><span class="k">}</span> <span class="se">\</span>
    <span class="nt">-o</span> <span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/build/obj/cblas_example.o

<span class="c"># compile binary</span>
g++ <span class="nt">-O3</span> <span class="se">\</span>
    <span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/build/obj/cblas_example.o <span class="se">\</span>
    <span class="nt">-L</span><span class="k">${</span><span class="nv">OPENBLAS_LIB</span><span class="k">}</span> <span class="se">\</span>
    <span class="nt">-lopenblas</span> <span class="se">\</span>
    <span class="nt">-Wl</span>,-rpath,<span class="k">${</span><span class="nv">OPENBLAS_LIB</span><span class="k">}</span> <span class="se">\</span>
    <span class="nt">-o</span> <span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/build/bin/cblas_example
</pre></td></tr></tbody></table></code></pre></div></div>

<p>where we basically first compile the file to an object with the includes and then link with the openblas library. This code assumes that your openblas library has been installed in the repository directory <code class="language-plaintext highlighter-rouge">external/lib/openblas</code> directory. If this is not the case and you have installed elsewhere, just change the variables <code class="language-plaintext highlighter-rouge">OPENBLAS_INC</code> and <code class="language-plaintext highlighter-rouge">OPENBLAS_LIB</code>. Also assumes we created several directories to store our object files and binaries inside the project.</p>

<p>If you follow the bash script, just execute <code class="language-plaintext highlighter-rouge">scripts/build-run.sh</code> and after compilation the executables will be in <code class="language-plaintext highlighter-rouge">build/bin</code> in the same project directory.</p>

<h3 id="lapacke-examples">LAPACKE examples</h3>

<p>In Lapacke we will do just one example in one source file that we will name <code class="language-plaintext highlighter-rouge">lapacke_example.cpp</code>. As before all signatures for this library (Lapacke) will be in <code class="language-plaintext highlighter-rouge">lapacke.h</code> in your installed directory. In this example we will use the function <a href="https://netlib.org/lapack/explore-html-3.6.1/d7/d3b/group__double_g_esolve_ga5ee879032a8365897c3ba91e3dc8d512.html">dgesv</a> to calculate the soluton to areal system of linear equations.</p>

\[A \times X = B\]

<p>where $A$ is a $N$ by $N$ matrix, and $X$ and $B$ are vectors of size $N$ times right-hand sides (solutions). The function <code class="language-plaintext highlighter-rouge">dgesv</code> has signature:</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
</pre></td><td class="rouge-code"><pre><span class="n">lapack_int</span> <span class="nf">LAPACKE_dgesv</span><span class="p">(</span>
    <span class="kt">int</span> <span class="n">matrix_layout</span><span class="p">,</span>     <span class="c1">// LAPACK_ROW_MAJOR or LAPACK_COL_MAJOR</span>
    <span class="n">lapack_int</span> <span class="n">n</span><span class="p">,</span>          <span class="c1">// order of A (A is n×n)</span>
    <span class="n">lapack_int</span> <span class="n">nrhs</span><span class="p">,</span>       <span class="c1">// number of right-hand sides (columns of B/X)</span>
    <span class="kt">double</span><span class="o">*</span> <span class="n">a</span><span class="p">,</span>             <span class="c1">// in: A (n×n)</span><span class="p">;</span> <span class="n">out</span><span class="o">:</span> <span class="n">combined</span> <span class="n">L</span> <span class="n">and</span> <span class="n">U</span> <span class="n">factors</span>
    <span class="n">lapack_int</span> <span class="n">lda</span><span class="p">,</span>        <span class="c1">// leading dimension of A</span>
    <span class="n">lapack_int</span><span class="o">*</span> <span class="n">ipiv</span><span class="p">,</span>      <span class="c1">// out: pivot indices (size n, 1-based)</span>
    <span class="kt">double</span><span class="o">*</span> <span class="n">b</span><span class="p">,</span>             <span class="c1">// in: B (n×nrhs); out: solution X (n×nrhs)</span>
    <span class="n">lapack_int</span> <span class="n">ldb</span>         <span class="c1">// leading dimension of B</span>
<span class="p">);</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>This function computes the LU factorization with partial pivoting of A and solves $A \times X = B$. We already explained leading dimensions in the previous section, here we have something new, the pivot indices. These are indices that are used by the algorithm internally to pivot (permute) indices. This is part of the <a href="https://en.wikipedia.org/wiki/LU_decomposition">LU decomposition</a> algorithm, we won’t extend here to explain the algorithm.</p>

<p>We define the following system of equations for our example</p>

\[\left[ {\begin{array}{ccc}
3 &amp; 1 &amp; 2 \\
6 &amp; 3 &amp; 4\\
3 &amp; 1 &amp; 5\\
\end{array} } \right]
\times
\left[ {\begin{array}{c}
x \\
y \\
z \\
\end{array}} \right]
=
\left[ {\begin{array}{c}
0 \\
1 \\
3 \\
\end{array}} \right]\]

<p>with solution</p>

\[\left[ {\begin{array}{c}
x \\
y \\
z \\
\end{array}} \right]
=
\left[ {\begin{array}{c}
-1 \\
1 \\
1 \\
\end{array}} \right]\]

<p>Let’s write the source file example to solve this equation using <code class="language-plaintext highlighter-rouge">dgesv</code>:</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
</pre></td><td class="rouge-code"><pre><span class="cp">#include</span> <span class="cpf">&lt;cstdio&gt;</span><span class="cp">
#include</span> <span class="cpf">&lt;vector&gt;</span><span class="cp">
#include</span> <span class="cpf">&lt;lapacke.h&gt;</span><span class="cp">
</span>
<span class="c1">// Solve A x = b for x, overwriting b with the solution.</span>
<span class="c1">// Uses LAPACKE_dgesv (LU factorization with partial pivoting).</span>
<span class="kt">int</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
    <span class="c1">// Example 3x3 system</span>
    <span class="c1">// A =</span>
    <span class="c1">// [ 3  1  2 ]</span>
    <span class="c1">// [ 6  3  4 ]</span>
    <span class="c1">// [ 3  1  5 ]</span>
    <span class="c1">// b = [ 0, 1, 3 ]^T</span>
    <span class="k">const</span> <span class="kt">int</span> <span class="n">n</span> <span class="o">=</span> <span class="mi">3</span><span class="p">;</span>          <span class="c1">// order of A</span>
    <span class="k">const</span> <span class="kt">int</span> <span class="n">nrhs</span> <span class="o">=</span> <span class="mi">1</span><span class="p">;</span>       <span class="c1">// number of right-hand sides</span>
    <span class="k">const</span> <span class="kt">int</span> <span class="n">lda</span> <span class="o">=</span> <span class="n">n</span><span class="p">;</span>        <span class="c1">// leading dimension of A (row-major -&gt; lda = n)</span>
    <span class="k">const</span> <span class="kt">int</span> <span class="n">ldb</span> <span class="o">=</span> <span class="n">nrhs</span><span class="p">;</span>     <span class="c1">// leading dimension of B (row-major -&gt; ldb = nrhs)</span>

    <span class="c1">// Row-major storage (C style)</span>
    <span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o">&lt;</span><span class="kt">double</span><span class="o">&gt;</span> <span class="n">A</span> <span class="o">=</span> <span class="p">{</span>
        <span class="mf">3.0</span><span class="p">,</span> <span class="mf">1.0</span><span class="p">,</span> <span class="mf">2.0</span><span class="p">,</span>
        <span class="mf">6.0</span><span class="p">,</span> <span class="mf">3.0</span><span class="p">,</span> <span class="mf">4.0</span><span class="p">,</span>
        <span class="mf">3.0</span><span class="p">,</span> <span class="mf">1.0</span><span class="p">,</span> <span class="mf">5.0</span>
    <span class="p">};</span>
    <span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o">&lt;</span><span class="kt">double</span><span class="o">&gt;</span> <span class="n">b</span> <span class="o">=</span> <span class="p">{</span> <span class="mf">0.0</span><span class="p">,</span> <span class="mf">1.0</span><span class="p">,</span> <span class="mf">3.0</span> <span class="p">};</span>

    <span class="c1">// Pivot indices</span>
    <span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o">&lt;</span><span class="n">lapack_int</span><span class="o">&gt;</span> <span class="n">ipiv</span><span class="p">(</span><span class="n">n</span><span class="p">);</span>

    <span class="c1">// Call LAPACKE (row-major)</span>
    <span class="n">lapack_int</span> <span class="n">info</span> <span class="o">=</span> <span class="n">LAPACKE_dgesv</span><span class="p">(</span><span class="n">LAPACK_ROW_MAJOR</span><span class="p">,</span>
                                    <span class="n">n</span><span class="p">,</span> <span class="n">nrhs</span><span class="p">,</span>
                                    <span class="n">A</span><span class="p">.</span><span class="n">data</span><span class="p">(),</span> <span class="n">lda</span><span class="p">,</span>
                                    <span class="n">ipiv</span><span class="p">.</span><span class="n">data</span><span class="p">(),</span>
                                    <span class="n">b</span><span class="p">.</span><span class="n">data</span><span class="p">(),</span> <span class="n">ldb</span><span class="p">);</span>
    <span class="k">if</span> <span class="p">(</span><span class="n">info</span> <span class="o">&gt;</span> <span class="mi">0</span><span class="p">)</span> <span class="p">{</span>
        <span class="n">std</span><span class="o">::</span><span class="n">fprintf</span><span class="p">(</span><span class="n">stderr</span><span class="p">,</span> <span class="s">"U(%d,%d) is exactly zero; singular matrix.</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span> <span class="n">info</span><span class="p">,</span> <span class="n">info</span><span class="p">);</span>
        <span class="k">return</span> <span class="mi">1</span><span class="p">;</span>
    <span class="p">}</span> <span class="k">else</span> <span class="nf">if</span> <span class="p">(</span><span class="n">info</span> <span class="o">&lt;</span> <span class="mi">0</span><span class="p">)</span> <span class="p">{</span>
        <span class="n">std</span><span class="o">::</span><span class="n">fprintf</span><span class="p">(</span><span class="n">stderr</span><span class="p">,</span> <span class="s">"Argument %d to dgesv had an illegal value.</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span> <span class="o">-</span><span class="n">info</span><span class="p">);</span>
        <span class="k">return</span> <span class="mi">1</span><span class="p">;</span>
    <span class="p">}</span>

    <span class="c1">// b now contains the solution x</span>
    <span class="n">std</span><span class="o">::</span><span class="n">printf</span><span class="p">(</span><span class="s">"Solution x:</span><span class="se">\n</span><span class="s">"</span><span class="p">);</span>
    <span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">i</span> <span class="o">&lt;</span> <span class="n">n</span><span class="p">;</span> <span class="o">++</span><span class="n">i</span><span class="p">)</span> <span class="p">{</span>
        <span class="n">std</span><span class="o">::</span><span class="n">printf</span><span class="p">(</span><span class="s">"x[%d] = %.2f</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span> <span class="n">i</span><span class="p">,</span> <span class="n">b</span><span class="p">[</span><span class="n">i</span><span class="p">]);</span>
    <span class="p">}</span>
    <span class="k">return</span> <span class="mi">0</span><span class="p">;</span>
<span class="p">}</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>This code prints the solution on screen. I agree that we don’t need <code class="language-plaintext highlighter-rouge">double</code> or enven <code class="language-plaintext highlighter-rouge">float</code> precision for this calculation (solution is <code class="language-plaintext highlighter-rouge">(-1, 1, 1)</code>)but there are no functions for <code class="language-plaintext highlighter-rouge">int</code> precision in lapacke. For more functions check the <a href="https://www.netlib.org/lapack/lug/">lapack</a> documentation and take a look at your <code class="language-plaintext highlighter-rouge">lapacke.h</code> header where you will find all the definitions.</p>

<p>To compile the above code run the following if you have installed <code class="language-plaintext highlighter-rouge">openblas</code> and <code class="language-plaintext highlighter-rouge">lapacke</code> in the current project directory</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
</pre></td><td class="rouge-code"><pre><span class="nv">OPENBLAS_INC</span><span class="o">=</span><span class="s2">"</span><span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span><span class="s2">/external/lib/openblas/include/openblas"</span>
<span class="nv">OPENBLAS_LIB</span><span class="o">=</span><span class="s2">"</span><span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span><span class="s2">/external/lib/openblas/lib"</span>

<span class="nv">LAPACKE_INC</span><span class="o">=</span><span class="s2">"</span><span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span><span class="s2">/external/lib/lapack/include"</span>
<span class="nv">LAPACKE_LIB</span><span class="o">=</span><span class="s2">"</span><span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span><span class="s2">/external/lib/lapack/lib"</span>

<span class="c"># compile object</span>
g++ <span class="nt">-O3</span> <span class="se">\</span>
    <span class="nt">-std</span><span class="o">=</span>c++17 <span class="se">\</span>
    <span class="nt">-c</span> <span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/src/lapacke_example.cpp <span class="se">\</span>
    <span class="nt">-I</span><span class="k">${</span><span class="nv">LAPACKE_INC</span><span class="k">}</span> <span class="se">\</span>
    <span class="nt">-o</span> <span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/build/obj/lapacke_example.o

<span class="c"># # compile binary</span>
g++ <span class="nt">-O3</span> <span class="se">\</span>
<span class="s2">"</span><span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span><span class="s2">/build/obj/lapacke_example.o"</span> <span class="se">\</span>
<span class="nt">-L</span><span class="s2">"</span><span class="k">${</span><span class="nv">LAPACKE_LIB</span><span class="k">}</span><span class="s2">"</span> <span class="se">\</span>
<span class="nt">-L</span><span class="s2">"</span><span class="k">${</span><span class="nv">OPENBLAS_LIB</span><span class="k">}</span><span class="s2">"</span> <span class="se">\</span>
<span class="nt">-llapacke</span> <span class="nt">-lopenblas</span> <span class="se">\</span>
<span class="nt">-Wl</span>,-rpath,<span class="s2">"</span><span class="k">${</span><span class="nv">LAPACKE_LIB</span><span class="k">}</span><span class="s2">"</span> <span class="se">\</span>
<span class="nt">-Wl</span>,-rpath,<span class="s2">"</span><span class="k">${</span><span class="nv">OPENBLAS_LIB</span><span class="k">}</span><span class="s2">"</span> <span class="se">\</span>
<span class="nt">-o</span> <span class="s2">"</span><span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span><span class="s2">/build/bin/lapacke_example"</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>where <code class="language-plaintext highlighter-rouge">ROOT_DIR</code> is the root directory of the repository. If you have installed the libraries elsewhere, just change the include and library directories above. In any case there is a bash script in the <a href="https://github.com/SebastiaAgramunt/blogging-code/tree/main/blas-lapack-install">GitHub repository</a> that compiles the executable.</p>

<h2 id="final-remarks">Final remarks</h2>

<p>I hope you have enjoyed this tutorial. I believe it’s important to use these two libraries for numerical computing, knowning them well can save you a lot of time (not only yours, computational time!). Besides these libraries are really well optimized, they are old and still well maintained, it’s the standard.</p>]]></content><author><name>Sebastia Agramunt Puig</name></author><category term="C++" /><category term="computer science" /><category term="mathematics" /><summary type="html"><![CDATA[BLAS is the basic linear algebra subprograms library, It’s a well tested library for basic algebraic operations. For instance, vector operations, dot products, matrix transpositions, multiplications etc. The LAPACK library (Linear Algebra PACKage) is a higher level library built on top of BLAS: while BLAS provides low-level building blocks (vector, matrix operations), LAPACK implements full algorithms for solving core linear algebra problems. For instance, with LAPACK we cans olve systems of linear equations (LU decomposition), least square problems, eigenvalue problems, matrix factorizations (LU, Cholesky, QR, Schur). Lapack is oringally built in Fortran, so the indexes are colum-order instead of the usual row-order in C, that might be a source of error for an avid C developer, however it exist a C wrapper for LAPACK, called LAPACKE (see the user guide).]]></summary></entry><entry><title type="html">Audio Transcription using Whisper from OpenAI</title><link href="https://agramunt.me/posts/audio-transcription/" rel="alternate" type="text/html" title="Audio Transcription using Whisper from OpenAI" /><published>2025-08-21T02:22:00-07:00</published><updated>2025-10-24T21:59:49-07:00</updated><id>https://agramunt.me/posts/audio-transcription</id><content type="html" xml:base="https://agramunt.me/posts/audio-transcription/"><![CDATA[<p>Recently I found myself with the need to transcribe an entire <a href="https://www.youtube.com/watch?v=3GL64FIqgtg&amp;t=912s">YouTube interview</a>. The prupose of this post is to use AI to transcribe the audio to text and then translate from Spanish to English.</p>

<p>The interview was hosted by people from <a href="https://opground.com/">Opground.com</a>. In this post I should acknoledge <a href="https://www.linkedin.com/in/eduardteixidoviladrich/">Eduard Teixidó</a> and <a href="https://www.linkedin.com/in/marcelgozalbobaro/">Marcel Gozalbo</a> from Opground for the interview. Also I thank <a href="https://lambda.ai/">Lambda AI</a> for providing free credit to run the inference in the AI models described in the post.</p>

<h2 id="introduction">Introduction</h2>

<p>Transcription is the process of converting speech or audio into written text. As an example, in the spanish congress of deputies, there exist the job of stenographer: A person that writes in paper everything that is said in the chamber to later be saved and published officially. These stenographers perform perfectly their job, they transcribe exactly what is said. Technology however, can help us accelreate the transcription of audio that is already recorded so that we can work with the text.</p>

<p>One of the first audio-to-text systems was <a href="https://www.bbc.com/future/article/20170214-the-machines-that-learned-to-listen">Audrey by Bell Labs</a> developed in the 50’s of last century. The system was able to recognize phonemes, not words or sentences and “the huge machine occupied a six-foot-high relay rack, consumed substantial power and had streams of cables”.</p>

<p>Audrey was a great breakthrough but clearly not feaseable for practical implementations. Luckyly the field of AI has made a huge progress in the last two decades and one of the models that has great perofrmance is <a href="https://openai.com/index/whisper/">Whisper from OpenAI</a>. Whisper is an automatic speech recognition (ASR) system trained on 680,000 hours of multilingual and multitask supervised data collected from the web. The large model has around 1.55B parameters or 6.2 GB in floating points of 32 bits. Find more details of the model in the paper <a href="https://cdn.openai.com/papers/whisper.pdf">Robust Speech Recognition via Large-Scale Weak Supervision</a> and the github repository <a href="https://github.com/openai/whisper/tree/main?tab=readme-ov-file">openai/whisper</a>. This model is complex… so I defer to the reader to use the references provided to understand the architecture and the backgorund. We will use inference in the model in Spanish and according to <a href="https://github.com/openai/whisper/tree/main">the Readme</a> of the repository, the <code class="language-plaintext highlighter-rouge">large-v3</code> model has around 4.7 <a href="https://en.wikipedia.org/wiki/Word_error_rate">Word Error Rate</a> (WER), which is an impressive metric. In this post we will use Whisper <code class="language-plaintext highlighter-rouge">large-v3</code> model to transcribe the interview.</p>

<h2 id="instance-in-lambda-ai">Instance in Lambda AI</h2>

<p>As mentioned I’m using <a href="https://lambda.ai/">Lambda AI</a> as cloud service to use a GPU. Yes, tried running the AI model in my Mac and… surprise, I had to cancel it, was taking too long. I’m using a <code class="language-plaintext highlighter-rouge">gpu_1x_a100_sxm4</code> machine. Once I’m in the machine I run <code class="language-plaintext highlighter-rouge">gpu_info</code> (a CLI tool I build on another post) to get the characteristigs of the GPU.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
</pre></td><td class="rouge-code"><pre>Detected 1 CUDA Capable Device(s)

Device 0: NVIDIA A100-SXM4-40GB
  PCI Domain/Bus/Device ID: 0/7/0
  Compute capability: 8.0
  Total global memory: 40442.4 MB
  Free memory (current): 40019.6 MB
  Total allocatable memory (current): 40442.4 MB
  Memory clock rate: 1215 MHz
  Memory bus width: 5120 bits
  L2 cache size: 40960 KB
  Max shared memory per block: 48 KB
  Total constant memory: 64 KB
  Warp size: 32
  Max threads per block: 1024
  Max threads per multiprocessor: 2048
  Multiprocessor count: 108
  Max grid dimensions: [2147483647, 65535, 65535]
  Max block dimensions: [1024, 1024, 64]
  Clock rate: 1410 MHz
  Concurrent kernels: Yes
  ECC enabled: Yes
  Integrated device: No
  Can map host memory: Yes
  Compute mode: Default
  Unified addressing: Yes
  Async engines: 3
  Device overlap: Yes
  PCI bus ID: 7
  PCI device ID: 0
</pre></td></tr></tbody></table></code></pre></div></div>

<p>This is an Ampere 100 GPU with 40GB of memory, a great GPU for our purposes, inference. Now running <code class="language-plaintext highlighter-rouge">lscpu</code> you can get the information of the CPU of the machine, won’t extend here but just mention that it is an <code class="language-plaintext highlighter-rouge">x86_64</code> architecture model <code class="language-plaintext highlighter-rouge">AMD EPYC 7J13 64-Core Processor</code> (check specs <a href="https://www.cpubenchmark.net/cpu.php?cpu=AMD+EPYC+7J13&amp;id=4300">here</a>). Pretty nice machine inedeed!.</p>

<h2 id="downloading-audio-from-youtube">Downloading audio from YouTube</h2>

<p>First we need to download the audio, you can do that directly from youtube. Normally the app we will be using to download uses the cookies from your browswer. That makes things hard as remote machines are normally pure command line and don’t have a browser. It is more convenient to download the audio in your local machine and then copy it to your remote machine.</p>

<p>Use the following script, and name it <code class="language-plaintext highlighter-rouge">download_audio.py</code>:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
</pre></td><td class="rouge-code"><pre><span class="kn">import</span> <span class="n">argparse</span>
<span class="kn">from</span> <span class="n">pathlib</span> <span class="kn">import</span> <span class="n">Path</span>
<span class="kn">from</span> <span class="n">yt_dlp</span> <span class="kn">import</span> <span class="n">YoutubeDL</span>

<span class="k">def</span> <span class="nf">download_audio</span><span class="p">(</span><span class="n">url</span><span class="p">:</span> <span class="nb">str</span><span class="p">,</span> <span class="n">out_dir</span><span class="p">:</span> <span class="n">Path</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="n">Path</span><span class="p">:</span>
    <span class="n">out_dir</span><span class="p">.</span><span class="nf">mkdir</span><span class="p">(</span><span class="n">parents</span><span class="o">=</span><span class="bp">True</span><span class="p">,</span> <span class="n">exist_ok</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span>
    <span class="n">ydl_opts</span> <span class="o">=</span> <span class="p">{</span>
        <span class="sh">"</span><span class="s">outtmpl</span><span class="sh">"</span><span class="p">:</span> <span class="nf">str</span><span class="p">(</span><span class="n">out_dir</span> <span class="o">/</span> <span class="sh">"</span><span class="s">%(title)s.%(ext)s</span><span class="sh">"</span><span class="p">),</span>
        <span class="sh">"</span><span class="s">format</span><span class="sh">"</span><span class="p">:</span> <span class="sh">"</span><span class="s">bestaudio/best</span><span class="sh">"</span><span class="p">,</span>
        <span class="sh">"</span><span class="s">noplaylist</span><span class="sh">"</span><span class="p">:</span> <span class="bp">True</span><span class="p">,</span>
        <span class="sh">"</span><span class="s">quiet</span><span class="sh">"</span><span class="p">:</span> <span class="bp">True</span><span class="p">,</span>
        <span class="sh">"</span><span class="s">no_warnings</span><span class="sh">"</span><span class="p">:</span> <span class="bp">True</span><span class="p">,</span>
        <span class="sh">"</span><span class="s">postprocessors</span><span class="sh">"</span><span class="p">:</span> <span class="p">[</span>
            <span class="p">{</span><span class="sh">"</span><span class="s">key</span><span class="sh">"</span><span class="p">:</span> <span class="sh">"</span><span class="s">FFmpegExtractAudio</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">preferredcodec</span><span class="sh">"</span><span class="p">:</span> <span class="sh">"</span><span class="s">wav</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">preferredquality</span><span class="sh">"</span><span class="p">:</span> <span class="sh">"</span><span class="s">192</span><span class="sh">"</span><span class="p">}</span>
        <span class="p">],</span>
    <span class="p">}</span>
    <span class="k">with</span> <span class="nc">YoutubeDL</span><span class="p">(</span><span class="n">ydl_opts</span><span class="p">)</span> <span class="k">as</span> <span class="n">ydl</span><span class="p">:</span>
        <span class="n">info</span> <span class="o">=</span> <span class="n">ydl</span><span class="p">.</span><span class="nf">extract_info</span><span class="p">(</span><span class="n">url</span><span class="p">,</span> <span class="n">download</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span>
    <span class="c1"># Try to resolve final .wav
</span>    <span class="n">expected</span> <span class="o">=</span> <span class="n">out_dir</span> <span class="o">/</span> <span class="sa">f</span><span class="sh">"</span><span class="si">{</span><span class="n">info</span><span class="p">.</span><span class="nf">get</span><span class="p">(</span><span class="sh">'</span><span class="s">title</span><span class="sh">'</span><span class="p">,</span><span class="sh">'</span><span class="s">audio</span><span class="sh">'</span><span class="p">)</span><span class="si">}</span><span class="s">.wav</span><span class="sh">"</span>
    <span class="k">if</span> <span class="n">expected</span><span class="p">.</span><span class="nf">exists</span><span class="p">():</span>
        <span class="k">return</span> <span class="n">expected</span>
    <span class="c1"># Fallback: newest wav in folder
</span>    <span class="n">wavs</span> <span class="o">=</span> <span class="nf">list</span><span class="p">(</span><span class="n">out_dir</span><span class="p">.</span><span class="nf">glob</span><span class="p">(</span><span class="sh">"</span><span class="s">*.wav</span><span class="sh">"</span><span class="p">))</span>
    <span class="k">if</span> <span class="ow">not</span> <span class="n">wavs</span><span class="p">:</span>
        <span class="k">raise</span> <span class="nc">RuntimeError</span><span class="p">(</span><span class="sh">"</span><span class="s">No WAV file produced. Check ffmpeg/yt-dlp output.</span><span class="sh">"</span><span class="p">)</span>
    <span class="k">return</span> <span class="nf">max</span><span class="p">(</span><span class="n">wavs</span><span class="p">,</span> <span class="n">key</span><span class="o">=</span><span class="k">lambda</span> <span class="n">p</span><span class="p">:</span> <span class="n">p</span><span class="p">.</span><span class="nf">stat</span><span class="p">().</span><span class="n">st_mtime</span><span class="p">)</span>

<span class="k">def</span> <span class="nf">main</span><span class="p">():</span>
    <span class="n">ap</span> <span class="o">=</span> <span class="n">argparse</span><span class="p">.</span><span class="nc">ArgumentParser</span><span class="p">(</span><span class="n">description</span><span class="o">=</span><span class="sh">"</span><span class="s">Download YouTube audio as WAV.</span><span class="sh">"</span><span class="p">)</span>
    <span class="n">ap</span><span class="p">.</span><span class="nf">add_argument</span><span class="p">(</span><span class="sh">"</span><span class="s">url</span><span class="sh">"</span><span class="p">,</span> <span class="nb">help</span><span class="o">=</span><span class="sh">"</span><span class="s">YouTube URL</span><span class="sh">"</span><span class="p">)</span>
    <span class="n">ap</span><span class="p">.</span><span class="nf">add_argument</span><span class="p">(</span><span class="sh">"</span><span class="s">-o</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">--outdir</span><span class="sh">"</span><span class="p">,</span> <span class="n">default</span><span class="o">=</span><span class="sh">"</span><span class="s">outputs/_tmp</span><span class="sh">"</span><span class="p">,</span> <span class="nb">help</span><span class="o">=</span><span class="sh">"</span><span class="s">Output dir (default: outputs/_tmp)</span><span class="sh">"</span><span class="p">)</span>
    <span class="n">args</span> <span class="o">=</span> <span class="n">ap</span><span class="p">.</span><span class="nf">parse_args</span><span class="p">()</span>

    <span class="n">out_dir</span> <span class="o">=</span> <span class="nc">Path</span><span class="p">(</span><span class="n">args</span><span class="p">.</span><span class="n">outdir</span><span class="p">).</span><span class="nf">resolve</span><span class="p">()</span>
    <span class="n">wav</span> <span class="o">=</span> <span class="nf">download_audio</span><span class="p">(</span><span class="n">args</span><span class="p">.</span><span class="n">url</span><span class="p">,</span> <span class="n">out_dir</span><span class="p">)</span>

<span class="k">if</span> <span class="n">__name__</span> <span class="o">==</span> <span class="sh">"</span><span class="s">__main__</span><span class="sh">"</span><span class="p">:</span>
    <span class="nf">main</span><span class="p">()</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Now create a virtual environment and install <a href="https://github.com/yt-dlp/yt-dlp">yt-dlp</a>, I’m using python version <code class="language-plaintext highlighter-rouge">3.12</code>.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
</pre></td><td class="rouge-code"><pre>rm -rf .venv
python -m venv .venv
.venv/bin/python -m pip install -U yt-dlp
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Run the command with your video URL:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre><span class="nv">PYTHONWARNINGS</span><span class="o">=</span>ignore .venv/bin/python download_audio.py <span class="s2">"https://www.youtube.com/watch?v=VIDEO_ID"</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Now find your <code class="language-plaintext highlighter-rouge">*.wav</code> file in <code class="language-plaintext highlighter-rouge">outputs/_tmp/</code> from where you ran the script. Will have the same name as the original video, you can change it to <code class="language-plaintext highlighter-rouge">audio.wav</code> to make it more simple.</p>

<h2 id="copy-audio-to-remote-machine">Copy audio to remote machine</h2>

<p>Now with the audio we downloaded we need to copy the file to the remote machine with something like:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>scp <span class="nt">-i</span> <span class="nv">$HOME</span>/.ssh/id_lambda ~/transcription/outputs/_tmp/audio.wav ubuntu@PUBLIC_IP:/home/ubuntu/audio.wav
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Changing the <code class="language-plaintext highlighter-rouge">PUBLIC_IP</code> by your public IP provided by the cloud service. It’s pretty straightforward to get it from the Lambda instances webpage. Then the <code class="language-plaintext highlighter-rouge">-i</code> argument is followed by the private key generated to SSH to the remote machine.</p>

<h2 id="run-inference-on-a-machine-with-gpu">Run Inference on a machine with GPU</h2>

<p>Finally the hard part, use Whisper model to run inference. For that, in the remote machine we will create a new python environment with:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
</pre></td><td class="rouge-code"><pre><span class="nb">rm</span> <span class="nt">-rf</span> .venv
python <span class="nt">-m</span> venv .venv
.venv/bin/pip <span class="nb">install</span> <span class="nt">-U</span> openai-whisper
</pre></td></tr></tbody></table></code></pre></div></div>

<p>I got the default system python version as <code class="language-plaintext highlighter-rouge">3.10.12</code> which is a relatively recent version. Then activate the environment and check that the executable <code class="language-plaintext highlighter-rouge">whisper</code> is installed:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre><span class="nb">source</span> .venv/bin/activate
which whisper
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Finally run inference using the Ampere 100 GPU with the command:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
</pre></td><td class="rouge-code"><pre>whisper audio.wav <span class="se">\</span>
  <span class="nt">--model</span> large-v3 <span class="se">\</span>
  <span class="nt">--language</span> es <span class="se">\</span>
  <span class="nt">--task</span> transcribe <span class="se">\</span>
  <span class="nt">--device</span> cuda <span class="se">\</span>
  <span class="nt">--fp16</span> True <span class="se">\</span>
  <span class="nt">--temperature</span> 0 <span class="se">\</span>
  <span class="nt">--beam_size</span> 1 <span class="se">\</span>
  <span class="nt">--output_format</span> txt <span class="se">\</span>
  <span class="nt">--output_dir</span> large-v3
</pre></td></tr></tbody></table></code></pre></div></div>

<p>which will create a directory <code class="language-plaintext highlighter-rouge">large-v3</code> with the contents <code class="language-plaintext highlighter-rouge">audio.txt</code>. In the interview I get the first 10 lines with</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre><span class="nb">cat </span>large-v3/audio.txt | <span class="nb">head</span> <span class="nt">-10</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>as</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
</pre></td><td class="rouge-code"><pre>a las historias de las personas que hacen realidad esta evolución tecnológica, los techies.
Y nada de esto sería posible sin el soporte de Upground, el primer reclutador virtual.
Un sistema basado en inteligencia artificial que replica entrevistas virtuales
y con solo una única entrevista con su chatbot, busca, aplica y gestiona
todas las oportunidades del sector tech por ti.
¿Hay algo por lo que aceptarías un nuevo reto profesional?
No sacrifiques tu tiempo libre, que Upground es tu aliado.
Y con esto empezamos el día de hoy. Hola Marcel.
Hola, ¿qué tal Eduard? Buenos días, buen día. ¿Cómo estamos?
Muy bien, aquí estamos. Hoy por la mañana que tenemos un invitado muy interesante
</pre></td></tr></tbody></table></code></pre></div></div>

<h2 id="translate-to-english">Translate to english</h2>

<p>Use Whisper to translate to english, all parameters are the same but the task, which his <code class="language-plaintext highlighter-rouge">translate</code> this time.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
</pre></td><td class="rouge-code"><pre>whisper audio.wav <span class="se">\</span>
  <span class="nt">--model</span> large-v3 <span class="se">\</span>
  <span class="nt">--language</span> es <span class="se">\</span>
  <span class="nt">--task</span> translate <span class="se">\</span>
  <span class="nt">--device</span> cuda <span class="se">\</span>
  <span class="nt">--fp16</span> True <span class="se">\</span>
  <span class="nt">--temperature</span> 0 <span class="se">\</span>
  <span class="nt">--beam_size</span> 1 <span class="se">\</span>
  <span class="nt">--output_format</span> txt <span class="se">\</span>
  <span class="nt">--output_dir</span> large-v3_en
</pre></td></tr></tbody></table></code></pre></div></div>

<p>with the first 10 lines</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
</pre></td><td class="rouge-code"><pre>to the stories of the people who make this technological evolution a reality, the techies.
And none of this would be possible without the support of UpGround, the first virtual recruiter.
A system based on artificial intelligence that replicates virtual interviews
and with only one interview with its chatbot,
searches, applies and manages all opportunities in the tech sector for you.
Is there something you would accept as a new professional challenge?
Don't waste your free time, because UpGround is your ally.
And with this we begin today. Hello Marcel.
Hello, how are you Eduard? Good morning, how are you?
Very well, here we are. Today in the morning we have a very interesting guest
</pre></td></tr></tbody></table></code></pre></div></div>

<p>that seems pretty close to the Spanish version. We made it!.</p>

<h2 id="conclusions-and-future-analysis">Conclusions and future analysis</h2>

<p>This has been a quick job, a quick translation. It ran just fine, as a matter of fact, I had to go to the english version and modify parts of the text. It was predicting most words correctly but the context was not understandable sometimes. I don’t think this model is wrong, obviously, I just didn’t have the time to investigate further. Perhaps my audio quality wasn’t good?. Maybe I needed to adjust other parameters like temperature when running the inference?. Anyways, it was a fun exercise that has some practicality for me too. If you reached this part, thank you for reading the post!.</p>]]></content><author><name>Sebastia Agramunt Puig</name></author><category term="Misc" /><category term="computer science" /><category term="speech recognition" /><category term="AI" /><summary type="html"><![CDATA[Recently I found myself with the need to transcribe an entire YouTube interview. The prupose of this post is to use AI to transcribe the audio to text and then translate from Spanish to English.]]></summary></entry><entry><title type="html">Veracrypt</title><link href="https://agramunt.me/posts/veracrypt/" rel="alternate" type="text/html" title="Veracrypt" /><published>2025-07-26T20:42:00-07:00</published><updated>2025-07-26T20:42:00-07:00</updated><id>https://agramunt.me/posts/veracrypt</id><content type="html" xml:base="https://agramunt.me/posts/veracrypt/"><![CDATA[<p><a href="https://veracrypt.io/en/Home.html">Veracrypt</a> is a free, open-source encryption software used to:</p>

<ul>
  <li>Create encrypted volumes (containers) to securely store files.</li>
  <li>Encrypt entire disks or partitions, including system drives.</li>
  <li>Protect sensitive data with strong encryption algorithms like AES, Serpent, and Twofish.</li>
  <li>Support hidden volumes, adding plausible deniability.</li>
</ul>

<p>It’s commonly used for securing data on laptops, USB drives, or external disks. I personally use it to encrypt my backups before saving them to an external cloud or physical devices. In this post we show how to install and use veracrypt in command line.</p>

<h2 id="tldr">TLDR</h2>

<p>Setting up: Create a directory to store the <code class="language-plaintext highlighter-rouge">hc</code> files and a <code class="language-plaintext highlighter-rouge">keyfile.bin</code> file</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
</pre></td><td class="rouge-code"><pre><span class="c"># create dir</span>
<span class="nv">VERACRYPT_STORE</span><span class="o">=</span>~/.VeracryptVolumes/
<span class="nb">mkdir</span> <span class="nt">-p</span> <span class="k">${</span><span class="nv">VERACRYPT_STORE</span><span class="k">}</span>

<span class="c"># create keyfile.bin random file</span>
<span class="nv">KEYFILES</span><span class="o">=</span><span class="k">${</span><span class="nv">HOME</span><span class="k">}</span>/.ssh/keyfile.bin
<span class="nb">dd </span><span class="k">if</span><span class="o">=</span>/dev/urandom <span class="nv">of</span><span class="o">=</span><span class="k">${</span><span class="nv">KEYFILES</span><span class="k">}</span> <span class="nv">bs</span><span class="o">=</span>512 <span class="nv">count</span><span class="o">=</span>1

<span class="c"># visually check the file</span>
<span class="nb">cat</span> ~/.ssh/keyfile.bin |  hexdump <span class="nt">-C</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>For interactive session (dynamically define every parameter) run <code class="language-plaintext highlighter-rouge">veracrypt -t -c</code> to create a volume, otherwise define constants to create the volume</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
</pre></td><td class="rouge-code"><pre><span class="nv">SIZE</span><span class="o">=</span><span class="s2">"20MiB"</span>
<span class="nv">ENCRYPTION</span><span class="o">=</span><span class="s2">"AES"</span>
<span class="nv">HASH</span><span class="o">=</span><span class="s2">"SHA-512"</span>
<span class="nv">FILESYSTEM</span><span class="o">=</span><span class="s2">"exFAT"</span>
<span class="nv">PIM</span><span class="o">=</span>120
<span class="nv">VOLUME_NAME</span><span class="o">=</span><span class="s2">"secret_volume"</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Now create volume (use a long and secure password along with the keyfile for better security)</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
</pre></td><td class="rouge-code"><pre>veracrypt <span class="nt">--text</span> <span class="nt">--create</span> <span class="se">\</span>
<span class="nt">--size</span><span class="o">=</span><span class="k">${</span><span class="nv">SIZE</span><span class="k">}</span> <span class="se">\</span>
<span class="nt">--volume-type</span><span class="o">=</span>normal <span class="se">\</span>
<span class="nt">--encryption</span><span class="o">=</span><span class="k">${</span><span class="nv">ENCRYPTION</span><span class="k">}</span> <span class="se">\</span>
<span class="nt">--hash</span><span class="o">=</span><span class="k">${</span><span class="nv">HASH</span><span class="k">}</span> <span class="se">\</span>
<span class="nt">--filesystem</span><span class="o">=</span><span class="k">${</span><span class="nv">FILESYSTEM</span><span class="k">}</span> <span class="se">\</span>
<span class="nt">--pim</span><span class="o">=</span><span class="k">${</span><span class="nv">PIM</span><span class="k">}</span> <span class="se">\</span>
<span class="nt">--keyfiles</span><span class="o">=</span><span class="k">${</span><span class="nv">KEYFILES</span><span class="k">}</span> <span class="se">\</span>
<span class="nt">--random-source</span> /dev/urandom <span class="se">\</span>
<span class="k">${</span><span class="nv">VERACRYPT_STORE</span><span class="k">}</span>/<span class="k">${</span><span class="nv">VOLUME_NAME</span><span class="k">}</span>.hc
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Important note: It is best practice to not set the <code class="language-plaintext highlighter-rouge">--random-source</code> and be prompted to type 320 characters to generate the entropy. We just use the program <code class="language-plaintext highlighter-rouge">/dev/urandom</code> here because is more practical.</p>

<p>Mount the volume</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
</pre></td><td class="rouge-code"><pre>veracrypt <span class="nt">--text</span> <span class="nt">--mount</span> <span class="se">\</span>
<span class="k">${</span><span class="nv">VERACRYPT_STORE</span><span class="k">}</span>/<span class="k">${</span><span class="nv">VOLUME_NAME</span><span class="k">}</span>.hc <span class="se">\</span>
<span class="nt">--pim</span><span class="o">=</span><span class="k">${</span><span class="nv">PIM</span><span class="k">}</span> <span class="se">\</span>
<span class="nt">--protect-hidden</span><span class="o">=</span>no <span class="se">\</span>
<span class="nt">--keyfiles</span><span class="o">=</span><span class="k">${</span><span class="nv">KEYFILES</span><span class="k">}</span> <span class="se">\</span>
/Volumes/<span class="k">${</span><span class="nv">VOLUME_NAME</span><span class="k">}</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Add a file to your volume as an example</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre><span class="nb">echo</span> <span class="s2">"Hey, this file is going to be encrypted"</span> <span class="o">&gt;</span> /Volumes/<span class="k">${</span><span class="nv">VOLUME_NAME</span><span class="k">}</span>/encrypted_file.txt
</pre></td></tr></tbody></table></code></pre></div></div>

<p>And unmount the volume with</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>veracrypt <span class="nt">--text</span> <span class="nt">--unmount</span> <span class="k">${</span><span class="nv">VERACRYPT_STORE</span><span class="k">}</span>/<span class="k">${</span><span class="nv">VOLUME_NAME</span><span class="k">}</span>.hc
</pre></td></tr></tbody></table></code></pre></div></div>

<h2 id="install-veracrypt">Install VeraCrypt</h2>

<p>In MacOS just use brew</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>brew <span class="nb">install</span> <span class="nt">--cask</span> veracrypt 
</pre></td></tr></tbody></table></code></pre></div></div>

<p>for other OSs, please download from the <a href="https://veracrypt.io/en/Downloads.html">downloads page</a> and install manually. Once installed type</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>veracrypt <span class="nt">--text</span> <span class="nt">--help</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>to see the options. The <code class="language-plaintext highlighter-rouge">text</code> flag indicates command line. If you wish to open the UI, just type <code class="language-plaintext highlighter-rouge">veracrypt</code>. If you are on a Mac, <code class="language-plaintext highlighter-rouge">macFUSE</code> is also needed. Go to <a href="https://github.com/macfuse/macfuse/releases">macfuse releases</a> page and donwload and install the latest stable release. As of today it is <a href="https://github.com/macfuse/macfuse/releases/tag/macfuse-4.10.2">macFUSE 4.10.2</a>. After installation you may need to restart your mac.</p>

<h2 id="cli-usage">CLI usage</h2>

<p>In this section we won’t explain the UI usage, for that you have a very nice <a href="https://veracrypt.io/en/Beginner%27s%20Tutorial.html">beginner’s tutorial</a> from Veracrypt.</p>

<h3 id="create-a-volume">Create a volume</h3>

<p>In veracrypt, a volume is a virtual encrypted disk. It behaves like a real disk once mounted, but all data stored on it is automatically encrypted.</p>

<p>There are two main types of veracrypt volumes:</p>

<p>A file container is a single encrypted file that acts like a virtual drive. You mount it with veracrypt, and it appears as a new drive. Inside, you can store files and folders just like on a regular disk.</p>

<p>A partition or disk encryption in veracrypt encrypts an entire partition or physical disk (e.g., USB, external drive, or system drive). The whole drive is protected, and access requires a password at boot or when mounting.</p>

<p>Normally I work on the former, file containers. Let’s crate one but first print on screen the options:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>veracrypt <span class="nt">--text</span> <span class="nt">--create</span> <span class="nt">--help</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>The options I use</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
</pre></td><td class="rouge-code"><pre>--size=SIZE[K|KiB|M|MiB|G|GiB|T|TiB] or --size=max
 Use specified size when creating a new volume. If no suffix is indicated,
 then SIZE is interpreted in bytes. Suffixes K, M, G or T can be used to
 indicate a value in KiB, MiB, GiB or TiB respectively.
 If max is specified, the new volume will use all available free disk space.

--volume-type=TYPE
 Use specified volume type when creating a new volume. TYPE can be 'normal'
 or 'hidden'. See option -c for more information on creating hidden volumes.

 --encryption=ENCRYPTION_ALGORITHM
 Use specified encryption algorithm when creating a new volume. When cascading
 algorithms, they must be separated by a dash. For example: AES-Twofish.

 --hash=HASH
 Use specified hash algorithm when creating a new volume or changing password
 and/or keyfiles. This option also specifies the mixing PRF of the random
 number generator.

 --filesystem=TYPE
 Filesystem type to mount. The TYPE argument is passed to mount(8) command
 with option -t. Default type is 'auto'. When creating a new volume, this
 option specifies the filesystem to be created on the new volume.
 Filesystem type 'none' disables mounting or creating a filesystem.

 --pim=PIM
 Use specified PIM to mount/open a volume. Note that passing a PIM on the
 command line is potentially insecure as the PIM may be visible in the process
 list (see ps(1)) and/or stored in a command history file or system logs.

 -k, --keyfiles=KEYFILE1[,KEYFILE2,KEYFILE3,...]
 Use specified keyfiles when mounting a volume or when changing password
 and/or keyfiles. When a directory is specified, all files inside it will be
 used (non-recursively). Multiple keyfiles must be separated by comma.
 Use double comma (,,) to specify a comma contained in keyfile's name.
 Keyfile stored on a security token must be specified as
 token://slot/SLOT_NUMBER/file/FILENAME for a security token keyfile
 and emv://slot/SLOT_NUMBER for an EMV token keyfile.
 An empty keyfile (-k "") disables
 interactive requests for keyfiles. See also options --import-token-keyfiles,
 --list-token-keyfiles, --list-securitytoken-keyfiles, --list-emvtoken-keyfiles,
 --new-keyfiles, --protection-keyfiles.

 --random-source=FILE
 Use FILE as a source of random data (e.g., when creating a volume) instead
 of requiring the user to type random characters.
</pre></td></tr></tbody></table></code></pre></div></div>

<p>If you want to dynamically create the volume seeing the options on screen run <code class="language-plaintext highlighter-rouge">veracrypt -t -c</code>, sometimes it is easier to do this without predefining any configuration. In the following sections I define some convenient variables to define the volume, after all if we use this in a bash script we don’t want to be prompted too much. Trying to automate as much as I can here.</p>

<h4 id="create-a-password-protected-volume">Create a password protected volume</h4>

<p>And my command to crate a volume of <code class="language-plaintext highlighter-rouge">20MB</code> in a file named <code class="language-plaintext highlighter-rouge">my_first_volume.hc</code> in the newly created <code class="language-plaintext highlighter-rouge">~/.VeracryptVolumes</code> directory:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
</pre></td><td class="rouge-code"><pre><span class="nb">mkdir</span> ~/.VeracryptVolumes/
veracrypt <span class="nt">--text</span> <span class="nt">--create</span> <span class="se">\</span>
<span class="nt">--size</span><span class="o">=</span>20MiB <span class="se">\</span>
<span class="nt">--volume-type</span><span class="o">=</span>normal <span class="se">\</span>
<span class="nt">--encryption</span><span class="o">=</span>AES <span class="se">\</span>
<span class="nt">--hash</span><span class="o">=</span>SHA-512 <span class="se">\</span>
<span class="nt">--filesystem</span><span class="o">=</span>exFAT <span class="se">\</span>
<span class="nt">--pim</span><span class="o">=</span>120 <span class="se">\</span>
<span class="nt">--keyfiles</span><span class="o">=</span><span class="s2">""</span> <span class="se">\</span>
<span class="nt">--random-source</span> /dev/urandom <span class="se">\</span>
~/.VeracryptVolumes/my_first_volume.hc
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Let’s inspect this call</p>

<ul>
  <li>Encryption is <code class="language-plaintext highlighter-rouge">AES</code> algorithm, see <a href="https://veracrypt.io/en/Encryption%20Algorithms.html">encryption algorithms available</a>.</li>
  <li><code class="language-plaintext highlighter-rouge">exFAT</code> filesystem for maximum compatibility with Windows, MacOS and modern Linux distributions.</li>
  <li><code class="language-plaintext highlighter-rouge">PIM</code> (Personal Iterations Multiplier) of 120. PIM controls the number of hash iterations used during the password derivation process when mounting a volume, the higher the more secure but also will take more time to mount the volume.</li>
  <li><code class="language-plaintext highlighter-rouge">keyfiles</code> empty if you just want password protection. Add a random file for better protection</li>
  <li><code class="language-plaintext highlighter-rouge">random-source</code> is set to <code class="language-plaintext highlighter-rouge">/dev/urandom</code>, a computer pseudo-random number generator (see an output of 100 random bytes in terminal with <code class="language-plaintext highlighter-rouge">head -c 100 /dev/urandom | hexdump -C</code>). Normally you would not introduce this parameter and would be expected to type 320 random characters at the moment of volume creation to increase entropy in the encryption.</li>
  <li>The last parameter is the name of the volume created</li>
</ul>

<p>Once the above command is executed it will prompt to introduce your desired password twice and then will create the volume. Make sure you use strong passwords (15 characters minimum combining caps, numbers and non-ascii characters), a good page to generate those is <a href="https://www.strongpasswordgenerator.org/">https://www.strongpasswordgenerator.org/</a>. After successful execution of the command check the file has been created by executing <code class="language-plaintext highlighter-rouge">ls -lhat ~/.VeracryptVolumes</code>, getting something like:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre><span class="nt">-rw-------</span>    1 sebas  staff    20M 27 Jul 13:32 my_first_volume.hc
</pre></td></tr></tbody></table></code></pre></div></div>

<h3 id="create-a-password-and-key-protected-volume">Create a password and key protected volume</h3>

<p>A more secure way to create a volume is using two factor autenticaction. A <code class="language-plaintext highlighter-rouge">keyfile</code> is just a random file, a picture, a completely random file with characters etc. They are simply files whose contents are mixed with your password to derive the final encryption key. I chose to create a random file of 512 Bytes first with:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre><span class="nb">dd </span><span class="k">if</span><span class="o">=</span>/dev/urandom <span class="nv">of</span><span class="o">=</span><span class="nv">$HOME</span>/.ssh/keyfile.bin <span class="nv">bs</span><span class="o">=</span>512 <span class="nv">count</span><span class="o">=</span>1
</pre></td></tr></tbody></table></code></pre></div></div>

<p>whose content in hexadecimal can be checked with <code class="language-plaintext highlighter-rouge">cat ~/.ssh/keyfile.bin |  hexdump -C</code>. Now use the keyfile key and password to create the encrypted volume:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
</pre></td><td class="rouge-code"><pre>veracrypt <span class="nt">--text</span> <span class="nt">--create</span> <span class="se">\</span>
<span class="nt">--size</span><span class="o">=</span>20MiB <span class="se">\</span>
<span class="nt">--volume-type</span><span class="o">=</span>normal <span class="se">\</span>
<span class="nt">--encryption</span><span class="o">=</span>AES <span class="se">\</span>
<span class="nt">--hash</span><span class="o">=</span>SHA-512 <span class="se">\</span>
<span class="nt">--filesystem</span><span class="o">=</span>FAT <span class="se">\</span>
<span class="nt">--pim</span><span class="o">=</span>120 <span class="se">\</span>
<span class="nt">--random-source</span> /dev/urandom <span class="se">\</span>
<span class="nt">--keyfiles</span><span class="o">=</span><span class="s2">"</span><span class="k">${</span><span class="nv">HOME</span><span class="k">}</span><span class="s2">/.ssh/keyfile.bin"</span> <span class="se">\</span>
~/.VeracryptVolumes/my_second_volume.hc
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Keep the <code class="language-plaintext highlighter-rouge">keyfile</code> safe, because it is needed to mount the volume, without it you would have lost access to your encrypted data.</p>

<h3 id="mount--unmount-a-volume">Mount &amp; unmount a volume</h3>

<p>Mounting a volume is making it accessible in the filesystem. The same way you mount a USB drive in linux when you connect a USB you can mount a veracrypt volume. See the options with</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>veracrypt <span class="nt">--text</span> <span class="nt">--mount</span> <span class="nt">--help</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>To unmount</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>veracrypt <span class="nt">--text</span> <span class="nt">--unmount</span> <span class="nt">--help</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>In the next subsections we will mount and unmount the drives we created before.</p>

<h4 id="mount--unmount-a-volume-with-password">Mount &amp; unmount a volume with password</h4>

<p>Once the file is created we use veracrypt to decrypt the volume and mount it in our filesystem. That way we can start saving files in the volume. Let’s mount the two recently created volumes,</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
</pre></td><td class="rouge-code"><pre>veracrypt <span class="nt">--text</span> <span class="nt">--mount</span> <span class="se">\</span>
~/.VeracryptVolumes/my_first_volume.hc <span class="se">\</span>
<span class="nt">--pim</span><span class="o">=</span>120 <span class="se">\</span>
<span class="nt">--protect-hidden</span><span class="o">=</span>no <span class="se">\</span>
<span class="nt">--keyfiles</span><span class="o">=</span><span class="s2">""</span>  <span class="se">\</span>
/Volumes/my_first_volume
</pre></td></tr></tbody></table></code></pre></div></div>

<p>And the volume will be mounted in <code class="language-plaintext highlighter-rouge">/Volumes/my_first_volume</code> so to see the contents excuete <code class="language-plaintext highlighter-rouge">ls -lhat /Volumes/my_first_volume</code>. Or run</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>diskutil list
</pre></td></tr></tbody></table></code></pre></div></div>
<p>with a result like:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
</pre></td><td class="rouge-code"><pre>...
/dev/disk2 (disk image):
   #:                       TYPE NAME                    SIZE       IDENTIFIER
   0:                                                   +20.7 MB    disk2
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Check in your Findr in MacOS that the volume is there under <code class="language-plaintext highlighter-rouge">my_first_volume</code> (that’s possible because we use <code class="language-plaintext highlighter-rouge">exFAT</code>, if we used <code class="language-plaintext highlighter-rouge">FAT</code> it would appear as <code class="language-plaintext highlighter-rouge">NO NAME</code>) and you can start moving data to your volume!. When you finish adding the data just unmount the volume with:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>veracrypt <span class="nt">--text</span> <span class="nt">--unmount</span> ~/.VeracryptVolumes/my_first_volume.hc
</pre></td></tr></tbody></table></code></pre></div></div>

<h3 id="mount--unmount-a-volume-with-password-and-a-keyfile">Mount &amp; unmount a volume with password and a keyfile</h3>

<p>Similarly with the volume we created with the <code class="language-plaintext highlighter-rouge">keyfile</code> we just need to run:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
</pre></td><td class="rouge-code"><pre>veracrypt <span class="nt">--text</span> <span class="nt">--mount</span> <span class="se">\</span>
~/.VeracryptVolumes/my_second_volume.hc <span class="se">\</span>
<span class="nt">--pim</span><span class="o">=</span>120 <span class="se">\</span>
<span class="nt">--protect-hidden</span><span class="o">=</span>no <span class="se">\</span>
<span class="nt">--keyfiles</span><span class="o">=</span><span class="s2">"</span><span class="k">${</span><span class="nv">HOME</span><span class="k">}</span><span class="s2">/.ssh/keyfile.bin"</span> <span class="se">\</span>
/Volumes/my_second_volume
</pre></td></tr></tbody></table></code></pre></div></div>
<p>and unmount with</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>veracrypt <span class="nt">--text</span> <span class="nt">--unmount</span> ~/.VeracryptVolumes/my_second_volume.hc
</pre></td></tr></tbody></table></code></pre></div></div>

<h2 id="backup-and-save-your-volumes">Backup and save your volumes</h2>

<p>Now you have your encrypted volumes with your encrypted files inside. That’s great, but you still need to keep this secure.</p>

<ul>
  <li>Keep both, your encryption password and your keyfiles in a password manager like LastPass, KeePass (free and open source), Bitwarden, 1Password, ProtonPass, MegaPass… Any of these work.</li>
  <li>Copy the encrypted volumes in physical HD drives and cloud services like Google Drive. It’s safe as they are encrypted so these providers won’t be able to see the content.</li>
</ul>]]></content><author><name>Sebastia Agramunt Puig</name></author><category term="Tools" /><category term="computer science" /><summary type="html"><![CDATA[Veracrypt is a free, open-source encryption software used to:]]></summary></entry><entry><title type="html">Phockup</title><link href="https://agramunt.me/posts/phockup/" rel="alternate" type="text/html" title="Phockup" /><published>2025-07-26T20:42:00-07:00</published><updated>2025-07-31T08:37:48-07:00</updated><id>https://agramunt.me/posts/phockup</id><content type="html" xml:base="https://agramunt.me/posts/phockup/"><![CDATA[<p><a href="https://github.com/ivandokov/phockup">Phockup</a> (in case this link dissapears I also <a href="https://github.com/SebastiaAgramunt/phockup">forked</a> it in my repository) is photo backup, a useful command line tool to organize your pictures. In this post we’ll show how to isntall and use it. The software seems to be a bit old and not updated but it just works.</p>

<h2 id="install-phockup">Install Phockup</h2>

<p>In MacOS the easiest would be installing using brew with</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre>brew tap ivandokov/homebrew-contrib
brew <span class="nb">install </span>phockup
</pre></td></tr></tbody></table></code></pre></div></div>

<p>but this doesn’t work, it seems to install but correctly when you type <code class="language-plaintext highlighter-rouge">phockup</code> you get a message of missing packages (the famous <code class="language-plaintext highlighter-rouge">tqdm</code> in this case):</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
</pre></td><td class="rouge-code"><pre>Traceback <span class="o">(</span>most recent call last<span class="o">)</span>:
  File <span class="s2">"/usr/local/bin/phockup"</span>, line 11, <span class="k">in</span> &lt;module&gt;
    from src.phockup import Phockup
  File <span class="s2">"/usr/local/Cellar/phockup/1.13.0/src/phockup.py"</span>, line 11, <span class="k">in</span> &lt;module&gt;
    from tqdm import tqdm
ModuleNotFoundError: No module named <span class="s1">'tqdm'</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>We need to install the dependencies, we will do it by clonning phockup repository and creating a virtual environment in it, then install the <code class="language-plaintext highlighter-rouge">requirements.txt</code>. First clone the repository:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre><span class="nb">mkdir</span> <span class="nt">-p</span> ~/.venvs
git clone git@github.com:ivandokov/phockup.git ~/.venvs/phockup
</pre></td></tr></tbody></table></code></pre></div></div>

<p>And select a recent python version</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre>pyenv <span class="nb">install </span>3.12
pyenv shell 3.12
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Then create the environment in the clonned repository</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>python <span class="nt">-m</span> venv ~/.venvs/phockup/.venv
</pre></td></tr></tbody></table></code></pre></div></div>

<p>And install the requirements in the environment</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
</pre></td><td class="rouge-code"><pre><span class="nb">source</span> ~/.venvs/phockup/.venv/bin/activate

python <span class="nt">-m</span> pip <span class="nb">install</span> <span class="nt">--upgrade</span> pip
pip <span class="nb">install</span> <span class="nt">-r</span> ~/.venvs/phockup/requirements.txt
deactivate
</pre></td></tr></tbody></table></code></pre></div></div>

<p>if you want to execute phockup just activate the environment and run <code class="language-plaintext highlighter-rouge">phockup</code> installed from brew.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre><span class="nb">source</span> ~/.venvs/phockup/.venv/bin/activate
phockup <span class="nt">--help</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>See that we didn’t install any extra CLI (there’s no <code class="language-plaintext highlighter-rouge">phockup</code> binary in <code class="language-plaintext highlighter-rouge">bin</code> directory in the environment). Check with <code class="language-plaintext highlighter-rouge">which phockup</code> and you will get it in <code class="language-plaintext highlighter-rouge">/usr/local/bin/phockup</code>.</p>

<p>I know, weird installation but the brew tap is broken as of now and as I mentioned the project doesn’t seem to be maintained anymore so we had to be a little hacky here.</p>

<h2 id="run-phockup">Run phockup</h2>

<p>In general you can run phockup with the commands</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre><span class="nb">source</span> ~/.venvs/phockup/.venv/bin/activate
phockup <span class="nt">--help</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Let’s run <code class="language-plaintext highlighter-rouge">phockup</code> to backup our images and movies. We have a folder of pictures and movies that we want to organise <code class="language-plaintext highlighter-rouge">${SOURCE}</code> and a place we want to save the pictures to <code class="language-plaintext highlighter-rouge">${DESTINATION}</code>. This last folder may be empty or with other pictures already.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
</pre></td><td class="rouge-code"><pre><span class="nb">source</span> ~/.venvs/phockup/.venv/bin/activate

<span class="c"># the source (where movies and photos are) and empthy destination</span>
<span class="nv">SOURCE</span><span class="o">=</span>~/Downloads/new_pictures
<span class="nv">DESTINATION</span><span class="o">=</span>~/Downloads/organised_new_pictures

<span class="c"># assuming we haven't created the new destination directory</span>
<span class="nb">mkdir</span> <span class="nt">-p</span> <span class="k">${</span><span class="nv">DESTINATION</span><span class="k">}</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Now start the process by running</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
</pre></td><td class="rouge-code"><pre>phockup <span class="k">${</span><span class="nv">SOURCE</span><span class="k">}</span> <span class="k">${</span><span class="nv">DESTINATION</span><span class="k">}</span> <span class="se">\</span>
    <span class="nt">--progress</span> <span class="se">\</span>
    <span class="nt">-d</span> YYYY.MM  <span class="se">\</span>
    <span class="nt">--date-field</span> <span class="s2">"DateTimeOriginal CreateDate FileModifyDate"</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<h2 id="practical-case-backup-photos-from-android-phone">Practical case: Backup photos from Android phone</h2>

<p>I downloaded pictures from my phone using <code class="language-plaintext highlighter-rouge">adb</code> (install with <code class="language-plaintext highlighter-rouge">brew install android-platform-tools</code>), a command line tool that allows you interact with your android device from terminal. Once installed and before plugging your phone into the USB port you need to enable deloper (go to Settings -&gt; About phone -&gt; Build number and tap 7 times to Build number). Then enable USB debugging (go to Settings &gt; System &gt; Developer options, scroll down and activate Enable USB debugging option). Finally plug the phone to your USB and you will see a popup on your phone asking “Allow USB Debugging?” with a figerprint code, just enable and you will be good to go for the next step.</p>

<p>Let’s ssh to the device by running <code class="language-plaintext highlighter-rouge">adb shell</code> and then go the camera directory and display the pictures there.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
</pre></td><td class="rouge-code"><pre>adb shell
<span class="nb">cd</span> /sdcard/DCIM/Camera
<span class="nb">ls</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Let’s open another terminal and pull the data to our computer</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
</pre></td><td class="rouge-code"><pre><span class="c"># camera pictures</span>
<span class="nv">SOURCE</span><span class="o">=</span><span class="k">${</span><span class="nv">HOME</span><span class="k">}</span>/Downloads/PixelPhotos
adb pull /sdcard/DCIM/Camera <span class="k">${</span><span class="nv">SOURCE</span><span class="k">}</span>

<span class="c"># WhatsApp images</span>
<span class="nv">SOURCE</span><span class="o">=</span><span class="k">${</span><span class="nv">HOME</span><span class="k">}</span>/Downloads/WhatsappImages
adb pull <span class="s2">"/sdcard/Android/media/com.whatsapp/WhatsApp/Media/WhatsApp Images/"</span> <span class="k">${</span><span class="nv">SOURCE</span><span class="k">}</span>

<span class="nb">mv</span> <span class="k">${</span><span class="nv">SOURCE</span><span class="k">}</span>/Private/<span class="k">*</span> <span class="k">${</span><span class="nv">SOURCE</span><span class="k">}</span>/
<span class="nb">mv</span> <span class="k">${</span><span class="nv">SOURCE</span><span class="k">}</span>/Sent/<span class="k">*</span> <span class="k">${</span><span class="nv">SOURCE</span><span class="k">}</span>/

<span class="nb">rm</span> <span class="nt">-rf</span> <span class="k">${</span><span class="nv">SOURCE</span><span class="k">}</span>/Private/
<span class="nb">rm</span> <span class="nt">-rf</span> <span class="k">${</span><span class="nv">SOURCE</span><span class="k">}</span>/Sent/

<span class="c"># WhatsApp video</span>
<span class="nv">SOURCE</span><span class="o">=</span><span class="k">${</span><span class="nv">HOME</span><span class="k">}</span>/Downloads/WhatsappVideo
adb pull <span class="s2">"/sdcard/Android/media/com.whatsapp/WhatsApp/Media/WhatsApp Video/"</span> <span class="k">${</span><span class="nv">SOURCE</span><span class="k">}</span>

<span class="nb">mv</span> <span class="k">${</span><span class="nv">SOURCE</span><span class="k">}</span>/Private/<span class="k">*</span> <span class="k">${</span><span class="nv">SOURCE</span><span class="k">}</span>/
<span class="nb">mv</span> <span class="k">${</span><span class="nv">SOURCE</span><span class="k">}</span>/Sent/<span class="k">*</span> <span class="k">${</span><span class="nv">SOURCE</span><span class="k">}</span>/

<span class="nb">rm</span> <span class="nt">-rf</span> <span class="k">${</span><span class="nv">SOURCE</span><span class="k">}</span>/Private/
<span class="nb">rm</span> <span class="nt">-rf</span> <span class="k">${</span><span class="nv">SOURCE</span><span class="k">}</span>/Sent/

<span class="c"># and consolidate all photos and videos to PixelPhotos</span>
<span class="nv">SOURCE</span><span class="o">=</span><span class="k">${</span><span class="nv">HOME</span><span class="k">}</span>/Downloads/WhatsappImages
<span class="nb">mv</span> <span class="k">${</span><span class="nv">SOURCE</span><span class="k">}</span>/<span class="k">*</span> <span class="k">${</span><span class="nv">HOME</span><span class="k">}</span>/Downloads/PixelPhotos
<span class="nb">rm</span> <span class="nt">-rf</span> <span class="k">${</span><span class="nv">SOURCE</span><span class="k">}</span>

<span class="nv">SOURCE</span><span class="o">=</span><span class="k">${</span><span class="nv">HOME</span><span class="k">}</span>/Downloads/WhatsappVideo
<span class="nb">mv</span> <span class="k">${</span><span class="nv">SOURCE</span><span class="k">}</span>/<span class="k">*</span> <span class="k">${</span><span class="nv">HOME</span><span class="k">}</span>/Downloads/PixelPhotos
<span class="nb">rm</span> <span class="nt">-rf</span> <span class="k">${</span><span class="nv">SOURCE</span><span class="k">}</span>

<span class="nv">SOURCE</span><span class="o">=</span><span class="k">${</span><span class="nv">HOME</span><span class="k">}</span>/Downloads/WhatsappVideo
<span class="nb">mv</span> <span class="k">${</span><span class="nv">SOURCE</span><span class="k">}</span>/<span class="k">*</span> <span class="k">${</span><span class="nv">HOME</span><span class="k">}</span>/Downloads/PixelPhotos
<span class="nb">rm</span> <span class="nt">-rf</span> <span class="k">${</span><span class="nv">SOURCE</span><span class="k">}</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Now define the destination and copy all files while changing their names with phockup.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
</pre></td><td class="rouge-code"><pre><span class="nv">SOURCE</span><span class="o">=</span><span class="k">${</span><span class="nv">HOME</span><span class="k">}</span>/Downloads/PixelPhotos
<span class="nv">DESTINATION</span><span class="o">=</span><span class="k">${</span><span class="nv">HOME</span><span class="k">}</span>/Downloads/organized_PixelPhotos
<span class="nb">mkdir</span> <span class="nt">-p</span> <span class="k">${</span><span class="nv">DESTINATION</span><span class="k">}</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
</pre></td><td class="rouge-code"><pre><span class="nb">source</span> ~/.venvs/phockup/.venv/bin/activate
phockup <span class="k">${</span><span class="nv">SOURCE</span><span class="k">}</span> <span class="k">${</span><span class="nv">DESTINATION</span><span class="k">}</span> <span class="se">\</span>
    <span class="nt">--progress</span> <span class="se">\</span>
    <span class="nt">-d</span> YYYY.MM  <span class="se">\</span>
    <span class="nt">--date-field</span> <span class="s2">"DateTimeOriginal CreateDate FileModifyDate"</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Where <code class="language-plaintext highlighter-rouge">--progress</code> indicates we want to see the progress bar (powered by <code class="language-plaintext highlighter-rouge">tqdm</code> that we were missing), <code class="language-plaintext highlighter-rouge">-d</code> is the format. The last parameter <code class="language-plaintext highlighter-rouge">--date-field</code> is to use the image metadata through <code class="language-plaintext highlighter-rouge">exiftool</code> to get the metadata (including date). If we run <code class="language-plaintext highlighter-rouge">ls -lhat ~/Downloads/organized_PixelPhotos</code> we will see the directory structure:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
</pre></td><td class="rouge-code"><pre>drwxr-xr-x sebas staff 151 KB Tue Jul 29 00:58:20 2025  2025.07
drwxr-xr-x sebas staff 3.9 KB Tue Jul 29 00:58:06 2025  2024.12
drwxr-xr-x sebas staff 2.8 KB Tue Jul 29 00:57:58 2025  2024.11
drwxr-xr-x sebas staff 6.0 KB Tue Jul 29 00:57:49 2025  2024.10
drwxr-xr-x sebas staff 6.5 KB Tue Jul 29 00:57:14 2025  2024.08
drwxr-xr-x sebas staff 6.3 KB Tue Jul 29 00:56:27 2025  2025.06
drwxr-xr-x sebas staff 2.7 KB Tue Jul 29 00:55:34 2025  2025.05
drwxr-xr-x sebas staff 2.3 KB Tue Jul 29 00:55:05 2025  2025.04
drwxr-xr-x sebas staff  10 KB Tue Jul 29 00:54:39 2025  2025.03
drwxr-xr-x sebas staff 1.2 KB Tue Jul 29 00:52:36 2025  2025.02
drwxr-xr-x sebas staff 896 B  Tue Jul 29 00:52:22 2025  <span class="nb">.</span>
drwxr-xr-x sebas staff 5.0 KB Tue Jul 29 00:52:21 2025  2025.01
drwxr-xr-x sebas staff 4.4 KB Tue Jul 29 00:48:02 2025  2024.09
drwxr-xr-x sebas staff  10 KB Tue Jul 29 00:46:01 2025  2024.07
drwxr-xr-x sebas staff 3.3 KB Tue Jul 29 00:44:43 2025  2024.06
drwxr-xr-x sebas staff 2.3 KB Tue Jul 29 00:44:18 2025  2024.05
drwxr-xr-x sebas staff 2.1 KB Tue Jul 29 00:44:01 2025  2024.04
drwxr-xr-x sebas staff 4.6 KB Tue Jul 29 00:43:44 2025  2024.03
drwxr-xr-x sebas staff 8.1 KB Tue Jul 29 00:43:05 2025  2024.02
drwxr-xr-x sebas staff 2.4 KB Tue Jul 29 00:41:35 2025  2024.01
drwxr-xr-x sebas staff 2.9 KB Tue Jul 29 00:41:00 2025  2023.12
drwxr-xr-x sebas staff 2.9 KB Tue Jul 29 00:40:25 2025  2023.11
drwxr-xr-x sebas staff 3.1 KB Tue Jul 29 00:40:01 2025  2023.10
drwxr-xr-x sebas staff 2.8 KB Tue Jul 29 00:39:29 2025  2023.09
drwxr-xr-x sebas staff  11 KB Tue Jul 29 00:39:05 2025  2023.08
drwxr-xr-x sebas staff 1.8 KB Tue Jul 29 00:37:39 2025  2023.07
drwx------ sebas staff 6.9 KB Tue Jul 29 00:25:32 2025  ..
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Where the format is <code class="language-plaintext highlighter-rouge">YYYY.MM</code>, then if we inspect the filenames we see something like <code class="language-plaintext highlighter-rouge">20250317-132507.jpg</code>, that’s a photo taken March the 17th at 13h, 25m and 07 seconds.</p>

<h2 id="closing-remarks">Closing remarks</h2>

<p>Once you have your photo backup it is best if you can create a VeraCrypt volume and store your photos encrypted, then you can save to an external drive and a cloud service. Check out my Veracrypt Guide.</p>]]></content><author><name>Sebastia Agramunt Puig</name></author><category term="Tools" /><category term="computer science" /><summary type="html"><![CDATA[Phockup (in case this link dissapears I also forked it in my repository) is photo backup, a useful command line tool to organize your pictures. In this post we’ll show how to isntall and use it. The software seems to be a bit old and not updated but it just works.]]></summary></entry><entry><title type="html">CUDA utils</title><link href="https://agramunt.me/posts/cuda-utils/" rel="alternate" type="text/html" title="CUDA utils" /><published>2025-07-12T21:35:00-07:00</published><updated>2025-07-12T21:35:00-07:00</updated><id>https://agramunt.me/posts/cuda-utils</id><content type="html" xml:base="https://agramunt.me/posts/cuda-utils/"><![CDATA[<p>This is my first post in CUDA, I have been working for a while using this technology and want to share some utilities that can be useful for newcomers to the field. All the code will be available in my <a href="https://github.com/SebastiaAgramunt/blogging-code">github repository</a> subdirectory <a href="https://github.com/SebastiaAgramunt/blogging-code/tree/main/cuda-utils">cuda-utils</a>. I would like to acqnowledge <a href="https://cloud.lambda.ai/">Lambda.ai</a> for providing me with free credits for my blog. I will be testing this code with a machine <code class="language-plaintext highlighter-rouge">gpu_1x_a100_sxm4</code> which has an A100 GPU, the Ampere GPU architecture. This GPU is a bit old these days but we won’t be doing any heavy compute so this will suffice.</p>

<h2 id="project-structure">Project structure</h2>

<p>Files in this project will be structured as follows</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
</pre></td><td class="rouge-code"><pre>.
├── CMakeLists.txt
├── README.md
└── src
    ├── gpu_allocate.cu
    └── gpu_info.cu
</pre></td></tr></tbody></table></code></pre></div></div>

<h2 id="gpu-info-tool">GPU info tool</h2>

<p>The first tool just displays some basic information of the GPUs available in the system, create a file <code class="language-plaintext highlighter-rouge">gpu_info.cu</code> in with the code:</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
</pre></td><td class="rouge-code"><pre><span class="cp">#include</span> <span class="cpf">&lt;iostream&gt;</span><span class="cp">
#include</span> <span class="cpf">&lt;cuda_runtime.h&gt;</span><span class="cp">
</span>
<span class="kt">int</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
    <span class="kt">int</span> <span class="n">deviceCount</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
    <span class="n">cudaError_t</span> <span class="n">err</span> <span class="o">=</span> <span class="n">cudaGetDeviceCount</span><span class="p">(</span><span class="o">&amp;</span><span class="n">deviceCount</span><span class="p">);</span>
    <span class="k">if</span> <span class="p">(</span><span class="n">err</span> <span class="o">!=</span> <span class="n">cudaSuccess</span><span class="p">)</span> <span class="p">{</span>
        <span class="n">std</span><span class="o">::</span><span class="n">cerr</span> <span class="o">&lt;&lt;</span> <span class="s">"Failed to get device count: "</span> 
                  <span class="o">&lt;&lt;</span> <span class="n">cudaGetErrorString</span><span class="p">(</span><span class="n">err</span><span class="p">)</span> <span class="o">&lt;&lt;</span> <span class="n">std</span><span class="o">::</span><span class="n">endl</span><span class="p">;</span>
        <span class="k">return</span> <span class="mi">1</span><span class="p">;</span>
    <span class="p">}</span>

    <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"Detected "</span> <span class="o">&lt;&lt;</span> <span class="n">deviceCount</span> <span class="o">&lt;&lt;</span> <span class="s">" CUDA Capable Device(s)</span><span class="se">\n\n</span><span class="s">"</span><span class="p">;</span>

    <span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">dev</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">dev</span> <span class="o">&lt;</span> <span class="n">deviceCount</span><span class="p">;</span> <span class="o">++</span><span class="n">dev</span><span class="p">)</span> <span class="p">{</span>
        <span class="c1">// Select device</span>
        <span class="n">cudaSetDevice</span><span class="p">(</span><span class="n">dev</span><span class="p">);</span>

        <span class="c1">// Query device properties</span>
        <span class="n">cudaDeviceProp</span> <span class="n">prop</span><span class="p">;</span>
        <span class="n">cudaGetDeviceProperties</span><span class="p">(</span><span class="o">&amp;</span><span class="n">prop</span><span class="p">,</span> <span class="n">dev</span><span class="p">);</span>

        <span class="c1">// Query memory info</span>
        <span class="kt">size_t</span> <span class="n">freeBytes</span> <span class="o">=</span> <span class="mi">0</span><span class="p">,</span> <span class="n">totalBytes</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
        <span class="n">cudaMemGetInfo</span><span class="p">(</span><span class="o">&amp;</span><span class="n">freeBytes</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">totalBytes</span><span class="p">);</span>

        <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"Device "</span> <span class="o">&lt;&lt;</span> <span class="n">dev</span> <span class="o">&lt;&lt;</span> <span class="s">": "</span> <span class="o">&lt;&lt;</span> <span class="n">prop</span><span class="p">.</span><span class="n">name</span> <span class="o">&lt;&lt;</span> <span class="s">"</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span>
        <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"  PCI Domain/Bus/Device ID: "</span> 
                  <span class="o">&lt;&lt;</span> <span class="n">prop</span><span class="p">.</span><span class="n">pciDomainID</span> <span class="o">&lt;&lt;</span> <span class="s">"/"</span> 
                  <span class="o">&lt;&lt;</span> <span class="n">prop</span><span class="p">.</span><span class="n">pciBusID</span>    <span class="o">&lt;&lt;</span> <span class="s">"/"</span> 
                  <span class="o">&lt;&lt;</span> <span class="n">prop</span><span class="p">.</span><span class="n">pciDeviceID</span> <span class="o">&lt;&lt;</span> <span class="s">"</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span>
        <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"  Compute capability: "</span> 
                  <span class="o">&lt;&lt;</span> <span class="n">prop</span><span class="p">.</span><span class="n">major</span> <span class="o">&lt;&lt;</span> <span class="s">"."</span> <span class="o">&lt;&lt;</span> <span class="n">prop</span><span class="p">.</span><span class="n">minor</span> <span class="o">&lt;&lt;</span> <span class="s">"</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span>
        <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"  Total global memory: "</span> 
                  <span class="o">&lt;&lt;</span> <span class="p">(</span><span class="n">prop</span><span class="p">.</span><span class="n">totalGlobalMem</span>  <span class="o">/</span> <span class="p">(</span><span class="mf">1024.0</span> <span class="o">*</span> <span class="mf">1024.0</span><span class="p">))</span> <span class="o">&lt;&lt;</span> <span class="s">" MB</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span>
        <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"  Free memory (current): "</span> 
                  <span class="o">&lt;&lt;</span> <span class="p">(</span><span class="n">freeBytes</span>  <span class="o">/</span> <span class="p">(</span><span class="mf">1024.0</span> <span class="o">*</span> <span class="mf">1024.0</span><span class="p">))</span> <span class="o">&lt;&lt;</span> <span class="s">" MB</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span>
        <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"  Total allocatable memory (current): "</span> 
                  <span class="o">&lt;&lt;</span> <span class="p">(</span><span class="n">totalBytes</span> <span class="o">/</span> <span class="p">(</span><span class="mf">1024.0</span> <span class="o">*</span> <span class="mf">1024.0</span><span class="p">))</span> <span class="o">&lt;&lt;</span> <span class="s">" MB</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span>
        <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"  Memory clock rate: "</span> 
                  <span class="o">&lt;&lt;</span> <span class="p">(</span><span class="n">prop</span><span class="p">.</span><span class="n">memoryClockRate</span> <span class="o">*</span> <span class="mf">1e-3</span><span class="p">)</span> <span class="o">&lt;&lt;</span> <span class="s">" MHz</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span>
        <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"  Memory bus width: "</span> 
                  <span class="o">&lt;&lt;</span> <span class="n">prop</span><span class="p">.</span><span class="n">memoryBusWidth</span> <span class="o">&lt;&lt;</span> <span class="s">" bits</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span>
        <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"  L2 cache size: "</span> 
                  <span class="o">&lt;&lt;</span> <span class="n">prop</span><span class="p">.</span><span class="n">l2CacheSize</span> <span class="o">/</span> <span class="mi">1024</span> <span class="o">&lt;&lt;</span> <span class="s">" KB</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span>
        <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"  Max shared memory per block: "</span> 
                  <span class="o">&lt;&lt;</span> <span class="n">prop</span><span class="p">.</span><span class="n">sharedMemPerBlock</span> <span class="o">/</span> <span class="mi">1024</span> <span class="o">&lt;&lt;</span> <span class="s">" KB</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span>
        <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"  Total constant memory: "</span> 
                  <span class="o">&lt;&lt;</span> <span class="n">prop</span><span class="p">.</span><span class="n">totalConstMem</span> <span class="o">/</span> <span class="mi">1024</span> <span class="o">&lt;&lt;</span> <span class="s">" KB</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span>
        <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"  Warp size: "</span> 
                  <span class="o">&lt;&lt;</span> <span class="n">prop</span><span class="p">.</span><span class="n">warpSize</span> <span class="o">&lt;&lt;</span> <span class="s">"</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span>
        <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"  Max threads per block: "</span> 
                  <span class="o">&lt;&lt;</span> <span class="n">prop</span><span class="p">.</span><span class="n">maxThreadsPerBlock</span> <span class="o">&lt;&lt;</span> <span class="s">"</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span>
        <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"  Max threads per multiprocessor: "</span> 
                  <span class="o">&lt;&lt;</span> <span class="n">prop</span><span class="p">.</span><span class="n">maxThreadsPerMultiProcessor</span> <span class="o">&lt;&lt;</span> <span class="s">"</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span>
        <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"  Multiprocessor count: "</span> 
                  <span class="o">&lt;&lt;</span> <span class="n">prop</span><span class="p">.</span><span class="n">multiProcessorCount</span> <span class="o">&lt;&lt;</span> <span class="s">"</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span>
        <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"  Max grid dimensions: ["</span> 
                  <span class="o">&lt;&lt;</span> <span class="n">prop</span><span class="p">.</span><span class="n">maxGridSize</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">&lt;&lt;</span> <span class="s">", "</span> 
                  <span class="o">&lt;&lt;</span> <span class="n">prop</span><span class="p">.</span><span class="n">maxGridSize</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="o">&lt;&lt;</span> <span class="s">", "</span> 
                  <span class="o">&lt;&lt;</span> <span class="n">prop</span><span class="p">.</span><span class="n">maxGridSize</span><span class="p">[</span><span class="mi">2</span><span class="p">]</span> <span class="o">&lt;&lt;</span> <span class="s">"]</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span>
        <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"  Max block dimensions: ["</span> 
                  <span class="o">&lt;&lt;</span> <span class="n">prop</span><span class="p">.</span><span class="n">maxThreadsDim</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">&lt;&lt;</span> <span class="s">", "</span> 
                  <span class="o">&lt;&lt;</span> <span class="n">prop</span><span class="p">.</span><span class="n">maxThreadsDim</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="o">&lt;&lt;</span> <span class="s">", "</span> 
                  <span class="o">&lt;&lt;</span> <span class="n">prop</span><span class="p">.</span><span class="n">maxThreadsDim</span><span class="p">[</span><span class="mi">2</span><span class="p">]</span> <span class="o">&lt;&lt;</span> <span class="s">"]</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span>
        <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"  Clock rate: "</span> 
                  <span class="o">&lt;&lt;</span> <span class="p">(</span><span class="n">prop</span><span class="p">.</span><span class="n">clockRate</span> <span class="o">*</span> <span class="mf">1e-3</span><span class="p">)</span> <span class="o">&lt;&lt;</span> <span class="s">" MHz</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span>
        <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"  Concurrent kernels: "</span> 
                  <span class="o">&lt;&lt;</span> <span class="p">(</span><span class="n">prop</span><span class="p">.</span><span class="n">concurrentKernels</span> <span class="o">?</span> <span class="s">"Yes"</span> <span class="o">:</span> <span class="s">"No"</span><span class="p">)</span> <span class="o">&lt;&lt;</span> <span class="s">"</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span>
        <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"  ECC enabled: "</span> 
                  <span class="o">&lt;&lt;</span> <span class="p">(</span><span class="n">prop</span><span class="p">.</span><span class="n">ECCEnabled</span> <span class="o">?</span> <span class="s">"Yes"</span> <span class="o">:</span> <span class="s">"No"</span><span class="p">)</span> <span class="o">&lt;&lt;</span> <span class="s">"</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span>
        <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"  Integrated device: "</span> 
                  <span class="o">&lt;&lt;</span> <span class="p">(</span><span class="n">prop</span><span class="p">.</span><span class="n">integrated</span> <span class="o">?</span> <span class="s">"Yes"</span> <span class="o">:</span> <span class="s">"No"</span><span class="p">)</span> <span class="o">&lt;&lt;</span> <span class="s">"</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span>
        <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"  Can map host memory: "</span> 
                  <span class="o">&lt;&lt;</span> <span class="p">(</span><span class="n">prop</span><span class="p">.</span><span class="n">canMapHostMemory</span> <span class="o">?</span> <span class="s">"Yes"</span> <span class="o">:</span> <span class="s">"No"</span><span class="p">)</span> <span class="o">&lt;&lt;</span> <span class="s">"</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span>
        <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"  Compute mode: "</span><span class="p">;</span>
        <span class="k">switch</span> <span class="p">(</span><span class="n">prop</span><span class="p">.</span><span class="n">computeMode</span><span class="p">)</span> <span class="p">{</span>
            <span class="k">case</span> <span class="n">cudaComputeModeDefault</span><span class="p">:</span>      <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"Default</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span> <span class="k">break</span><span class="p">;</span>
            <span class="k">case</span> <span class="n">cudaComputeModeExclusive</span><span class="p">:</span>    <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"Exclusive</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span> <span class="k">break</span><span class="p">;</span>
            <span class="k">case</span> <span class="n">cudaComputeModeProhibited</span><span class="p">:</span>   <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"Prohibited</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span> <span class="k">break</span><span class="p">;</span>
            <span class="k">case</span> <span class="n">cudaComputeModeExclusiveProcess</span><span class="p">:</span>
                                              <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"Exclusive Process</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span> <span class="k">break</span><span class="p">;</span>
            <span class="nl">default:</span>                          <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"Unknown</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span> <span class="k">break</span><span class="p">;</span>
        <span class="p">}</span>
        <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"  Unified addressing: "</span> 
                  <span class="o">&lt;&lt;</span> <span class="p">(</span><span class="n">prop</span><span class="p">.</span><span class="n">unifiedAddressing</span> <span class="o">?</span> <span class="s">"Yes"</span> <span class="o">:</span> <span class="s">"No"</span><span class="p">)</span> <span class="o">&lt;&lt;</span> <span class="s">"</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span>
        <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"  Async engines: "</span> 
                  <span class="o">&lt;&lt;</span> <span class="n">prop</span><span class="p">.</span><span class="n">asyncEngineCount</span> <span class="o">&lt;&lt;</span> <span class="s">"</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span>
        <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"  Device overlap: "</span> 
                  <span class="o">&lt;&lt;</span> <span class="p">(</span><span class="n">prop</span><span class="p">.</span><span class="n">deviceOverlap</span> <span class="o">?</span> <span class="s">"Yes"</span> <span class="o">:</span> <span class="s">"No"</span><span class="p">)</span> <span class="o">&lt;&lt;</span> <span class="s">"</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span>
        <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"  PCI bus ID: "</span> 
                  <span class="o">&lt;&lt;</span> <span class="n">prop</span><span class="p">.</span><span class="n">pciBusID</span> <span class="o">&lt;&lt;</span> <span class="s">"</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span>
        <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"  PCI device ID: "</span> 
                  <span class="o">&lt;&lt;</span> <span class="n">prop</span><span class="p">.</span><span class="n">pciDeviceID</span> <span class="o">&lt;&lt;</span> <span class="s">"</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span>
        <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span>
    <span class="p">}</span>

    <span class="k">return</span> <span class="mi">0</span><span class="p">;</span>
<span class="p">}</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>The main ingredient is <code class="language-plaintext highlighter-rouge">cudaDeviceProp</code> a struct defined in <code class="language-plaintext highlighter-rouge">cuda_runtime.h</code> (see documentation <a href="https://docs.nvidia.com/cuda/cuda-runtime-api/structcudaDeviceProp.html">here</a>) that contains properites of the devices. Before printing out on screen properties we count the devices and then loop over all of them to print out he properites using the device propery variable. Let’s see what is the output of an A100 gpu from lambda.ai:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
</pre></td><td class="rouge-code"><pre>Detected 1 CUDA Capable Device<span class="o">(</span>s<span class="o">)</span>

Device 0: NVIDIA A100-PCIE-40GB
  PCI Domain/Bus/Device ID: 0/7/0
  Compute capability: 8.0
  Total global memory: 40442.4 MB
  Free memory <span class="o">(</span>current<span class="o">)</span>: 40019.6 MB
  Total allocatable memory <span class="o">(</span>current<span class="o">)</span>: 40442.4 MB
  Memory clock rate: 1215 MHz
  Memory bus width: 5120 bits
  L2 cache size: 40960 KB
  Max shared memory per block: 48 KB
  Total constant memory: 64 KB
  Warp size: 32
  Max threads per block: 1024
  Max threads per multiprocessor: 2048
  Multiprocessor count: 108
  Max grid dimensions: <span class="o">[</span>2147483647, 65535, 65535]
  Max block dimensions: <span class="o">[</span>1024, 1024, 64]
  Clock rate: 1410 MHz
  Concurrent kernels: Yes
  ECC enabled: Yes
  Integrated device: No
  Can map host memory: Yes
  Compute mode: Default
  Unified addressing: Yes
  Async engines: 3
  Device overlap: Yes
  PCI bus ID: 7
  PCI device ID: 0
</pre></td></tr></tbody></table></code></pre></div></div>

<p>It tells us the memory (global) is around 40GB and it is mostly free. Warp size is 32, which is quite usual in many architectures. Maximum threads per block 1024, also very common, and maximum block dimensions [1024, 1024, 64]. I like this tool just to know my limits when I code cuda kernels (a high level API to interact with the Nvidia card).</p>

<h2 id="gpu-allocate-tool">GPU allocate tool</h2>

<p>This tool is a bit different, it can be used to block a chunk of gpu memory and serves as a hello wolrd example on how to code basic cuda. Write into <code class="language-plaintext highlighter-rouge">gpu_allocate.cu</code> the content:</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
</pre></td><td class="rouge-code"><pre><span class="cp">#include</span> <span class="cpf">&lt;cuda_runtime.h&gt;</span><span class="cp">
#include</span> <span class="cpf">&lt;iostream&gt;</span><span class="cp">
#include</span> <span class="cpf">&lt;string&gt;</span><span class="cp">
#include</span> <span class="cpf">&lt;cstdlib&gt;</span><span class="cp">
#include</span> <span class="cpf">&lt;chrono&gt;</span><span class="cp">
#include</span> <span class="cpf">&lt;thread&gt;</span><span class="cp">
</span>
<span class="c1">// Helper to parse size strings like 1024, 100M, 2G, etc.</span>
<span class="kt">size_t</span> <span class="nf">parseSize</span><span class="p">(</span><span class="k">const</span> <span class="n">std</span><span class="o">::</span><span class="n">string</span><span class="o">&amp;</span> <span class="n">s</span><span class="p">)</span> <span class="p">{</span>
    <span class="kt">char</span> <span class="n">unit</span> <span class="o">=</span> <span class="n">s</span><span class="p">.</span><span class="n">back</span><span class="p">();</span>
    <span class="n">std</span><span class="o">::</span><span class="n">string</span> <span class="n">num</span> <span class="o">=</span> <span class="n">s</span><span class="p">;</span>
    <span class="kt">size_t</span> <span class="n">multiplier</span> <span class="o">=</span> <span class="mi">1</span><span class="p">;</span>
    <span class="k">if</span> <span class="p">(</span><span class="n">unit</span> <span class="o">==</span> <span class="sc">'K'</span> <span class="o">||</span> <span class="n">unit</span> <span class="o">==</span> <span class="sc">'k'</span><span class="p">)</span> <span class="p">{</span>
        <span class="n">multiplier</span> <span class="o">=</span> <span class="mi">1024ULL</span><span class="p">;</span>
        <span class="n">num</span> <span class="o">=</span> <span class="n">s</span><span class="p">.</span><span class="n">substr</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="n">s</span><span class="p">.</span><span class="n">size</span><span class="p">()</span> <span class="o">-</span> <span class="mi">1</span><span class="p">);</span>
    <span class="p">}</span> <span class="k">else</span> <span class="nf">if</span> <span class="p">(</span><span class="n">unit</span> <span class="o">==</span> <span class="sc">'M'</span> <span class="o">||</span> <span class="n">unit</span> <span class="o">==</span> <span class="sc">'m'</span><span class="p">)</span> <span class="p">{</span>
        <span class="n">multiplier</span> <span class="o">=</span> <span class="mi">1024ULL</span> <span class="o">*</span> <span class="mi">1024ULL</span><span class="p">;</span>
        <span class="n">num</span> <span class="o">=</span> <span class="n">s</span><span class="p">.</span><span class="n">substr</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="n">s</span><span class="p">.</span><span class="n">size</span><span class="p">()</span> <span class="o">-</span> <span class="mi">1</span><span class="p">);</span>
    <span class="p">}</span> <span class="k">else</span> <span class="nf">if</span> <span class="p">(</span><span class="n">unit</span> <span class="o">==</span> <span class="sc">'G'</span> <span class="o">||</span> <span class="n">unit</span> <span class="o">==</span> <span class="sc">'g'</span><span class="p">)</span> <span class="p">{</span>
        <span class="n">multiplier</span> <span class="o">=</span> <span class="mi">1024ULL</span> <span class="o">*</span> <span class="mi">1024ULL</span> <span class="o">*</span> <span class="mi">1024ULL</span><span class="p">;</span>
        <span class="n">num</span> <span class="o">=</span> <span class="n">s</span><span class="p">.</span><span class="n">substr</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="n">s</span><span class="p">.</span><span class="n">size</span><span class="p">()</span> <span class="o">-</span> <span class="mi">1</span><span class="p">);</span>
    <span class="p">}</span>
    <span class="k">return</span> <span class="k">static_cast</span><span class="o">&lt;</span><span class="kt">size_t</span><span class="o">&gt;</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">stoull</span><span class="p">(</span><span class="n">num</span><span class="p">)</span> <span class="o">*</span> <span class="n">multiplier</span><span class="p">);</span>
<span class="p">}</span>

<span class="c1">// Helper to parse time strings like 10s, 5m, 1h, or raw seconds</span>
<span class="kt">long</span> <span class="nf">parseTime</span><span class="p">(</span><span class="k">const</span> <span class="n">std</span><span class="o">::</span><span class="n">string</span><span class="o">&amp;</span> <span class="n">s</span><span class="p">)</span> <span class="p">{</span>
    <span class="kt">char</span> <span class="n">unit</span> <span class="o">=</span> <span class="n">s</span><span class="p">.</span><span class="n">back</span><span class="p">();</span>
    <span class="n">std</span><span class="o">::</span><span class="n">string</span> <span class="n">num</span> <span class="o">=</span> <span class="n">s</span><span class="p">;</span>
    <span class="kt">long</span> <span class="n">multiplier</span> <span class="o">=</span> <span class="mi">1</span><span class="p">;</span>
    <span class="k">if</span> <span class="p">(</span><span class="n">unit</span> <span class="o">==</span> <span class="sc">'s'</span> <span class="o">||</span> <span class="n">unit</span> <span class="o">==</span> <span class="sc">'S'</span><span class="p">)</span> <span class="p">{</span>
        <span class="n">multiplier</span> <span class="o">=</span> <span class="mi">1</span><span class="p">;</span>
        <span class="n">num</span> <span class="o">=</span> <span class="n">s</span><span class="p">.</span><span class="n">substr</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="n">s</span><span class="p">.</span><span class="n">size</span><span class="p">()</span> <span class="o">-</span> <span class="mi">1</span><span class="p">);</span>
    <span class="p">}</span> <span class="k">else</span> <span class="nf">if</span> <span class="p">(</span><span class="n">unit</span> <span class="o">==</span> <span class="sc">'m'</span> <span class="o">||</span> <span class="n">unit</span> <span class="o">==</span> <span class="sc">'M'</span><span class="p">)</span> <span class="p">{</span>
        <span class="n">multiplier</span> <span class="o">=</span> <span class="mi">60</span><span class="p">;</span>
        <span class="n">num</span> <span class="o">=</span> <span class="n">s</span><span class="p">.</span><span class="n">substr</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="n">s</span><span class="p">.</span><span class="n">size</span><span class="p">()</span> <span class="o">-</span> <span class="mi">1</span><span class="p">);</span>
    <span class="p">}</span> <span class="k">else</span> <span class="nf">if</span> <span class="p">(</span><span class="n">unit</span> <span class="o">==</span> <span class="sc">'h'</span> <span class="o">||</span> <span class="n">unit</span> <span class="o">==</span> <span class="sc">'H'</span><span class="p">)</span> <span class="p">{</span>
        <span class="n">multiplier</span> <span class="o">=</span> <span class="mi">3600</span><span class="p">;</span>
        <span class="n">num</span> <span class="o">=</span> <span class="n">s</span><span class="p">.</span><span class="n">substr</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="n">s</span><span class="p">.</span><span class="n">size</span><span class="p">()</span> <span class="o">-</span> <span class="mi">1</span><span class="p">);</span>
    <span class="p">}</span>
    <span class="k">return</span> <span class="k">static_cast</span><span class="o">&lt;</span><span class="kt">long</span><span class="o">&gt;</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">stol</span><span class="p">(</span><span class="n">num</span><span class="p">)</span> <span class="o">*</span> <span class="n">multiplier</span><span class="p">);</span>
<span class="p">}</span>

<span class="kt">int</span> <span class="nf">main</span><span class="p">(</span><span class="kt">int</span> <span class="n">argc</span><span class="p">,</span> <span class="kt">char</span><span class="o">*</span> <span class="n">argv</span><span class="p">[])</span> <span class="p">{</span>
    <span class="k">if</span> <span class="p">(</span><span class="n">argc</span> <span class="o">!=</span> <span class="mi">4</span><span class="p">)</span> <span class="p">{</span>
        <span class="n">std</span><span class="o">::</span><span class="n">cerr</span> <span class="o">&lt;&lt;</span> <span class="s">"Usage: "</span> <span class="o">&lt;&lt;</span> <span class="n">argv</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">&lt;&lt;</span> <span class="s">" &lt;gpu_id&gt; &lt;memory_amount (e.g., 512M, 1G, or bytes)&gt; &lt;duration (e.g., 10s, 5m, 1h)&gt;"</span> <span class="o">&lt;&lt;</span> <span class="n">std</span><span class="o">::</span><span class="n">endl</span><span class="p">;</span>
        <span class="k">return</span> <span class="n">EXIT_FAILURE</span><span class="p">;</span>
    <span class="p">}</span>

    <span class="kt">int</span> <span class="n">gpuId</span> <span class="o">=</span> <span class="n">std</span><span class="o">::</span><span class="n">stoi</span><span class="p">(</span><span class="n">argv</span><span class="p">[</span><span class="mi">1</span><span class="p">]);</span>
    <span class="kt">size_t</span> <span class="n">bytes</span> <span class="o">=</span> <span class="n">parseSize</span><span class="p">(</span><span class="n">argv</span><span class="p">[</span><span class="mi">2</span><span class="p">]);</span>
    <span class="kt">long</span> <span class="n">duration</span> <span class="o">=</span> <span class="n">parseTime</span><span class="p">(</span><span class="n">argv</span><span class="p">[</span><span class="mi">3</span><span class="p">]);</span>

    <span class="n">cudaError_t</span> <span class="n">err</span> <span class="o">=</span> <span class="n">cudaSetDevice</span><span class="p">(</span><span class="n">gpuId</span><span class="p">);</span>
    <span class="k">if</span> <span class="p">(</span><span class="n">err</span> <span class="o">!=</span> <span class="n">cudaSuccess</span><span class="p">)</span> <span class="p">{</span>
        <span class="n">std</span><span class="o">::</span><span class="n">cerr</span> <span class="o">&lt;&lt;</span> <span class="s">"Error setting GPU device "</span> <span class="o">&lt;&lt;</span> <span class="n">gpuId</span> <span class="o">&lt;&lt;</span> <span class="s">": "</span> <span class="o">&lt;&lt;</span> <span class="n">cudaGetErrorString</span><span class="p">(</span><span class="n">err</span><span class="p">)</span> <span class="o">&lt;&lt;</span> <span class="n">std</span><span class="o">::</span><span class="n">endl</span><span class="p">;</span>
        <span class="k">return</span> <span class="n">EXIT_FAILURE</span><span class="p">;</span>
    <span class="p">}</span>

    <span class="kt">void</span><span class="o">*</span> <span class="n">d_ptr</span> <span class="o">=</span> <span class="nb">nullptr</span><span class="p">;</span>
    <span class="n">err</span> <span class="o">=</span> <span class="n">cudaMalloc</span><span class="p">(</span><span class="o">&amp;</span><span class="n">d_ptr</span><span class="p">,</span> <span class="n">bytes</span><span class="p">);</span>
    <span class="k">if</span> <span class="p">(</span><span class="n">err</span> <span class="o">!=</span> <span class="n">cudaSuccess</span><span class="p">)</span> <span class="p">{</span>
        <span class="n">std</span><span class="o">::</span><span class="n">cerr</span> <span class="o">&lt;&lt;</span> <span class="s">"Error allocating "</span> <span class="o">&lt;&lt;</span> <span class="n">bytes</span> <span class="o">&lt;&lt;</span> <span class="s">" bytes on GPU "</span> <span class="o">&lt;&lt;</span> <span class="n">gpuId</span>
                  <span class="o">&lt;&lt;</span> <span class="s">": "</span> <span class="o">&lt;&lt;</span> <span class="n">cudaGetErrorString</span><span class="p">(</span><span class="n">err</span><span class="p">)</span> <span class="o">&lt;&lt;</span> <span class="n">std</span><span class="o">::</span><span class="n">endl</span><span class="p">;</span>
        <span class="k">return</span> <span class="n">EXIT_FAILURE</span><span class="p">;</span>
    <span class="p">}</span>

    <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"Successfully allocated "</span> <span class="o">&lt;&lt;</span> <span class="n">bytes</span> <span class="o">&lt;&lt;</span> <span class="s">" bytes on GPU "</span> <span class="o">&lt;&lt;</span> <span class="n">gpuId</span>
              <span class="o">&lt;&lt;</span> <span class="s">", holding for "</span> <span class="o">&lt;&lt;</span> <span class="n">duration</span> <span class="o">&lt;&lt;</span> <span class="s">" seconds..."</span> <span class="o">&lt;&lt;</span> <span class="n">std</span><span class="o">::</span><span class="n">endl</span><span class="p">;</span>

    <span class="c1">// Keep the allocation alive for the specified duration</span>
    <span class="n">std</span><span class="o">::</span><span class="n">this_thread</span><span class="o">::</span><span class="n">sleep_for</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">chrono</span><span class="o">::</span><span class="n">seconds</span><span class="p">(</span><span class="n">duration</span><span class="p">));</span>

    <span class="c1">// Free the allocation and exit</span>
    <span class="n">cudaFree</span><span class="p">(</span><span class="n">d_ptr</span><span class="p">);</span>
    <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"Freed memory and exiting."</span> <span class="o">&lt;&lt;</span> <span class="n">std</span><span class="o">::</span><span class="n">endl</span><span class="p">;</span>
    <span class="k">return</span> <span class="n">EXIT_SUCCESS</span><span class="p">;</span>
<span class="p">}</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Let’s take a look at the main, it is a command line tool whith three inputs in the <code class="language-plaintext highlighter-rouge">argv</code> argument, the <code class="language-plaintext highlighter-rouge">gpuId</code>, the number of <code class="language-plaintext highlighter-rouge">bytes</code> and the <code class="language-plaintext highlighter-rouge">duration</code> in seconds. Basically we want to allocate a number of bytes in a specific gpu during a certain ammount of time. The next part is sleecting the gpu with <code class="language-plaintext highlighter-rouge">cudaSetDevice</code> function and allocate the memory with <code class="language-plaintext highlighter-rouge">cudaMalloc(&amp;d_ptr, bytes)</code> where <code class="language-plaintext highlighter-rouge">d_ptr</code> is a pointer to void. Then on the cpu side we tell it to sleep for the ammount of seconds we selected with <code class="language-plaintext highlighter-rouge">std::this_thread::sleep_for(std::chrono::seconds(duration))</code>, and finally after that time is elapsed we dealocate the memory with <code class="language-plaintext highlighter-rouge">cudaFree</code> and exit the program with success code. The functions <code class="language-plaintext highlighter-rouge">ParseSize</code> and <code class="language-plaintext highlighter-rouge">ParseTime</code> are just two helpers to match the sizes kilobytes, megabytes, gigabytes to bytes and the times hours, minutes, seconds to seconds.</p>

<h2 id="the-cmakeliststxt-file">The CMakeLists.txt file</h2>

<p>Cmake is a super powerful command line tool that creates a make for your project. It is very convenient in C++ and CUDA projects. Write a <code class="language-plaintext highlighter-rouge">CMakeLists.txt</code> file with this content</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
</pre></td><td class="rouge-code"><pre>cmake_minimum_required<span class="o">(</span>VERSION 3.10<span class="o">)</span>
project<span class="o">(</span>GPUTools LANGUAGES CXX CUDA<span class="o">)</span>

<span class="c"># default exec names</span>
<span class="nb">set</span><span class="o">(</span>GPU_INFO_OUT_NAME    <span class="s2">"gpu_info"</span>
    CACHE STRING <span class="s2">"Name of the gpu_info executable"</span><span class="o">)</span>
<span class="nb">set</span><span class="o">(</span>GPU_ALLOC_OUT_NAME   <span class="s2">"gpu_allocate"</span>
    CACHE STRING <span class="s2">"Name of the gpu_allocate executable"</span><span class="o">)</span>

<span class="c"># Restore old FindCUDA behavior if needed</span>
<span class="k">if</span><span class="o">(</span>POLICY CMP0146<span class="o">)</span>
  cmake_policy<span class="o">(</span>SET CMP0146 OLD<span class="o">)</span>
endif<span class="o">()</span>

<span class="c"># Language standards</span>
<span class="nb">set</span><span class="o">(</span>CMAKE_CXX_STANDARD      14<span class="o">)</span>
<span class="nb">set</span><span class="o">(</span>CMAKE_CXX_STANDARD_REQUIRED ON<span class="o">)</span>

<span class="nb">set</span><span class="o">(</span>CMAKE_CUDA_ARCHITECTURES 80 CACHE STRING
    <span class="s2">"List of CUDA architectures to build for (e.g. 61;70;75;86)"</span><span class="o">)</span>

<span class="c"># Find CUDA (for older CMake) or you can use find_package(CUDAToolkit) in 3.17+</span>
find_package<span class="o">(</span>CUDA REQUIRED<span class="o">)</span>
include_directories<span class="o">(</span><span class="k">${</span><span class="nv">CUDA_INCLUDE_DIRS</span><span class="k">}</span><span class="o">)</span>

add_executable<span class="o">(</span>gpu_info
  src/gpu_info.cu
<span class="o">)</span>
target_link_libraries<span class="o">(</span>gpu_info
  PRIVATE <span class="k">${</span><span class="nv">CUDA_CUDART_LIBRARY</span><span class="k">}</span>
<span class="o">)</span>
set_target_properties<span class="o">(</span>gpu_info
  PROPERTIES OUTPUT_NAME <span class="k">${</span><span class="nv">GPU_INFO_OUT_NAME</span><span class="k">}</span>
<span class="o">)</span>

add_executable<span class="o">(</span>gpu_allocate
  src/gpu_allocate.cu
<span class="o">)</span>
target_link_libraries<span class="o">(</span>gpu_allocate
  PRIVATE <span class="k">${</span><span class="nv">CUDA_CUDART_LIBRARY</span><span class="k">}</span>
<span class="o">)</span>
set_target_properties<span class="o">(</span>gpu_allocate
  PROPERTIES OUTPUT_NAME <span class="k">${</span><span class="nv">GPU_ALLOC_OUT_NAME</span><span class="k">}</span>
<span class="o">)</span>

<span class="c"># (Optional) If you want to give a different on-disk name:</span>
<span class="c"># set(EXE_NAME alloc_mem)</span>
<span class="c"># set_target_properties(allocate_gpu_memory PROPERTIES OUTPUT_NAME ${EXE_NAME})</span>

<span class="c"># Installation</span>
<span class="nb">install</span><span class="o">(</span>TARGETS
  gpu_info
  gpu_allocate
  RUNTIME DESTINATION bin
<span class="o">)</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>There are two variables that are set by default but can be changed when we call cmake command line. <code class="language-plaintext highlighter-rouge">GPU_INFO_OUT_NAME</code> and <code class="language-plaintext highlighter-rouge">GPU_ALOC_OUT_NAME</code>, those two are the names of the executables. We set the <code class="language-plaintext highlighter-rouge">C++</code> standard and the <code class="language-plaintext highlighter-rouge">CMAKE_CUDA_ARCHITECTURES</code> is the architecture of the GPU we are compiling for:</p>

<table>
  <thead>
    <tr>
      <th>GPU Architecture</th>
      <th>NVCC Arch Flag</th>
      <th>Compute Capability</th>
      <th>Example GPUs</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Kepler</td>
      <td><code class="language-plaintext highlighter-rouge">sm_30</code></td>
      <td>3.0</td>
      <td>GTX 780, Tesla K20</td>
    </tr>
    <tr>
      <td>Maxwell</td>
      <td><code class="language-plaintext highlighter-rouge">sm_50</code></td>
      <td>5.0</td>
      <td>GTX 970, Tesla M60</td>
    </tr>
    <tr>
      <td>Pascal</td>
      <td><code class="language-plaintext highlighter-rouge">sm_60</code></td>
      <td>6.0</td>
      <td>GTX 1080, Tesla P100</td>
    </tr>
    <tr>
      <td>Volta</td>
      <td><code class="language-plaintext highlighter-rouge">sm_70</code></td>
      <td>7.0</td>
      <td>Tesla V100</td>
    </tr>
    <tr>
      <td><strong>Turing</strong></td>
      <td><code class="language-plaintext highlighter-rouge">sm_75</code></td>
      <td>7.5</td>
      <td>RTX 2080, Quadro RTX 6000</td>
    </tr>
    <tr>
      <td>Ampere (A100)</td>
      <td><code class="language-plaintext highlighter-rouge">sm_80</code></td>
      <td>8.0</td>
      <td>A100, RTX A6000</td>
    </tr>
    <tr>
      <td><strong>Ampere (GA10x)</strong></td>
      <td><code class="language-plaintext highlighter-rouge">sm_86</code></td>
      <td>8.6</td>
      <td>RTX 3090, 3080, 3070, A10</td>
    </tr>
    <tr>
      <td>Ada Lovelace</td>
      <td><code class="language-plaintext highlighter-rouge">sm_89</code></td>
      <td>8.9</td>
      <td>RTX 4090, 4080</td>
    </tr>
    <tr>
      <td>Hopper</td>
      <td><code class="language-plaintext highlighter-rouge">sm_90</code></td>
      <td>9.0</td>
      <td>H100</td>
    </tr>
  </tbody>
</table>

<p>In our case the arcithectures is defined by <code class="language-plaintext highlighter-rouge">Compute capability: 8.0</code> from the information printed on screen in the previous section. This is, our GPU is an A100. A general solution is to set <code class="language-plaintext highlighter-rouge">set(CMAKE_CUDA_ARCHITECTURES) all CACHE STRING "Target all architectures)</code>  making the code compatible with any card but this increases compilation time and also is slower at runtime. For modern GPUs you can do <code class="language-plaintext highlighter-rouge">set(CMAKE_CUDA_ARCHITECTURES 75 80 86 89 CACHE STRING "Target common modern architectures")</code>. Then we need to find the cuda libraries with</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre>find_package<span class="o">(</span>CUDA REQUIRED<span class="o">)</span>
include_directories<span class="o">(</span><span class="k">${</span><span class="nv">CUDA_INCLUDE_DIRS</span><span class="k">}</span><span class="o">)</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>and adds the headers to the project. For <code class="language-plaintext highlighter-rouge">cmake&gt;3.17</code> we would only need to define <code class="language-plaintext highlighter-rouge">project(GPUTools LANGUAGES CXX CUDA)</code> without the need to even include the cuda headers through <code class="language-plaintext highlighter-rouge">include_directories</code>. Finally we tell cmake which are the executables to be compiled, the libraries to link and the executable name. Then just the install instruction with the two executables.</p>

<p>To compile the two executables we need to run</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
</pre></td><td class="rouge-code"><pre><span class="nb">rm</span> <span class="nt">-rf</span> build <span class="o">&amp;&amp;</span> <span class="nb">mkdir </span>build <span class="o">&amp;&amp;</span> <span class="nb">cd </span>build
cmake ..
cmake <span class="nt">--build</span> <span class="nb">.</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>which creates them in <code class="language-plaintext highlighter-rouge">build</code> directory. But, if you want to make them execcutable in all the system by installing them in <code class="language-plaintext highlighter-rouge">$HOME/.local/bin</code> you can do</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
</pre></td><td class="rouge-code"><pre><span class="nb">rm</span> <span class="nt">-rf</span> build <span class="o">&amp;&amp;</span> <span class="nb">mkdir </span>build <span class="o">&amp;&amp;</span> <span class="nb">cd </span>build
cmake <span class="se">\</span>
  <span class="nt">-DGPU_INFO_OUT_NAME</span><span class="o">=</span>gpu_info <span class="se">\</span>
  <span class="nt">-DGPU_ALLOC_OUT_NAME</span><span class="o">=</span>gpu_allocate <span class="se">\</span>
  <span class="nt">-DCMAKE_CUDA_ARCHITECTURES</span><span class="o">=</span><span class="s2">"70;75;80"</span> <span class="se">\</span>
  <span class="nt">-DCMAKE_INSTALL_PREFIX</span><span class="o">=</span><span class="k">${</span><span class="nv">HOME</span><span class="k">}</span>/.local <span class="se">\</span>
  ..
cmake <span class="nt">--build</span> <span class="nb">.</span>
cmake <span class="nt">--install</span> <span class="nb">.</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>where <code class="language-plaintext highlighter-rouge">GPU_INFO_OUT_NAME</code> and <code class="language-plaintext highlighter-rouge">GPU_ALLOC_OUT_NAME</code> are the names of the executables (should we want to change them), <code class="language-plaintext highlighter-rouge">CMAKE_CUDA_ARCHITECTURES</code> are the GPU architectures we want to compile for. And <code class="language-plaintext highlighter-rouge">CMAKE_INSTALL_PREFIX</code> the install directory. In this last one even though we set it to <code class="language-plaintext highlighter-rouge">${HOME}/.local</code>, the binaries will be installed in <code class="language-plaintext highlighter-rouge">${HOME}/.local/bin</code> since we have the condition <code class="language-plaintext highlighter-rouge">RUNTIME destination bin</code> in the cmake.</p>

<h2 id="bonus-nvitop">Bonus: nvitop</h2>

<p><a href="https://github.com/XuehaiPan/nvitop">nvitop</a> is a great tool to monitor your GPU. I personally like it better than <code class="language-plaintext highlighter-rouge">nvidia-smi</code> which is the nvidia default “top”. This tool comes in a python package so to install it it’s best to create a new python virtual environment. We have covered this before in this blog so I am not going to extend. Just create a virtual envirionment in <code class="language-plaintext highlighter-rouge">$HOME/.venvs</code> called <code class="language-plaintext highlighter-rouge">nvitop</code> and pip install the tool:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
</pre></td><td class="rouge-code"><pre><span class="nb">mkdir</span> <span class="nt">-p</span> <span class="nv">$HOME</span>/.venvs
python <span class="nt">-m</span> venv <span class="nv">$HOME</span>/.venvs/nvitop
<span class="nv">$HOME</span>/.venvs/nvitop/bin/pip <span class="nb">install </span>nvitop
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Now you can exectute <code class="language-plaintext highlighter-rouge">nvitop</code> with</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre><span class="nv">$HOME</span>/.venvs/nvitop/bin/nvitop
</pre></td></tr></tbody></table></code></pre></div></div>

<p>or activating your environment and running <code class="language-plaintext highlighter-rouge">nvitop</code> in the command line. It is better to create a symlink to <code class="language-plaintext highlighter-rouge">${HOME}/.local/bin</code> directory so that the command line is in your <code class="language-plaintext highlighter-rouge">$PATH</code>:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre><span class="nb">ln</span> <span class="nt">-s</span>  <span class="nv">$HOME</span>/.venvs/nvitop/bin/nvitop <span class="nv">$HOME</span>/.local/bin/nvitop
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Seems that the python executable nvitop is not platform specific as the build wheels i find currently for the most recent version <code class="language-plaintext highlighter-rouge">1.5.1</code> in <a href="https://pypi.org/project/nvitop/1.5.1/#files">PyPi</a> is <code class="language-plaintext highlighter-rouge">nvitop-1.5.1-py3-none-any.whl</code>. So this wheel should work for ARM64 (New generation of Grace Hoppers and Blackwell with GPU and CPU integrated) as well as for x86 CPU architectures.</p>

<p>Now we have the execs in <code class="language-plaintext highlighter-rouge">${HOME}/.local/bin</code> that should be in our <code class="language-plaintext highlighter-rouge">$PATH</code>.</p>

<h2 id="conclusion">Conclusion</h2>

<p>Hope you liked these tools I built, so far they have been useful for me. I will probably build more in the near future so I will post again about this.</p>]]></content><author><name>Sebastia Agramunt Puig</name></author><category term="C++" /><category term="CUDA" /><category term="computer science" /><category term="GPU" /><summary type="html"><![CDATA[This is my first post in CUDA, I have been working for a while using this technology and want to share some utilities that can be useful for newcomers to the field. All the code will be available in my github repository subdirectory cuda-utils. I would like to acqnowledge Lambda.ai for providing me with free credits for my blog. I will be testing this code with a machine gpu_1x_a100_sxm4 which has an A100 GPU, the Ampere GPU architecture. This GPU is a bit old these days but we won’t be doing any heavy compute so this will suffice.]]></summary></entry><entry><title type="html">Nvidia-GPU Performance</title><link href="https://agramunt.me/posts/cuda-performance/" rel="alternate" type="text/html" title="Nvidia-GPU Performance" /><published>2025-07-12T21:35:00-07:00</published><updated>2025-10-24T22:26:33-07:00</updated><id>https://agramunt.me/posts/cuda-performance</id><content type="html" xml:base="https://agramunt.me/posts/cuda-performance/"><![CDATA[<p>GPUs or Graphical Processing Units have become essential in high performance computing these days. They are efficient hardware that can parallelize small calculations. For instance, in Machine Learning (ML) and Artificial Intelligence (AI), GPUs are ubiquitous, as almost all operations are matrix multiplications, convolutions, max pooling… Operations that can be paralellized easily. However, GPUs are not suitable for any kind of calculation, in this post we will understand when it pays off to bring the calculation to GPU for a very simple example.</p>

<p>I thank <a href="https://lambda.ai/">Lambda AI</a> for providing free credit to run the experiments described in the post. Throughout this post we will be using <code class="language-plaintext highlighter-rouge">gpu_1x_a100_sxm4</code>, an Ampere 100 GPU, the same as used in the post <a href="../cuda-utils">CUDA Utils</a>. As always the code can be found in my <a href="https://github.com/SebastiaAgramunt/blogging-code/tree/main/cuda-performance">github repository</a>.</p>

<h2 id="the-problem">The problem</h2>

<p>Given two arrays of <code class="language-plaintext highlighter-rouge">float</code>s of length <code class="language-plaintext highlighter-rouge">N</code>, their sum <code class="language-plaintext highlighter-rouge">m</code>  times. This is, for each element <code class="language-plaintext highlighter-rouge">a</code> in $\vec{a}$ and each element <code class="language-plaintext highlighter-rouge">b</code> of vector $\vec{b}$ we make the sum <code class="language-plaintext highlighter-rouge">m</code> times. A total of $N \times m$ floating point sums. We can write the CPU code in C++ as</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
</pre></td><td class="rouge-code"><pre><span class="c1">// Create host a, b, c</span>
<span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o">&lt;</span><span class="kt">float</span><span class="o">&gt;</span> <span class="n">h_a</span><span class="p">(</span><span class="n">N</span><span class="p">),</span> <span class="n">h_b</span><span class="p">(</span><span class="n">N</span><span class="p">),</span> <span class="n">h_c_cpu</span><span class="p">(</span><span class="n">N</span><span class="p">);</span>

<span class="c1">// Fill a and b with random floats 0 to 1</span>
<span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">i</span> <span class="o">&lt;</span> <span class="n">N</span><span class="p">;</span> <span class="o">++</span><span class="n">i</span><span class="p">)</span> <span class="p">{</span>
    <span class="n">h_a</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">=</span> <span class="k">static_cast</span><span class="o">&lt;</span><span class="kt">float</span><span class="o">&gt;</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">rand</span><span class="p">())</span> <span class="o">/</span> <span class="n">RAND_MAX</span><span class="p">;</span>
    <span class="n">h_b</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">=</span> <span class="k">static_cast</span><span class="o">&lt;</span><span class="kt">float</span><span class="o">&gt;</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">rand</span><span class="p">())</span> <span class="o">/</span> <span class="n">RAND_MAX</span><span class="p">;</span>
<span class="p">}</span>

<span class="c1">// Sum each element of the arrays i...N, a total of m times</span>
<span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">i</span> <span class="o">&lt;</span> <span class="n">N</span><span class="p">;</span> <span class="o">++</span><span class="n">i</span><span class="p">)</span> <span class="p">{</span>
    <span class="kt">double</span> <span class="n">acc</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
    <span class="kt">double</span> <span class="n">s</span> <span class="o">=</span> <span class="n">h_a</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">+</span> <span class="n">h_b</span><span class="p">[</span><span class="n">i</span><span class="p">];</span>
    <span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">j</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">j</span> <span class="o">&lt;</span> <span class="n">m</span><span class="o">-</span><span class="mi">1</span><span class="p">;</span> <span class="o">++</span><span class="n">j</span><span class="p">)</span> <span class="p">{</span>
        <span class="n">acc</span> <span class="o">=</span> <span class="n">acc</span> <span class="o">+</span> <span class="n">s</span><span class="p">;</span>
    <span class="p">}</span>
    <span class="n">h_c_cpu</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">=</span> <span class="n">acc</span><span class="p">;</span>
<span class="p">}</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>I agree the code seems a bit useless, why don’t we do…</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre><span class="kt">double</span> <span class="n">s</span> <span class="o">=</span> <span class="p">(</span><span class="n">h_a</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">+</span> <span class="n">h_b</span><span class="p">[</span><span class="mi">1</span><span class="p">])</span> <span class="o">*</span> <span class="n">m</span><span class="p">;</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>So that it would be only <code class="language-plaintext highlighter-rouge">N</code> additions?. We want to explicitely make the sum <code class="language-plaintext highlighter-rouge">m</code> times per array element. Also we want to avoid adding constants to this loop, e.g. <code class="language-plaintext highlighter-rouge">acc = acc + 1</code>. That would make the compiler to optimize the code and run much faster, the point of this calculation is to perform <code class="language-plaintext highlighter-rouge">m</code> sums for each array element.</p>

<h2 id="gpu-kernel-for-vector-addition">GPU kernel for vector addition</h2>

<p>This is one of the simplests CUDA kernels one can write, but before starting to explain it, if you are new into CUDA programming, please take a look at the <a href="https://developer.nvidia.com/blog/cuda-refresher-cuda-programming-model/">CUDA programming model</a>. Make sure you understand what is a <a href="https://modal.com/gpu-glossary/device-software/thread">thread</a>, a <a href="https://modal.com/gpu-glossary/device-software/thread-block">block</a> and a <a href="https://modal.com/gpu-glossary/device-software/thread-block-grid">grid</a>. A good visualization of the model can be found <a href="https://harmanani.github.io/classes/csc447/Notes/Lecture15.pdf">here</a>. Another great resource is <a href="https://www.goodreads.com/book/show/7659954-programming-massively-parallel-processors">Programming Massively Parallel Processors: A Hands-on Approach</a>, this is my reference book for GPU programming.</p>

<p>Our CUDA kernel for addition of two vectors hould look like this:</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
</pre></td><td class="rouge-code"><pre><span class="k">template</span><span class="o">&lt;</span><span class="k">typename</span> <span class="nc">T</span><span class="p">&gt;</span>
<span class="n">__global__</span> <span class="kt">void</span> <span class="nf">vectorAdd</span><span class="p">(</span>
    <span class="k">const</span> <span class="n">T</span><span class="o">*</span> <span class="n">__restrict__</span> <span class="n">a</span><span class="p">,</span>        <span class="c1">// vector a</span>
    <span class="k">const</span> <span class="n">T</span><span class="o">*</span> <span class="n">__restrict__</span> <span class="n">b</span><span class="p">,</span>        <span class="c1">// vector b</span>
    <span class="n">T</span><span class="o">*</span> <span class="n">__restrict__</span> <span class="n">c</span><span class="p">,</span>              <span class="c1">// vector c</span>
    <span class="kt">int</span> <span class="n">n</span><span class="p">,</span>                          <span class="c1">// number of elements of a, b, c</span>
    <span class="kt">int</span> <span class="n">m</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>                        <span class="c1">// number of additions</span>
    <span class="p">{</span>
    <span class="kt">int</span> <span class="n">idx</span> <span class="o">=</span> <span class="n">blockDim</span><span class="p">.</span><span class="n">x</span> <span class="o">*</span> <span class="n">blockIdx</span><span class="p">.</span><span class="n">x</span> <span class="o">+</span> <span class="n">threadIdx</span><span class="p">.</span><span class="n">x</span><span class="p">;</span>
    <span class="k">if</span> <span class="p">(</span><span class="n">idx</span> <span class="o">&gt;=</span> <span class="n">n</span><span class="p">)</span> <span class="k">return</span><span class="p">;</span>
        <span class="n">T</span> <span class="n">s</span> <span class="o">=</span> <span class="n">a</span><span class="p">[</span><span class="n">idx</span><span class="p">]</span> <span class="o">+</span> <span class="n">b</span><span class="p">[</span><span class="n">idx</span><span class="p">];</span>
        <span class="n">T</span> <span class="n">acc</span> <span class="o">=</span> <span class="n">T</span><span class="p">(</span><span class="mi">0</span><span class="p">);</span>
        <span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">j</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">j</span> <span class="o">&lt;</span> <span class="n">m</span><span class="o">-</span><span class="mi">1</span><span class="p">;</span> <span class="o">++</span><span class="n">j</span><span class="p">)</span> <span class="p">{</span>
            <span class="n">acc</span> <span class="o">=</span> <span class="n">acc</span> <span class="o">+</span> <span class="n">s</span><span class="p">;</span>
        <span class="p">}</span>
        <span class="n">c</span><span class="p">[</span><span class="n">idx</span><span class="p">]</span> <span class="o">=</span> <span class="n">acc</span><span class="p">;</span>
<span class="p">}</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>We define a template to adapt in case we want to test for other types like <code class="language-plaintext highlighter-rouge">int</code> or <code class="language-plaintext highlighter-rouge">double</code> instead of <code class="language-plaintext highlighter-rouge">float</code>. Every CUDA kernel has to start with <code class="language-plaintext highlighter-rouge">__global__</code>, that tells the <code class="language-plaintext highlighter-rouge">nvcc</code> (the compiler) that this is code to be executed at the GPU. Then, inside the function we have the global index of the thread, <code class="language-plaintext highlighter-rouge">idx</code>. We will launch 1D blocks in one grid so we are working only with the <code class="language-plaintext highlighter-rouge">x</code> dimension, in this case the <code class="language-plaintext highlighter-rouge">idx</code> can be written as the block dimension times the block index plus the thread index within the block, <code class="language-plaintext highlighter-rouge">blockDim.x * blockIdx.x + threadIdx.x;</code>. Then we have the conditional on the global thread index, a condition that, even though not mandatory, it is very recommended to add; the index cannot exeed the total number of elements of the array. If the thread is larger, no worries, we just don’t do anything and we leave the function for that thread. Finally we have the sum $m$ times.</p>

<p>To launch this kernel on any C++ file I normally write a wrapper C++ function:</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
</pre></td><td class="rouge-code"><pre><span class="k">template</span><span class="o">&lt;</span><span class="k">typename</span> <span class="nc">T</span><span class="p">&gt;</span>
<span class="kt">void</span> <span class="nf">vectorAdd_wrapper</span><span class="p">(</span>
    <span class="k">const</span> <span class="n">T</span><span class="o">*</span> <span class="n">__restrict__</span> <span class="n">d_a</span><span class="p">,</span>
    <span class="k">const</span> <span class="n">T</span><span class="o">*</span> <span class="n">__restrict__</span> <span class="n">d_b</span><span class="p">,</span>
    <span class="n">T</span><span class="o">*</span> <span class="n">__restrict__</span> <span class="n">d_c</span><span class="p">,</span>
    <span class="kt">int</span> <span class="n">N</span><span class="p">,</span>
    <span class="kt">int</span> <span class="n">m</span><span class="p">,</span>
    <span class="kt">int</span> <span class="n">ThreadsPerBlock</span><span class="p">){</span>
    
    <span class="kt">int</span> <span class="n">blocksPerGrid</span> <span class="o">=</span> <span class="p">(</span><span class="n">N</span> <span class="o">+</span> <span class="n">ThreadsPerBlock</span> <span class="o">-</span> <span class="mi">1</span><span class="p">)</span> <span class="o">/</span> <span class="n">ThreadsPerBlock</span><span class="p">;</span>
    <span class="n">vectorAdd</span><span class="o">&lt;&lt;&lt;</span><span class="n">blocksPerGrid</span><span class="p">,</span> <span class="n">ThreadsPerBlock</span><span class="o">&gt;&gt;&gt;</span><span class="p">(</span><span class="n">d_a</span><span class="p">,</span> <span class="n">d_b</span><span class="p">,</span> <span class="n">d_c</span><span class="p">,</span> <span class="n">N</span><span class="p">,</span> <span class="n">m</span><span class="p">);</span>
    <span class="n">CHECK_LAST_CUDA_ERROR</span><span class="p">();</span>
    <span class="n">CHECK_CUDA_ERROR</span><span class="p">(</span><span class="n">cudaDeviceSynchronize</span><span class="p">());</span> 
<span class="p">}</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Here I specify how many <code class="language-plaintext highlighter-rouge">ThreadsPerBlock</code> to use and therefore infer the blocks needed to run my addition, Then launch the kernel and check the errors with <code class="language-plaintext highlighter-rouge">CHECK_LAST_CUDA_ERROR</code> (checks last errors in the kernel launch) and <code class="language-plaintext highlighter-rouge">CHECK_CUDA_ERROR</code> (checks error outputs from functions that return the type <code class="language-plaintext highlighter-rouge">cudaError_t</code>, for instance <a href="https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__DEVICE.html#group__CUDART__DEVICE_1g10e20b05a95f638a4071a655503df25d">cudaDeviceSynchronize</a>). Using these two cuda error functions is a good pattern to actively catch every error that can be happening in your GPU.</p>

<h2 id="cpu-vs-gpu-in-vector-addition">CPU vs GPU in vector addition</h2>

<p>With all this and the code in the <a href="https://github.com/SebastiaAgramunt/blogging-code/tree/main/cuda-performance">github repository</a> we can start calculating vector additions in the GPU and compare them to the CPU. First let’s calculate how much time does it take to calculate the addition of two vectors as a function of ttheir size $N$.</p>

<figure style="text-align: center;">
  <img src="/assets/img/posts/2025-09-20-cuda-performance/gpu_cpu_performance_1.png" alt="" width="700" />
  <figcaption><strong>Figure 1.</strong> time for vector addition for two vectors of size $N$. The x axis is in logarithmic scale. Green line corresponds to the GPU calculation (device gpu_1x_a100_sxm4). Black line is the CPU, an AMD EPYC 7J13 64-Core Processor. </figcaption>
</figure>

<p>The first we notice is that for values smaller than $\log_{10}N$ smaller than ~6.5 the CPU is faster. That may be surprising if you are new in the GPU world. How can it be that the CPU is faster in a parallel task?. Well, there’s two reasons, first, there is an overhead of allocating memory in the GPU, then transfer data, moving the data back to the CPU and free the memory in the GPU, that’s a lot of steps. Also another reason is that a CPU is in general faster than the GPU individual thread. As $N$ grows we see that the GPU calculation time becomes smaller than the CPU, at this regime is where it pays off (computationally speaking) to use the GPU.</p>

<p>But he above calculation is only for one addition, what if we do $m$ additions once the data is uploaed to the GPU?.</p>

<figure style="text-align: center;">
  <img src="/assets/img/posts/2025-09-20-cuda-performance/gpu_cpu_performance.png" alt="" width="700" />
  <figcaption><strong>Figure 2.</strong> time for vector addition for two vectors of size $N$. The x axis is in logarithmic scale. The different curves show the number of aditions we perform on the vectors. The gray and black thick lines are the CPU calculations.</figcaption>
</figure>

<p>In this figure you can see that the CPU for 1 addition and $2^8$ are very different, the latter is much more lenghtly. And this is expected, if we compare the same GPU additions we see that the curves are much closer, i.e. the fact that we parallelized the calculation makes it almost equal in time in the GPU. As we look at higher numer of additions this curve increases in time (see yelllow curve for $2^{14}) but it is impressive that the time difference is not that large.</p>

<p>From the above we can already say something (rather obvious) about GPUs, if your calculation can be parallelized and has a lot of operations it probably pays off to bring it to the GPU.</p>

<h2 id="gpu-times">GPU times</h2>

<p>It’s time to dive deeper into the total time of the GPU. As mentioned before there are 5 times that contribute to the GPU calculation time</p>

<ul>
  <li>Allocation</li>
  <li>Data transfer from host to device</li>
  <li>Calculation</li>
  <li>Data transfer from device to host</li>
  <li>Free memory</li>
</ul>

<p>We can look into this in the following graph</p>

<figure style="text-align: center;">
  <img src="/assets/img/posts/2025-09-20-cuda-performance/gpu_times_stacked_1024.png" alt="" width="700" />
  <figcaption><strong>Figure 3.</strong> time for vector addition for two vectors of size $N$. The x axis is in logarithmic scale. The different curves show the times of each GPU step. The gray and black thick lines are the CPU calculations.</figcaption>
</figure>

<p>Here we show in colors all the times including the total GPU time in a cummulative way. The allocation time seems inexistent and indeed it is very small compared to the rest, that is only if we have done a previous allocation in the GPU, the initial allocation always takes a significant amount of time. I have preallocated memory before allocating the bytes needed at every step in the graph. The copy time increases as $N$ increases. That is logical, we are trasnfering bigger vectors into the GPU. Also the compute time increases, and this could be surprising, after all each trhead in the GPU calculates the same ammount of additions, 1024. However if the number of elements in the vector is very large we may be batching our calculations, i.e. first we calculate $K$ vector elements, then the next $K$… in sequence. That may be increasing the total time for the calculation. Here we just fixed the blocks per thread to 512. After the calculation we have to copy back the data from the GPU to the host, if you notice, that takes less time that from the host to the GPU, the reason is that when copying from CPU to GPU we need to copy $a$ and $b$ vectors, whereas from GPU to CPU we only copy the result, $c$, that’s half of the bytes, therefore approximately half of the time. Finally freeing the memory is almost unnoticeable when the vector size is large.</p>

<p>Let’s compare the explicit times for 2048 additions at a small and large vector size:</p>

<div style="display: flex; justify-content: center; gap: 20px;">
  <figure style="text-align: center;">
    <img src="/assets/img/posts/2025-09-20-cuda-performance/percentages_performance_N_512additions_2048.png" width="350" alt="Percentage of GPU time spent in each operation for N=10^2 with 2048 additions" />
    <figcaption><strong>Fig. 4.</strong> Percentage of time spent in calculation for $log_{10}N$=2 and 2048 additions per vector element. Notice total time is very small and percentage of free and allocation time is significant in percentage.</figcaption>
  </figure>

  <figure style="text-align: center;">
    <img src="/assets/img/posts/2025-09-20-cuda-performance/percentages_performance_N_1073741824additions_2048.png" width="350" alt="Percentage of GPU time spent in each operation for N=10^9 with 2048 additions" />
    <figcaption><strong>Fig. 5.</strong> Percentage of time spent in calculation for $log_{10}N$=9 and 2048 additions per vector element. The compute time percentage has increased, not all threads are launched at the same time.</figcaption>
  </figure>
</div>

<p>In the limit of small vector size the time that it takes to allocate and deallocate the memory and transfer the data from host to device is large compared to the calculation. At this limit is clearly not worth it to use the GPU. For large vector sizes the percentage of the compute time increases, and that is what we want when using a GPU, to maximize the time of actual calculation with respect to allocation and data transfer.</p>

<p>But something interesing happens here: By design of our kernels each thread computes $m$ additions, if we launch all threads in parallel there is a limit of them so internally the GPU launches the calculations in batches. To be more specific, the GPU has finite number of <a href="https://modal.com/gpu-glossary/device-hardware/streaming-multiprocessor">Streaming Multipcoressors</a> (SMs), and a finite number of cores per SM. So the time increase in this particular calculation is actually expected to be linear (as we see in Figure 3 noticing that the scale is logarithmic).</p>

<h2 id="takeaways">Takeaways</h2>

<p>GPUs are great but they aren’t free lunch. In some cases they may make your calculation slower than using a GPU. The vector adition example is very basic, but it’s useful to get some general guidelines when coding in CUDA</p>

<ul>
  <li>Keep data ransfer between host and device (GPU) the minimum possible.</li>
  <li>Use multiple threads per block without exeeding the maximum of the indications (usually 1024 threads per block)</li>
  <li>Use the GPU to make many operations, many operations per thread.</li>
</ul>

<p>GPU optimization is a large topic that we can’t cover entirely in this post, but Nvidia already compiled a <a href="https://docs.nvidia.com/cuda/cuda-c-programming-guide/">Nvidia CUDA-C programming guide</a>. Some fo the highlights are:</p>

<ul>
  <li>Use memory hierarchy: There are different levels of memory from Global, shared, registers and constant memory.</li>
  <li>Use asynchronous operatiosn: It’s possible to overlap data transfer between host and device at the same time that the GPU is executing kernels. We can do that with <a href="https://developer.download.nvidia.com/CUDA/training/StreamsAndConcurrencyWebinar.pdf">streams</a>.</li>
  <li>Avoid excessive branching in threads. Threads executed in parallel (known as warps) should take more or less the same amount of time, otherwise the calculation time increases to the worst performer.</li>
  <li>Keep threads busy at all times, optimize so that threads have available data to calculate.</li>
</ul>

<p>Hope you enjoyed this very simple CUDA demonstration. There’s going to be more on CUDA soon!.</p>]]></content><author><name>Sebastia Agramunt Puig</name></author><category term="C++" /><category term="CUDA" /><category term="computer science" /><category term="GPU" /><summary type="html"><![CDATA[GPUs or Graphical Processing Units have become essential in high performance computing these days. They are efficient hardware that can parallelize small calculations. For instance, in Machine Learning (ML) and Artificial Intelligence (AI), GPUs are ubiquitous, as almost all operations are matrix multiplications, convolutions, max pooling… Operations that can be paralellized easily. However, GPUs are not suitable for any kind of calculation, in this post we will understand when it pays off to bring the calculation to GPU for a very simple example.]]></summary></entry><entry><title type="html">Introduction to Secure Communication</title><link href="https://agramunt.me/posts/introduction-to-secure-communication/" rel="alternate" type="text/html" title="Introduction to Secure Communication" /><published>2025-06-27T15:10:00-07:00</published><updated>2025-06-27T15:10:00-07:00</updated><id>https://agramunt.me/posts/introduction-to-secure-communication</id><content type="html" xml:base="https://agramunt.me/posts/introduction-to-secure-communication/"><![CDATA[<p>In this post we will see the most basic example for encrypting your messages, then we will show that this by no means is secure and finally we will introduce what it means to have a perfect secrecy scheme and why is not practical. Then, we’ll get to see what are stream ciphers for symmetric encryption. Python code for this post can be found <a href="https://github.com/SebastiaAgramunt/Cryptography/tree/master/notebooks">here</a> in my <a href="https://github.com/SebastiaAgramunt/Cryptography">cryptography repository</a>.</p>

<p>One of the main objectives of cryptography is to enable secure communication between a sender and a receiver. This means that if someone is intercepting (eavesdropping) the ciphertexts (encrypted messages) sent between parties A and B he is not able to get any information. To introduce nomenclature I will show the most basic cipher, the shift cipher.</p>

<h2 id="shift-cipher">Shift Cipher</h2>

<p>In the shift cipher (a.k.a Caesar’s cipher), two parties A and B agree on a common key, a number in between 0 to 24, this key is secret and they don’t share it with anybody else. In the shift cipher the key is translated into the number of jumps on the alphabet, let’s take a key (shift) of 3 as an example. The substitution of letters <code class="language-plaintext highlighter-rouge">abcdefghijklmnopqrstuvwxyz</code> would be <code class="language-plaintext highlighter-rouge">xyzabcdefghijklmnopqrstuvw</code>.</p>

<p>Now imagine that A wants to send to B the message “let the people vote” (a.k.a the message in plaintext), but obviously A cannot send this message as is through an insecure communication channel so he has to transform the message to the encrypted space as “ohw wkh shrsoh yrwh” (this is known as the ciphertext), that is, it substitutes “l” by “o”, “e” by “h” and so on as depicted in the substitution image above for key=3. Then he can send the ciphertext through the insecure channel and people eavesdropping this are not able to decrypt (reveal) the original message unless they have the secret key.</p>

<p>This cipher is very simple, to crack it the eavesdropper just needs to try all the possible secret keys (shifts) from 0 to 24. For instance if an eavesdropper C tried to decrypt the message using the key=1 he would have the decrypted message <code class="language-plaintext highlighter-rouge">ngv vjg rgqrng xqvg</code> when observing the previous ciphertext (you can try it easily in <a href="https://cryptii.com/pipes/caesar-cipher">this webpage</a>), obviously this message does not make any sense in english so he’d think that this is not the correct key and would try a new one. It is easy to see that it won’t take him very long to find that the original key for encryption was 3 and therefore decrypt all the messages that he was able to grasp between A and B.</p>

<h2 id="better-security-by-increasing-the-key-space-mono-alphabetic-cipher">Better security by increasing the key space (mono-alphabetic cipher)</h2>

<p>We’ve seen that if the key space (number of possible secret keys) is small it is not difficult to find the correct key and then decrypt all the messages. So, we can increase this key size and then the probability for the attacker to find the correct key is reduced (or, he’ll have to invest a lot of time to find it).</p>

<p>In the practical example I will show in this section (have a look at the code in the <a href="https://github.com/SebastiaAgramunt/Cryptography/blob/master/notebooks/06_Classical_cipher_mono_alphabetic.ipynb">notebook</a>) we are going to work with an improved cipher, called the mono-alphabetic substitution cipher. In this cipher we substitute the letters “a, b, c,…, z” with a random permutation of those. For instance a key can be <code class="language-plaintext highlighter-rouge">abcdefghijklmnopqrstuvwxyz</code>-&gt; <code class="language-plaintext highlighter-rouge">avsboircylxmpgkhjwqdtzefun</code>. In this case the letters “a”, “b”, “c” … from the message are mapped to the corresponding values below “a”, “v”, “s”… It is a simple substitution. The key space of this cipher is much larger compared to the Shift cipher, we can in fact generate 24!=24<em>23</em>22…<em>2</em>1=620448401733239439360000 possible keys. Looks pretty secure, right?. Brute force trial and error of the keys would take long since the probability of guessing the correct key in 1 trial is $1/24!$=$1.61e-24$ and it would take very long to keep on trying.</p>

<h2 id="leaking-information-from-ciphertexts">Leaking information from ciphertexts</h2>

<p>In the mono-alphabetic cipher it may be difficult for the attacker to get the exact key but still he can get some valuable information by just looking at the ciphertexts. Imagine that the attacker knows the language of communication in between the two parties (Alice and Bob) and so by just observing the ciphertexts he can infer information from the messages. Imagine this language is plain english, the attacker then knows that some words are more frequent than others, for instance “the”, “be”, “to”, “of” or “and” are listed as the <a href="https://en.wikipedia.org/wiki/Most_common_words_in_English">most frequent words in english</a>. This means that the attacker will observe many times the words “dco”, “vo”, “dk”, “ki” and “agb” in the ciphertexts (if we use the substitution cipher introduced in the example). Now he can exploit that to find some letter substitutions in the key.</p>

<p>In the example of the <a href="https://github.com/SebastiaAgramunt/Cryptography/blob/master/notebooks/06_Classical_cipher_mono_alphabetic.ipynb">notebook</a> I used a much simpler but similar attack. In this, I used text of the famous George Orwell’s book 1984. First I calculated the frequencies of letters (not words as before) taking all the words in the book and got ‘a’ appearing 36548, ‘b’ 7668 times, ‘c’: 11642 times, ‘d’: 19033… in sorted order. This is my source of truth for frequency of letters in the english language. Then, in order to estimate a regular message in plain english I sampled a chunk of 5% length of the book. With this chunk we calculate the ciphertext (the simple substitution above) and compute the frequencies of the words on it. Now, just by comparing the frequencies in the ciphertext and those of the english language we have been able to estimate 8 correct substitutions of the letters out of 24. How? Just by having a closer look at the ciphertexts with the prior that we know the language of communication is english. The conclusion: One can infer information from the original message just by looking at the ciphertext. This is an unwanted result.</p>

<h1 id="perfect-secrecy-and-the-one-time-pad">Perfect secrecy and the one time pad</h1>

<p>Can we find a way such that the ciphertext does not contain any information?. First let me write a definition (a more formal definition can be found in the book of Katz and Lindell) . An encryption scheme is considered to be perfectly secret if</p>

\[P(m | c) = P(m)\]

<p>where <code class="language-plaintext highlighter-rouge">m</code> represents all possible messages and <code class="language-plaintext highlighter-rouge">c</code> all possible ciphertexts. This means that the probability of finding a specific message does not change by the observation of any cipertext, i.e. the ciphertext does not contain any information about the message.</p>

<p>There’s a way to achieve perfect secrecy, and this is through one time pad. Let me explain it with a simple example. Imagine the scenario where Bob is a submarine captain of a secret army and Alice is his contact on the mainland. In the next mission Bob is told to go to the enemy base and wait for a communication from Alice of “attack” or “retreat” at exactly 4 p.m. They therefore want to communicate with messages of 1 bit, (1 means attack and 0 means retreat). The enemy has been informed by one of his spies that Bob is going to his base at 4 p.m and will be waiting for orders from Alice. He is also aware of the code they use. If the enemy knows that Bob is not attacking he will let him go (let’s assume the enemy is much weaker than Bob), otherwise he will try to attack first with all his force.</p>

<p>The first thing that Alice and Bob do is to meet in person at the base and agree on a common key for communication, this key has to be the same length as the message they want to send (in our case one bit). Then they agree that they calculate the ciphertext by <a href="https://en.wikipedia.org/wiki/Exclusive_or">XORing</a> the message with the key and do again XOR for decrypting the message (recall that XOR is the same as to apply addition modulo 2 operation). In the following table it is represented the encryption of one bit using XOR</p>

<table>
  <thead>
    <tr>
      <th>secret key</th>
      <th>message</th>
      <th>ciphertext (key xor message)</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>0</td>
      <td>0 (retreat)</td>
      <td>0</td>
    </tr>
    <tr>
      <td>0</td>
      <td>1 (attack)</td>
      <td>1</td>
    </tr>
    <tr>
      <td>1</td>
      <td>0 (retreat)</td>
      <td>1</td>
    </tr>
    <tr>
      <td>1</td>
      <td>1 (attack)</td>
      <td>0</td>
    </tr>
  </tbody>
</table>

<p>Once they agreed on a key (and are 100% sure nobody else knows it) they can communicate once through an insecure channel. Le’t say the key is 1. If Alice wants Bob to attack, she will send the ciphertext 0, otherwise 1. Now the attacker can observe this ciphertext but has no information whatsoever on the key, let’s say he observes ciphertext=1, from the above he can say that if the secret key is 0 then the order is attack but if it is 1 the order is retreat so P(attack)=P(retreat)=0.5. So he gets exactly the same probability for either attack or retreat, that means the ciphertext does not contain any information and he hasn’t learned anything new from the intentions of the secret army. Further explanation and implementation in Python can be found <a href="https://github.com/SebastiaAgramunt/Cryptography/blob/master/notebooks/10_One_Time_Pad_Encryption.ipynb">here</a>.</p>

<p>The one time pad can be extended to many bits, for instance if the message is of 256 bits, the binary key has to be of 256 bits too because we are masking 1 bit by 1 bit (this is a requirement from Shannon’s theorem for perfect secrecy found on this book). It is important to notice however that the key can only be used to transmit one message. If more messages were transmitted with the same key one could start computing the frequencies of the bits and eventually make statistics similar to what we did in the previous section.</p>

<p>One time pads are not practical for implementation because of the following reasons:</p>

<ul>
  <li>
    <p>The key has to be at least as long as the message one wants to transmit. This means we have to store a lot of information. For instance if we transmit a text message in ASCII encoding (8 bits per letter) and send a word of 10 letters we would need a key of 80 bits at least.</p>
  </li>
  <li>
    <p>For perfect secrecy one has to use a new key every time. When Alice and Bob meet in person they will have to exchange a lot of keys and use one by one sequentially for their communications. This again is a lot of information and is impractical.</p>
  </li>
  <li>
    <p>Alice and Bob have to make sure that they are the only ones that know the key. In the examples I always stated that they have to meet in person, this is a way to make sure that nobody is spying them. When using computers one wants to establish a secure communication through an insecure channel like the internet between two computers that are physically very far away. This makes the one time pad totally impractical.</p>
  </li>
</ul>

<h2 id="improving-the-one-time-pad-stream-ciphers">Improving the one time pad: stream ciphers</h2>

<p>The general idea of the one time pad is used in stream ciphers. Here we need the notion of a pseudorandom generator (see <a href="https://www.cs.umd.edu/~jkatz/imc.html">Introduction to modern cryptography from Katz and Lindell</a>), this is an algorithm that inputs a number (a.k.a the seed) and outputs what looks like a random string of bits. In essence a good pseudorandom generator (PRG) must output bit strings that are difficult to distinguish from pure random.</p>

<p>The PRG algorithm and the seed (the secret key) are shared among Alice and Bob so they can generate the same stream of “close-to-random” bits. These stream is used to pad the message and encrypt/decrypt the same way we did (XORing) in the one time pad in the previous section. Yes!, this is very similar to the one time pad!. But not exactly the same…</p>

<p>As expected there’s no free lunch. PRGs do not produce pure randomness. Pure randomness would mean that if one observes the output of n bits produced by the PRG (without knowing the seed) the probability of observing either 1 or 0 on the next bit generated by the PRG is exactly 0.5. This can’t be the case because the PRGs are deterministic algorithms. So there’s still information leakage from the ciphertext.</p>

<p>Alice and Bob can generate exactly the same stream of bits if they know both, the PRG algorithm and the seed to initiate it. To an adversary (knowing the PRG algorithm but not the seed) it is very difficult do guess the sequence computationally speaking so we would say this is secure. We can even measure the security of the stream cipher by comparing different PRGs, i.e. one stream cipher is more secure than another stream cipher if the pseudo-random numbers generated by the first looks more random (informally speaking) to an observer looking at the generated bits of both.</p>

<p>And finally good news! stream ciphers are used in modern day communications!. For instance <a href="https://en.wikipedia.org/wiki/A5/1">A5</a> is used in cell phone communications but as we said it is not perfectly secure since the PRG generates pseudo-random deterministic bits that are not entirely random.</p>

<p>I won’t extend on stream ciphers here but I hope you got the general idea. For further details have a look at the <a href="https://www.youtube.com/watch?v=AELVJL0axRs&amp;ab_channel=IntroductiontoCryptographybyChristofPaar">lecture of Prof. Christof Paar</a> on the topic. He approaches the topic differently, first explaining stream ciphers and later on perfect secrecy. Another excellent reference is the book of <a href="https://www.cs.umd.edu/~jkatz/imc.html">Katz and Lindell</a>.</p>

<h2 id="improving-the-stream-cipher-using-quantum-physics">Improving the stream cipher using quantum physics</h2>

<p>Now imagine that Alice and Bob could generate the same pure random streams of bits for a moment. For an attacker eavesdropping all the communications between Alice and Bob the ciphertexts would look totally random, that’s the case of perfect secrecy!. Don’t overreact but we are close to find the perfect cipher!. Now the question. Can we generate <strong>pure correlated randomness</strong> between Alice and Bob?</p>

<p>Let’s think how we can generate pure randomness first. Just use a physical process like thermal fluctuation on a CPU of your computer. Say for instance that normally the CPU is at X temperature, then if at the moment of generating one random bit we measure the temperature and is below X we output 0 otherwise 1. A better way to generate random noise is to prepare a quantum state for an electron in which the probability when measuring its spin is exactly 0.5 for up or down.</p>

<p>Ok, we got ways using physics to generate pure randomness on Alice and Bob’s ends. However you have to remember that Alice and Bob have to generate exactly the same stream of bits i.e. the same randomness. We cannot achieve this using classical thermal fluctuation so Alice and Bob need to have some sort of correlation between them. Using the properties of <a href="https://en.wikipedia.org/wiki/Quantum_entanglement">quantum entanglement</a> (see <a href="https://en.wikipedia.org/wiki/Quantum_key_distribution">quantum key distribution</a>) Alice and Bob and can prepare two physical quantum states on their end that are “correlated” so they can both generate the same pure randomness (again this comes from the random nature of quantum physics).</p>

<p>Yes… this is far beyond what I wanted to explain in this post, but I come from a physics background and it was very tempting to at least mention.</p>

<h2 id="takeaways-from-this-post">Takeaways from this post</h2>

<p>We presented the shift cipher and the substitution cipher as simple examples to illustrate the problem of the ciphertext carrying information from the message. We’ve seen with a simple attack on those ciphers how can one get information from the messages by just observing the ciphertexts. Then stated the problem of getting ciphertexts not containing any information from the message, i.e. perfect secrecy and we’ve seen that even though we can achieve it with the one time pad this is not a good practical solution for several reasons. A more practical approach is to use stream ciphers where take the same philosophy of padding the message with a random stream of bits. Here however we use pseudo-random generators to generate the noise, and the problem is that this noise is not purely random so stream ciphers are not perfectly secure but practical in many applications such as mobile communications. One way to make stream ciphers perfectly secure is to generate the randomness using quantum physics, an active field of research nowadays.</p>]]></content><author><name>Sebastia Agramunt Puig</name></author><category term="Cryptography" /><category term="cryptography" /><category term="mathematics" /><summary type="html"><![CDATA[In this post we will see the most basic example for encrypting your messages, then we will show that this by no means is secure and finally we will introduce what it means to have a perfect secrecy scheme and why is not practical. Then, we’ll get to see what are stream ciphers for symmetric encryption. Python code for this post can be found here in my cryptography repository.]]></summary></entry><entry><title type="html">Basic C++ python extension</title><link href="https://agramunt.me/posts/cpp-python-extension/" rel="alternate" type="text/html" title="Basic C++ python extension" /><published>2025-03-08T06:28:00-08:00</published><updated>2025-03-08T06:28:00-08:00</updated><id>https://agramunt.me/posts/cpp-python-extension</id><content type="html" xml:base="https://agramunt.me/posts/cpp-python-extension/"><![CDATA[<p>C++ and Python are very different programming languages, the first one is compiled and low level whereas the second one is interpreted. C++ is a lot faster than Python but, can we leverage the performance of C++ and the versatility in Python?. Yes, we can do such thing writing C++ extensions and create bindings for Python. In this post we will create a python package with compiled code using <a href="https://github.com/pybind/pybind11">pybind11</a> library to create the python bindings. As usual you have the <a href="https://github.com/SebastiaAgramunt/blogging-code">blog</a> with the <a href="https://github.com/SebastiaAgramunt/blogging-code/tree/main/cpp-basic-cpp-python-extension">code</a>.</p>

<h2 id="the-c-project">The C++ project</h2>

<p>In a repository we will need coexisting python code and C++ code. In this example we will code a C++ matrix multiplication that we want to expose to Python. We define the following file structure</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
</pre></td><td class="rouge-code"><pre><span class="nb">.</span>
├── README.md
├── include
│   └── matmul.h
├── scripts
│   └── compile.sh
├── src
│   ├── bindings.cpp
│   ├── main.cpp
│   └── matmul.cpp
└── tests
    └── test_matmul.py
</pre></td></tr></tbody></table></code></pre></div></div>

<p>With the following contents for <code class="language-plaintext highlighter-rouge">matmul.h</code>:</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
</pre></td><td class="rouge-code"><pre><span class="cp">#ifndef MATMUL_H
#define MATMUL_H
</span>
<span class="kt">void</span> <span class="nf">matmul</span><span class="p">(</span><span class="k">const</span> <span class="kt">float</span><span class="o">*</span> <span class="n">A</span><span class="p">,</span> <span class="k">const</span> <span class="kt">float</span><span class="o">*</span> <span class="n">B</span><span class="p">,</span> <span class="kt">float</span><span class="o">*</span> <span class="n">C</span><span class="p">,</span> <span class="kt">int</span> <span class="n">M</span><span class="p">,</span> <span class="kt">int</span> <span class="n">N</span><span class="p">,</span> <span class="kt">int</span> <span class="n">K</span><span class="p">);</span>
<span class="kt">void</span> <span class="nf">printmatrix</span><span class="p">(</span><span class="k">const</span> <span class="kt">float</span><span class="o">*</span> <span class="n">A</span><span class="p">,</span> <span class="kt">int</span> <span class="n">M</span><span class="p">,</span> <span class="kt">int</span> <span class="n">N</span><span class="p">);</span>

<span class="cp">#endif
</span></pre></td></tr></tbody></table></code></pre></div></div>

<p>and <code class="language-plaintext highlighter-rouge">matmul.cpp</code></p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
</pre></td><td class="rouge-code"><pre><span class="cp">#include</span> <span class="cpf">&lt;iostream&gt;</span><span class="cp">
#include</span> <span class="cpf">"matmul.h"</span><span class="cp">
</span>
<span class="c1">// Matrices are indexed row-major in this example. E.g. if A is [M x N]</span>
<span class="c1">// If i,j are the row and column indices, the element A[i, j] is</span>
<span class="c1">// A[i, j] = A[i * N + j] // if row-major</span>
<span class="c1">// A[i, j] = A[j * M + i] // if column-major</span>

<span class="kt">void</span> <span class="nf">matmul</span><span class="p">(</span><span class="k">const</span> <span class="kt">float</span><span class="o">*</span> <span class="n">A</span><span class="p">,</span> <span class="k">const</span> <span class="kt">float</span><span class="o">*</span> <span class="n">B</span><span class="p">,</span> <span class="kt">float</span><span class="o">*</span> <span class="n">C</span><span class="p">,</span> <span class="kt">int</span> <span class="n">M</span><span class="p">,</span> <span class="kt">int</span> <span class="n">N</span><span class="p">,</span> <span class="kt">int</span> <span class="n">K</span><span class="p">){</span>
<span class="c1">// Matrix multiplication, C[M x K] = A[M x N] * B[N x K]</span>
<span class="c1">// Multiplication is $\sum_n A[m, n] * B[n, k]$</span>
    <span class="k">for</span><span class="p">(</span><span class="kt">int</span> <span class="n">m</span><span class="o">=</span><span class="mi">0</span><span class="p">;</span> <span class="n">m</span><span class="o">&lt;</span><span class="n">M</span><span class="p">;</span> <span class="n">m</span><span class="o">++</span><span class="p">){</span>
        <span class="k">for</span><span class="p">(</span><span class="kt">int</span> <span class="n">k</span><span class="o">=</span><span class="mi">0</span><span class="p">;</span> <span class="n">k</span><span class="o">&lt;</span><span class="n">K</span><span class="p">;</span> <span class="n">k</span><span class="o">++</span><span class="p">){</span>
            <span class="n">C</span><span class="p">[</span><span class="n">m</span> <span class="o">*</span> <span class="n">K</span> <span class="o">+</span> <span class="n">k</span><span class="p">]</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
            <span class="k">for</span><span class="p">(</span><span class="kt">int</span> <span class="n">n</span><span class="o">=</span><span class="mi">0</span><span class="p">;</span> <span class="n">n</span><span class="o">&lt;</span><span class="n">N</span><span class="p">;</span> <span class="n">n</span><span class="o">++</span><span class="p">){</span>
                <span class="n">C</span><span class="p">[</span><span class="n">m</span> <span class="o">*</span> <span class="n">K</span> <span class="o">+</span> <span class="n">k</span><span class="p">]</span> <span class="o">+=</span> <span class="n">A</span><span class="p">[</span><span class="n">m</span> <span class="o">*</span> <span class="n">N</span> <span class="o">+</span> <span class="n">n</span><span class="p">]</span> <span class="o">*</span> <span class="n">B</span><span class="p">[</span><span class="n">n</span> <span class="o">*</span> <span class="n">K</span> <span class="o">+</span> <span class="n">k</span><span class="p">];</span>
            <span class="p">}</span>
        <span class="p">}</span>
    <span class="p">}</span>
<span class="p">}</span>

<span class="kt">void</span> <span class="nf">printmatrix</span><span class="p">(</span><span class="k">const</span> <span class="kt">float</span><span class="o">*</span> <span class="n">A</span><span class="p">,</span> <span class="kt">int</span> <span class="n">M</span><span class="p">,</span> <span class="kt">int</span> <span class="n">N</span><span class="p">)</span> <span class="p">{</span>
    <span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">i</span> <span class="o">&lt;</span> <span class="n">M</span><span class="p">;</span> <span class="o">++</span><span class="n">i</span><span class="p">)</span> <span class="p">{</span>
        <span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">j</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">j</span> <span class="o">&lt;</span> <span class="n">N</span><span class="p">;</span> <span class="o">++</span><span class="n">j</span><span class="p">)</span> <span class="p">{</span>
            <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="n">A</span><span class="p">[</span><span class="n">i</span> <span class="o">*</span> <span class="n">N</span> <span class="o">+</span> <span class="n">j</span><span class="p">]</span> <span class="o">&lt;&lt;</span> <span class="s">" "</span><span class="p">;</span>
        <span class="p">}</span>
        <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span>
    <span class="p">}</span>
<span class="p">}</span>
</pre></td></tr></tbody></table></code></pre></div></div>
<p>This file contains just two fucntions, <code class="language-plaintext highlighter-rouge">matmul</code> gets three matrices <code class="language-plaintext highlighter-rouge">A[M x K]</code>, <code class="language-plaintext highlighter-rouge">B[M x N]</code> and <code class="language-plaintext highlighter-rouge">C[N x K]</code> and returns the multiplication of $C = A \times B$. The matrices are single precision array of floats and we consider <a href="https://en.wikipedia.org/wiki/Row-_and_column-major_order">row-major order</a>.</p>

<p>We can use the <code class="language-plaintext highlighter-rouge">matmul</code> library in a main fucntion to compile a binary and test that our function is correct. For that we define a <code class="language-plaintext highlighter-rouge">main.cpp</code>:</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
</pre></td><td class="rouge-code"><pre><span class="cp">#include</span> <span class="cpf">&lt;iostream&gt;</span><span class="cp">
#include</span> <span class="cpf">"matmul.h"</span><span class="cp">
</span>
<span class="cp">#define M 32
#define N 64
#define K 32
</span>
<span class="kt">void</span> <span class="nf">initializeMatrices</span><span class="p">(</span><span class="kt">float</span><span class="o">*</span> <span class="n">A</span><span class="p">,</span> <span class="kt">float</span><span class="o">*</span> <span class="n">B</span><span class="p">)</span> <span class="p">{</span>
    <span class="n">srand</span><span class="p">(</span><span class="mi">7</span><span class="p">);</span>

    <span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">i</span> <span class="o">&lt;</span> <span class="n">M</span> <span class="o">*</span> <span class="n">N</span><span class="p">;</span> <span class="o">++</span><span class="n">i</span><span class="p">)</span>
        <span class="n">A</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">=</span> <span class="p">(</span><span class="n">rand</span><span class="p">()</span> <span class="o">%</span> <span class="mi">100</span><span class="p">)</span> <span class="o">/</span> <span class="mf">10.0f</span><span class="p">;</span>  <span class="c1">// Random float in range [0,10]</span>

    <span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">i</span> <span class="o">&lt;</span> <span class="n">N</span> <span class="o">*</span> <span class="n">K</span><span class="p">;</span> <span class="o">++</span><span class="n">i</span><span class="p">)</span>
        <span class="n">B</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">=</span> <span class="p">(</span><span class="n">rand</span><span class="p">()</span> <span class="o">%</span> <span class="mi">100</span><span class="p">)</span> <span class="o">/</span> <span class="mf">10.0f</span><span class="p">;</span>  <span class="c1">// Random float in range [0,10]</span>
<span class="p">}</span>

<span class="kt">int</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>

    <span class="kt">float</span><span class="o">*</span> <span class="n">A</span> <span class="o">=</span> <span class="k">new</span> <span class="kt">float</span><span class="p">[</span><span class="n">M</span> <span class="o">*</span> <span class="n">N</span><span class="p">];</span>
    <span class="kt">float</span><span class="o">*</span> <span class="n">B</span> <span class="o">=</span> <span class="k">new</span> <span class="kt">float</span><span class="p">[</span><span class="n">N</span> <span class="o">*</span> <span class="n">K</span><span class="p">];</span>
    <span class="kt">float</span><span class="o">*</span> <span class="n">C</span> <span class="o">=</span> <span class="k">new</span> <span class="kt">float</span><span class="p">[</span><span class="n">M</span> <span class="o">*</span> <span class="n">K</span><span class="p">];</span>

    <span class="n">initializeMatrices</span><span class="p">(</span><span class="n">A</span><span class="p">,</span> <span class="n">B</span><span class="p">);</span>
    <span class="n">matmul</span><span class="p">(</span><span class="n">A</span><span class="p">,</span> <span class="n">B</span><span class="p">,</span> <span class="n">C</span><span class="p">,</span> <span class="n">M</span><span class="p">,</span> <span class="n">N</span><span class="p">,</span> <span class="n">K</span><span class="p">);</span>

    <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"C = A x B:"</span> <span class="o">&lt;&lt;</span> <span class="n">std</span><span class="o">::</span><span class="n">endl</span><span class="p">;</span>
    <span class="n">printmatrix</span><span class="p">(</span><span class="n">C</span><span class="p">,</span> <span class="n">M</span><span class="p">,</span> <span class="n">K</span><span class="p">);</span>

    <span class="k">delete</span><span class="p">[]</span> <span class="n">A</span><span class="p">;</span>
    <span class="k">delete</span><span class="p">[]</span> <span class="n">B</span><span class="p">;</span>
    <span class="k">delete</span><span class="p">[]</span> <span class="n">C</span><span class="p">;</span>

    <span class="k">return</span> <span class="mi">0</span><span class="p">;</span>
<span class="p">}</span>
</pre></td></tr></tbody></table></code></pre></div></div>
<p>This is pretty simple code, we define the matrices, randomly initialize them (although we don’t need C to be initialized randomly) and we perform the multiplication. Then we print out the results on screen. Let’s compile this <code class="language-plaintext highlighter-rouge">main.cpp</code> entrypoint. First we manually create our usual build directory, then we compile the objects and lastly we link the objects into the final executable.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
</pre></td><td class="rouge-code"><pre><span class="c"># create build directories</span>
<span class="nb">rm</span> <span class="nt">-rf</span> build
<span class="nb">mkdir</span> <span class="nt">-p</span> build/obj
<span class="nb">mkdir </span>build/bin
<span class="nb">mkdir </span>build/lib

<span class="c"># compile to objects</span>
g++ <span class="nt">-std</span><span class="o">=</span>c++17 <span class="nt">-Iinclude</span> <span class="nt">-c</span> src/matmul.cpp <span class="nt">-o</span> build/obj/matmul.o
g++ <span class="nt">-std</span><span class="o">=</span>c++17 <span class="nt">-Iinclude</span> <span class="nt">-c</span> src/main.cpp <span class="nt">-o</span> build/obj/main.o

<span class="c"># link all the objects</span>
g++ build/obj/matmul.o <span class="se">\</span>
    build/obj/main.o <span class="se">\</span>
    <span class="nt">-o</span> build/bin/main
</pre></td></tr></tbody></table></code></pre></div></div>

<p>With this we can execute the <code class="language-plaintext highlighter-rouge">main</code> and see the result of the multiplication of the two matrices</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>./build/bin/main
</pre></td></tr></tbody></table></code></pre></div></div>

<h2 id="understanding-what-are-python-bindings">Understanding what are Python bindings</h2>

<p>Now the question is, how do we convert this code so that we can run it with python?. I would like to use <code class="language-plaintext highlighter-rouge">matmul</code> function from python. We need to understand first that python is actually a collection of shared libraries that are loaded dynamically. Just create a new environment and let’s inspect it</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre>python <span class="nt">-m</span> venv .venv
<span class="nb">source</span> .venv/bin/activate
</pre></td></tr></tbody></table></code></pre></div></div>

<p>First with the tool <code class="language-plaintext highlighter-rouge">otool</code> in MacOs (<code class="language-plaintext highlighter-rouge">ldd</code> in Linux) let’s see what are the libraries that the executable <code class="language-plaintext highlighter-rouge">python</code> depends on, type</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>otool <span class="nt">-L</span> .venv/bin/python
</pre></td></tr></tbody></table></code></pre></div></div>

<p>to see that</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
</pre></td><td class="rouge-code"><pre>.venv/bin/python:
	/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation (compatibility version 150.0.0, current version 2420.0.0)
	/Users/sebas/.pyenv/versions/3.12.4/lib/libpython3.12.dylib (compatibility version 3.12.0, current version 3.12.0)
	/usr/local/opt/gettext/lib/libintl.8.dylib (compatibility version 13.0.0, current version 13.0.0)
	/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1345.100.2)
</pre></td></tr></tbody></table></code></pre></div></div>

<p>These are the libraries that <code class="language-plaintext highlighter-rouge">python</code> binary expect to load at runtime. The most important is <code class="language-plaintext highlighter-rouge">libpython3.12.dylib</code> (that’s for MacOS, you would see a <code class="language-plaintext highlighter-rouge">.so</code> file in Linux or a <code class="language-plaintext highlighter-rouge">.dll</code> file in Windows). It contains the compiled core of the Python interpreter, including the bytecode evaluator, built-in types, and other core runtime components. This library is used to embed Python into other applications or link with C/C++ extensions dynamically. Let’s continue inspecting the paths that python uses. At runtime the interpreter includes a bunch of directories to look for libraries. Find them by running</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>python <span class="nt">-c</span> <span class="s2">"import sys; print(sys.path)"</span>
</pre></td></tr></tbody></table></code></pre></div></div>
<p>The output is:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
</pre></td><td class="rouge-code"><pre>['',
'/Users/sebas/.pyenv/versions/3.12.4/lib/python3.12',
'/Users/sebas/.pyenv/versions/3.12.4/lib/python3.12/lib-dynload',
'/Users/sebas/tmp/blogging-code/cpp-compile-link-external-lib/.venv/lib/python3.12/site-packages']
</pre></td></tr></tbody></table></code></pre></div></div>

<p>The first directory containes the file <code class="language-plaintext highlighter-rouge">libpython3.12.dylib</code>. If we go deeper into that directory there’s <code class="language-plaintext highlighter-rouge">/Users/sebas/.pyenv/versions/3.12.4/lib/python3.12/lib-dynload</code> where you will find python files of really known packages (the standard library of python), that’s <code class="language-plaintext highlighter-rouge">hashlib.py</code>, <code class="language-plaintext highlighter-rouge">datetime.py</code>, <code class="language-plaintext highlighter-rouge">dataclases.py</code>, <code class="language-plaintext highlighter-rouge">abc.py</code>… those are “packages” that come by default with the python installation.</p>

<p>Let’s take a look at the file <code class="language-plaintext highlighter-rouge">hashlib.py</code> file, some of the imports are <code class="language-plaintext highlighter-rouge">import _sha1</code>, <code class="language-plaintext highlighter-rouge">import _md5</code>, those are cryptographic algoritms. Where are those imports?. If you check the next path <code class="language-plaintext highlighter-rouge">/Users/sebas/.pyenv/versions/3.12.4/lib/python3.12/lib-dynload</code> there are files like <code class="language-plaintext highlighter-rouge">_sha1.cpython-312-darwin.so</code> and <code class="language-plaintext highlighter-rouge">_md5.cpython-312-darwin.so</code>. Libraries that are imported as modules, C++ shared libraries that can be loaded by the python interpreter. That’s what we want to do, compile the C++ <code class="language-plaintext highlighter-rouge">matmul</code> funcion into some sort of shared library so that we can import in our python script.</p>

<h2 id="compiling-a-shared-library-for-python">Compiling a shared library for Python</h2>

<p>In a previous C++ post I have shown how to compile a shared object, and this should be easy. However we cannot expect to compile C++ code directly to get a python shared object, we need to define how the C++ code translates into C++ python objects. For this we need the python C++ headers, to use the C++ python API. You can see the path for those by executing</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>python <span class="nt">-c</span> <span class="s2">"import sysconfig; print(sysconfig.get_path('include'))"</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>which in my case is <code class="language-plaintext highlighter-rouge">/Users/sebas/.pyenv/versions/3.12.4/include/python3.12</code>, there you can find many headers but the most important is <code class="language-plaintext highlighter-rouge">Python.h</code> (which is basically all the other headers combined).</p>

<p>Apart from the header you need the library <code class="language-plaintext highlighter-rouge">python3.12</code>, you can get it by asking the linking flags to your python:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>python3-config <span class="nt">--ldflags</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>which returns <code class="language-plaintext highlighter-rouge">-lintl -ldl -L/Users/sebas/.pyenv/versions/3.12.4/lib -Wl,-rpath,/Users/sebas/.pyenv/versions/3.12.4/lib -framework CoreFoundation</code>, that is to link against the libraier <code class="language-plaintext highlighter-rouge">intl</code>, <code class="language-plaintext highlighter-rouge">dl</code> and look for those libraries in the specified cirectory <code class="language-plaintext highlighter-rouge">-L</code>. Finally also tells the linker to add a runtime path <code class="language-plaintext highlighter-rouge">-rpath</code> at which the executable will try to find the libraries at runtime. The last is specific to <code class="language-plaintext highlighter-rouge">MacOS</code>, this provides utilities for the operating system. This output would be different if we were on Windows or a Linux machine.</p>

<p>At this point we have the includes and libraries that we need to compile the C++ code into Python. We could write our bindings using the definitons in <code class="language-plaintext highlighter-rouge">Python.h</code>. This library is the official Python API to write C code, it allows you to have full control of the program but it is generally more difficult to write code compared to other options (see next section).</p>

<h2 id="compiling-a-shared-library-for-python-using-pybind11">Compiling a shared library for Python using Pybind11</h2>

<p>A more convenient library to compile your shared python packages is <a href="https://github.com/pybind/pybind11">pybind11</a>, which is a header only library that exposes C++ types in Python and vice versa. For this you will need the python headers and libraries (shown in previous section) and <code class="language-plaintext highlighter-rouge">pybind11</code> that can be installed with <code class="language-plaintext highlighter-rouge">pip install pybind11</code>.</p>

<p>For now we will write a <code class="language-plaintext highlighter-rouge">bindings.cpp</code> file with all the “translated code”:</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
</pre></td><td class="rouge-code"><pre><span class="cp">#include</span> <span class="cpf">&lt;pybind11/pybind11.h&gt;</span><span class="cp">
#include</span> <span class="cpf">&lt;pybind11/stl.h&gt;</span><span class="cp">
#include</span> <span class="cpf">&lt;pybind11/numpy.h&gt;</span><span class="cp">
#include</span> <span class="cpf">"matmul.h"</span><span class="cp">
</span>
<span class="k">namespace</span> <span class="n">py</span> <span class="o">=</span> <span class="n">pybind11</span><span class="p">;</span>

<span class="kt">void</span> <span class="nf">matmul_py</span><span class="p">(</span><span class="n">py</span><span class="o">::</span><span class="n">array_t</span><span class="o">&lt;</span><span class="kt">float</span><span class="o">&gt;</span> <span class="n">A</span><span class="p">,</span> <span class="n">py</span><span class="o">::</span><span class="n">array_t</span><span class="o">&lt;</span><span class="kt">float</span><span class="o">&gt;</span> <span class="n">B</span><span class="p">,</span> <span class="n">py</span><span class="o">::</span><span class="n">array_t</span><span class="o">&lt;</span><span class="kt">float</span><span class="o">&gt;</span> <span class="n">C</span><span class="p">)</span> <span class="p">{</span>
    <span class="k">auto</span> <span class="n">bufA</span> <span class="o">=</span> <span class="n">A</span><span class="p">.</span><span class="n">request</span><span class="p">(),</span> <span class="n">bufB</span> <span class="o">=</span> <span class="n">B</span><span class="p">.</span><span class="n">request</span><span class="p">(),</span> <span class="n">bufC</span> <span class="o">=</span> <span class="n">C</span><span class="p">.</span><span class="n">request</span><span class="p">();</span>

    <span class="k">if</span> <span class="p">(</span><span class="n">bufA</span><span class="p">.</span><span class="n">ndim</span> <span class="o">!=</span> <span class="mi">2</span> <span class="o">||</span> <span class="n">bufB</span><span class="p">.</span><span class="n">ndim</span> <span class="o">!=</span> <span class="mi">2</span> <span class="o">||</span> <span class="n">bufC</span><span class="p">.</span><span class="n">ndim</span> <span class="o">!=</span> <span class="mi">2</span><span class="p">)</span> <span class="p">{</span>
        <span class="k">throw</span> <span class="n">std</span><span class="o">::</span><span class="n">runtime_error</span><span class="p">(</span><span class="s">"All matrices must be 2D"</span><span class="p">);</span>
    <span class="p">}</span>

    <span class="kt">size_t</span> <span class="n">M</span> <span class="o">=</span> <span class="n">bufA</span><span class="p">.</span><span class="n">shape</span><span class="p">[</span><span class="mi">0</span><span class="p">];</span>
    <span class="kt">size_t</span> <span class="n">N</span> <span class="o">=</span> <span class="n">bufA</span><span class="p">.</span><span class="n">shape</span><span class="p">[</span><span class="mi">1</span><span class="p">];</span>
    <span class="kt">size_t</span> <span class="n">K</span> <span class="o">=</span> <span class="n">bufB</span><span class="p">.</span><span class="n">shape</span><span class="p">[</span><span class="mi">1</span><span class="p">];</span>

    <span class="k">if</span> <span class="p">(</span><span class="n">bufB</span><span class="p">.</span><span class="n">shape</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">!=</span> <span class="n">N</span> <span class="o">||</span> <span class="n">bufC</span><span class="p">.</span><span class="n">shape</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">!=</span> <span class="n">M</span> <span class="o">||</span> <span class="n">bufC</span><span class="p">.</span><span class="n">shape</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="o">!=</span> <span class="n">K</span><span class="p">)</span> <span class="p">{</span>
        <span class="k">throw</span> <span class="n">std</span><span class="o">::</span><span class="n">runtime_error</span><span class="p">(</span><span class="s">"Matrix dimensions do not match for multiplication"</span><span class="p">);</span>
    <span class="p">}</span>

    <span class="kt">float</span><span class="o">*</span> <span class="n">ptrA</span> <span class="o">=</span> <span class="k">static_cast</span><span class="o">&lt;</span><span class="kt">float</span><span class="o">*&gt;</span><span class="p">(</span><span class="n">bufA</span><span class="p">.</span><span class="n">ptr</span><span class="p">);</span>
    <span class="kt">float</span><span class="o">*</span> <span class="n">ptrB</span> <span class="o">=</span> <span class="k">static_cast</span><span class="o">&lt;</span><span class="kt">float</span><span class="o">*&gt;</span><span class="p">(</span><span class="n">bufB</span><span class="p">.</span><span class="n">ptr</span><span class="p">);</span>
    <span class="kt">float</span><span class="o">*</span> <span class="n">ptrC</span> <span class="o">=</span> <span class="k">static_cast</span><span class="o">&lt;</span><span class="kt">float</span><span class="o">*&gt;</span><span class="p">(</span><span class="n">bufC</span><span class="p">.</span><span class="n">ptr</span><span class="p">);</span>

    <span class="n">matmul</span><span class="p">(</span><span class="n">ptrA</span><span class="p">,</span> <span class="n">ptrB</span><span class="p">,</span> <span class="n">ptrC</span><span class="p">,</span> <span class="n">M</span><span class="p">,</span> <span class="n">N</span><span class="p">,</span> <span class="n">K</span><span class="p">);</span>  <span class="c1">// same call as before</span>
<span class="p">}</span>


<span class="n">PYBIND11_MODULE</span><span class="p">(</span><span class="n">matrix_mul</span><span class="p">,</span> <span class="n">m</span><span class="p">)</span> <span class="p">{</span>
    <span class="n">m</span><span class="p">.</span><span class="n">def</span><span class="p">(</span><span class="s">"matmul"</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">matmul_py</span><span class="p">,</span> <span class="s">"Matrix multiplication function"</span><span class="p">);</span>
<span class="p">}</span>
</pre></td></tr></tbody></table></code></pre></div></div>
<p>The function <code class="language-plaintext highlighter-rouge">matmul_py</code> takes three python numpy arrays, A, B and C and first checks that they are dimension 2. After that, we get the shapes of the pointers and get the pointers to the memory of each array. Finally we can call the C++ function <code class="language-plaintext highlighter-rouge">matmul</code>. Lastly we define our <code class="language-plaintext highlighter-rouge">PYBIND11_MODULE</code>, we expose the function <code class="language-plaintext highlighter-rouge">matmul_py</code> to be called as <code class="language-plaintext highlighter-rouge">matmul</code> in Python. And that should be it, now is time to compile. Bear with me, I’m going to throw a bunch of bash commands while explaining them in inline comments, in the root directory of the project run:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
</pre></td><td class="rouge-code"><pre><span class="c"># create a building python environment</span>
<span class="nb">rm</span> <span class="nt">-rf</span> .venv_build
python <span class="nt">-m</span> venv .venv_build
.venv_build/bin/pip <span class="nb">install</span> <span class="nt">--upgrade</span> pip

<span class="c"># install pybind11 using pip</span>
.venv_build/bin/python <span class="nt">-m</span> pip <span class="nb">install </span>pybind11

</pre></td></tr></tbody></table></code></pre></div></div>
<p>Create the directory to hold the objects, binaries and the library</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
</pre></td><td class="rouge-code"><pre><span class="nb">rm</span> <span class="nt">-rf</span> build

<span class="c"># creating directories for the build</span>
<span class="nb">mkdir</span> <span class="nt">-p</span> build/obj
<span class="nb">mkdir </span>build/bin
<span class="nb">mkdir </span>build/lib
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Now we can compile the objects including the python and pybind11 headers (the output of <code class="language-plaintext highlighter-rouge">python -m pybind11 --includes</code> is <code class="language-plaintext highlighter-rouge">-I/Users/sebas/.pyenv/versions/3.12.4/include/python3.12 -I/Users/sebas/tmp/blogging-code/cpp-basic-cpp-python-extension/.venv_build/lib/python3.12/site-packages/pybind11/include</code> in my setup).</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
</pre></td><td class="rouge-code"><pre>g++ <span class="nt">-std</span><span class="o">=</span>c++17 <span class="nt">-Iinclude</span> <span class="nt">-c</span> src/matmul.cpp <span class="nt">-o</span> build/obj/matmul.o
g++ <span class="nt">-std</span><span class="o">=</span>c++17 <span class="nt">-Iinclude</span> <span class="se">\</span>
                <span class="si">$(</span>.venv_build/bin/python <span class="nt">-m</span> pybind11 <span class="nt">--includes</span><span class="si">)</span> <span class="se">\</span>
                <span class="nt">-c</span> src/bindings.cpp <span class="se">\</span>
                <span class="nt">-o</span> build/obj/bindings.o
</pre></td></tr></tbody></table></code></pre></div></div>

<p>And finally we create the shared object</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
</pre></td><td class="rouge-code"><pre><span class="c"># grep the name of the major and minor versions of python, i.e. if we use 3.12.8 this will return python3.12</span>
<span class="c"># this is the name of the python library</span>
<span class="nv">python_library</span><span class="o">=</span>python<span class="si">$(</span>.venv_build/bin/python <span class="nt">--version</span> | <span class="nb">awk</span> <span class="s1">'{print $2}'</span> | <span class="nb">awk</span> <span class="nt">-F</span><span class="nb">.</span> <span class="s1">'{print $1"."$2}'</span><span class="si">)</span>

g++ <span class="nt">-O3</span> <span class="nt">-Wall</span> <span class="nt">-shared</span> <span class="nt">-std</span><span class="o">=</span>c++17 <span class="nt">-fPIC</span> <span class="se">\</span>
    <span class="si">$(</span>python3-config <span class="nt">--ldflags</span><span class="si">)</span> <span class="se">\</span>
    <span class="nt">-l</span><span class="k">${</span><span class="nv">python_library</span><span class="k">}</span> <span class="se">\</span>
    build/obj/matmul.o <span class="se">\</span>
    build/obj/bindings.o <span class="se">\</span>
    <span class="nt">-o</span> build/lib/matrix_mul<span class="si">$(</span>python3-config <span class="nt">--extension-suffix</span><span class="si">)</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>And you will see a file <code class="language-plaintext highlighter-rouge">matrix_mul.cpython-312-darwin.so</code> in your <code class="language-plaintext highlighter-rouge">build/lib</code> directory. This is your compiled library!. Let me explain some key concepts. The command <code class="language-plaintext highlighter-rouge">python3-config --ldflags</code> gives you the flags needed to compile the python extension (explained before), the <code class="language-plaintext highlighter-rouge">-l</code> flag is to specifically link a library, in my case <code class="language-plaintext highlighter-rouge">python3.12</code>, then <code class="language-plaintext highlighter-rouge">python3-config --extension-suffix</code> gives the python version, architecture and operating system. It is used commonly to name the extension.</p>

<p>How can you import it?. Change directory to where the shared library is and try to impor it from there</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre><span class="nb">cd </span>build/lib
python <span class="nt">-c</span> <span class="s2">"import matrix_mul"</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>This works here because “current directory” is always in the <code class="language-plaintext highlighter-rouge">sys.path</code> for the interpreter. We should place the library in the <code class="language-plaintext highlighter-rouge">libs</code> directory of our environment and then it could be imported every time we open a python prompt.</p>

<h2 id="testing">Testing</h2>

<p>Let’s write a python script to test our library, this file will be called <code class="language-plaintext highlighter-rouge">test_matmul.py</code> and will be placed under <code class="language-plaintext highlighter-rouge">tests</code> directory.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
</pre></td><td class="rouge-code"><pre><span class="kn">import</span> <span class="n">numpy</span> <span class="k">as</span> <span class="n">np</span>
<span class="kn">import</span> <span class="n">sys</span>
<span class="kn">from</span> <span class="n">pathlib</span> <span class="kn">import</span> <span class="n">Path</span>

<span class="c1"># we can't really import the library (shared object) from a script
# unless it's in the sys.path
</span><span class="n">SHARED_LIBRARY_DIR</span> <span class="o">=</span> <span class="nc">Path</span><span class="p">(</span><span class="n">__file__</span><span class="p">).</span><span class="n">parents</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="o">/</span> <span class="sh">"</span><span class="s">build</span><span class="sh">"</span> <span class="o">/</span> <span class="sh">"</span><span class="s">lib</span><span class="sh">"</span>
<span class="n">sys</span><span class="p">.</span><span class="n">path</span><span class="p">.</span><span class="nf">insert</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="nf">str</span><span class="p">(</span><span class="n">SHARED_LIBRARY_DIR</span><span class="p">))</span>

<span class="c1"># now we can import our compiled library
</span><span class="kn">import</span> <span class="n">matrix_mul</span>

<span class="c1"># Define matrix dimensions
</span><span class="n">M</span><span class="p">,</span> <span class="n">N</span><span class="p">,</span> <span class="n">K</span> <span class="o">=</span> <span class="mi">32</span><span class="p">,</span> <span class="mi">64</span><span class="p">,</span> <span class="mi">32</span>

<span class="c1"># Create random matrices A[MxN] and B[NxK]
</span><span class="n">A</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">random</span><span class="p">.</span><span class="nf">rand</span><span class="p">(</span><span class="n">M</span><span class="p">,</span> <span class="n">N</span><span class="p">).</span><span class="nf">astype</span><span class="p">(</span><span class="n">np</span><span class="p">.</span><span class="n">float32</span><span class="p">)</span>
<span class="n">B</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">random</span><span class="p">.</span><span class="nf">rand</span><span class="p">(</span><span class="n">N</span><span class="p">,</span> <span class="n">K</span><span class="p">).</span><span class="nf">astype</span><span class="p">(</span><span class="n">np</span><span class="p">.</span><span class="n">float32</span><span class="p">)</span>
<span class="n">C</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="nf">zeros</span><span class="p">((</span><span class="n">M</span><span class="p">,</span> <span class="n">K</span><span class="p">),</span> <span class="n">dtype</span><span class="o">=</span><span class="n">np</span><span class="p">.</span><span class="n">float32</span><span class="p">)</span>  <span class="c1"># Initialize C with zeros
</span>
<span class="c1"># Call the compiled function
</span><span class="n">matrix_mul</span><span class="p">.</span><span class="nf">matmul</span><span class="p">(</span><span class="n">A</span><span class="p">,</span> <span class="n">B</span><span class="p">,</span> <span class="n">C</span><span class="p">)</span>

<span class="c1"># Verify with NumPy
</span><span class="n">C_np</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="nf">dot</span><span class="p">(</span><span class="n">A</span><span class="p">,</span> <span class="n">B</span><span class="p">)</span>

<span class="c1"># Check if the results match
</span><span class="k">assert</span> <span class="n">np</span><span class="p">.</span><span class="nf">allclose</span><span class="p">(</span><span class="n">C</span><span class="p">,</span> <span class="n">C_np</span><span class="p">),</span> <span class="sa">f</span><span class="sh">"</span><span class="s">something went wrong, C and C_np are not equal</span><span class="sh">"</span>

<span class="nf">print</span><span class="p">(</span><span class="sa">f</span><span class="sh">"</span><span class="s">Tests passed!</span><span class="sh">"</span><span class="p">)</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>The first part adds the path to the library we just compile so that the python interpreter can find it. The rest of the script is self explanatory, we use <code class="language-plaintext highlighter-rouge">numpy</code> to compare the two matrix multiplications. To start “fresh” we create a new python environment and call the script</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
</pre></td><td class="rouge-code"><pre><span class="nb">rm</span> <span class="nt">-rf</span> .venv_test
python <span class="nt">-m</span> venv .venv_test
.venv_test/bin/pip <span class="nb">install</span> <span class="nt">--upgrade</span> pip

<span class="c"># install numpy, required for the script tests/test_matmul.py</span>
.venv_test/bin/pip <span class="nb">install </span>numpy

<span class="c"># run the test</span>
.venv_test/bin/python tests/test_matmul.py
</pre></td></tr></tbody></table></code></pre></div></div>

<p>When executing this you should see <code class="language-plaintext highlighter-rouge">Tests passed!</code> as the last output. Congrats! You have learned the basic of Python bindings for C++ extensions.</p>]]></content><author><name>Sebastia Agramunt Puig</name></author><category term="C++" /><category term="Python" /><category term="computer science" /><summary type="html"><![CDATA[C++ and Python are very different programming languages, the first one is compiled and low level whereas the second one is interpreted. C++ is a lot faster than Python but, can we leverage the performance of C++ and the versatility in Python?. Yes, we can do such thing writing C++ extensions and create bindings for Python. In this post we will create a python package with compiled code using pybind11 library to create the python bindings. As usual you have the blog with the code.]]></summary></entry><entry><title type="html">Make and makefiles</title><link href="https://agramunt.me/posts/cpp-make/" rel="alternate" type="text/html" title="Make and makefiles" /><published>2025-01-28T15:05:00-08:00</published><updated>2025-01-28T15:05:00-08:00</updated><id>https://agramunt.me/posts/cpp-make</id><content type="html" xml:base="https://agramunt.me/posts/cpp-make/"><![CDATA[<p>n previous posts we have been compiling and linking projects using bash commands, we had to write all commands explicitly to build the projects. This is ok for small projects with few source files but oftentimes in large projects we need to compile hundreds of source files and link them to different libraries which makes the build complex. This is what <code class="language-plaintext highlighter-rouge">make</code> was invented for.</p>

<p>The <a href="https://www.gnu.org/software/make/manual/make.html">GNU make</a> utility was written in 1977 by <a href="https://en.wikipedia.org/wiki/Stuart_Feldman">Stuart Feldman</a> at Bell Labs and its purpose was to automate the build process, replacing manual shell scripts for compiling and linking large projects. Other (more modern) build systems for C++ projects are <a href="https://scons.org/">SCons</a>, <a href="https://cmake.org/">CMake</a>, <a href="https://bazel.build/">Bazel</a>, and <a href="https://ninja-build.org/">Ninja</a>. Even though <code class="language-plaintext highlighter-rouge">make</code> is old is still widely used in the industry, specially for small projects.</p>

<p>In this post we will learn how make works as always illustrating it with an example. The entire example can be found in my github repository <a href="https://github.com/SebastiaAgramunt/blogging-code">blogging-code</a>, in the subdirectory <a href="https://github.com/SebastiaAgramunt/blogging-code/tree/main/cpp-makefile">cpp-makefile</a>.</p>

<h1 id="tldr">TLDR</h1>

<p>With the filestructure defned in the following section, the makefile can be</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
</pre></td><td class="rouge-code"><pre><span class="c"># Compiler and flags</span>
CXX <span class="o">=</span> g++
CXXFLAGS <span class="o">=</span> <span class="nt">-std</span><span class="o">=</span>c++17 <span class="nt">-Iinclude</span>

<span class="c"># Directories</span>
SRC_DIR <span class="o">=</span> src
INCLUDE_DIR <span class="o">=</span> include
BUILD_DIR <span class="o">=</span> build
OBJ_DIR <span class="o">=</span> <span class="si">$(</span>BUILD_DIR<span class="si">)</span>/obj
BIN_DIR <span class="o">=</span> <span class="si">$(</span>BUILD_DIR<span class="si">)</span>/bin

<span class="c"># Target executable name</span>
TARGET <span class="o">=</span> <span class="si">$(</span>BIN_DIR<span class="si">)</span>/main

<span class="c"># # Find all source files and corresponding object files</span>
SRCS <span class="o">=</span> <span class="si">$(</span>wildcard <span class="si">$(</span>SRC_DIR<span class="si">)</span>/<span class="k">*</span>.cpp<span class="si">)</span>
OBJS <span class="o">=</span> <span class="si">$(</span>patsubst <span class="si">$(</span>SRC_DIR<span class="si">)</span>/%.cpp, <span class="si">$(</span>OBJ_DIR<span class="si">)</span>/%.o, <span class="si">$(</span>SRCS<span class="si">))</span>

<span class="c"># Default target</span>
all:
	@echo <span class="s2">"Available options:"</span>
	@echo <span class="s2">"  build  - Build the project"</span>
	@echo <span class="s2">"  clean  - Remove all build files"</span>
	@echo <span class="s2">"  help   - Show this message"</span>

<span class="c"># Build target</span>
build: <span class="si">$(</span>TARGET<span class="si">)</span>

<span class="c"># Rule to build the executable</span>
<span class="si">$(</span>TARGET<span class="si">)</span>: <span class="si">$(</span>OBJS<span class="si">)</span>
	@mkdir <span class="nt">-p</span> <span class="si">$(</span>BIN_DIR<span class="si">)</span>
	<span class="si">$(</span>CXX<span class="si">)</span> <span class="si">$(</span>CXXFLAGS<span class="si">)</span> <span class="nt">-o</span> <span class="nv">$@</span> <span class="nv">$^</span>

<span class="c"># Rule to build object files</span>
<span class="si">$(</span>OBJ_DIR<span class="si">)</span>/%.o: <span class="si">$(</span>SRC_DIR<span class="si">)</span>/%.cpp
	@mkdir <span class="nt">-p</span> <span class="si">$(</span>OBJ_DIR<span class="si">)</span>
	<span class="si">$(</span>CXX<span class="si">)</span> <span class="si">$(</span>CXXFLAGS<span class="si">)</span> <span class="nt">-c</span> <span class="nv">$&lt;</span> <span class="nt">-o</span> <span class="nv">$@</span>

<span class="c"># Clean up build files</span>
clean:
	<span class="nb">rm</span> <span class="nt">-rf</span> <span class="si">$(</span>BUILD_DIR<span class="si">)</span>

<span class="c"># Help target</span>
<span class="nb">help</span>: all

<span class="c"># Phony targets</span>
.PHONY: all clean build <span class="nb">help</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>That can be executed with</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
</pre></td><td class="rouge-code"><pre>make
make clean
make build
./build/bin/main
</pre></td></tr></tbody></table></code></pre></div></div>

<p>that displays the helper, builds the program and executes the main</p>
<h2 id="file-structure">File structure</h2>

<p>As always for our example we define here the file structure and contents so that you can copy-paste the example and run it yourself. The file structure is</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
</pre></td><td class="rouge-code"><pre><span class="nb">.</span>
├── Makefile
├── include
│   └── matmul.h
└── src
    ├── main.cpp
    └── matmul.cpp
</pre></td></tr></tbody></table></code></pre></div></div>

<p>with <code class="language-plaintext highlighter-rouge">matmul.h</code>:</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
</pre></td><td class="rouge-code"><pre><span class="cp">#ifndef MATMUL_H
#define MATMUL_H
</span>
<span class="kt">void</span> <span class="nf">matmul</span><span class="p">(</span><span class="k">const</span> <span class="kt">int</span><span class="o">*</span> <span class="n">A</span><span class="p">,</span> <span class="k">const</span> <span class="kt">int</span><span class="o">*</span> <span class="n">B</span><span class="p">,</span> <span class="kt">int</span><span class="o">*</span> <span class="n">C</span><span class="p">,</span> <span class="kt">int</span> <span class="n">M</span><span class="p">,</span> <span class="kt">int</span> <span class="n">N</span><span class="p">,</span> <span class="kt">int</span> <span class="n">K</span><span class="p">);</span>
<span class="kt">void</span> <span class="nf">printmatrix</span><span class="p">(</span><span class="k">const</span> <span class="kt">int</span><span class="o">*</span> <span class="n">A</span><span class="p">,</span> <span class="kt">int</span> <span class="n">M</span><span class="p">,</span> <span class="kt">int</span> <span class="n">N</span><span class="p">);</span>

<span class="cp">#endif
</span></pre></td></tr></tbody></table></code></pre></div></div>

<p>and <code class="language-plaintext highlighter-rouge">matmul.cpp</code>:</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
</pre></td><td class="rouge-code"><pre><span class="cp">#include</span> <span class="cpf">&lt;iostream&gt;</span><span class="cp">
#include</span> <span class="cpf">"matmul.h"</span><span class="cp">
</span>
<span class="c1">// Matrices are indexed row-major in this example. E.g. if A is [M x N]</span>
<span class="c1">// If i,j are the row and column indices, the element A[i, j] is</span>
<span class="c1">// A[i, j] = A[i * N + j] // if row-index</span>
<span class="c1">// A[i, j] = A[j * M + i] // if column-index</span>

<span class="kt">void</span> <span class="nf">matmul</span><span class="p">(</span><span class="k">const</span> <span class="kt">int</span><span class="o">*</span> <span class="n">A</span><span class="p">,</span> <span class="k">const</span> <span class="kt">int</span><span class="o">*</span> <span class="n">B</span><span class="p">,</span> <span class="kt">int</span><span class="o">*</span> <span class="n">C</span><span class="p">,</span> <span class="kt">int</span> <span class="n">M</span><span class="p">,</span> <span class="kt">int</span> <span class="n">N</span><span class="p">,</span> <span class="kt">int</span> <span class="n">K</span><span class="p">){</span>
<span class="c1">// Matrix multiplication, C[M x K] = A[M x N] * B[N x K]</span>
<span class="c1">// Multiplication is $\sum_n A[m, n] * B[n, k]$</span>
    <span class="k">for</span><span class="p">(</span><span class="kt">int</span> <span class="n">m</span><span class="o">=</span><span class="mi">0</span><span class="p">;</span> <span class="n">m</span><span class="o">&lt;</span><span class="n">M</span><span class="p">;</span> <span class="n">m</span><span class="o">++</span><span class="p">){</span>
        <span class="k">for</span><span class="p">(</span><span class="kt">int</span> <span class="n">k</span><span class="o">=</span><span class="mi">0</span><span class="p">;</span> <span class="n">k</span><span class="o">&lt;</span><span class="n">K</span><span class="p">;</span> <span class="n">k</span><span class="o">++</span><span class="p">){</span>
            <span class="n">C</span><span class="p">[</span><span class="n">m</span> <span class="o">*</span> <span class="n">K</span> <span class="o">+</span> <span class="n">k</span><span class="p">]</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
            <span class="k">for</span><span class="p">(</span><span class="kt">int</span> <span class="n">n</span><span class="o">=</span><span class="mi">0</span><span class="p">;</span> <span class="n">n</span><span class="o">&lt;</span><span class="n">N</span><span class="p">;</span> <span class="n">n</span><span class="o">++</span><span class="p">){</span>
                <span class="n">C</span><span class="p">[</span><span class="n">m</span> <span class="o">*</span> <span class="n">K</span> <span class="o">+</span> <span class="n">k</span><span class="p">]</span> <span class="o">+=</span> <span class="n">A</span><span class="p">[</span><span class="n">m</span> <span class="o">*</span> <span class="n">N</span> <span class="o">+</span> <span class="n">n</span><span class="p">]</span> <span class="o">*</span> <span class="n">B</span><span class="p">[</span><span class="n">n</span> <span class="o">*</span> <span class="n">K</span> <span class="o">+</span> <span class="n">k</span><span class="p">];</span>
            <span class="p">}</span>
        <span class="p">}</span>
    <span class="p">}</span>
<span class="p">}</span>

<span class="kt">void</span> <span class="nf">printmatrix</span><span class="p">(</span><span class="k">const</span> <span class="kt">int</span><span class="o">*</span> <span class="n">A</span><span class="p">,</span> <span class="kt">int</span> <span class="n">M</span><span class="p">,</span> <span class="kt">int</span> <span class="n">N</span><span class="p">)</span> <span class="p">{</span>
    <span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">i</span> <span class="o">&lt;</span> <span class="n">M</span><span class="p">;</span> <span class="o">++</span><span class="n">i</span><span class="p">)</span> <span class="p">{</span>
        <span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">j</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">j</span> <span class="o">&lt;</span> <span class="n">N</span><span class="p">;</span> <span class="o">++</span><span class="n">j</span><span class="p">)</span> <span class="p">{</span>
            <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="n">A</span><span class="p">[</span><span class="n">i</span> <span class="o">*</span> <span class="n">N</span> <span class="o">+</span> <span class="n">j</span><span class="p">]</span> <span class="o">&lt;&lt;</span> <span class="s">" "</span><span class="p">;</span>
        <span class="p">}</span>
        <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span>
    <span class="p">}</span>
<span class="p">}</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>and finally <code class="language-plaintext highlighter-rouge">main.cpp</code></p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
</pre></td><td class="rouge-code"><pre><span class="cp">#include</span> <span class="cpf">&lt;iostream&gt;</span><span class="cp">
#include</span> <span class="cpf">"matmul.h"</span><span class="cp">
</span>
<span class="kt">int</span> <span class="nf">main</span><span class="p">(</span><span class="kt">void</span><span class="p">){</span>

    <span class="c1">// A[M x N]</span>
    <span class="kt">int</span> <span class="n">M</span> <span class="o">=</span> <span class="mi">2</span><span class="p">;</span>
    <span class="kt">int</span> <span class="n">N</span> <span class="o">=</span> <span class="mi">3</span><span class="p">;</span> 
    <span class="kt">int</span><span class="o">*</span> <span class="n">A</span> <span class="o">=</span> <span class="k">new</span> <span class="kt">int</span><span class="p">[</span><span class="n">M</span> <span class="o">*</span> <span class="n">N</span><span class="p">];</span>

    <span class="k">for</span><span class="p">(</span><span class="kt">int</span> <span class="n">i</span><span class="o">=</span><span class="mi">0</span><span class="p">;</span> <span class="n">i</span> <span class="o">&lt;</span> <span class="n">M</span> <span class="o">*</span> <span class="n">N</span><span class="p">;</span> <span class="n">i</span><span class="o">++</span><span class="p">){</span>
        <span class="n">A</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">=</span> <span class="n">i</span><span class="p">;</span>
    <span class="p">}</span>
    <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="n">std</span><span class="o">::</span><span class="n">endl</span> <span class="o">&lt;&lt;</span> <span class="s">"A:"</span> <span class="o">&lt;&lt;</span> <span class="n">std</span><span class="o">::</span><span class="n">endl</span><span class="p">;</span>
    <span class="n">printmatrix</span><span class="p">(</span><span class="n">A</span><span class="p">,</span> <span class="n">M</span><span class="p">,</span> <span class="n">N</span><span class="p">);</span>

    <span class="c1">// B[N x K]</span>
    <span class="kt">int</span> <span class="n">K</span> <span class="o">=</span> <span class="mi">4</span><span class="p">;</span>
    <span class="kt">int</span><span class="o">*</span> <span class="n">B</span> <span class="o">=</span> <span class="k">new</span> <span class="kt">int</span><span class="p">[</span><span class="n">N</span> <span class="o">*</span> <span class="n">K</span><span class="p">];</span>

    <span class="k">for</span><span class="p">(</span><span class="kt">int</span> <span class="n">i</span><span class="o">=</span><span class="mi">0</span><span class="p">;</span> <span class="n">i</span> <span class="o">&lt;</span> <span class="n">N</span> <span class="o">*</span> <span class="n">K</span><span class="p">;</span> <span class="n">i</span><span class="o">++</span><span class="p">){</span>
        <span class="n">B</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">=</span> <span class="n">i</span><span class="p">;</span>
    <span class="p">}</span>
    <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="n">std</span><span class="o">::</span><span class="n">endl</span> <span class="o">&lt;&lt;</span> <span class="s">"B:"</span> <span class="o">&lt;&lt;</span> <span class="n">std</span><span class="o">::</span><span class="n">endl</span><span class="p">;</span>
    <span class="n">printmatrix</span><span class="p">(</span><span class="n">B</span><span class="p">,</span> <span class="n">N</span><span class="p">,</span> <span class="n">K</span><span class="p">);</span>

    <span class="c1">// C[M x K]</span>
    <span class="kt">int</span><span class="o">*</span> <span class="n">C</span> <span class="o">=</span> <span class="k">new</span> <span class="kt">int</span><span class="p">[</span><span class="n">M</span> <span class="o">*</span> <span class="n">K</span><span class="p">];</span>
    <span class="n">matmul</span><span class="p">(</span><span class="n">A</span><span class="p">,</span> <span class="n">B</span><span class="p">,</span> <span class="n">C</span><span class="p">,</span> <span class="n">M</span><span class="p">,</span> <span class="n">N</span><span class="p">,</span> <span class="n">K</span><span class="p">);</span>

    <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="n">std</span><span class="o">::</span><span class="n">endl</span> <span class="o">&lt;&lt;</span> <span class="s">"C = A x B: "</span> <span class="o">&lt;&lt;</span> <span class="n">std</span><span class="o">::</span><span class="n">endl</span><span class="p">;</span>
    <span class="n">printmatrix</span><span class="p">(</span><span class="n">C</span><span class="p">,</span> <span class="n">M</span><span class="p">,</span> <span class="n">K</span><span class="p">);</span>

    <span class="k">return</span> <span class="mi">0</span><span class="p">;</span>
<span class="p">}</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>We have used this example before, this is just a matrix multiplication example. The idea is to compile the main as an executable using <code class="language-plaintext highlighter-rouge">make</code>.</p>
<h2 id="make-basics">Make basics</h2>

<p>The process of compilation can be thought of as a graph: to generate a executable or library we need first to compile all the source files to objects and then link all the objects into the executable (or library). In more complex projects there could be even more compilation steps. Make has also the advantage of compiling only the files that have changed, this is crucial for large projects as it could take several minutes to compile the code again from scratch while there is no need if the source file hasn’t changed. Check the compilation time of OpenCV in a previous post.</p>

<p>To run <code class="language-plaintext highlighter-rouge">make</code> we create a file named <code class="language-plaintext highlighter-rouge">Makefile</code> with the compilation instructions and then run <code class="language-plaintext highlighter-rouge">make</code>:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre><span class="nb">touch </span>Makefile
make
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Since the <code class="language-plaintext highlighter-rouge">Makefile</code> is empty (there are no rules) <code class="language-plaintext highlighter-rouge">make</code> will complain with <code class="language-plaintext highlighter-rouge">make: *** No targets.  Stop.</code>. Let’s add a rule in the file</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre>hello:
	<span class="nb">echo</span> <span class="s2">"Hi there"</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Now run <code class="language-plaintext highlighter-rouge">make hello</code> and it will print into your screen the “Hi there”. In essence the makefile contains rules, a rule has the following syntax</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
</pre></td><td class="rouge-code"><pre>targets: prerequisites
	<span class="nb">command
	command
	command</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>The targets are normally files to be compiled (object files) and the prerequisites the source files, but before jumping to that, let’s understand the essence of the graph calculation. Modify the makefile to contain</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
</pre></td><td class="rouge-code"><pre>calculate_1:
	<span class="nb">echo</span> <span class="s2">"calculate_1"</span>

calculate_2:
	<span class="nb">echo</span> <span class="s2">"calculate_2"</span>

calculate_3: calculate_1
	<span class="nb">echo</span> <span class="s2">"calculate_3"</span>

calculate_4: calculate_1 calculate_2
	<span class="nb">echo</span> <span class="s2">"calculate_4"</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Running <code class="language-plaintext highlighter-rouge">make calculate_1</code> will just print “calculate_1”, similarly for <code class="language-plaintext highlighter-rouge">make calculate_2</code>. These two rules don’t depend on any other rule. However if we run <code class="language-plaintext highlighter-rouge">make calculate_3</code> it will print first “calculate_1” and then “calculate_3” as the third calculation depends on the first (by the prerequisites in the rule). A similar case will happen in <code class="language-plaintext highlighter-rouge">make calculate_4</code>, this time it will print first “calculate_1” and then “calculate_2” before printing “calculate_4”. This describes the nature of makefile, you can nest this as much as you want to generate a direct acyclic graph of your bash commands.</p>

<p>But make is more than just instructions, it is intrinsically linked to files. Let me explain this with an example. Create a file “calculate_1” and try to run the calculate_1 rule from the previous makefile.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre><span class="nb">touch </span>calculate_1
make calculate_1
</pre></td></tr></tbody></table></code></pre></div></div>

<p>you will be prompted with <code class="language-plaintext highlighter-rouge">make: 'calculate_1' is up to date.</code>. Indeed!, make interprets that since there is a file in this directory named <code class="language-plaintext highlighter-rouge">calculate_1</code> it has already been “compiled” and there is nothing to do for this rule. This is very useful when you have compilation errors in certain files, the files that are compiled successfully won’t be compiled again if you re-run make.</p>

<h2 id="simple-make-build-and-run-the-example">Simple make: build and run the example</h2>

<p>Let’s write a simple makefile to compile and link the program:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
</pre></td><td class="rouge-code"><pre>matmul.o:
	g++ <span class="nt">-std</span><span class="o">=</span>c++17 <span class="nt">-Iinclude</span> <span class="nt">-c</span> src/matmul.cpp <span class="nt">-o</span> matmul.o

main.o:
	g++ <span class="nt">-std</span><span class="o">=</span>c++17 <span class="nt">-Iinclude</span> <span class="nt">-c</span> src/main.cpp <span class="nt">-o</span> main.o

compile: matmul.o main.o
	g++ matmul.o main.o <span class="nt">-o</span> main
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Now run</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre>make compile
./main
</pre></td></tr></tbody></table></code></pre></div></div>

<p>to compile and run the program. See that we have specified a graph here. To run <code class="language-plaintext highlighter-rouge">compile</code> we need to have the files <code class="language-plaintext highlighter-rouge">matmul.o</code> and <code class="language-plaintext highlighter-rouge">main.o</code>.  See that the files will be generated in the current directory, we can modify that by writing the path in the rule <code class="language-plaintext highlighter-rouge">build/obj/matmul.o</code> for instance.</p>

<h2 id="automatic-variables">Automatic variables</h2>

<p>There are some special characters defined in the <a href="https://www.gnu.org/software/make/manual/html_node/Automatic-Variables.html">make documentation</a> called automatic variables. These are very useful but rarely explained through examples. Before jumping to a complete makefile we’ll explain some of them</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
</pre></td><td class="rouge-code"><pre>output.txt: input1.txt input2.txt input1.txt
	<span class="nb">echo</span> <span class="s2">"Target: </span><span class="nv">$@</span><span class="s2">"</span> <span class="o">&gt;</span> <span class="nv">$@</span>
	<span class="nb">echo</span> <span class="s2">"First prerequisite: </span><span class="nv">$&lt;</span><span class="s2">"</span> <span class="o">&gt;&gt;</span> <span class="nv">$@</span>
	<span class="nb">echo</span> <span class="s2">"Updated prerequisites: </span><span class="nv">$?</span><span class="s2">"</span> <span class="o">&gt;&gt;</span> <span class="nv">$@</span>
	<span class="nb">echo</span> <span class="s2">"All prerequisites (unique): </span><span class="nv">$^</span><span class="s2">"</span> <span class="o">&gt;&gt;</span> <span class="nv">$@</span>
	<span class="nb">echo</span> <span class="s2">"All prerequisites (with duplicates): </span><span class="nv">$+</span><span class="s2">"</span> <span class="o">&gt;&gt;</span> <span class="nv">$@</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">input1.txt</code>, <code class="language-plaintext highlighter-rouge">input2.txt</code>, and <code class="language-plaintext highlighter-rouge">input1.txt</code> are listed as prerequisites. Notice that <code class="language-plaintext highlighter-rouge">input1.txt</code> is repeated to demonstrate <code class="language-plaintext highlighter-rouge">$^</code> (unique) vs. <code class="language-plaintext highlighter-rouge">$+</code> (duplicates included).</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">$@</code>: Refers to the target, <code class="language-plaintext highlighter-rouge">output.txt</code>.</li>
  <li><code class="language-plaintext highlighter-rouge">$&lt;</code>: Refers to the first prerequisite, <code class="language-plaintext highlighter-rouge">input1.txt</code>.</li>
  <li><code class="language-plaintext highlighter-rouge">$?</code>: Lists all prerequisites that are newer than the target. This is dynamic and depends on file timestamps.</li>
  <li><code class="language-plaintext highlighter-rouge">$^</code>: Lists all unique prerequisites (<code class="language-plaintext highlighter-rouge">input1.txt input2.txt</code>).</li>
  <li><code class="language-plaintext highlighter-rouge">$+</code>: Lists all prerequisites, including duplicates (<code class="language-plaintext highlighter-rouge">input1.txt input2.txt input1.txt</code>).</li>
</ul>

<p>As an example create the prerequisites with</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre><span class="nb">echo </span>Hello from input1! <span class="o">&gt;</span> input1.txt
<span class="nb">echo </span>Hello from input2! <span class="o">&gt;</span> input2.txt
</pre></td></tr></tbody></table></code></pre></div></div>

<p>and run <code class="language-plaintext highlighter-rouge">make output.txt</code>, the result in <code class="language-plaintext highlighter-rouge">output.txt</code> will be:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
</pre></td><td class="rouge-code"><pre>Target: output.txt
First prerequisite: input1.txt
Updated prerequisites: input1.txt input2.txt
All prerequisites <span class="o">(</span>unique<span class="o">)</span>: input1.txt input2.txt
All prerequisites <span class="o">(</span>with duplicates<span class="o">)</span>: input1.txt input2.txt input1.txt
</pre></td></tr></tbody></table></code></pre></div></div>

<p>We will use some of these automatic variables to write a more robust makefile.</p>
<h2 id="a-more-complete-makefile">A more complete makefile</h2>

<p>Let’s write a proper makefile this time, now that we understand the basics.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
</pre></td><td class="rouge-code"><pre><span class="c"># Compiler and flags</span>
CXX <span class="o">=</span> g++
CXXFLAGS <span class="o">=</span> <span class="nt">-std</span><span class="o">=</span>c++17 <span class="nt">-Iinclude</span>

<span class="c"># Directories</span>
SRC_DIR <span class="o">=</span> src
INCLUDE_DIR <span class="o">=</span> include
BUILD_DIR <span class="o">=</span> build
OBJ_DIR <span class="o">=</span> <span class="si">$(</span>BUILD_DIR<span class="si">)</span>/obj
BIN_DIR <span class="o">=</span> <span class="si">$(</span>BUILD_DIR<span class="si">)</span>/bin

<span class="c"># Target executable name</span>
TARGET <span class="o">=</span> <span class="si">$(</span>BIN_DIR<span class="si">)</span>/main

<span class="c"># # Find all source files and corresponding object files</span>
SRCS <span class="o">=</span> <span class="si">$(</span>wildcard <span class="si">$(</span>SRC_DIR<span class="si">)</span>/<span class="k">*</span>.cpp<span class="si">)</span>
OBJS <span class="o">=</span> <span class="si">$(</span>patsubst <span class="si">$(</span>SRC_DIR<span class="si">)</span>/%.cpp, <span class="si">$(</span>OBJ_DIR<span class="si">)</span>/%.o, <span class="si">$(</span>SRCS<span class="si">))</span>

<span class="c"># Default target</span>
all: <span class="si">$(</span>TARGET<span class="si">)</span>

<span class="c"># Rule to build the executable</span>
<span class="si">$(</span>TARGET<span class="si">)</span>: <span class="si">$(</span>OBJS<span class="si">)</span>
	@mkdir <span class="nt">-p</span> <span class="si">$(</span>BIN_DIR<span class="si">)</span>
	<span class="si">$(</span>CXX<span class="si">)</span> <span class="si">$(</span>CXXFLAGS<span class="si">)</span> <span class="nt">-o</span> <span class="nv">$@</span> <span class="nv">$^</span>

<span class="c"># Rule to build object files</span>
<span class="si">$(</span>OBJ_DIR<span class="si">)</span>/%.o: <span class="si">$(</span>SRC_DIR<span class="si">)</span>/%.cpp
	@mkdir <span class="nt">-p</span> <span class="si">$(</span>OBJ_DIR<span class="si">)</span>
	<span class="si">$(</span>CXX<span class="si">)</span> <span class="si">$(</span>CXXFLAGS<span class="si">)</span> <span class="nt">-c</span> <span class="nv">$&lt;</span> <span class="nt">-o</span> <span class="nv">$@</span>

<span class="c"># Clean up build files</span>
clean:
	<span class="nb">rm</span> <span class="nt">-rf</span> <span class="si">$(</span>BUILD_DIR<span class="si">)</span>

<span class="c"># Phony targets</span>
.PHONY: all clean

</pre></td></tr></tbody></table></code></pre></div></div>

<p>I know, this is a lot… let’s explain line by line.</p>

<p>In the first two lines we define the compiler <code class="language-plaintext highlighter-rouge">g++</code> and the flags used to compile the files <code class="language-plaintext highlighter-rouge">-std=c++17 -Iinclude</code>, the C++ standard 17 and the include directories.</p>

<p>Next we define the paths, <code class="language-plaintext highlighter-rouge">src</code>, <code class="language-plaintext highlighter-rouge">include</code> and the build directories as usual <code class="language-plaintext highlighter-rouge">build/obj</code> and <code class="language-plaintext highlighter-rouge">build/bin</code> (this time we won’t compile any library).</p>

<p>The target is the variable target, the file <code class="language-plaintext highlighter-rouge">build/bin/main</code>. This is the main rule we are going to execute.</p>

<p>The sources and object files could be specified explicitly with</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre>SRCS <span class="o">=</span> src/matmul.cpp src/main.cpp 
OBJS <span class="o">=</span> build/obj/matmul.o build/obj/main.o
</pre></td></tr></tbody></table></code></pre></div></div>

<p>but, it’s better to use the <code class="language-plaintext highlighter-rouge">wildcard</code> and <code class="language-plaintext highlighter-rouge">patsubst</code> commands. The first command finds all the files in the <code class="language-plaintext highlighter-rouge">SRC_DIR</code> that end with <code class="language-plaintext highlighter-rouge">.cpp</code>, the second is used to substitute the names of the files ending with <code class="language-plaintext highlighter-rouge">.cpp</code> to end with <code class="language-plaintext highlighter-rouge">.o</code>, so we construct the object paths and names. This is very convenient if our sources are in the same directory, if we have a nested directory we can do this operation several times, one per path.</p>

<p>Next the <code class="language-plaintext highlighter-rouge">all</code> rule. This one is the rule that is going to be executed when calling <code class="language-plaintext highlighter-rouge">make</code> without specifying any rule. The default target.</p>

<p>The <code class="language-plaintext highlighter-rouge">$(TARGET)</code> rule is (by substituting the variable) <code class="language-plaintext highlighter-rouge">build/bin/main</code> but this rule has the requirements of the object files <code class="language-plaintext highlighter-rouge">$(OBJ)</code> (that’s <code class="language-plaintext highlighter-rouge">build/obj/matmul.o build/obj/main.o</code>). To build the objects first, the target needs to build <code class="language-plaintext highlighter-rouge">$(OBJ_DIR)/%.o</code>, that expands all the objects paths. This rule creates the objects directory first and then compiles the objects (recall the <code class="language-plaintext highlighter-rouge">-c</code> flag for compilation to object). The command <code class="language-plaintext highlighter-rouge">$&lt;</code> , as explained previously, represents the all the input (source files, prerequisites in this case) and <code class="language-plaintext highlighter-rouge">$@</code> the name of the target output.</p>

<p>Going back to the rule <code class="language-plaintext highlighter-rouge">$(TARGET)</code> after the objects have been compiled, we create first the binary directory and then link the objects. For that the <code class="language-plaintext highlighter-rouge">-o</code> flag with <code class="language-plaintext highlighter-rouge">$@</code> representing the input (<code class="language-plaintext highlighter-rouge">build/bin/main</code>) and <code class="language-plaintext highlighter-rouge">$^</code> all the prerequisites (all the object files).</p>

<p>Finally we write a <code class="language-plaintext highlighter-rouge">clean</code> rule that removes the build dir. See that we define <code class="language-plaintext highlighter-rouge">.PHONY</code> targets to be <code class="language-plaintext highlighter-rouge">all</code> and <code class="language-plaintext highlighter-rouge">clean</code>. A phony target is not associated with any actual file; instead, it represents an action or a command. By declaring a target as phony, you ensure that <code class="language-plaintext highlighter-rouge">make</code> will always execute the associated recipe, regardless of whether a file with the same name as the target exists in the filesystem.  Why use phony then? if a file named <code class="language-plaintext highlighter-rouge">clean</code> exists in the directory, running <code class="language-plaintext highlighter-rouge">make clean</code> without <code class="language-plaintext highlighter-rouge">.PHONY</code> would check the timestamp of the <code class="language-plaintext highlighter-rouge">clean</code> file and conclude that the target is “up to date.” This prevents the <code class="language-plaintext highlighter-rouge">clean</code> recipe from running. Declaring <code class="language-plaintext highlighter-rouge">clean</code> as a phony target ensures the recipe is executed regardless of such a file’s presence.</p>

<p>Another nice addition is a helper menu. Make the default <code class="language-plaintext highlighter-rouge">all</code> to print the targets on screen. To do that let me modify part of the makefile. For simplicity I include the entire new makefile</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
</pre></td><td class="rouge-code"><pre><span class="c"># Compiler and flags</span>
CXX <span class="o">=</span> g++
CXXFLAGS <span class="o">=</span> <span class="nt">-std</span><span class="o">=</span>c++17 <span class="nt">-Iinclude</span>

<span class="c"># Directories</span>
SRC_DIR <span class="o">=</span> src
INCLUDE_DIR <span class="o">=</span> include
BUILD_DIR <span class="o">=</span> build
OBJ_DIR <span class="o">=</span> <span class="si">$(</span>BUILD_DIR<span class="si">)</span>/obj
BIN_DIR <span class="o">=</span> <span class="si">$(</span>BUILD_DIR<span class="si">)</span>/bin

<span class="c"># Target executable name</span>
TARGET <span class="o">=</span> <span class="si">$(</span>BIN_DIR<span class="si">)</span>/main

<span class="c"># # Find all source files and corresponding object files</span>
SRCS <span class="o">=</span> <span class="si">$(</span>wildcard <span class="si">$(</span>SRC_DIR<span class="si">)</span>/<span class="k">*</span>.cpp<span class="si">)</span>
OBJS <span class="o">=</span> <span class="si">$(</span>patsubst <span class="si">$(</span>SRC_DIR<span class="si">)</span>/%.cpp, <span class="si">$(</span>OBJ_DIR<span class="si">)</span>/%.o, <span class="si">$(</span>SRCS<span class="si">))</span>

<span class="c"># Default target</span>
all:
	@echo <span class="s2">"Available options:"</span>
	@echo <span class="s2">"  build  - Build the project"</span>
	@echo <span class="s2">"  clean  - Remove all build files"</span>
	@echo <span class="s2">"  help   - Show this message"</span>

<span class="c"># Build target</span>
build: <span class="si">$(</span>TARGET<span class="si">)</span>

<span class="c"># Rule to build the executable</span>
<span class="si">$(</span>TARGET<span class="si">)</span>: <span class="si">$(</span>OBJS<span class="si">)</span>
	@mkdir <span class="nt">-p</span> <span class="si">$(</span>BIN_DIR<span class="si">)</span>
	<span class="si">$(</span>CXX<span class="si">)</span> <span class="si">$(</span>CXXFLAGS<span class="si">)</span> <span class="nt">-o</span> <span class="nv">$@</span> <span class="nv">$^</span>

<span class="c"># Rule to build object files</span>
<span class="si">$(</span>OBJ_DIR<span class="si">)</span>/%.o: <span class="si">$(</span>SRC_DIR<span class="si">)</span>/%.cpp
	@mkdir <span class="nt">-p</span> <span class="si">$(</span>OBJ_DIR<span class="si">)</span>
	<span class="si">$(</span>CXX<span class="si">)</span> <span class="si">$(</span>CXXFLAGS<span class="si">)</span> <span class="nt">-c</span> <span class="nv">$&lt;</span> <span class="nt">-o</span> <span class="nv">$@</span>

<span class="c"># Clean up build files</span>
clean:
	<span class="nb">rm</span> <span class="nt">-rf</span> <span class="si">$(</span>BUILD_DIR<span class="si">)</span>

<span class="c"># Help target</span>
<span class="nb">help</span>: all

<span class="c"># Phony targets</span>
.PHONY: all clean build <span class="nb">help</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Again run</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
</pre></td><td class="rouge-code"><pre>make
make clean
make build
./build/bin/main
</pre></td></tr></tbody></table></code></pre></div></div>

<p>to display the help, clean the directory and build again (compile) the directory.</p>

<h2 id="wrap-up">Wrap up</h2>

<p>We showed how to compile an executable using make and a makefile. This tutorial could be expanded with building libraries (static, dynamic) and linking them. But this is just an extension of the logic presented here. Developers still use makefiles, they are easy to comprehend, widely adopted and efficient but <code class="language-plaintext highlighter-rouge">cmake</code> (and other tools like <code class="language-plaintext highlighter-rouge">bazel</code>) is used for bigger projects, we’ll show how to use this tool in a follow up post. But till then, congratulations! Now you can build C++ programs with <code class="language-plaintext highlighter-rouge">make</code>!.</p>]]></content><author><name>Sebastia Agramunt Puig</name></author><category term="C++" /><category term="computer science" /><summary type="html"><![CDATA[n previous posts we have been compiling and linking projects using bash commands, we had to write all commands explicitly to build the projects. This is ok for small projects with few source files but oftentimes in large projects we need to compile hundreds of source files and link them to different libraries which makes the build complex. This is what make was invented for.]]></summary></entry><entry><title type="html">C++ compile and link external library</title><link href="https://agramunt.me/posts/cpp-compile-link-external-lib/" rel="alternate" type="text/html" title="C++ compile and link external library" /><published>2025-01-18T17:05:00-08:00</published><updated>2025-02-18T22:59:29-08:00</updated><id>https://agramunt.me/posts/cpp-compile-link-external-lib</id><content type="html" xml:base="https://agramunt.me/posts/cpp-compile-link-external-lib/"><![CDATA[<p>As mentioned in other posts C++ is a rich programming language that has been out there for a while, it’s predecessor C was created in the 70s and C++ in 1979. Naturally many projects flourished using this language and therefore we have lots of resources out there to use, among them <a href="https://opencv.org/">OpenCV</a>, <a href="https://www.boost.org/">Boost</a>, <a href="https://eigen.tuxfamily.org/index.php?title=Main_Page">Eigen</a> or <a href="https://www.gnu.org/software/gsl/doc/html/cblas.html">cBLAS</a>. In this post we will see an example on how to compile link and use OpenCV in a custom C++ program.</p>

<p>The code to generate the following can be found in the repository <a href="https://github.com/SebastiaAgramunt/blogging-code">blogging-code</a> in the subdirectory <a href="https://github.com/SebastiaAgramunt/blogging-code/tree/main/cpp-compile-link-external-lib">cpp-compile-link-external-lib</a>.</p>

<h2 id="file-structure">File structure</h2>

<p>If I have to use a library that is specific to a project I prefer to have the structure/library inside the same project then the project is self contained. I normally create a bash script that downloads and compiles the library inside a directory called <code class="language-plaintext highlighter-rouge">external</code>. The structure would be</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
</pre></td><td class="rouge-code"><pre>.
├── external
│   └── install-opencv.sh
├── main.cpp
└── scripts
    ├── compile-run.sh
    └── download-img.sh
</pre></td></tr></tbody></table></code></pre></div></div>

<p>where the script <code class="language-plaintext highlighter-rouge">install-opencv.sh</code> contains the routines for download and build the library, the source <code class="language-plaintext highlighter-rouge">main.cpp</code> the example using opencv routines, <code class="language-plaintext highlighter-rouge">compile-run.sh</code> the instructions to compile the <code class="language-plaintext highlighter-rouge">main.cpp</code> and <code class="language-plaintext highlighter-rouge">download-img.sh</code> the code to download the example image.</p>

<h2 id="installing-opencv">Installing Opencv</h2>

<p>The script <code class="language-plaintext highlighter-rouge">install-opencv.sh</code> will install the opencv library version <code class="language-plaintext highlighter-rouge">4.10.0</code></p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
</pre></td><td class="rouge-code"><pre><span class="c">#!/bin/bash</span>

<span class="nv">THIS_DIR</span><span class="o">=</span><span class="si">$(</span><span class="nb">dirname</span> <span class="s2">"</span><span class="si">$(</span><span class="nb">realpath</span> <span class="s2">"</span><span class="nv">$0</span><span class="s2">"</span><span class="si">)</span><span class="s2">"</span><span class="si">)</span>
<span class="nv">ROOT_DIR</span><span class="o">=</span><span class="si">$(</span><span class="nb">dirname</span> <span class="k">${</span><span class="nv">THIS_DIR</span><span class="k">}</span><span class="si">)</span>
<span class="nv">OPENCV_VERSION</span><span class="o">=</span>4.10.0


<span class="c"># library installed in this directory/lib</span>
<span class="nv">LIBDIR</span><span class="o">=</span><span class="k">${</span><span class="nv">THIS_DIR</span><span class="k">}</span>/lib

<span class="c"># download and untar</span>
wget https://github.com/opencv/opencv/archive/refs/tags/<span class="k">${</span><span class="nv">OPENCV_VERSION</span><span class="k">}</span>.tar.gz <span class="nt">-O</span> <span class="k">${</span><span class="nv">THIS_DIR</span><span class="k">}</span>/opencv.tar.gz
<span class="nb">cd</span> <span class="k">${</span><span class="nv">THIS_DIR</span><span class="k">}</span> <span class="o">&amp;&amp;</span> <span class="nb">tar</span> <span class="nt">-xzf</span> <span class="k">${</span><span class="nv">THIS_DIR</span><span class="k">}</span>/opencv.tar.gz

<span class="c"># build the library</span>
<span class="nb">cd</span> <span class="k">${</span><span class="nv">THIS_DIR</span><span class="k">}</span>/opencv-<span class="k">${</span><span class="nv">OPENCV_VERSION</span><span class="k">}</span>
<span class="nb">mkdir</span> <span class="nt">-p</span> build <span class="o">&amp;&amp;</span> <span class="nb">cd </span>build

cmake <span class="nt">-D</span> <span class="nv">CMAKE_BUILD_TYPE</span><span class="o">=</span>Release <span class="se">\</span>
	<span class="nt">-D</span> <span class="nv">CMAKE_INSTALL_PREFIX</span><span class="o">=</span><span class="nv">$LIBDIR</span>/opencv <span class="se">\</span>
	<span class="nt">-D</span> <span class="nv">BUILD_EXAMPLES</span><span class="o">=</span>ON ..

make <span class="nt">-j</span><span class="si">$(</span><span class="nb">nproc</span><span class="si">)</span>
make <span class="nb">install</span>

<span class="c"># remove temporary files</span>
<span class="nb">cd</span> <span class="k">${</span><span class="nv">THIS_DIR</span><span class="k">}</span> <span class="o">&amp;&amp;</span> <span class="nb">rm</span> <span class="nt">-rf</span> opencv-<span class="k">${</span><span class="nv">OPENCV_VERSION</span><span class="k">}</span>
<span class="nb">cd</span> <span class="k">${</span><span class="nv">THIS_DIR</span><span class="k">}</span> <span class="o">&amp;&amp;</span> <span class="nb">rm</span> <span class="nt">-rf</span> opencv.tar.gz

</pre></td></tr></tbody></table></code></pre></div></div>

<p>will install opencv in <code class="language-plaintext highlighter-rouge">external/lib/opencv</code> for us to use. The includes are in <code class="language-plaintext highlighter-rouge">opencv/include</code>, the binaries in <code class="language-plaintext highlighter-rouge">opencv/bin</code> and the compiled libraries in <code class="language-plaintext highlighter-rouge">opencv/lib</code>.</p>

<p>Simply run with</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre><span class="nb">chmod</span> +x external/install-opencv.sh
./external/install-opencv.sh
</pre></td></tr></tbody></table></code></pre></div></div>

<p>And the library will be compiled and installed in <code class="language-plaintext highlighter-rouge">external</code> directory.</p>
<h2 id="download-image">Download image</h2>

<p>To run the example we need to download an image, I’ve chosen a free one from wikimedia commons. Write the script <code class="language-plaintext highlighter-rouge">download-img.sh</code> with the following:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
</pre></td><td class="rouge-code"><pre><span class="c">#!/bin/bash</span>

<span class="nv">THIS_DIR</span><span class="o">=</span><span class="si">$(</span><span class="nb">dirname</span> <span class="s2">"</span><span class="si">$(</span><span class="nb">realpath</span> <span class="s2">"</span><span class="nv">$0</span><span class="s2">"</span><span class="si">)</span><span class="s2">"</span><span class="si">)</span>
<span class="nv">ROOT_DIR</span><span class="o">=</span><span class="si">$(</span><span class="nb">dirname</span> <span class="k">${</span><span class="nv">THIS_DIR</span><span class="k">}</span><span class="si">)</span>

<span class="nb">mkdir</span> <span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/img
wget <span class="s2">"https://upload.wikimedia.org/wikipedia/commons/2/28/20100723_Miyajima_4904.jpg"</span> <span class="nt">-O</span> <span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/img/raw_img.jpeg
</pre></td></tr></tbody></table></code></pre></div></div>

<p>execute it with</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre><span class="nb">chmod</span> +x scripts/download-img.sh
./scripts/download-img.sh
</pre></td></tr></tbody></table></code></pre></div></div>

<p>And this will download the image in <code class="language-plaintext highlighter-rouge">img/raw_img.jpeg</code>.</p>

<h2 id="the-main">The main</h2>

<p>The example consists on a small main that loads a file, converts it to grayscale and finally blurs it with a gaussian blur.  Here is the code:</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
</pre></td><td class="rouge-code"><pre><span class="cp">#include</span> <span class="cpf">&lt;opencv2/opencv.hpp&gt;</span><span class="cp">
#include</span> <span class="cpf">&lt;iostream&gt;</span><span class="cp">
</span>
<span class="kt">int</span> <span class="nf">main</span><span class="p">(</span><span class="kt">int</span> <span class="n">argc</span><span class="p">,</span> <span class="kt">char</span><span class="o">*</span> <span class="n">argv</span><span class="p">[])</span> <span class="p">{</span>
    <span class="c1">// Check if the user provided an argument</span>
    <span class="k">if</span> <span class="p">(</span><span class="n">argc</span> <span class="o">!=</span> <span class="mi">2</span><span class="p">)</span> <span class="p">{</span>
        <span class="n">std</span><span class="o">::</span><span class="n">cerr</span> <span class="o">&lt;&lt;</span> <span class="s">"Usage: "</span> <span class="o">&lt;&lt;</span> <span class="n">argv</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">&lt;&lt;</span> <span class="s">" &lt;image_path&gt;"</span> <span class="o">&lt;&lt;</span> <span class="n">std</span><span class="o">::</span><span class="n">endl</span><span class="p">;</span>
        <span class="k">return</span> <span class="o">-</span><span class="mi">1</span><span class="p">;</span>
    <span class="p">}</span>

    <span class="c1">// Get the image path from the command-line argument</span>
    <span class="n">std</span><span class="o">::</span><span class="n">string</span> <span class="n">imagePath</span> <span class="o">=</span> <span class="n">argv</span><span class="p">[</span><span class="mi">1</span><span class="p">];</span>

    <span class="c1">// Read the image</span>
    <span class="n">cv</span><span class="o">::</span><span class="n">Mat</span> <span class="n">image</span> <span class="o">=</span> <span class="n">cv</span><span class="o">::</span><span class="n">imread</span><span class="p">(</span><span class="n">imagePath</span><span class="p">);</span>

    <span class="c1">// Check if the image was successfully loaded</span>
    <span class="k">if</span> <span class="p">(</span><span class="n">image</span><span class="p">.</span><span class="n">empty</span><span class="p">())</span> <span class="p">{</span>
        <span class="n">std</span><span class="o">::</span><span class="n">cerr</span> <span class="o">&lt;&lt;</span> <span class="s">"Error: Unable to load image at "</span> <span class="o">&lt;&lt;</span> <span class="n">imagePath</span> <span class="o">&lt;&lt;</span> <span class="n">std</span><span class="o">::</span><span class="n">endl</span><span class="p">;</span>
        <span class="k">return</span> <span class="o">-</span><span class="mi">1</span><span class="p">;</span>
    <span class="p">}</span>

    <span class="c1">// Convert the image to grayscale</span>
    <span class="n">cv</span><span class="o">::</span><span class="n">Mat</span> <span class="n">grayImage</span><span class="p">;</span>
    <span class="n">cv</span><span class="o">::</span><span class="n">cvtColor</span><span class="p">(</span><span class="n">image</span><span class="p">,</span> <span class="n">grayImage</span><span class="p">,</span> <span class="n">cv</span><span class="o">::</span><span class="n">COLOR_BGR2GRAY</span><span class="p">);</span>

    <span class="c1">// Apply Gaussian blur</span>
    <span class="n">cv</span><span class="o">::</span><span class="n">Mat</span> <span class="n">blurredImage</span><span class="p">;</span>
    <span class="n">cv</span><span class="o">::</span><span class="n">GaussianBlur</span><span class="p">(</span><span class="n">grayImage</span><span class="p">,</span> <span class="n">blurredImage</span><span class="p">,</span> <span class="n">cv</span><span class="o">::</span><span class="n">Size</span><span class="p">(</span><span class="mi">15</span><span class="p">,</span> <span class="mi">15</span><span class="p">),</span> <span class="mf">5.0</span><span class="p">);</span>

    <span class="c1">// Display the original and processed images</span>
    <span class="n">cv</span><span class="o">::</span><span class="n">imshow</span><span class="p">(</span><span class="s">"Original Image"</span><span class="p">,</span> <span class="n">image</span><span class="p">);</span>
    <span class="n">cv</span><span class="o">::</span><span class="n">imshow</span><span class="p">(</span><span class="s">"Blurred Image"</span><span class="p">,</span> <span class="n">blurredImage</span><span class="p">);</span>

    <span class="c1">// Wait for a key press</span>
    <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"Press any key to exit..."</span> <span class="o">&lt;&lt;</span> <span class="n">std</span><span class="o">::</span><span class="n">endl</span><span class="p">;</span>
    <span class="n">cv</span><span class="o">::</span><span class="n">waitKey</span><span class="p">(</span><span class="mi">0</span><span class="p">);</span>

    <span class="c1">// // Save the processed image</span>
    <span class="n">std</span><span class="o">::</span><span class="n">string</span> <span class="n">outputPath</span> <span class="o">=</span> <span class="s">"blurred_image.jpg"</span><span class="p">;</span>
    <span class="n">cv</span><span class="o">::</span><span class="n">imwrite</span><span class="p">(</span><span class="n">outputPath</span><span class="p">,</span> <span class="n">blurredImage</span><span class="p">);</span>
    <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"Processed image saved to "</span> <span class="o">&lt;&lt;</span> <span class="n">outputPath</span> <span class="o">&lt;&lt;</span> <span class="n">std</span><span class="o">::</span><span class="n">endl</span><span class="p">;</span>

    <span class="k">return</span> <span class="mi">0</span><span class="p">;</span>
<span class="p">}</span>

</pre></td></tr></tbody></table></code></pre></div></div>

<p>Copy these contents into the <code class="language-plaintext highlighter-rouge">main.cpp</code> file.</p>

<h2 id="compile-and-run-the-example">Compile and run the example</h2>

<p>Now we have all the important files written but we still need to compile and run the example, write the following contents to the <code class="language-plaintext highlighter-rouge">scripts/compile_run.sh</code> file</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
</pre></td><td class="rouge-code"><pre><span class="c">#!/bin/bash</span>

<span class="nv">THIS_DIR</span><span class="o">=</span><span class="si">$(</span><span class="nb">dirname</span> <span class="s2">"</span><span class="si">$(</span><span class="nb">realpath</span> <span class="s2">"</span><span class="nv">$0</span><span class="s2">"</span><span class="si">)</span><span class="s2">"</span><span class="si">)</span>
<span class="nv">ROOT_DIR</span><span class="o">=</span><span class="si">$(</span><span class="nb">dirname</span> <span class="k">${</span><span class="nv">THIS_DIR</span><span class="k">}</span><span class="si">)</span>

recreate_dirs<span class="o">(){</span>
    <span class="c"># removing build directory</span>
    <span class="nb">echo</span> <span class="s2">"Removing </span><span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span><span class="s2">/build and recreating..."</span>
    <span class="nb">rm</span> <span class="nt">-rf</span> <span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/build
    <span class="nb">mkdir</span> <span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/build

    <span class="c"># creating directories for the build</span>
    <span class="nb">mkdir</span> <span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/build/obj
    <span class="nb">mkdir</span> <span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/build/bin
    <span class="nb">mkdir</span> <span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/build/lib
<span class="o">}</span>

compile_exec<span class="o">(){</span>
    recreate_dirs

    <span class="c"># compile to objects</span>
    <span class="nb">echo</span> <span class="s2">"Compiling objects for executable..."</span>
    g++ <span class="nt">-std</span><span class="o">=</span>c++17 <span class="nt">-I</span><span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/external/lib/opencv/include/opencv4 <span class="nt">-c</span> <span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/main.cpp <span class="nt">-o</span> <span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/build/obj/main.o

    <span class="c"># link all the objects</span>
    g++ <span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/build/obj/main.o <span class="se">\</span>
        <span class="nt">-I</span><span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/external/lib/opencv/include/opencv4 <span class="se">\</span>
        <span class="nt">-L</span><span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/external/lib/opencv/lib <span class="se">\</span>
        <span class="nt">-lopencv_core</span> <span class="se">\</span>
        <span class="nt">-lopencv_highgui</span> <span class="se">\</span>
        <span class="nt">-lopencv_imgproc</span> <span class="se">\</span>
        <span class="nt">-lopencv_imgcodecs</span> <span class="se">\</span>
        <span class="nt">-Wl</span>,-rpath,external/lib/opencv/lib <span class="se">\</span>
        <span class="nt">-o</span> <span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/build/bin/main
<span class="o">}</span>

croak<span class="o">(){</span>
    <span class="nb">echo</span> <span class="s2">"[ERROR] </span><span class="nv">$*</span><span class="s2">"</span> <span class="o">&gt;</span> /dev/stderr
    <span class="nb">exit </span>1
<span class="o">}</span>

main<span class="o">(){</span>
    <span class="k">if</span> <span class="o">[[</span> <span class="nt">-z</span> <span class="s2">"</span><span class="nv">$TASK</span><span class="s2">"</span> <span class="o">]]</span><span class="p">;</span> <span class="k">then
        </span>croak <span class="s2">"No TASK specified."</span>
    <span class="k">fi
    </span><span class="nb">echo</span> <span class="s2">"[INFO] running </span><span class="nv">$TASK</span><span class="s2"> </span><span class="nv">$*</span><span class="s2">"</span>
    <span class="nv">$TASK</span> <span class="s2">"</span><span class="nv">$@</span><span class="s2">"</span>
<span class="o">}</span>

main <span class="s2">"</span><span class="nv">$@</span><span class="s2">"</span>

</pre></td></tr></tbody></table></code></pre></div></div>

<p>See the compilation phase, to generate the main object we just need the includes from the library, those are in <code class="language-plaintext highlighter-rouge">external/lib/opencv/include/opencv4</code>. In the linking part we also add the includes but also the libraries with the directory <code class="language-plaintext highlighter-rouge">external/lib/opencv/lib</code>. Then with the flag <code class="language-plaintext highlighter-rouge">-l</code> we specify the libraries. Just to name one <code class="language-plaintext highlighter-rouge">opencv_imgcodecs</code> can be found in <code class="language-plaintext highlighter-rouge">external/lib/opencv/lib/libopencv_imgcodecs.dylib</code>. In my case since I work in a MacOS the libraries are <code class="language-plaintext highlighter-rouge">dylib</code> extension, this is dynamic library for MacOS. Compile the code with</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
</pre></td><td class="rouge-code"><pre><span class="nb">chmod</span> +x ./scripts/compile-run.sh

<span class="nb">export </span><span class="nv">TASK</span><span class="o">=</span>compile_exec
./scripts/compile-run.sh 
</pre></td></tr></tbody></table></code></pre></div></div>

<p>This will create the usual <code class="language-plaintext highlighter-rouge">build</code> directory with the subdirectories and place the executable in the <code class="language-plaintext highlighter-rouge">bin</code>.</p>

<p>Run with the argument the filepath of the image you downloaded in the previous section</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>./build/bin/main img/raw_img.jpeg
</pre></td></tr></tbody></table></code></pre></div></div>

<h2 id="conclusions">Conclusions</h2>

<p>We downloaded <code class="language-plaintext highlighter-rouge">opencv</code> library and compiled it from scratch. Placed the library inside our project directory. Next we used the library into a custom program to convert an image to black and white and blur it with a gaussian filter. The steps to use another library like Boost are similar, download the source code, compile it and then use the flags <code class="language-plaintext highlighter-rouge">-I</code> to add the includes of your library and <code class="language-plaintext highlighter-rouge">-L</code> to indicate where are the compiled libraries and <code class="language-plaintext highlighter-rouge">-l</code> to tell which libraries should be considered in the linking step.</p>]]></content><author><name>Sebastia Agramunt Puig</name></author><category term="C++" /><category term="computer science" /><summary type="html"><![CDATA[As mentioned in other posts C++ is a rich programming language that has been out there for a while, it’s predecessor C was created in the 70s and C++ in 1979. Naturally many projects flourished using this language and therefore we have lots of resources out there to use, among them OpenCV, Boost, Eigen or cBLAS. In this post we will see an example on how to compile link and use OpenCV in a custom C++ program.]]></summary></entry><entry><title type="html">C++ Compile Libraries</title><link href="https://agramunt.me/posts/cpp-compile-library/" rel="alternate" type="text/html" title="C++ Compile Libraries" /><published>2025-01-18T16:23:00-08:00</published><updated>2025-02-18T22:59:29-08:00</updated><id>https://agramunt.me/posts/cpp-compile-library</id><content type="html" xml:base="https://agramunt.me/posts/cpp-compile-library/"><![CDATA[<p>In C++ we constantly deal with libraries that are compiled in our system like the standard libraries or other libraries such as  <a href="https://opencv.org/">OpenCV</a> (for computer vision) <a href="https://www.boost.org/">Boost</a> (for linear algebra, pseudorandom number generation…). In this post we will learn about shared and static libraries, how to compile them and how to link them to your programs.</p>

<p>All the code in this post can be found in <a href="https://github.com/SebastiaAgramunt/blogging-code/tree/main/cpp-compile-library">cpp-compile-library</a> supporting material of my <a href="https://github.com/SebastiaAgramunt/blogging-code">blogging-code</a> github repository.</p>

<h2 id="file-structure">File structure</h2>

<p>For this demonstration we will generate a file structure like the following</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
</pre></td><td class="rouge-code"><pre><span class="nb">.</span>
├── include
│   └── matmul.h
├── scripts
│   └── compile.sh
└── src
    ├── main.cpp
    └── matmul.cpp
</pre></td></tr></tbody></table></code></pre></div></div>

<p>The <code class="language-plaintext highlighter-rouge">mathmul</code> file will contain a routine to multiply two matrices. In my example I define <code class="language-plaintext highlighter-rouge">matmul.h</code> with the contents</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
</pre></td><td class="rouge-code"><pre><span class="cp">#ifndef MATMUL_H
#define MATMUL_H
</span>
<span class="kt">void</span> <span class="nf">matmul</span><span class="p">(</span><span class="k">const</span> <span class="kt">int</span><span class="o">*</span> <span class="n">A</span><span class="p">,</span> <span class="k">const</span> <span class="kt">int</span><span class="o">*</span> <span class="n">B</span><span class="p">,</span> <span class="kt">int</span><span class="o">*</span> <span class="n">C</span><span class="p">,</span> <span class="kt">int</span> <span class="n">M</span><span class="p">,</span> <span class="kt">int</span> <span class="n">N</span><span class="p">,</span> <span class="kt">int</span> <span class="n">K</span><span class="p">);</span>
<span class="kt">void</span> <span class="nf">printmatrix</span><span class="p">(</span><span class="k">const</span> <span class="kt">int</span><span class="o">*</span> <span class="n">A</span><span class="p">,</span> <span class="kt">int</span> <span class="n">M</span><span class="p">,</span> <span class="kt">int</span> <span class="n">N</span><span class="p">);</span>

<span class="cp">#endif
</span></pre></td></tr></tbody></table></code></pre></div></div>

<p>and <code class="language-plaintext highlighter-rouge">matmul.cpp</code> with</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
</pre></td><td class="rouge-code"><pre><span class="cp">#include</span> <span class="cpf">&lt;iostream&gt;</span><span class="cp">
#include</span> <span class="cpf">"matmul.h"</span><span class="cp">
</span>
<span class="c1">// Matrices are indexed row-major in this example. E.g. if A is [M x N]</span>
<span class="c1">// If i,j are the row and column indices, the element A[i, j] is</span>
<span class="c1">// A[i, j] = A[i * N + j] // if row-index</span>
<span class="c1">// A[i, j] = A[j * M + i] // if column-index</span>

<span class="kt">void</span> <span class="nf">matmul</span><span class="p">(</span><span class="k">const</span> <span class="kt">int</span><span class="o">*</span> <span class="n">A</span><span class="p">,</span> <span class="k">const</span> <span class="kt">int</span><span class="o">*</span> <span class="n">B</span><span class="p">,</span> <span class="kt">int</span><span class="o">*</span> <span class="n">C</span><span class="p">,</span> <span class="kt">int</span> <span class="n">M</span><span class="p">,</span> <span class="kt">int</span> <span class="n">N</span><span class="p">,</span> <span class="kt">int</span> <span class="n">K</span><span class="p">){</span>
<span class="c1">// Matrix multiplication, C[M x K] = A[M x N] * B[N x K]</span>
<span class="c1">// Multiplication is $\sum_n A[m, n] * B[n, k]$</span>
    <span class="k">for</span><span class="p">(</span><span class="kt">int</span> <span class="n">m</span><span class="o">=</span><span class="mi">0</span><span class="p">;</span> <span class="n">m</span><span class="o">&lt;</span><span class="n">M</span><span class="p">;</span> <span class="n">m</span><span class="o">++</span><span class="p">){</span>
        <span class="k">for</span><span class="p">(</span><span class="kt">int</span> <span class="n">k</span><span class="o">=</span><span class="mi">0</span><span class="p">;</span> <span class="n">k</span><span class="o">&lt;</span><span class="n">K</span><span class="p">;</span> <span class="n">k</span><span class="o">++</span><span class="p">){</span>
            <span class="n">C</span><span class="p">[</span><span class="n">m</span> <span class="o">*</span> <span class="n">K</span> <span class="o">+</span> <span class="n">k</span><span class="p">]</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
            <span class="k">for</span><span class="p">(</span><span class="kt">int</span> <span class="n">n</span><span class="o">=</span><span class="mi">0</span><span class="p">;</span> <span class="n">n</span><span class="o">&lt;</span><span class="n">N</span><span class="p">;</span> <span class="n">n</span><span class="o">++</span><span class="p">){</span>
                <span class="n">C</span><span class="p">[</span><span class="n">m</span> <span class="o">*</span> <span class="n">K</span> <span class="o">+</span> <span class="n">k</span><span class="p">]</span> <span class="o">+=</span> <span class="n">A</span><span class="p">[</span><span class="n">m</span> <span class="o">*</span> <span class="n">N</span> <span class="o">+</span> <span class="n">n</span><span class="p">]</span> <span class="o">*</span> <span class="n">B</span><span class="p">[</span><span class="n">n</span> <span class="o">*</span> <span class="n">K</span> <span class="o">+</span> <span class="n">k</span><span class="p">];</span>
            <span class="p">}</span>
        <span class="p">}</span>
    <span class="p">}</span>
<span class="p">}</span>

<span class="kt">void</span> <span class="nf">printmatrix</span><span class="p">(</span><span class="k">const</span> <span class="kt">int</span><span class="o">*</span> <span class="n">A</span><span class="p">,</span> <span class="kt">int</span> <span class="n">M</span><span class="p">,</span> <span class="kt">int</span> <span class="n">N</span><span class="p">)</span> <span class="p">{</span>
    <span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">i</span> <span class="o">&lt;</span> <span class="n">M</span><span class="p">;</span> <span class="o">++</span><span class="n">i</span><span class="p">)</span> <span class="p">{</span>
        <span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">j</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">j</span> <span class="o">&lt;</span> <span class="n">N</span><span class="p">;</span> <span class="o">++</span><span class="n">j</span><span class="p">)</span> <span class="p">{</span>
            <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="n">A</span><span class="p">[</span><span class="n">i</span> <span class="o">*</span> <span class="n">N</span> <span class="o">+</span> <span class="n">j</span><span class="p">]</span> <span class="o">&lt;&lt;</span> <span class="s">" "</span><span class="p">;</span>
        <span class="p">}</span>
        <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span>
    <span class="p">}</span>
<span class="p">}</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>The plan is to compile <code class="language-plaintext highlighter-rouge">matmul.cpp</code> as a library and then use it in <code class="language-plaintext highlighter-rouge">main.cpp</code>. The latter file should contain something like</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
</pre></td><td class="rouge-code"><pre><span class="cp">#include</span> <span class="cpf">&lt;iostream&gt;</span><span class="cp">
#include</span> <span class="cpf">"matmul.h"</span><span class="cp">
</span>
<span class="kt">int</span> <span class="nf">main</span><span class="p">(</span><span class="kt">void</span><span class="p">){</span>

    <span class="c1">// A[M x N]</span>
    <span class="kt">int</span> <span class="n">M</span> <span class="o">=</span> <span class="mi">2</span><span class="p">;</span>
    <span class="kt">int</span> <span class="n">N</span> <span class="o">=</span> <span class="mi">3</span><span class="p">;</span> 
    <span class="kt">int</span><span class="o">*</span> <span class="n">A</span> <span class="o">=</span> <span class="k">new</span> <span class="kt">int</span><span class="p">[</span><span class="n">M</span> <span class="o">*</span> <span class="n">N</span><span class="p">];</span>

    <span class="k">for</span><span class="p">(</span><span class="kt">int</span> <span class="n">i</span><span class="o">=</span><span class="mi">0</span><span class="p">;</span> <span class="n">i</span> <span class="o">&lt;</span> <span class="n">M</span> <span class="o">*</span> <span class="n">N</span><span class="p">;</span> <span class="n">i</span><span class="o">++</span><span class="p">){</span>
        <span class="n">A</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">=</span> <span class="n">i</span><span class="p">;</span>
    <span class="p">}</span>
    <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="n">std</span><span class="o">::</span><span class="n">endl</span> <span class="o">&lt;&lt;</span> <span class="s">"A:"</span> <span class="o">&lt;&lt;</span> <span class="n">std</span><span class="o">::</span><span class="n">endl</span><span class="p">;</span>
    <span class="n">printmatrix</span><span class="p">(</span><span class="n">A</span><span class="p">,</span> <span class="n">M</span><span class="p">,</span> <span class="n">N</span><span class="p">);</span>

    <span class="c1">// B[N x K]</span>
    <span class="kt">int</span> <span class="n">K</span> <span class="o">=</span> <span class="mi">4</span><span class="p">;</span>
    <span class="kt">int</span><span class="o">*</span> <span class="n">B</span> <span class="o">=</span> <span class="k">new</span> <span class="kt">int</span><span class="p">[</span><span class="n">N</span> <span class="o">*</span> <span class="n">K</span><span class="p">];</span>

    <span class="k">for</span><span class="p">(</span><span class="kt">int</span> <span class="n">i</span><span class="o">=</span><span class="mi">0</span><span class="p">;</span> <span class="n">i</span> <span class="o">&lt;</span> <span class="n">N</span> <span class="o">*</span> <span class="n">K</span><span class="p">;</span> <span class="n">i</span><span class="o">++</span><span class="p">){</span>
        <span class="n">B</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">=</span> <span class="n">i</span><span class="p">;</span>
    <span class="p">}</span>
    <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="n">std</span><span class="o">::</span><span class="n">endl</span> <span class="o">&lt;&lt;</span> <span class="s">"B:"</span> <span class="o">&lt;&lt;</span> <span class="n">std</span><span class="o">::</span><span class="n">endl</span><span class="p">;</span>
    <span class="n">printmatrix</span><span class="p">(</span><span class="n">B</span><span class="p">,</span> <span class="n">N</span><span class="p">,</span> <span class="n">K</span><span class="p">);</span>

    <span class="c1">// C[M x K]</span>
    <span class="kt">int</span><span class="o">*</span> <span class="n">C</span> <span class="o">=</span> <span class="k">new</span> <span class="kt">int</span><span class="p">[</span><span class="n">M</span> <span class="o">*</span> <span class="n">K</span><span class="p">];</span>
    <span class="n">matmul</span><span class="p">(</span><span class="n">A</span><span class="p">,</span> <span class="n">B</span><span class="p">,</span> <span class="n">C</span><span class="p">,</span> <span class="n">M</span><span class="p">,</span> <span class="n">N</span><span class="p">,</span> <span class="n">K</span><span class="p">);</span>

    <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="n">std</span><span class="o">::</span><span class="n">endl</span> <span class="o">&lt;&lt;</span> <span class="s">"C = A x B: "</span> <span class="o">&lt;&lt;</span> <span class="n">std</span><span class="o">::</span><span class="n">endl</span><span class="p">;</span>
    <span class="n">printmatrix</span><span class="p">(</span><span class="n">C</span><span class="p">,</span> <span class="n">M</span><span class="p">,</span> <span class="n">K</span><span class="p">);</span>

    <span class="k">return</span> <span class="mi">0</span><span class="p">;</span>
<span class="p">}</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Where we define two matrices A (size 2x3) and B (size 3x4) that are filled with numbers from 0 to the maximum index of each matrix. We use the routine <code class="language-plaintext highlighter-rouge">matmul</code> to calculate a matrix C that results from the multiplication of A times B.</p>

<p>In the next sections we will compile all, compile with shared library and compile with static library and explain the difference between static and shared libraries</p>

<h2 id="compile-all-the-code">Compile all the code</h2>

<p>First let’s compile all the code and link it as we did in the previous post. Run the following to create the file structure for the compiled objects:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
</pre></td><td class="rouge-code"><pre><span class="nb">rm</span> <span class="nt">-rf</span> build
<span class="nb">mkdir </span>build

<span class="c"># creating directories for the build</span>
<span class="nb">mkdir </span>build/obj
<span class="nb">mkdir </span>build/bin
<span class="nb">mkdir </span>build/lib
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Then start compiling the files</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre><span class="n">g</span><span class="o">++</span> <span class="o">-</span><span class="n">std</span><span class="o">=</span><span class="n">c</span><span class="o">++</span><span class="mi">17</span> <span class="o">-</span><span class="n">Iinclude</span> <span class="o">-</span><span class="n">c</span> <span class="n">src</span><span class="o">/</span><span class="n">matmul</span><span class="p">.</span><span class="n">cpp</span> <span class="o">-</span><span class="n">o</span> <span class="n">build</span><span class="o">/</span><span class="n">obj</span><span class="o">/</span><span class="n">matmul</span><span class="p">.</span><span class="n">o</span>
<span class="n">g</span><span class="o">++</span> <span class="o">-</span><span class="n">std</span><span class="o">=</span><span class="n">c</span><span class="o">++</span><span class="mi">17</span> <span class="o">-</span><span class="n">Iinclude</span> <span class="o">-</span><span class="n">c</span> <span class="n">src</span><span class="o">/</span><span class="n">main</span><span class="p">.</span><span class="n">cpp</span> <span class="o">-</span><span class="n">o</span> <span class="n">build</span><span class="o">/</span><span class="n">obj</span><span class="o">/</span><span class="n">main</span><span class="p">.</span><span class="n">o</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>The <code class="language-plaintext highlighter-rouge">-I</code> flag precedes the path where to find the includes. The <code class="language-plaintext highlighter-rouge">-std</code> indicates the version for the standard library used.</p>

<p>Link to the final executable</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
</pre></td><td class="rouge-code"><pre>g++ build/obj/matmul.o <span class="se">\</span>
	build/obj/main.o <span class="se">\</span>
    <span class="nt">-o</span> build/bin/main
</pre></td></tr></tbody></table></code></pre></div></div>

<p>And execute the compiled main to see the result</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>./build/bin/main
</pre></td></tr></tbody></table></code></pre></div></div>

<p>This will print out the expected result</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
</pre></td><td class="rouge-code"><pre>A:
0 1 2 
3 4 5 

B:
0 1 2 3 
4 5 6 7 
8 9 10 11 

C = A x B: 
20 23 26 29 
56 68 80 92
</pre></td></tr></tbody></table></code></pre></div></div>

<h2 id="shared-library">Shared library</h2>

<p>A compiled shared library is a library that is loaded once in the computer and shared by different processes. Those processes can only read the code and not modify it, and execute in their own threads. This saves global processing memory as the library is stored once in “shared” for all processes. This makes the executable smaller as the library is not included in it but makes it more complex to run as we need to have the library saved somewhere then tell the linker where to find it.</p>

<p>Let’s compile the shared library. First, recreate the build directory</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
</pre></td><td class="rouge-code"><pre><span class="nb">rm</span> <span class="nt">-rf</span> build
<span class="nb">mkdir </span>build

<span class="c"># creating directories for the build</span>
<span class="nb">mkdir </span>build/obj
<span class="nb">mkdir </span>build/bin
<span class="nb">mkdir </span>build/lib
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Now compile the library <code class="language-plaintext highlighter-rouge">matmul</code> with the command</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
</pre></td><td class="rouge-code"><pre>g++ <span class="nt">-std</span><span class="o">=</span>c++17 <span class="nt">-Iinclude</span> <span class="nt">-c</span> src/matmul.cpp <span class="nt">-o</span> build/obj/matmul.o
g++ <span class="nt">-std</span><span class="o">=</span>c++17 <span class="nt">-shared</span> <span class="nt">-fPIC</span> <span class="nt">-Iinclude</span> build/obj/matmul.o <span class="nt">-o</span> build/lib/libmatmul.so

// or <span class="k">in </span>one step
// g++ <span class="nt">-std</span><span class="o">=</span>c++17 <span class="nt">-shared</span> <span class="nt">-fPIC</span> <span class="nt">-Iinclude</span> src/matmul.cpp <span class="nt">-o</span> build/lib/libmatmul.so
</pre></td></tr></tbody></table></code></pre></div></div>

<p>When compiling source code into a shared library using the <code class="language-plaintext highlighter-rouge">-shared</code> flag, the <code class="language-plaintext highlighter-rouge">-fPIC</code> flag is often required. This ensures that the resulting shared library is position-independent in memory (can be loaded regardless of the memory address managed by the OS).</p>

<p>Now compile the <code class="language-plaintext highlighter-rouge">main.cpp</code> to an object <code class="language-plaintext highlighter-rouge">main.o</code>, this is the code that will use the shared library.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>g++ <span class="nt">-std</span><span class="o">=</span>c++17 <span class="nt">-Iinclude</span> <span class="nt">-c</span> src/main.cpp <span class="nt">-o</span> build/obj/main.o
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Finally link the compiled <code class="language-plaintext highlighter-rouge">main.o</code> with the library to generate the executable.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
</pre></td><td class="rouge-code"><pre>g++ build/obj/main.o <span class="se">\</span>
    <span class="nt">-Iinclude</span> <span class="se">\</span>
    <span class="nt">-L</span>./build/lib <span class="se">\</span>
    <span class="nt">-lmatmul</span> <span class="se">\</span>
    <span class="nt">-Wl</span>,-rpath,./build/lib <span class="se">\</span>
    <span class="nt">-o</span> build/bin/main_dynamic
</pre></td></tr></tbody></table></code></pre></div></div>

<p>the <code class="language-plaintext highlighter-rouge">-L</code> flag indicates the directory where the library <code class="language-plaintext highlighter-rouge">libmatmul.so</code> is located. The <code class="language-plaintext highlighter-rouge">-l</code> is the flag that tells the linker which library to link, in our case <code class="language-plaintext highlighter-rouge">matmul</code>. Recall that the file is <code class="language-plaintext highlighter-rouge">libmatmul.so</code> and in the <code class="language-plaintext highlighter-rouge">-l</code> flag we don’t include this “lib”, it would fail to find the library otherwise. The <code class="language-plaintext highlighter-rouge">-Wl</code> option is used to pass options directly to the linker. For instance<code class="language-plaintext highlighter-rouge">-Wl,-rpath,./build/lib</code> tells the linker to set the runtime search path for shared libraries, so the executable can find shared libraries at runtime. The <code class="language-plaintext highlighter-rouge">-o</code> is to specify the output path and filename.</p>

<p>Once the executable is linked test it by running</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>./build/bin/main_dynamic
</pre></td></tr></tbody></table></code></pre></div></div>

<p>And you should get the same output as in the previous section. One of the consequences of using a shared library is that once linked we can’t change the path of the compiled library (unless we also change the runtime path of the executable running <code class="language-plaintext highlighter-rouge">chrpath</code>). Let me explain this with an example: Move the file <code class="language-plaintext highlighter-rouge">build/lib/libmatmul.so</code> somewhere else and try to run the executable again, it won’t run and will raise an error because it won’t be able to find the shared library. The <code class="language-plaintext highlighter-rouge">-rpath</code> tells the executable where to find the library and is encoded in the executable after linking. As we will see, this won’t happen in the static library</p>

<h2 id="static-library">Static Library</h2>

<p>As opposed to a dynamic library the static library is included in the final executable and we don’t need to specify a path at runtime. It uses more memory as each process has its own copied instructions, the executable is larger since it includes all the library code but as a bright side the executable is self contained as we mentioned. Let’s compile the example.</p>

<p>Compile the library</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre>g++ <span class="nt">-std</span><span class="o">=</span>c++17 <span class="nt">-Iinclude</span> <span class="nt">-c</span> src/matmul.cpp <span class="nt">-o</span> build/obj/matmul.o
ar rcs build/lib/libmatmul.a build/obj/matmul.o
</pre></td></tr></tbody></table></code></pre></div></div>

<p>The second command <code class="language-plaintext highlighter-rouge">ar</code> is simply the archiver, a command that combines several files into one (like zip but without compressing by default), the <code class="language-plaintext highlighter-rouge">rcs</code> tells the archiver to insert files, create the archive and create an index.</p>

<p>Next is to create the object of the main as usual</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>g++ <span class="nt">-std</span><span class="o">=</span>c++17 <span class="nt">-Iinclude</span> <span class="nt">-c</span> src/main.cpp <span class="nt">-o</span> build/obj/main.o
</pre></td></tr></tbody></table></code></pre></div></div>

<p>And the last step is to create the executable</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>g++ build/obj/main.o <span class="nt">-o</span> build/bin/main_static <span class="nt">-L</span>./build/lib <span class="nt">-lmatmul</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>If you check, it is a very similar command compared to the one we used to generate the shared library.</p>
<h2 id="a-bash-script-to-compile-everything">A bash script to compile everything</h2>

<p>As usual, I have developed a bash script that can be useful to run all the tasks, compile the executable all from scratch, compile it using static library and a dynamic library.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
</pre></td><td class="rouge-code"><pre><span class="c">#!/bin/bash</span>

<span class="nv">THIS_DIR</span><span class="o">=</span><span class="si">$(</span><span class="nb">dirname</span> <span class="s2">"</span><span class="si">$(</span><span class="nb">realpath</span> <span class="s2">"</span><span class="nv">$0</span><span class="s2">"</span><span class="si">)</span><span class="s2">"</span><span class="si">)</span>
<span class="nv">ROOT_DIR</span><span class="o">=</span><span class="si">$(</span><span class="nb">dirname</span> <span class="k">${</span><span class="nv">THIS_DIR</span><span class="k">}</span><span class="si">)</span>

recreate_dirs<span class="o">(){</span>
    <span class="c"># removing build directory</span>
    <span class="nb">echo</span> <span class="s2">"Removing </span><span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span><span class="s2">/build and recreating..."</span>
    <span class="nb">rm</span> <span class="nt">-rf</span> <span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/build
    <span class="nb">mkdir</span> <span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/build

    <span class="c"># creating directories for the build</span>
    <span class="nb">mkdir</span> <span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/build/obj
    <span class="nb">mkdir</span> <span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/build/bin
    <span class="nb">mkdir</span> <span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/build/lib
<span class="o">}</span>

compile_exec<span class="o">(){</span>
    recreate_dirs

    <span class="c"># compile to objects</span>
    <span class="nb">echo</span> <span class="s2">"Compiling objects for executable..."</span>
    g++ <span class="nt">-std</span><span class="o">=</span>c++17 <span class="nt">-I</span><span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/include <span class="nt">-c</span> <span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/src/matmul.cpp <span class="nt">-o</span> <span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/build/obj/matmul.o
    g++ <span class="nt">-std</span><span class="o">=</span>c++17 <span class="nt">-I</span><span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/include <span class="nt">-c</span> <span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/src/main.cpp <span class="nt">-o</span> <span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/build/obj/main.o

    <span class="c"># link all the objects</span>
    g++ <span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/build/obj/matmul.o <span class="se">\</span>
        <span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/build/obj/main.o <span class="se">\</span>
        <span class="nt">-o</span> <span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/build/bin/main
<span class="o">}</span>

compile_static<span class="o">(){</span>
    recreate_dirs
    <span class="nb">echo</span> <span class="s2">"Compiling objects for executable using static library..."</span>

    <span class="c"># compile shared library</span>
    g++ <span class="nt">-std</span><span class="o">=</span>c++17 <span class="nt">-I</span><span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/include <span class="nt">-c</span> <span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/src/matmul.cpp <span class="nt">-o</span> build/obj/matmul.o
    ar rcs <span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/build/lib/libmatmul.a <span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/build/obj/matmul.o

    <span class="c"># compile main object</span>
    g++ <span class="nt">-std</span><span class="o">=</span>c++17 <span class="nt">-I</span><span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/include <span class="nt">-c</span> <span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/src/main.cpp <span class="nt">-o</span> <span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/build/obj/main.o

    <span class="c"># link</span>
    g++ <span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/build/obj/main.o <span class="nt">-o</span> <span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/build/bin/main_static <span class="nt">-L</span><span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/build/lib <span class="nt">-lmatmul</span>
<span class="o">}</span>


compile_dynamic<span class="o">(){</span>
    recreate_dirs
    g++ <span class="nt">-std</span><span class="o">=</span>c++17 <span class="nt">-Iinclude</span> <span class="nt">-c</span> <span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/src/matmul.cpp <span class="nt">-o</span> <span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/build/obj/matmul.o
    g++ <span class="nt">-std</span><span class="o">=</span>c++17 <span class="nt">-shared</span> <span class="nt">-fPIC</span> <span class="nt">-Iinclude</span> <span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/build/obj/matmul.o <span class="nt">-o</span> <span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/build/lib/libmatmul.so

    g++ <span class="nt">-std</span><span class="o">=</span>c++17 <span class="nt">-Iinclude</span> <span class="nt">-c</span> <span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/src/main.cpp <span class="nt">-o</span> <span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/build/obj/main.o

    g++ <span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/build/obj/main.o <span class="se">\</span>
    <span class="nt">-I</span><span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/include <span class="se">\</span>
    <span class="nt">-L</span><span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/build/lib <span class="se">\</span>
    <span class="nt">-lmatmul</span> <span class="se">\</span>
    <span class="nt">-Wl</span>,-rpath,<span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/build/lib <span class="se">\</span>
    <span class="nt">-o</span> <span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/build/bin/main_dynamic
<span class="o">}</span>


croak<span class="o">(){</span>
    <span class="nb">echo</span> <span class="s2">"[ERROR] </span><span class="nv">$*</span><span class="s2">"</span> <span class="o">&gt;</span> /dev/stderr
    <span class="nb">exit </span>1
<span class="o">}</span>

main<span class="o">(){</span>
    <span class="k">if</span> <span class="o">[[</span> <span class="nt">-z</span> <span class="s2">"</span><span class="nv">$TASK</span><span class="s2">"</span> <span class="o">]]</span><span class="p">;</span> <span class="k">then
        </span>croak <span class="s2">"No TASK specified."</span>
    <span class="k">fi
    </span><span class="nb">echo</span> <span class="s2">"[INFO] running </span><span class="nv">$TASK</span><span class="s2"> </span><span class="nv">$*</span><span class="s2">"</span>
    <span class="nv">$TASK</span> <span class="s2">"</span><span class="nv">$@</span><span class="s2">"</span>
<span class="o">}</span>

main <span class="s2">"</span><span class="nv">$@</span><span class="s2">"</span>

</pre></td></tr></tbody></table></code></pre></div></div>

<p>Simply save it in <code class="language-plaintext highlighter-rouge">scripts/compile.sh</code> file, make it executable <code class="language-plaintext highlighter-rouge">chmod +x scripts/compile.sh</code> and run with</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
</pre></td><td class="rouge-code"><pre><span class="nb">export </span><span class="nv">TASK</span><span class="o">=</span>compile_exec
./scripts/compile.sh

<span class="nb">export </span><span class="nv">TASK</span><span class="o">=</span>compile_dynamic
./scripts/compile.sh

<span class="nb">export </span><span class="nv">TASK</span><span class="o">=</span>compile_static
./scripts/compile.sh
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Bear in mind that we recreate the build directory in every execution. As mentioned previously, this is not the best way to compile a project, we normally use cmake or make. The bash script helps to understand the real bash commands used before we make things more complex with cmake.</p>
<h2 id="comparison-of-dynamic-libraries-vs-static-libraries">Comparison of dynamic libraries vs static libraries</h2>

<p>Here is a summary of the comparison that we have already explained in the previous sections. Just a cheatsheet for the future</p>

<table>
  <thead>
    <tr>
      <th>Feature</th>
      <th>Shared Library</th>
      <th>Static Library</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><strong>Memory Usage</strong></td>
      <td>Lower (shared across applications)</td>
      <td>Higher (duplicated in each app)</td>
    </tr>
    <tr>
      <td><strong>Executable Size</strong></td>
      <td>Smaller</td>
      <td>Larger</td>
    </tr>
    <tr>
      <td><strong>Deployment Simplicity</strong></td>
      <td>Requires library installation</td>
      <td>Self-contained executable</td>
    </tr>
    <tr>
      <td><strong>Update Flexibility</strong></td>
      <td>Can update library independently</td>
      <td>Requires app recompilation</td>
    </tr>
    <tr>
      <td><strong>Startup Performance</strong></td>
      <td>Potentially slower (dynamic linking)</td>
      <td>Faster (prelinked)</td>
    </tr>
    <tr>
      <td><strong>Compatibility Concerns</strong></td>
      <td>Dependency on library versions</td>
      <td>None</td>
    </tr>
  </tbody>
</table>

<p>One more thing to check is the size of the executable for this example. The executable <code class="language-plaintext highlighter-rouge">main_dynamic</code> is 39952 bytes whereas the <code class="language-plaintext highlighter-rouge">main_static</code> is 40248, a difference of 296 bytes being larger the static executable (as it contains all the <code class="language-plaintext highlighter-rouge">matmul</code> library in the excutable itself</p>]]></content><author><name>Sebastia Agramunt Puig</name></author><category term="C++" /><category term="computer science" /><summary type="html"><![CDATA[In C++ we constantly deal with libraries that are compiled in our system like the standard libraries or other libraries such as OpenCV (for computer vision) Boost (for linear algebra, pseudorandom number generation…). In this post we will learn about shared and static libraries, how to compile them and how to link them to your programs.]]></summary></entry><entry><title type="html">C++ Multiple file project</title><link href="https://agramunt.me/posts/cpp-multifile-project/" rel="alternate" type="text/html" title="C++ Multiple file project" /><published>2025-01-18T15:05:00-08:00</published><updated>2025-02-18T22:59:29-08:00</updated><id>https://agramunt.me/posts/cpp-multifile-project</id><content type="html" xml:base="https://agramunt.me/posts/cpp-multifile-project/"><![CDATA[<p>Normally when dealing with C++ projects, developers structure their code in source files and headers. Here I want to show how to create a simple project without external dependencies. The entire example can be found in my github repository <a href="https://github.com/SebastiaAgramunt/blogging-code">blogging-code</a>.</p>

<h2 id="file-structure">File structure</h2>

<p>Let’s design a project with two modules, think of it as if module1 was a physics module and module2 was some sort of interfacing file with that module. It is a good design pattern to isolate sub-projects inside the same project. The file strcucture I propose for the example is (when I run <code class="language-plaintext highlighter-rouge">tree</code> in my project root directory):</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
</pre></td><td class="rouge-code"><pre><span class="nb">.</span>
├── include
│   ├── module1
│   │   ├── module1c1.hpp
│   │   └── module1c2.hpp
│   └── module2
│       ├── module2c1.hpp
│       └── module2c2.hpp
├── scripts
│   └── compile.sh
└── src
    ├── main.cpp
    ├── module1
    │   ├── module1c1.cpp
    │   └── module1c2.cpp
    └── module2
        ├── module2c1.cpp
        └── module2c2.cpp
</pre></td></tr></tbody></table></code></pre></div></div>

<p>In each cpp file we should place (in <code class="language-plaintext highlighter-rouge">mod1c1.cpp</code>):</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
</pre></td><td class="rouge-code"><pre><span class="cp">#include</span> <span class="cpf">&lt;iostream&gt;</span><span class="cp">
</span>
<span class="cp">#include</span> <span class="cpf">&lt;module1/module1c1.hpp&gt;</span><span class="cp">
</span>
<span class="kt">void</span> <span class="n">mod1c1</span><span class="o">::</span><span class="n">foo</span><span class="p">(){</span>
    <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"mod1c1</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span>
<span class="p">}</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>where we substitute <code class="language-plaintext highlighter-rouge">module1/mod1c1.hpp</code> and <code class="language-plaintext highlighter-rouge">mod1c1</code> depending on the filename. If for example we are working in file <code class="language-plaintext highlighter-rouge">mod2c1</code> we would substitue by <code class="language-plaintext highlighter-rouge">module2/mod2c1</code> and <code class="language-plaintext highlighter-rouge">mod2c1</code>. They are the same files but chainging the name.</p>

<p>Also the header files should be something like</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
</pre></td><td class="rouge-code"><pre><span class="cp">#ifndef INCLUDE_MOD1C1_HPP
#define INCLUDE_MOD1C1_HPP
</span>
<span class="cp">#include</span> <span class="cpf">&lt;iostream&gt;</span><span class="cp">
</span>
<span class="k">class</span> <span class="nc">mod1c1</span><span class="p">{</span>
<span class="nl">public:</span>
   <span class="kt">void</span> <span class="n">foo</span><span class="p">();</span>
<span class="p">};</span>

<span class="cp">#endif
</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>As before, for this example substitute the numbers of <code class="language-plaintext highlighter-rouge">mod1c1</code> to the corresponding ones according to the filename.</p>

<p>Finally we add a <code class="language-plaintext highlighter-rouge">main.cpp</code> file that will contain the <code class="language-plaintext highlighter-rouge">main</code> function (entrypoing) to generate an executable that uses all the modules and functions in the <code class="language-plaintext highlighter-rouge">src</code>. The contents of the file <code class="language-plaintext highlighter-rouge">main.cpp</code> are</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
</pre></td><td class="rouge-code"><pre><span class="cp">#include</span> <span class="cpf">"module1/module1c1.hpp"</span><span class="cp">
#include</span> <span class="cpf">"module1/module1c2.hpp"</span><span class="cp">
#include</span> <span class="cpf">"module2/module2c1.hpp"</span><span class="cp">
#include</span> <span class="cpf">"module2/module2c2.hpp"</span><span class="cp">
</span>
<span class="kt">int</span> <span class="nf">main</span><span class="p">(){</span>
    <span class="n">mod1c1</span> <span class="n">m1c1</span><span class="p">;</span> <span class="n">m1c1</span><span class="p">.</span><span class="n">foo</span><span class="p">();</span>
    <span class="n">mod1c1</span> <span class="n">m1c2</span><span class="p">;</span> <span class="n">m1c2</span><span class="p">.</span><span class="n">foo</span><span class="p">();</span>
    <span class="n">mod2c1</span> <span class="n">m2c1</span><span class="p">;</span> <span class="n">m2c1</span><span class="p">.</span><span class="n">foo</span><span class="p">();</span>
    <span class="n">mod2c2</span> <span class="n">m2c2</span><span class="p">;</span> <span class="n">m2c2</span><span class="p">.</span><span class="n">foo</span><span class="p">();</span>
<span class="p">}</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<h2 id="compilation-and-linking-for-executable">Compilation and linking for executable</h2>

<p>It is time to compile the source files to the objects, first create a directory <code class="language-plaintext highlighter-rouge">build</code> where we are going to store all the builds, commonly we also create subdirectories for <code class="language-plaintext highlighter-rouge">obj</code> to store the objects, <code class="language-plaintext highlighter-rouge">bin</code> to store the binaries and <code class="language-plaintext highlighter-rouge">lib</code> for the libraries</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
</pre></td><td class="rouge-code"><pre><span class="c"># recreate build</span>
<span class="nb">rm</span> <span class="nt">-rf</span> build
<span class="nb">mkdir </span>build

<span class="c"># create subdirectories</span>
<span class="nb">mkdir </span>build/obj build/bin build/lib
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Now compile the source files to objects and place them in the subfolder <code class="language-plaintext highlighter-rouge">obj</code>:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
</pre></td><td class="rouge-code"><pre><span class="c"># compile all sources</span>
g++ <span class="nt">-std</span><span class="o">=</span>c++17 <span class="nt">-Iinclude</span> <span class="nt">-c</span> src/module1/module1c1.cpp <span class="nt">-o</span> build/obj/moudle1c1.o
g++ <span class="nt">-std</span><span class="o">=</span>c++17 <span class="nt">-Iinclude</span> <span class="nt">-c</span> src/module1/module1c2.cpp <span class="nt">-o</span> build/obj/moudle1c2.o
g++ <span class="nt">-std</span><span class="o">=</span>c++17 <span class="nt">-Iinclude</span> <span class="nt">-c</span> src/module2/module2c1.cpp <span class="nt">-o</span> build/obj/moudle2c1.o
g++ <span class="nt">-std</span><span class="o">=</span>c++17 <span class="nt">-Iinclude</span> <span class="nt">-c</span> src/module2/module2c2.cpp <span class="nt">-o</span> build/obj/moudle2c2.o

<span class="c"># compile main</span>
g++ <span class="nt">-std</span><span class="o">=</span>c++17 <span class="nt">-Iinclude</span> <span class="nt">-c</span> src/main.cpp <span class="nt">-o</span> build/obj/main.o
</pre></td></tr></tbody></table></code></pre></div></div>

<p>We are choosing the C++ standard with the flag <code class="language-plaintext highlighter-rouge">-std</code> (check the versions in <a href="https://en.cppreference.com/w/cpp">cpprefenence.com</a>). The include directory is “included” with the flag <code class="language-plaintext highlighter-rouge">-I</code> and according to the project structure is <code class="language-plaintext highlighter-rouge">include</code>. Finally the flag <code class="language-plaintext highlighter-rouge">-c</code> is to tell the compiler to compile the source into an object file. For faster compilation use additional flags like <code class="language-plaintext highlighter-rouge">-O3</code> (aggressive optimization, you can do <code class="language-plaintext highlighter-rouge">O2</code> or <code class="language-plaintext highlighter-rouge">O1</code> for less optimization),  <code class="language-plaintext highlighter-rouge">-march=native</code> (optimizing for your specific CPU), <code class="language-plaintext highlighter-rouge">-flto</code> (link time optimization for cross file optimizations) and  <code class="language-plaintext highlighter-rouge">-ffast-math</code> (aggresive floating point optimizations).</p>

<p>After the sources have been compiled and we checked that there is no error the next step is to link the objects. The linker will check all the definitions of the objects and generate a <code class="language-plaintext highlighter-rouge">main</code> executable that when run will start with the <code class="language-plaintext highlighter-rouge">int main()</code> function. The function <code class="language-plaintext highlighter-rouge">main()</code> could be in any of the object files but the important is that when liking several objects there should be only one <code class="language-plaintext highlighter-rouge">main</code> function to generate an executable. Let’s link all the objects with the command</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
</pre></td><td class="rouge-code"><pre><span class="c"># link all the objects</span>
g++ build/obj/moudle1c1.o <span class="se">\</span>
    build/obj/moudle1c2.o <span class="se">\</span>
    build/obj/moudle2c1.o <span class="se">\</span>
    build/obj/moudle2c2.o <span class="se">\</span>
    build/obj/main.o <span class="se">\</span>
    <span class="nt">-o</span> build/bin/main
</pre></td></tr></tbody></table></code></pre></div></div>

<p>See that we placed the executable in <code class="language-plaintext highlighter-rouge">build/bin/main</code>, execute the binary as</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>./build/bin/main
</pre></td></tr></tbody></table></code></pre></div></div>

<p>For convenience and make things faster I like to create scripts. Place this bash script in <code class="language-plaintext highlighter-rouge">scripts/compile.sh</code> with the content:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
</pre></td><td class="rouge-code"><pre><span class="nv">THIS_DIR</span><span class="o">=</span><span class="si">$(</span><span class="nb">dirname</span> <span class="s2">"</span><span class="si">$(</span><span class="nb">realpath</span> <span class="s2">"</span><span class="nv">$0</span><span class="s2">"</span><span class="si">)</span><span class="s2">"</span><span class="si">)</span>
<span class="nv">ROOT_DIR</span><span class="o">=</span><span class="si">$(</span><span class="nb">dirname</span> <span class="k">${</span><span class="nv">THIS_DIR</span><span class="k">}</span><span class="si">)</span>

recreate_dirs<span class="o">(){</span>
    <span class="c"># removing build directory</span>
    <span class="nb">echo</span> <span class="s2">"Removing </span><span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span><span class="s2">/build and recreating..."</span>
    <span class="nb">rm</span> <span class="nt">-rf</span> <span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/build
    <span class="nb">mkdir</span> <span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/build

    <span class="c"># creating directories for the build</span>
    <span class="nb">mkdir</span> <span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/build/obj
    <span class="nb">mkdir</span> <span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/build/bin
    <span class="nb">mkdir</span> <span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/build/lib
<span class="o">}</span>

compile_exec<span class="o">(){</span>
    recreate_dirs

    <span class="c"># compile to objects</span>
    <span class="nb">echo</span> <span class="s2">"Compiling objects for executable..."</span>
    g++ <span class="nt">-std</span><span class="o">=</span>c++17 <span class="nt">-Iinclude</span> <span class="nt">-c</span> src/module1/module1c1.cpp <span class="nt">-o</span> <span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/build/obj/moudle1c1.o
    g++ <span class="nt">-std</span><span class="o">=</span>c++17 <span class="nt">-Iinclude</span> <span class="nt">-c</span> src/module1/module1c2.cpp <span class="nt">-o</span> <span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/build/obj/moudle1c2.o
    g++ <span class="nt">-std</span><span class="o">=</span>c++17 <span class="nt">-Iinclude</span> <span class="nt">-c</span> src/module2/module2c1.cpp <span class="nt">-o</span> <span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/build/obj/moudle2c1.o
    g++ <span class="nt">-std</span><span class="o">=</span>c++17 <span class="nt">-Iinclude</span> <span class="nt">-c</span> src/module2/module2c2.cpp <span class="nt">-o</span> <span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/build/obj/moudle2c2.o

    <span class="c"># compile the main to object</span>
    g++ <span class="nt">-std</span><span class="o">=</span>c++17 <span class="nt">-Iinclude</span> <span class="nt">-c</span> src/main.cpp <span class="nt">-o</span> <span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/build/obj/main.o

    <span class="c"># link all the objects</span>
    g++ <span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/build/obj/moudle1c1.o <span class="se">\</span>
        <span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/build/obj/moudle1c2.o <span class="se">\</span>
        <span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/build/obj/moudle2c1.o <span class="se">\</span>
        <span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/build/obj/moudle2c2.o <span class="se">\</span>
        <span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/build/obj/main.o <span class="se">\</span>
        <span class="nt">-o</span> <span class="k">${</span><span class="nv">ROOT_DIR</span><span class="k">}</span>/build/bin/main
<span class="o">}</span>


croak<span class="o">(){</span>
  <span class="nb">echo</span> <span class="s2">"[ERROR] </span><span class="nv">$*</span><span class="s2">"</span> <span class="o">&gt;</span> /dev/stderr
  <span class="nb">exit </span>1
<span class="o">}</span>

main<span class="o">(){</span>

  <span class="k">if</span> <span class="o">[[</span> <span class="nt">-z</span> <span class="s2">"</span><span class="nv">$TASK</span><span class="s2">"</span> <span class="o">]]</span><span class="p">;</span> <span class="k">then
    </span>croak <span class="s2">"No TASK specified."</span>
  <span class="k">fi
  </span><span class="nb">echo</span> <span class="s2">"[INFO] running </span><span class="nv">$TASK</span><span class="s2"> </span><span class="nv">$*</span><span class="s2">"</span>
  <span class="nv">$TASK</span> <span class="s2">"</span><span class="nv">$@</span><span class="s2">"</span>
<span class="o">}</span>

main <span class="s2">"</span><span class="nv">$@</span><span class="s2">"</span>

</pre></td></tr></tbody></table></code></pre></div></div>

<p>To execute it you just need to run</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
</pre></td><td class="rouge-code"><pre><span class="c"># make the script executable</span>
<span class="nb">chmod</span> +x scripts/compile.sh

<span class="nb">export </span><span class="nv">TASK</span><span class="o">=</span>compile_exec
./scripts/compile.sh
</pre></td></tr></tbody></table></code></pre></div></div>

<p>The new generated executable will be found in <code class="language-plaintext highlighter-rouge">build/bin/main</code>, execute it as:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>./build/bin/main
</pre></td></tr></tbody></table></code></pre></div></div>

<p>To get the prints of each function on your screen.</p>
<h2 id="remarks">Remarks</h2>

<p>What we have shown here is the bare bones of a C++ project. We created an executable that runs several functions defined in different files.</p>

<p>This is the basics, in follow up posts we will learn how to compile code using <a href="https://www.gnu.org/software/make/manual/make.html">Makefile</a> and <a href="https://cmake.org/">cmake</a>, better and more convenient tools to compile code. We will also learn to create static and dynamic libraries of our code to be used in other projects and also how to compile and link external libraries like <a href="https://opencv.org/">OpenCV</a>, <a href="https://www.boost.org/">Boost</a>, <a href="https://eigen.tuxfamily.org/index.php?title=Main_Page">Eigen</a> or <a href="https://www.gnu.org/software/gsl/doc/html/cblas.html">cBLAS</a>.</p>]]></content><author><name>Sebastia Agramunt Puig</name></author><category term="C++" /><category term="computer science" /><summary type="html"><![CDATA[Normally when dealing with C++ projects, developers structure their code in source files and headers. Here I want to show how to create a simple project without external dependencies. The entire example can be found in my github repository blogging-code.]]></summary></entry><entry><title type="html">C++ Basics</title><link href="https://agramunt.me/posts/cpp-basics/" rel="alternate" type="text/html" title="C++ Basics" /><published>2025-01-18T14:02:00-08:00</published><updated>2025-01-18T14:02:00-08:00</updated><id>https://agramunt.me/posts/cpp-basics</id><content type="html" xml:base="https://agramunt.me/posts/cpp-basics/"><![CDATA[<p>This post is about C++ and the way we can compile a C++ program. I remember in the early days of my PhD in physics that I used to write all my code into one <code class="language-plaintext highlighter-rouge">main.cpp</code> and compile it into an executable. Over the months and years I understood that it is more efficient to write in different files serving different purposes and compile and link your libraries. I write this post as a reminder for me but also for these people that are starting in C++ as a guide.</p>

<h1 id="tldr">TLDR</h1>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
</pre></td><td class="rouge-code"><pre>
<span class="c"># step by step compilation</span>
g++ <span class="nt">-E</span> main.cpp <span class="nt">-o</span> main.i <span class="c"># preprocessing</span>
g++ <span class="nt">-S</span> main.cpp <span class="nt">-o</span> main.s <span class="c"># compilation: assembly</span>
g++ <span class="nt">-c</span> main.s <span class="nt">-o</span> main.o   <span class="c"># compilation: object file</span>
g++ main.o <span class="nt">-o</span> hello_world <span class="c"># linking</span>

<span class="c"># compilation all at once (one file)</span>
g++ main.cpp <span class="nt">-o</span> hello_world
</pre></td></tr></tbody></table></code></pre></div></div>

<h2 id="compilation">Compilation</h2>

<p>Compilation is a process by which we translate from human language (code) to machine language, or instructions that the machine can understand. There are several steps, let’s explain it using a file example saved as <code class="language-plaintext highlighter-rouge">main.cpp</code></p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
</pre></td><td class="rouge-code"><pre><span class="cp">#include</span> <span class="cpf">&lt;iostream&gt;</span><span class="c1"> </span><span class="cp">
</span>
<span class="kt">int</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span> 
	<span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"Hello, World!"</span> <span class="o">&lt;&lt;</span> <span class="n">std</span><span class="o">::</span><span class="n">endl</span><span class="p">;</span> <span class="k">return</span> <span class="mi">0</span><span class="p">;</span>
<span class="p">}</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<h2 id="installing-the-compiler">Installing the compiler</h2>

<p>We will isntall <code class="language-plaintext highlighter-rouge">gcc</code> (GNU compiler collection), the compiler for C and <code class="language-plaintext highlighter-rouge">g++</code>, the compiler for C++. In linux install with</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre><span class="nb">sudo </span>apt update
<span class="nb">sudo </span>apt <span class="nb">install </span>gcc g++
</pre></td></tr></tbody></table></code></pre></div></div>

<p>In MacOS install using brew:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>brew <span class="nb">install </span>gcc
</pre></td></tr></tbody></table></code></pre></div></div>

<p>The brew formulae alrady has <code class="language-plaintext highlighter-rouge">gcc</code> and <code class="language-plaintext highlighter-rouge">g++</code>, check both compilers have been successfully installed with</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre>gcc <span class="nt">--version</span>
g++ <span class="nt">--version</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<h3 id="preprocessing">Preprocessing</h3>

<p>The preprocessing step has several functions:</p>
<ul>
  <li>File inclusion: When encountering <code class="language-plaintext highlighter-rouge">#include</code>, the preprocessor replaces it with the actual content of the included file</li>
  <li>Macro expansion: Macros defined with <code class="language-plaintext highlighter-rouge">#define</code> are replaced wherever they appear in the code. E.g. <code class="language-plaintext highlighter-rouge">#define PI 3.14159; double area = PI * r * r;</code> is after preprocessing <code class="language-plaintext highlighter-rouge">double area = 3.14159 * r * r;</code> on the variable <code class="language-plaintext highlighter-rouge">area</code>.</li>
  <li>Conditional compilation: IFDEFs are included/not included in this step, for instance <code class="language-plaintext highlighter-rouge">#ifdef DEBUG std::cout &lt;&lt; "Debugging enabled" &lt;&lt; std::endl; #endif</code>. if <code class="language-plaintext highlighter-rouge">DEBUG</code> is added in the flag <code class="language-plaintext highlighter-rouge">-DDEBUG</code> it will add these lines.</li>
  <li>Removes comments: For instance comments in <code class="language-plaintext highlighter-rouge">//this is a comment</code> won’t be included after the source file is processed
`
The command to generate a <code class="language-plaintext highlighter-rouge">main.i</code> preprocessed file is:</li>
</ul>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>g++ <span class="nt">-E</span> main.cpp <span class="nt">-o</span> main.i
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Try to take a look at the file. There’s many things modified. The preprocessing is useful to add all the necessary code in the preprocessed file, also the compiler will get a unique file with code to compile to an object. It is useful to add or delete configurations and other directives using <code class="language-plaintext highlighter-rouge">ifdef</code> and other commands, this way we can compile for <code class="language-plaintext highlighter-rouge">DEBUG</code> or maybe <code class="language-plaintext highlighter-rouge">CUDA</code> only code.</p>

<h3 id="compilation-generate-assembly-code">Compilation: Generate assembly code</h3>

<p>This step instructs the compiler to translate the preprocessed C++ source file (<code class="language-plaintext highlighter-rouge">main.i</code>) into <strong>assembly code</strong>, which is a low-level, human-readable representation of the instructions for the target machine’s CPU.</p>

<p>The following command precompiles and then generates the assembly code for your specific CPU:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>g++ <span class="nt">-S</span> main.cpp <span class="nt">-o</span> main.s
</pre></td></tr></tbody></table></code></pre></div></div>

<p>The compiler reads the preprocessed file and checks for errors in variables, types correctness, compliance of the C++ standard. Then it constructs the abstract syntax tree, that is a tree structure of the program’s execution. Then it optimizes the instructions by removing unnecessary operations or simplifying operations. Finally writes the hardware specific instructions in human readable code to the file.</p>

<h3 id="compilation-generate-object-file">Compilation: Generate object file</h3>

<p>The following step is produced by the assembler, that translates assembly code produced in the previous step to object code or compiled code that the CPU can execute. However it is not yet a complete executable.</p>

<p>The code for this step is</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>g++ <span class="nt">-c</span> main.s <span class="nt">-o</span> main.o
</pre></td></tr></tbody></table></code></pre></div></div>

<p>We create a source binary file that is not linked to any executable. This is useful for modular projects with multiple source files.</p>

<h3 id="linking-executable-generation">Linking: Executable generation</h3>

<p>Now that we have machine interpretable code we need to link all the objects to generate an executable that can be called from the machine and produces an output or calculation. In our simple case</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>g++ main.o <span class="nt">-o</span> hello_world
</pre></td></tr></tbody></table></code></pre></div></div>

<p>The linking phase is simple here because we are just dealing with a single object file but in general we have many. We will see how to deal with that in a follow-up post. What we end up having from the command is a <code class="language-plaintext highlighter-rouge">hello_world</code> file that can be executed as <code class="language-plaintext highlighter-rouge">./hello_world</code> to output Hello World! in your screen.</p>]]></content><author><name>Sebastia Agramunt Puig</name></author><category term="C++" /><category term="computer science" /><summary type="html"><![CDATA[This post is about C++ and the way we can compile a C++ program. I remember in the early days of my PhD in physics that I used to write all my code into one main.cpp and compile it into an executable. Over the months and years I understood that it is more efficient to write in different files serving different purposes and compile and link your libraries. I write this post as a reminder for me but also for these people that are starting in C++ as a guide.]]></summary></entry><entry><title type="html">Graphs - Depth First Search</title><link href="https://agramunt.me/posts/depth-first-search/" rel="alternate" type="text/html" title="Graphs - Depth First Search" /><published>2024-11-30T15:05:00-08:00</published><updated>2025-03-03T21:48:42-08:00</updated><id>https://agramunt.me/posts/depth-first-search</id><content type="html" xml:base="https://agramunt.me/posts/depth-first-search/"><![CDATA[<p>In this post we will show how to transverse a graph using the depth first search algorithm. As usual we will use an example to explain it. Find the code of this post in my github repository <a href="https://github.com/SebastiaAgramunt/blogging-code">blogging-code</a>, in the subdirectory <a href="https://github.com/SebastiaAgramunt/blogging-code/tree/main/graphs-depth-first-search">graphs-depth-first-search</a>.</p>

<p>Consider the following graph</p>

<pre><code class="language-mermaid">%%{init: {'theme': 'base', 'themeVariables': {'primaryColor': '#d7d7d7', "lineColor": "#5972ff"}}}%%
graph TD
    A((A)) --- B((B))
    A --- C((C))
    B --- D((D))
    B --- E((E))
    C --- F((F))
    D --- A
    D --- F
    E --- F
    F --- G((G))
    H((H)) --- B

    %% Define styles
    classDef largeNodes fill:#d7d7d7,stroke:#000000 ,stroke-width:2px,font-size:24px;
    classDef visited fill:#8fff85,stroke:#000000 ,stroke-width:2px,font-size:24px;

    %% Apply base class to all nodes
    class A,B,C,D,E,F,G,H largeNodes;

    %% Step 1: Visit A
    class A visited;
</code></pre>
<p>consider we start from node $A$ (marked in green) and we want to visit all the nodes.</p>

<h2 id="transversing-the-graph-using-depth-first-search">Transversing the graph using depth first search</h2>

<p>The way it’s done in depth first search is though a <strong>LIFO</strong> (Last In First Out), or equivalently a <strong>stack</strong> data structure. Let’s see it through an example. Imagine the following graph and let’s say we start from node <strong>A</strong>.</p>

<p>A simple way to represent this graph in python using and adjacency list is</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
</pre></td><td class="rouge-code"><pre><span class="kn">from</span> <span class="n">typing</span> <span class="kn">import</span> <span class="n">Dict</span><span class="p">,</span> <span class="n">List</span>
<span class="n">GraphType</span> <span class="o">=</span> <span class="n">Dict</span><span class="p">[</span><span class="nb">str</span><span class="p">,</span> <span class="n">List</span><span class="p">[</span><span class="nb">str</span><span class="p">]]</span>

<span class="n">graph</span> <span class="o">=</span> <span class="p">{</span>
    <span class="sh">"</span><span class="s">A</span><span class="sh">"</span><span class="p">:</span> <span class="p">[</span><span class="sh">"</span><span class="s">B</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">C</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">D</span><span class="sh">"</span><span class="p">],</span>
    <span class="sh">"</span><span class="s">B</span><span class="sh">"</span><span class="p">:</span> <span class="p">[</span><span class="sh">"</span><span class="s">A</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">H</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">E</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">D</span><span class="sh">"</span><span class="p">],</span>
    <span class="sh">"</span><span class="s">C</span><span class="sh">"</span><span class="p">:</span> <span class="p">[</span><span class="sh">"</span><span class="s">A</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">F</span><span class="sh">"</span><span class="p">],</span>
    <span class="sh">"</span><span class="s">D</span><span class="sh">"</span><span class="p">:</span> <span class="p">[</span><span class="sh">"</span><span class="s">B</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">A</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">F</span><span class="sh">"</span><span class="p">],</span>
    <span class="sh">"</span><span class="s">E</span><span class="sh">"</span><span class="p">:</span> <span class="p">[</span><span class="sh">"</span><span class="s">B</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">F</span><span class="sh">"</span><span class="p">],</span>
    <span class="sh">"</span><span class="s">F</span><span class="sh">"</span><span class="p">:</span> <span class="p">[</span><span class="sh">"</span><span class="s">C</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">D</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">E</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">G</span><span class="sh">"</span><span class="p">],</span>
    <span class="sh">"</span><span class="s">G</span><span class="sh">"</span><span class="p">:</span> <span class="p">[</span><span class="sh">"</span><span class="s">F</span><span class="sh">"</span><span class="p">],</span>
    <span class="sh">"</span><span class="s">H</span><span class="sh">"</span><span class="p">:</span> <span class="p">[</span><span class="sh">"</span><span class="s">B</span><span class="sh">"</span><span class="p">],</span>
<span class="p">}</span>
</pre></td></tr></tbody></table></code></pre></div></div>
<p>The summary of the steps for the depth first search of this graph is</p>

<table>
  <thead>
    <tr>
      <th><strong>Step</strong></th>
      <th><strong>Current Node</strong></th>
      <th><strong>Stack (before update)</strong></th>
      <th><strong>Visited Nodes</strong></th>
      <th><strong>Stack (after update)</strong></th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>0</td>
      <td>A</td>
      <td>[]</td>
      <td>[A]</td>
      <td>[D, C, B]</td>
    </tr>
    <tr>
      <td>1</td>
      <td>B</td>
      <td>[D, C]</td>
      <td>[A, B]</td>
      <td>[D, C, D, E, H, A]</td>
    </tr>
    <tr>
      <td>2</td>
      <td>H</td>
      <td>[D, C, D, E]</td>
      <td>[A, B, H]</td>
      <td>[D, C, D, E, B]</td>
    </tr>
    <tr>
      <td>3</td>
      <td>E</td>
      <td>[D, C, D]</td>
      <td>[A, B, H, E]</td>
      <td>[D, C, D, F, B]</td>
    </tr>
    <tr>
      <td>4</td>
      <td>F</td>
      <td>[D, C, D]</td>
      <td>[A, B, H, E, F]</td>
      <td>[D, C, D, G, E, D, C]</td>
    </tr>
    <tr>
      <td>5</td>
      <td>C</td>
      <td>[D, C, D, G, E, D]</td>
      <td>[A, B, H, E, F, C]</td>
      <td>[D, C, D, G, E, D, F, A]</td>
    </tr>
    <tr>
      <td>6</td>
      <td>D</td>
      <td>[D, C, D, G, E]</td>
      <td>[A, B, H, E, F, C, D]</td>
      <td>[D, C, D, G, E, F, A, B]</td>
    </tr>
    <tr>
      <td>7</td>
      <td>G</td>
      <td>[D, C, D]</td>
      <td>[A, B, H, E, F, C, D, G]</td>
      <td>[D, C, D, F]</td>
    </tr>
  </tbody>
</table>

<p>Let’s look at some of the first steps:</p>

<ul>
  <li>Step 0: We visit node $A$ and add to the stack nodes $B$, $C$ and $D$.</li>
  <li>Step 1: Pull the next node in the stack $B$, mark as visited and add the nodes $A$, $H$, $E$ and $D$ from the list of nodes connecting to $B$, now we will have a stack of [$A$, $H$, $E$, $D$, $C$, $D$].</li>
  <li>Step 2: We should pull $A$ from the stack but is already visited so we visit $H$. We expand the stack to [$B$, $E$, $D$, $C$, $D$].</li>
  <li>Step 3: Should visit $B$ but is already visited so we go to the next node in the stack, $E$, mark it as visited and add the nodes to the stack [$B$, $F$, $D$, $C$, $D$]…</li>
</ul>

<p>The order of the visited nodes is [$A$, $B$, $H$, $E$, $F$, $C$, $D$, $G$]. Notice why we stop at step 7 even though we still have nodes in the stack. In the potential new step we search for the stack and all nodes have been visited so we have finished trasversing the graph.</p>

<p>The structure of a depth fist search relies on a FILO (First In Last Out) queue. This can be implemented in a stack data structure. In python is simply a list that appends new elements and pops elements to be visited. The pop operation in a list is just getting the last element.</p>

<p>This algorithm can be coded in python</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
</pre></td><td class="rouge-code"><pre><span class="kn">from</span> <span class="n">typing</span> <span class="kn">import</span> <span class="n">Dict</span><span class="p">,</span> <span class="n">List</span><span class="p">,</span> <span class="n">Optional</span>

<span class="n">GraphType</span> <span class="o">=</span> <span class="n">Dict</span><span class="p">[</span><span class="nb">str</span><span class="p">,</span> <span class="n">List</span><span class="p">[</span><span class="nb">str</span><span class="p">]]</span>

<span class="k">def</span> <span class="nf">dfs_filo</span><span class="p">(</span><span class="n">graph</span><span class="p">:</span> <span class="n">GraphType</span><span class="p">,</span> <span class="n">start</span><span class="p">:</span> <span class="nb">str</span><span class="p">,</span> <span class="n">end</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="nb">str</span><span class="p">]</span> <span class="o">=</span> <span class="bp">None</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="nb">bool</span><span class="p">:</span>
    <span class="n">stack</span> <span class="o">=</span> <span class="p">[</span><span class="n">start</span><span class="p">]</span>
    <span class="n">visited</span> <span class="o">=</span> <span class="nf">set</span><span class="p">()</span>

    <span class="k">while</span> <span class="n">stack</span><span class="p">:</span>
        <span class="n">node</span> <span class="o">=</span> <span class="n">stack</span><span class="p">.</span><span class="nf">pop</span><span class="p">()</span>
        <span class="k">if</span> <span class="n">node</span> <span class="ow">in</span> <span class="n">visited</span><span class="p">:</span>
            <span class="k">continue</span>

        <span class="n">visited</span><span class="p">.</span><span class="nf">add</span><span class="p">(</span><span class="n">node</span><span class="p">)</span>
        <span class="nf">print</span><span class="p">(</span><span class="n">node</span><span class="p">,</span> <span class="n">stack</span><span class="p">)</span>

        <span class="k">if</span> <span class="n">node</span> <span class="o">==</span> <span class="n">end</span><span class="p">:</span>
            <span class="k">return</span> <span class="bp">True</span>
        
        <span class="k">for</span> <span class="n">neighbor</span> <span class="ow">in</span> <span class="nf">reversed</span><span class="p">(</span><span class="n">graph</span><span class="p">.</span><span class="nf">get</span><span class="p">(</span><span class="n">node</span><span class="p">,</span> <span class="p">[])):</span>  
            <span class="k">if</span> <span class="n">neighbor</span> <span class="ow">not</span> <span class="ow">in</span> <span class="n">visited</span><span class="p">:</span>
                <span class="n">stack</span><span class="p">.</span><span class="nf">append</span><span class="p">(</span><span class="n">neighbor</span><span class="p">)</span>

    <span class="k">return</span> <span class="bp">False</span>

<span class="k">if</span> <span class="n">__name__</span> <span class="o">==</span> <span class="sh">"</span><span class="s">__main__</span><span class="sh">"</span><span class="p">:</span>
    <span class="n">graph</span> <span class="o">=</span> <span class="p">{</span>
        <span class="sh">"</span><span class="s">A</span><span class="sh">"</span><span class="p">:</span> <span class="p">[</span><span class="sh">"</span><span class="s">B</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">C</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">D</span><span class="sh">"</span><span class="p">],</span>
        <span class="sh">"</span><span class="s">B</span><span class="sh">"</span><span class="p">:</span> <span class="p">[</span><span class="sh">"</span><span class="s">A</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">H</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">E</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">D</span><span class="sh">"</span><span class="p">],</span>
        <span class="sh">"</span><span class="s">C</span><span class="sh">"</span><span class="p">:</span> <span class="p">[</span><span class="sh">"</span><span class="s">A</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">F</span><span class="sh">"</span><span class="p">],</span>
        <span class="sh">"</span><span class="s">D</span><span class="sh">"</span><span class="p">:</span> <span class="p">[</span><span class="sh">"</span><span class="s">B</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">A</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">F</span><span class="sh">"</span><span class="p">],</span>
        <span class="sh">"</span><span class="s">E</span><span class="sh">"</span><span class="p">:</span> <span class="p">[</span><span class="sh">"</span><span class="s">B</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">F</span><span class="sh">"</span><span class="p">],</span>
        <span class="sh">"</span><span class="s">F</span><span class="sh">"</span><span class="p">:</span> <span class="p">[</span><span class="sh">"</span><span class="s">C</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">D</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">E</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">G</span><span class="sh">"</span><span class="p">],</span>
        <span class="sh">"</span><span class="s">G</span><span class="sh">"</span><span class="p">:</span> <span class="p">[</span><span class="sh">"</span><span class="s">F</span><span class="sh">"</span><span class="p">],</span>
        <span class="sh">"</span><span class="s">H</span><span class="sh">"</span><span class="p">:</span> <span class="p">[</span><span class="sh">"</span><span class="s">B</span><span class="sh">"</span><span class="p">],</span>
    <span class="p">}</span>

    <span class="nf">print</span><span class="p">(</span><span class="sh">"</span><span class="se">\n</span><span class="s">Iterative DFS (FILO) starting from A:</span><span class="sh">"</span><span class="p">)</span>
    <span class="nf">dfs_filo</span><span class="p">(</span><span class="n">graph</span><span class="p">,</span> <span class="sh">"</span><span class="s">A</span><span class="sh">"</span><span class="p">)</span>

    <span class="nf">print</span><span class="p">(</span><span class="sh">"</span><span class="se">\n</span><span class="s">Iterative DFS (FILO) starting from A and ending at F:</span><span class="sh">"</span><span class="p">)</span>
    <span class="nf">dfs_filo</span><span class="p">(</span><span class="n">graph</span><span class="p">,</span> <span class="sh">"</span><span class="s">A</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">F</span><span class="sh">"</span><span class="p">)</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<h2 id="checking-if-a-graph-is-cyclic">Checking if a graph is cyclic</h2>

<p>With depth fist search we can detect if the graph is cyclic. In the transversal if the node we are going to visit has already been visited, then we have found a loop and instead of continuing we can exit the function and say that the graph is cyclic.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
</pre></td><td class="rouge-code"><pre><span class="k">def</span> <span class="nf">is_cyclic</span><span class="p">(</span><span class="n">graph</span><span class="p">:</span> <span class="n">GraphType</span><span class="p">,</span> <span class="n">start</span><span class="p">:</span> <span class="nb">str</span><span class="p">,</span> <span class="n">end</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="nb">str</span><span class="p">]</span> <span class="o">=</span> <span class="bp">None</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="nb">bool</span><span class="p">:</span>
    <span class="n">stack</span> <span class="o">=</span> <span class="p">[</span><span class="n">start</span><span class="p">]</span>
    <span class="n">visited</span> <span class="o">=</span> <span class="nf">set</span><span class="p">()</span>

    <span class="k">while</span> <span class="n">stack</span><span class="p">:</span>
        <span class="n">node</span> <span class="o">=</span> <span class="n">stack</span><span class="p">.</span><span class="nf">pop</span><span class="p">()</span>
        <span class="k">if</span> <span class="n">node</span> <span class="ow">in</span> <span class="n">visited</span><span class="p">:</span>
            <span class="k">return</span> <span class="bp">True</span>

        <span class="n">visited</span><span class="p">.</span><span class="nf">add</span><span class="p">(</span><span class="n">node</span><span class="p">)</span>
        
        <span class="k">for</span> <span class="n">neighbor</span> <span class="ow">in</span> <span class="nf">reversed</span><span class="p">(</span><span class="n">graph</span><span class="p">.</span><span class="nf">get</span><span class="p">(</span><span class="n">node</span><span class="p">,</span> <span class="p">[])):</span>  
            <span class="k">if</span> <span class="n">neighbor</span> <span class="ow">not</span> <span class="ow">in</span> <span class="n">visited</span><span class="p">:</span>
                <span class="n">stack</span><span class="p">.</span><span class="nf">append</span><span class="p">(</span><span class="n">neighbor</span><span class="p">)</span>

    <span class="k">return</span> <span class="bp">False</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Let’s check if the graph is cyclic:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre><span class="nf">print</span><span class="p">(</span><span class="sa">f</span><span class="sh">"</span><span class="se">\n</span><span class="s">Is the graph cyclic? True/False</span><span class="sh">"</span><span class="p">)</span>
<span class="nf">print</span><span class="p">(</span><span class="nf">is_cyclic</span><span class="p">(</span><span class="n">graph</span><span class="p">,</span> <span class="sh">"</span><span class="s">A</span><span class="sh">"</span><span class="p">))</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>which clearly is. For instance, we have the loop (A, B, D, A) or (A, C, F, D, B, A).</p>]]></content><author><name>Sebastia Agramunt Puig</name></author><category term="Data Structures" /><category term="Graphs" /><category term="computer science" /><summary type="html"><![CDATA[In this post we will show how to transverse a graph using the depth first search algorithm. As usual we will use an example to explain it. Find the code of this post in my github repository blogging-code, in the subdirectory graphs-depth-first-search.]]></summary></entry><entry><title type="html">Graphs - Breadth First Search</title><link href="https://agramunt.me/posts/breadth-first-search/" rel="alternate" type="text/html" title="Graphs - Breadth First Search" /><published>2024-11-30T15:05:00-08:00</published><updated>2024-11-30T15:05:00-08:00</updated><id>https://agramunt.me/posts/breadth-first-search</id><content type="html" xml:base="https://agramunt.me/posts/breadth-first-search/"><![CDATA[<p>In this post we will review what is breadth first transversal and compare it to depth first transversal. As usual the code can be found in my github repository <a href="https://github.com/SebastiaAgramunt/blogging-code">blogging-code</a> and for this post in the subdirectory <a href="https://github.com/SebastiaAgramunt/blogging-code/tree/main/graphs-breadth-first-search">graphs-breadth-first-search</a>.</p>

<p>Consider the following graph</p>

<pre><code class="language-mermaid">%%{init: {'theme': 'base', 'themeVariables': {'primaryColor': '#d7d7d7', "lineColor": "#5972ff"}}}%%
graph TD
    A((A)) --- B((B))
    A --- C((C))
    B --- D((D))
    B --- E((E))
    C --- F((F))
    D --- A
    D --- F
    E --- F
    F --- G((G))
    H((H)) --- B

    %% Define styles
    classDef largeNodes fill:#d7d7d7,stroke:#000000 ,stroke-width:2px,font-size:24px;
    classDef visited fill:#8fff85,stroke:#000000 ,stroke-width:2px,font-size:24px;

    %% Apply base class to all nodes
    class A,B,C,D,E,F,G,H largeNodes;

    %% Step 1: Visit A
    class A visited;
</code></pre>
<p>consider we start from node $A$ (marked in green) and we want to visit all the nodes.</p>

<h2 id="transversing-the-graph-using-breadth-first-search">Transversing the graph using breadth first search</h2>

<p>As opposed to depth first search where we explore the node that is last included in the list (we do this with a stack / first in last out FILO queue), in breadth first search (BFS) the nodes are explored diffrently, the first that comes in to the list is the first to be explored. We do this with a first in first out FIFO or a queue data structure.</p>

<p>The graph can be represented in python with</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
</pre></td><td class="rouge-code"><pre><span class="kn">from</span> <span class="n">typing</span> <span class="kn">import</span> <span class="n">Dict</span><span class="p">,</span> <span class="n">List</span><span class="p">,</span> <span class="n">Optional</span>
<span class="n">GraphType</span> <span class="o">=</span> <span class="n">Dict</span><span class="p">[</span><span class="nb">str</span><span class="p">,</span> <span class="n">List</span><span class="p">[</span><span class="nb">str</span><span class="p">]]</span>

<span class="n">graph</span> <span class="o">=</span> <span class="p">{</span>
    <span class="sh">"</span><span class="s">A</span><span class="sh">"</span><span class="p">:</span> <span class="p">[</span><span class="sh">"</span><span class="s">B</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">C</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">D</span><span class="sh">"</span><span class="p">],</span>
    <span class="sh">"</span><span class="s">B</span><span class="sh">"</span><span class="p">:</span> <span class="p">[</span><span class="sh">"</span><span class="s">A</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">H</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">E</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">D</span><span class="sh">"</span><span class="p">],</span>
    <span class="sh">"</span><span class="s">C</span><span class="sh">"</span><span class="p">:</span> <span class="p">[</span><span class="sh">"</span><span class="s">A</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">F</span><span class="sh">"</span><span class="p">],</span>
    <span class="sh">"</span><span class="s">D</span><span class="sh">"</span><span class="p">:</span> <span class="p">[</span><span class="sh">"</span><span class="s">B</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">A</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">F</span><span class="sh">"</span><span class="p">],</span>
    <span class="sh">"</span><span class="s">E</span><span class="sh">"</span><span class="p">:</span> <span class="p">[</span><span class="sh">"</span><span class="s">B</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">F</span><span class="sh">"</span><span class="p">],</span>
    <span class="sh">"</span><span class="s">F</span><span class="sh">"</span><span class="p">:</span> <span class="p">[</span><span class="sh">"</span><span class="s">C</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">D</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">E</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">G</span><span class="sh">"</span><span class="p">],</span>
    <span class="sh">"</span><span class="s">G</span><span class="sh">"</span><span class="p">:</span> <span class="p">[</span><span class="sh">"</span><span class="s">F</span><span class="sh">"</span><span class="p">],</span>
    <span class="sh">"</span><span class="s">H</span><span class="sh">"</span><span class="p">:</span> <span class="p">[</span><span class="sh">"</span><span class="s">B</span><span class="sh">"</span><span class="p">],</span>
<span class="p">}</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>The BFS transversal of this graph is</p>

<table>
  <thead>
    <tr>
      <th><strong>Step</strong></th>
      <th><strong>Current Node</strong></th>
      <th><strong>Queue (before update)</strong></th>
      <th><strong>Visited Nodes</strong></th>
      <th><strong>Queue (after update)</strong></th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>0</td>
      <td>A</td>
      <td>[]</td>
      <td>[A]</td>
      <td>[B, C, D]</td>
    </tr>
    <tr>
      <td>1</td>
      <td>B</td>
      <td>[C, D]</td>
      <td>[A, B]</td>
      <td>[C, D, A, H, E, D]</td>
    </tr>
    <tr>
      <td>2</td>
      <td>C</td>
      <td>[D, A, H, E, D]</td>
      <td>[A, B, C]</td>
      <td>[D, A, H, E, D, A, F]</td>
    </tr>
    <tr>
      <td>3</td>
      <td>D</td>
      <td>[A, H, E, D, A, F]</td>
      <td>[A, B, C, D]</td>
      <td>[A, H, E, D, A, F, B, A, F]</td>
    </tr>
    <tr>
      <td>4</td>
      <td>H</td>
      <td>[E, D, A, F, B, A, F]</td>
      <td>[A, B, C, D, H]</td>
      <td>[E, D, A, F, B, A, F, B]</td>
    </tr>
    <tr>
      <td>5</td>
      <td>E</td>
      <td>[D, A, F, B, A, F, B]</td>
      <td>[A, B, C, D, H, E]</td>
      <td>[D, A, F, B, A, F, B, B, F]</td>
    </tr>
    <tr>
      <td>6</td>
      <td>F</td>
      <td>[B, A, F, B, B, F]</td>
      <td>[A, B, C, D, H, E, F]</td>
      <td>[B, A, F, B, B, F, C, D, E, G]</td>
    </tr>
    <tr>
      <td>7</td>
      <td>G</td>
      <td>[]</td>
      <td>[A, B, C, D, H, E, F, G]</td>
      <td>[F]</td>
    </tr>
  </tbody>
</table>

<p>Let’s analyze the steps:</p>

<ul>
  <li>Step 0: We start from $A$, we populate the list with $B$, $C$, $D$, being $B$ the first node added,</li>
  <li>Step 1: pop the first element that was added to the queue, this is $B$, we mark it as visited and we add its nodes $A$, $H$, $E$ and $D$.</li>
  <li>Step 2: pop the next element in the queue, that is $C$, add its nodes, $D$, $A$, $F$.</li>
  <li>Step 3: pop the next element in the queue, $D$, again add it to visited and add the linked nodes $B$, $A$ and $F$.</li>
  <li>Step 4: The next element to explore would be $A$ but that is already visted so we explore the following in the queue, that is $H$. We add the node $B$ to the queue.</li>
</ul>

<p>The algorithm can be coded as</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
</pre></td><td class="rouge-code"><pre><span class="kn">from</span> <span class="n">typing</span> <span class="kn">import</span> <span class="n">Dict</span><span class="p">,</span> <span class="n">List</span><span class="p">,</span> <span class="n">Optional</span>
<span class="kn">from</span> <span class="n">collections</span> <span class="kn">import</span> <span class="n">deque</span>

<span class="n">GraphType</span> <span class="o">=</span> <span class="n">Dict</span><span class="p">[</span><span class="nb">str</span><span class="p">,</span> <span class="n">List</span><span class="p">[</span><span class="nb">str</span><span class="p">]]</span>

<span class="k">def</span> <span class="nf">bfs_fifo</span><span class="p">(</span><span class="n">graph</span><span class="p">:</span> <span class="n">GraphType</span><span class="p">,</span> <span class="n">start</span><span class="p">:</span> <span class="nb">str</span><span class="p">,</span> <span class="n">end</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="nb">str</span><span class="p">]</span> <span class="o">=</span> <span class="bp">None</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="nb">bool</span><span class="p">:</span>
    <span class="n">queue</span> <span class="o">=</span> <span class="nf">deque</span><span class="p">([</span><span class="n">start</span><span class="p">])</span>
    <span class="n">visited</span> <span class="o">=</span> <span class="nf">set</span><span class="p">()</span>

    <span class="k">while</span> <span class="n">queue</span><span class="p">:</span>
        <span class="n">node</span> <span class="o">=</span> <span class="n">queue</span><span class="p">.</span><span class="nf">popleft</span><span class="p">()</span>
        <span class="k">if</span> <span class="n">node</span> <span class="ow">in</span> <span class="n">visited</span><span class="p">:</span>
            <span class="k">continue</span>

        <span class="n">visited</span><span class="p">.</span><span class="nf">add</span><span class="p">(</span><span class="n">node</span><span class="p">)</span>
        <span class="nf">print</span><span class="p">(</span><span class="n">node</span><span class="p">,</span> <span class="n">queue</span><span class="p">)</span>

        <span class="k">if</span> <span class="n">node</span> <span class="o">==</span> <span class="n">end</span><span class="p">:</span>
            <span class="k">return</span> <span class="bp">True</span>
        
        <span class="k">for</span> <span class="n">neighbor</span> <span class="ow">in</span> <span class="n">graph</span><span class="p">.</span><span class="nf">get</span><span class="p">(</span><span class="n">node</span><span class="p">,</span> <span class="p">[]):</span>  
            <span class="k">if</span> <span class="n">neighbor</span> <span class="ow">not</span> <span class="ow">in</span> <span class="n">visited</span><span class="p">:</span>
                <span class="n">queue</span><span class="p">.</span><span class="nf">append</span><span class="p">(</span><span class="n">neighbor</span><span class="p">)</span>

    <span class="k">return</span> <span class="bp">False</span>

<span class="k">if</span> <span class="n">__name__</span> <span class="o">==</span> <span class="sh">"</span><span class="s">__main__</span><span class="sh">"</span><span class="p">:</span>
    <span class="n">graph</span> <span class="o">=</span> <span class="p">{</span>
        <span class="sh">"</span><span class="s">A</span><span class="sh">"</span><span class="p">:</span> <span class="p">[</span><span class="sh">"</span><span class="s">B</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">C</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">D</span><span class="sh">"</span><span class="p">],</span>
        <span class="sh">"</span><span class="s">B</span><span class="sh">"</span><span class="p">:</span> <span class="p">[</span><span class="sh">"</span><span class="s">A</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">H</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">E</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">D</span><span class="sh">"</span><span class="p">],</span>
        <span class="sh">"</span><span class="s">C</span><span class="sh">"</span><span class="p">:</span> <span class="p">[</span><span class="sh">"</span><span class="s">A</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">F</span><span class="sh">"</span><span class="p">],</span>
        <span class="sh">"</span><span class="s">D</span><span class="sh">"</span><span class="p">:</span> <span class="p">[</span><span class="sh">"</span><span class="s">B</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">A</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">F</span><span class="sh">"</span><span class="p">],</span>
        <span class="sh">"</span><span class="s">E</span><span class="sh">"</span><span class="p">:</span> <span class="p">[</span><span class="sh">"</span><span class="s">B</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">F</span><span class="sh">"</span><span class="p">],</span>
        <span class="sh">"</span><span class="s">F</span><span class="sh">"</span><span class="p">:</span> <span class="p">[</span><span class="sh">"</span><span class="s">C</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">D</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">E</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">G</span><span class="sh">"</span><span class="p">],</span>
        <span class="sh">"</span><span class="s">G</span><span class="sh">"</span><span class="p">:</span> <span class="p">[</span><span class="sh">"</span><span class="s">F</span><span class="sh">"</span><span class="p">],</span>
        <span class="sh">"</span><span class="s">H</span><span class="sh">"</span><span class="p">:</span> <span class="p">[</span><span class="sh">"</span><span class="s">B</span><span class="sh">"</span><span class="p">],</span>
    <span class="p">}</span>

    <span class="nf">print</span><span class="p">(</span><span class="sh">"</span><span class="se">\n</span><span class="s">Iterative BFS (FIFO) starting from A:</span><span class="sh">"</span><span class="p">)</span>
    <span class="nf">bfs_fifo</span><span class="p">(</span><span class="n">graph</span><span class="p">,</span> <span class="sh">"</span><span class="s">A</span><span class="sh">"</span><span class="p">)</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>The <code class="language-plaintext highlighter-rouge">deque</code> structure is a convenient data structure in python see <a href="https://www.geeksforgeeks.org/deque-in-python/">this post</a>. It has the method <code class="language-plaintext highlighter-rouge">pop_left</code> that allows you to pop the lefmost element of the list in O(1). We could code this in lists as well but in this case <code class="language-plaintext highlighter-rouge">pop</code> is <code class="language-plaintext highlighter-rouge">O(N)</code>. In another post we will show how to implement <code class="language-plaintext highlighter-rouge">stacks</code> and <code class="language-plaintext highlighter-rouge">queues</code> in C++ it’s a beautiful exercise.</p>

<h2 id="dfs-vs-bfs">DFS vs BFS</h2>

<p>In BFS the exploration is first on the closest nodes and last the furthest nodes from the initial. That can help us find the shortest path in between two nodes. Starting from a node $X$ and transversing in BFS wenever we first find then node $Y$ we know that the <strong>path is the shortest</strong>. BFS is suitable in cases like social networks, web creawling or network packing routing as what’s interesting in those cases is to explore the nodes that are close to the initial node. For instance, in social networks we are interested in knowning the relationship between closer nodes hence we would use BFS.</p>

<p>DFS is suitable to explore first the nodes that are not close to the initial node. For instance, in solving a maze: One is not interested in exploring all the possible paths in the neighbourhood of the beginning but rather move far from the original node and backtrack if it finds a dead end. Also DFS can find any path in between two nodes if it exist, however does not guarantee the shortest.</p>

<p>A table to compare the differences between DFS and BFS.</p>

<table>
  <thead>
    <tr>
      <th><strong>Feature</strong></th>
      <th><strong>BFS (Breadth-First Search)</strong></th>
      <th><strong>DFS (Depth-First Search)</strong></th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><strong>Exploration Order</strong></td>
      <td>Explores <strong>level by level</strong> (all neighbors first)</td>
      <td>Explores <strong>deep paths first</strong>, then backtracks</td>
    </tr>
    <tr>
      <td><strong>Data Structure Used</strong></td>
      <td><strong>Queue (FIFO)</strong></td>
      <td><strong>Stack (LIFO)</strong> (or recursion)</td>
    </tr>
    <tr>
      <td><strong>Best for</strong></td>
      <td><strong>Finding the shortest path</strong> in an unweighted graph</td>
      <td><strong>Deep exploration</strong> (finding connected components, topological sorting)</td>
    </tr>
    <tr>
      <td><strong>Shortest Path Guarantee</strong></td>
      <td>✅ <strong>Yes</strong> (first time a node is reached, it’s via the shortest path)</td>
      <td>❌ <strong>No</strong> (may take a longer path before finding the shortest one)</td>
    </tr>
    <tr>
      <td><strong>Memory Usage</strong></td>
      <td><strong>O(V)</strong> (stores all nodes at a level before proceeding)</td>
      <td><strong>O(V) in worst case</strong>, but can be <strong>O(depth)</strong> with recursion</td>
    </tr>
    <tr>
      <td><strong>Use Case Examples</strong></td>
      <td>- Shortest path (e.g., mazes, road networks)  <br /> - Social networks <br /> - Web crawlers <br /> - Network packet routing</td>
      <td>- Pathfinding in <strong>mazes</strong> <br /> - <strong>Topological sorting</strong> <br /> - <strong>Cycle detection</strong> <br /> - <strong>Backtracking problems</strong> (e.g., Sudoku, N-Queens)</td>
    </tr>
    <tr>
      <td><strong>Handling Large Graphs</strong></td>
      <td><strong>Requires more memory</strong> (holds many nodes in the queue)</td>
      <td><strong>Uses less memory</strong> (only needs to track one deep path)</td>
    </tr>
    <tr>
      <td><strong>Handles Infinite Graphs?</strong></td>
      <td>✅ <strong>Yes</strong>, because it explores in levels</td>
      <td>❌ <strong>No</strong>, because it may get stuck in an infinite loop</td>
    </tr>
    <tr>
      <td><strong>Can detect cycles?</strong></td>
      <td>✅ Yes</td>
      <td>✅ Yes</td>
    </tr>
  </tbody>
</table>]]></content><author><name>Sebastia Agramunt Puig</name></author><category term="Data Structures" /><category term="Graphs" /><category term="computer science" /><summary type="html"><![CDATA[In this post we will review what is breadth first transversal and compare it to depth first transversal. As usual the code can be found in my github repository blogging-code and for this post in the subdirectory graphs-breadth-first-search.]]></summary></entry><entry><title type="html">Introduction to Graphs</title><link href="https://agramunt.me/posts/introduction-to-graphs/" rel="alternate" type="text/html" title="Introduction to Graphs" /><published>2024-11-30T14:05:00-08:00</published><updated>2024-11-30T14:05:00-08:00</updated><id>https://agramunt.me/posts/introduction-to-graphs</id><content type="html" xml:base="https://agramunt.me/posts/introduction-to-graphs/"><![CDATA[<p>A graph is a structure used in graph theory, consisting of <strong>vertices</strong> (or <strong>nodes</strong>) and <strong>edges</strong> (or <strong>links</strong>) that connect pairs of vertices. Graphs are used to model relationships, networks, and structures in various fields like computer science, social networks, logistics, and physics.</p>

<p>A graph can be represented easily, let’s see an example</p>

<pre><code class="language-mermaid">%%{init: {'theme': 'base', 'themeVariables': {'primaryColor': '#d7d7d7', "lineColor": "#5972ff"}}}%%
graph TD
    A((A)) --- B((B))
    A --- C((C))
    B --- D((D))
    B --- E((E))
    C --- F((F))
    D --- A
    D --- F
    E --- F
    F --- G((G))
    H((H)) --- B

    %% Define a global class for larger circular nodes
    classDef largeNodes fill:#d7d7d7,stroke:#000000 ,stroke-width:2px,font-size:24px;

    %% Apply the class to all nodes
    class A,B,C,D,E,F,G,H largeNodes;
</code></pre>

<p>The circular objects A, B, C, D, E, F, G and H are the <strong>nodes</strong> of the graph and the lines that connect them are the <strong>edges</strong>. In this example the node B is connected to A, H, E and D. What is the maning of these edges and nodes? Why are they connected?. The applications of graphs are many:</p>

<ul>
  <li><strong>Computer networks</strong>: Nodes are routers and links are connections. In this application we may be interested to route a signal that goes through the minimum number of nodes so it’s faster.</li>
  <li><strong>Social networks</strong>: Can be modelled with graphs, nodes are people and edges are relationships.</li>
  <li><strong>Shortest path problems in space</strong>: going from point A to point B in a map with obstacles. For instance nodes can be cities and edges roads. A typical problem is the traveling salesman problem: Given $V$ (nodes) cities, connected with $E$ (edges) roads the salesman has to visit all of them with the minium time possible.</li>
  <li><strong>Biology</strong>: Neural networks, gene networks and gene interactions.</li>
  <li><strong>AI and Machine Learning</strong>: Knowledge graphs (links of concepts) and the more modern graph neural networks.</li>
</ul>

<h2 id="types-of-graphs">Types of graphs</h2>

<p>In the following subsections we will show the different kinds of graphs</p>

<h3 id="directed-vs-undirected-graphs">Directed vs undirected graphs</h3>

<p>An undirected graph is a graph whose edges have no directions. The graph above represents an undirected graph, we can go from A to B and from B to A equally. In the following graph we show a directed one</p>

<pre><code class="language-mermaid">%%{init: {'theme': 'base', 'themeVariables': {'primaryColor': '#d7d7d7', "lineColor": "#5972ff"}}}%%
graph TD
    A((A)) --&gt; B((B))
    A --&gt; C((C))
    B --&gt; D((D))
    B --&gt; E((E))
    C --&gt; F((F))
    F --&gt; G((G))

    %% Define a global class for larger circular nodes
    classDef largeNodes fill:#d7d7d7,stroke:#000000 ,stroke-width:2px,font-size:24px;

    %% Apply the class to all nodes
    class A,B,C,D,E,F,G,H largeNodes;
</code></pre>

<p>In this directed graph we can go from A to C and from A to B, but not from either C or B to A.</p>

<h3 id="weighted-vs-unweighted-graphs">Weighted vs unweighted graphs</h3>

<p>In an unweighted graph all edges are treaded equally, in the first graph shown in this post it is the same to go from A to C than from A to B, let’s say is one step. Conversely in weighted graphs the edges are not treated equally and there is a cost associated at each edge, the following is a representation of a weighted graph</p>

<pre><code class="language-mermaid">%%{init: {'theme': 'base', 'themeVariables': {'primaryColor': '#d7d7d7', "lineColor": "#5972ff"}}}%%
graph TD
    A((A)) ---|5| B((B))
    A ---|3| C((C))
    B ---|4| D((D))
    B ---|2| E((E))
    C ---|6| F((F))
    F ---|7| G((G))

    %% Define a global class for larger circular nodes
    classDef largeNodes fill:#d7d7d7,stroke:#000000 ,stroke-width:2px,font-size:24px;

    %% Apply the class to all nodes
    class A,B,C,D,E,F,G largeNodes;

    %% Style for larger edge labels
    linkStyle 0,1,2,3,4,5 stroke-width:2px,font-size:18px;
</code></pre>

<p>In this example the cost to go from A to B and vice versa is 5, from A to C is 3, etc. in a weighted graph we can anticipate of the model of cities and roads. It is not the same to go from A to C than from A to B.</p>

<h3 id="connected-vs-disconnected">Connected vs disconnected</h3>

<p>A connected graph has a path in between every pair of nodes. If we think of the analogy of cities and roads, that would mean we can reach any city from any other city. All the graphs shown so far are connected graphs. An example of disconnected graph is the following</p>

<pre><code class="language-mermaid">%%{init: {'theme': 'base', 'themeVariables': {'primaryColor': '#d7d7d7', "lineColor": "#5972ff"}}}%%
graph TD
    A((A)) --- B((B))

    C((C)) --- D((D))
    D --- E((E))
    E --- F((F))
    F --- G((G))

    %% Define a global class for larger circular nodes
    classDef largeNodes fill:#d7d7d7,stroke:#000000 ,stroke-width:2px,font-size:24px;

    %% Apply the class to all nodes
    class A,B,C,D,E,F,G largeNodes;

    %% Style for larger edge labels
    linkStyle 0,1,2,3,4 stroke-width:2px,font-size:18px;
</code></pre>

<p>We can’t reach A from C for example. Disconnected graphs are separated graphs.</p>

<h3 id="cyclic-vs-acyclic">Cyclic vs acyclic</h3>

<p>A cyclic graph contains at least one path that starts and ends at the same node. An example is the first graph</p>

<pre><code class="language-mermaid">%%{init: {'theme': 'base', 'themeVariables': {'primaryColor': '#d7d7d7', "lineColor": "#5972ff"}}}%%
graph TD
    A((A)) --- B((B))
    A --- C((C))
    B --- D((D))
    B --- E((E))
    C --- F((F))
    D --- A
    D --- F
    E --- F
    F --- G((G))
    H((H)) --- B

    %% Define a global class for larger circular nodes
    classDef largeNodes fill:#d7d7d7,stroke:#000000 ,stroke-width:2px,font-size:24px;

    %% Apply the class to all nodes
    class A,B,C,D,E,F,G,H largeNodes;
</code></pre>

<p>In this graph we have 5 cyclic paths:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
</pre></td><td class="rouge-code"><pre>A → B → D → A
A → C → F → D → A
B → D → F → E → B
A → B → E → F → D → A
A → B → D → F → A
</pre></td></tr></tbody></table></code></pre></div></div>

<p>In the following example we have a special graph, a connected acyclic graph for which any two vertices are connected by at most one path (i.e. there are $V-1$ edges). This is known as a graph <strong>tree</strong>, and is a data structructure used in decision trees in data science (and the basic structure of a random forest). Here is one example:</p>

<pre><code class="language-mermaid">%%{init: {'theme': 'base', 'themeVariables': {'primaryColor': '#d7d7d7', "lineColor": "#5972ff"}}}%%
graph TD
    A((A)) --- B((B))
    A --- C((C))
    B --- D((D))
    B --- E((E))
    C --- F((F))
    F --- G((G))

    %% Define a global class for larger circular nodes
    classDef largeNodes fill:#d7d7d7,stroke:#000000 ,stroke-width:2px,font-size:24px;

    %% Apply the class to all nodes
    class A,B,C,D,E,F,G,H largeNodes;
</code></pre>

<p>Take any node of this graph and try to make a closed loop to get back to the same node, you won’t be able to close any loop in this example, this means this graph is acyclic. To formally check if a graph is acyclic we will use an algorithm called depth first search.</p>

<h2 id="graph-representations">Graph representations</h2>

<p>There are several ways to represent a graph in computer science and mathematics. The choice of representation depends on the type of graph (directed, undirected, weighted, unweighted) and the operations to be performed (e.g., searching, pathfinding). In this section we present the most common graph representations</p>

<ul>
  <li><strong>Adjacency matrix</strong>: Defines a matrix $V\times V$ for which each $i$, $j$ element represents the cost from node $i$ to node $j$.</li>
  <li><strong>Adjacency list</strong>: Defines a list for the connected nodes, additionally if the graph is weighted it has a numerical value indicating the cost.</li>
  <li><strong>Edge list</strong>: A list of edges in the graph, usually stored as pairs (V1, V2) with weight (V1, V2, weight) for an edge graph.</li>
</ul>

<p>In the following we will show a representation of each graph</p>

<h3 id="undirected-unweighted-graph">Undirected unweighted graph</h3>

<p>An example of undirected unweighted graph</p>

<pre><code class="language-mermaid">%%{init: {'theme': 'base', 'themeVariables': {'primaryColor': '#d7d7d7', "lineColor": "#5972ff"}}}%%
graph TD
    A((A)) --- B((B))
    A --- C((C))
    B --- D((D))
    B --- E((E))
    C --- F((F))
    D --- A
    D --- F
    E --- F
    F --- G((G))
    H((H)) --- B

    %% Define a global class for larger circular nodes
    classDef largeNodes fill:#d7d7d7,stroke:#000000 ,stroke-width:2px,font-size:24px;

    %% Apply the class to all nodes
    class A,B,C,D,E,F,G,H largeNodes;
</code></pre>

<h4 id="adjacency-matrix">Adjacency matrix</h4>

<p>The corresponding adjacnecy matrix is</p>

<table>
  <thead>
    <tr>
      <th> </th>
      <th>A</th>
      <th>B</th>
      <th>C</th>
      <th>D</th>
      <th>E</th>
      <th>F</th>
      <th>G</th>
      <th>H</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>A</td>
      <td>0</td>
      <td>1</td>
      <td>1</td>
      <td>1</td>
      <td>0</td>
      <td>0</td>
      <td>0</td>
      <td>0</td>
    </tr>
    <tr>
      <td>B</td>
      <td>1</td>
      <td>0</td>
      <td>0</td>
      <td>1</td>
      <td>1</td>
      <td>0</td>
      <td>0</td>
      <td>1</td>
    </tr>
    <tr>
      <td>C</td>
      <td>1</td>
      <td>0</td>
      <td>0</td>
      <td>0</td>
      <td>0</td>
      <td>1</td>
      <td>0</td>
      <td>0</td>
    </tr>
    <tr>
      <td>D</td>
      <td>1</td>
      <td>1</td>
      <td>0</td>
      <td>0</td>
      <td>0</td>
      <td>1</td>
      <td>0</td>
      <td>0</td>
    </tr>
    <tr>
      <td>E</td>
      <td>0</td>
      <td>1</td>
      <td>0</td>
      <td>0</td>
      <td>0</td>
      <td>1</td>
      <td>0</td>
      <td>0</td>
    </tr>
    <tr>
      <td>F</td>
      <td>0</td>
      <td>0</td>
      <td>1</td>
      <td>1</td>
      <td>1</td>
      <td>0</td>
      <td>1</td>
      <td>0</td>
    </tr>
    <tr>
      <td>G</td>
      <td>0</td>
      <td>0</td>
      <td>0</td>
      <td>0</td>
      <td>0</td>
      <td>1</td>
      <td>0</td>
      <td>0</td>
    </tr>
    <tr>
      <td>H</td>
      <td>0</td>
      <td>1</td>
      <td>0</td>
      <td>0</td>
      <td>0</td>
      <td>0</td>
      <td>0</td>
      <td>0</td>
    </tr>
  </tbody>
</table>

<p>Since there are no weights the connected pairs of nodes are represented with 1 and the disconnected pairs with 0. For undirected graphs the adjacency matrix is symmetric, for every pairs of nodes one can transverse the graph from one to the other (A to B or B to A are equivalent).</p>

<h4 id="adjacency-list">Adjacency list</h4>

<p>And the list representation is</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
</pre></td><td class="rouge-code"><pre>A: [B, C, D]
B: [A, D, E, H]
C: [A, F]
D: [B, A, F]
E: [B, F]
F: [C, D, E, G]
G: [F]
H: [B]
</pre></td></tr></tbody></table></code></pre></div></div>

<h4 id="edge-list">Edge list</h4>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
</pre></td><td class="rouge-code"><pre>(A, B)
(A, C)
(B, D)
(B, E)
(C, F)
(D, A)
(D, F)
(E, F)
(F, G)
(H, B)
</pre></td></tr></tbody></table></code></pre></div></div>

<h3 id="undirected-weighted-graph">Undirected weighted graph</h3>

<p>Example graph:</p>

<pre><code class="language-mermaid">%%{init: {'theme': 'base', 'themeVariables': {'primaryColor': '#d7d7d7', "lineColor": "#5972ff"}}}%%
graph TD
    A((A)) ---|4| B((B))
    A ---|2| C((C))
    B ---|3| D((D))
    C ---|6| D((D))
    D ---|5| E((E))

    %% Define a global class for larger circular nodes
    classDef largeNodes fill:#d7d7d7,stroke:#000000 ,stroke-width:2px,font-size:24px;

    %% Apply the class to all nodes
    class A,B,C,D,E largeNodes;

    %% Style for larger edge labels
    linkStyle 0,1,2,3,4 stroke-width:2px,font-size:18px;
</code></pre>

<h4 id="adjacency-matrix-1">Adjacency matrix</h4>

<p>we have the adjacency matrix</p>

<table>
  <thead>
    <tr>
      <th> </th>
      <th>A</th>
      <th>B</th>
      <th>C</th>
      <th>D</th>
      <th>E</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>A</td>
      <td>0</td>
      <td>4</td>
      <td>2</td>
      <td>0</td>
      <td>0</td>
    </tr>
    <tr>
      <td>B</td>
      <td>4</td>
      <td>0</td>
      <td>0</td>
      <td>3</td>
      <td>0</td>
    </tr>
    <tr>
      <td>C</td>
      <td>2</td>
      <td>0</td>
      <td>0</td>
      <td>6</td>
      <td>0</td>
    </tr>
    <tr>
      <td>D</td>
      <td>0</td>
      <td>3</td>
      <td>6</td>
      <td>0</td>
      <td>5</td>
    </tr>
    <tr>
      <td>E</td>
      <td>0</td>
      <td>0</td>
      <td>0</td>
      <td>5</td>
      <td>0</td>
    </tr>
  </tbody>
</table>

<p>which is still symmetric becuase the graph is undirected. But see that the that now the values represent exactly the cost to go from each node to each node. Another observation is that for undirected graphs we need to store only $V^2/2$ values of the matrix.</p>

<h4 id="adjacency-list-1">Adjacency list</h4>

<p>The adjacency list for this undirected weithed graph is</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
</pre></td><td class="rouge-code"><pre>A: [(B, 4), (C, 2)]
B: [(A, 4), (D, 3)]
C: [(A, 2), (D, 6)]
D: [(B, 3), (C, 6), (E, 5)]
E: [(D, 5)]
</pre></td></tr></tbody></table></code></pre></div></div>

<h4 id="edge-list-1">Edge list</h4>

<p>The edge list for the graph</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
</pre></td><td class="rouge-code"><pre>(A, B, 4)
(A, C, 2)
(B, D, 3)
(C, D, 6)
(D, E, 5)
</pre></td></tr></tbody></table></code></pre></div></div>

<h3 id="directed-weighed-graph">Directed weighed graph</h3>

<p>Consider this directed weighted graph:</p>
<pre><code class="language-mermaid">%%{init: {'theme': 'base', 'themeVariables': {'primaryColor': '#d7d7d7', "lineColor": "#5972ff"}}}%%
graph TD
    A((A)) --&gt;|4| B((B))
    A --&gt;|2| C((C))
    B --&gt;|3| D((D))
    C --&gt;|6| D((D))
    D --&gt;|5| E((E))

    %% Define a global class for larger circular nodes
    classDef largeNodes fill:#d7d7d7,stroke:#000000 ,stroke-width:2px,font-size:24px;

    %% Apply the class to all nodes
    class A,B,C,D,E largeNodes;

    %% Style for larger edge labels
    linkStyle 0,1,2,3,4 stroke-width:2px,font-size:18px;
</code></pre>

<h4 id="adjacency-matrix-2">Adjacency matrix</h4>

<p>Has the following adjacency matrix</p>

<table>
  <thead>
    <tr>
      <th> </th>
      <th>A</th>
      <th>B</th>
      <th>C</th>
      <th>D</th>
      <th>E</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>A</td>
      <td>0</td>
      <td>4</td>
      <td>2</td>
      <td>0</td>
      <td>0</td>
    </tr>
    <tr>
      <td>B</td>
      <td>0</td>
      <td>0</td>
      <td>0</td>
      <td>3</td>
      <td>0</td>
    </tr>
    <tr>
      <td>C</td>
      <td>0</td>
      <td>0</td>
      <td>0</td>
      <td>6</td>
      <td>0</td>
    </tr>
    <tr>
      <td>D</td>
      <td>0</td>
      <td>0</td>
      <td>0</td>
      <td>0</td>
      <td>5</td>
    </tr>
    <tr>
      <td>E</td>
      <td>0</td>
      <td>0</td>
      <td>0</td>
      <td>0</td>
      <td>0</td>
    </tr>
  </tbody>
</table>

<p>See that is non symmetric. In general for any directed graph we need to store the $V^2$ values that connect each pair of nodes.</p>

<h4 id="adjacency-list-2">Adjacency list</h4>

<p>This representation defines a list</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
</pre></td><td class="rouge-code"><pre>A: [(B, 4), (C, 2)]
B: [(D, 3)]
C: [(D, 6)]
D: [(E, 5)]
E: []
</pre></td></tr></tbody></table></code></pre></div></div>

<h4 id="edge-list-2">Edge list</h4>

<p>And finally the edge list</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
</pre></td><td class="rouge-code"><pre>(A, B, 4)
(A, C, 2)
(B, D, 3)
(C, D, 6)
(D, E, 5)
</pre></td></tr></tbody></table></code></pre></div></div>

<h3 id="tree">Tree</h3>

<p>As mentioned before, a tree is a connected acyclic with $V-1$ edges. An example graph shown before</p>

<pre><code class="language-mermaid">%%{init: {'theme': 'base', 'themeVariables': {'primaryColor': '#d7d7d7', "lineColor": "#5972ff"}}}%%
graph TD
    A((A)) --- B((B))
    A --- C((C))
    B --- D((D))
    B --- E((E))
    C --- F((F))
    F --- G((G))

    %% Define a global class for larger circular nodes
    classDef largeNodes fill:#d7d7d7,stroke:#000000 ,stroke-width:2px,font-size:24px;

    %% Apply the class to all nodes
    class A,B,C,D,E,F,G,H largeNodes;
</code></pre>

<h4 id="adjacency-matrix-3">Adjacency matrix</h4>

<table>
  <thead>
    <tr>
      <th> </th>
      <th>A</th>
      <th>B</th>
      <th>C</th>
      <th>D</th>
      <th>E</th>
      <th>F</th>
      <th>G</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>A</td>
      <td>0</td>
      <td>1</td>
      <td>1</td>
      <td>0</td>
      <td>0</td>
      <td>0</td>
      <td>0</td>
    </tr>
    <tr>
      <td>B</td>
      <td>1</td>
      <td>0</td>
      <td>0</td>
      <td>1</td>
      <td>1</td>
      <td>0</td>
      <td>0</td>
    </tr>
    <tr>
      <td>C</td>
      <td>1</td>
      <td>0</td>
      <td>0</td>
      <td>0</td>
      <td>0</td>
      <td>1</td>
      <td>0</td>
    </tr>
    <tr>
      <td>D</td>
      <td>0</td>
      <td>1</td>
      <td>0</td>
      <td>0</td>
      <td>0</td>
      <td>0</td>
      <td>0</td>
    </tr>
    <tr>
      <td>E</td>
      <td>0</td>
      <td>1</td>
      <td>0</td>
      <td>0</td>
      <td>0</td>
      <td>0</td>
      <td>0</td>
    </tr>
    <tr>
      <td>F</td>
      <td>0</td>
      <td>0</td>
      <td>1</td>
      <td>0</td>
      <td>0</td>
      <td>0</td>
      <td>1</td>
    </tr>
    <tr>
      <td>G</td>
      <td>0</td>
      <td>0</td>
      <td>0</td>
      <td>0</td>
      <td>0</td>
      <td>1</td>
      <td>0</td>
    </tr>
  </tbody>
</table>

<h4 id="adjacency-list-3">Adjacency list</h4>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
</pre></td><td class="rouge-code"><pre>A: [B, C]
B: [A, D, E]
C: [A, F]
D: [B]
E: [B]
F: [C, G]
G: [F]
</pre></td></tr></tbody></table></code></pre></div></div>

<h4 id="edge-list-3">Edge list</h4>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
</pre></td><td class="rouge-code"><pre>(A, B)
(A, C)
(B, D)
(B, E)
(C, F)
(F, G)
</pre></td></tr></tbody></table></code></pre></div></div>

<h3 id="complexities-of-different-graph-representations">Complexities of different graph representations</h3>

<p>In the following table we compare the different graph representations shown in this post</p>

<table>
  <thead>
    <tr>
      <th>Representation</th>
      <th>Space Complexity</th>
      <th>Edge Existence Check</th>
      <th>Neighbor Traversal</th>
      <th>Best for</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><strong>Adjacency Matrix</strong></td>
      <td>$O(V^2)$</td>
      <td>$O(1)$</td>
      <td>$O(V)$</td>
      <td>Dense graphs, fast edge lookup</td>
    </tr>
    <tr>
      <td><strong>Adjacency List</strong></td>
      <td>$O(V + E)$</td>
      <td>$O(\text{degree}(v))$</td>
      <td>$O(\text{degree}(v))$</td>
      <td>Sparse graphs, efficient traversal</td>
    </tr>
    <tr>
      <td><strong>Edge List</strong></td>
      <td>$O(E)$</td>
      <td>$O(E)$</td>
      <td>$O(E)$</td>
      <td>Small graphs, simple storage</td>
    </tr>
  </tbody>
</table>

<p>Let’s define each column:</p>

<ul>
  <li><strong>Space Complexity</strong> : The number elements needed to describe a graph. This is important for memory issues.</li>
  <li><strong>Edege Existence Check</strong> : The time complexity for an algorithm to check if two nodes in the graph are connected by an edge.</li>
  <li><strong>Neighbor Traversal</strong> :The time complexity for an algorithm that finds all the nodes connected to a given node.</li>
</ul>

<p>Let’s analyze one by one each representation and reason why we have these complexities. For the adjancency matrix the space complexity is the number of vertices to the square, of course, it’s a matrix of $V \times V$. The edge existence check is of order $1$ because we just need to check an element in the adjacency matrix, e.g. if $a_{i,j} \neq 0$ (where $a_{i,j}$ is the matrix element for nodes $i$ and $j$). This is immediate if we use the proper data structures, normally pointers in the heap of the memory. For the neighbor traversal the complexity is $V$ as for a given node we need to check all other nodes $V-1$ for the column and $V-1$ for the row, tha tis complexity $V$, i.e. linear with the number of nodes.</p>

<p>In the adjacency list the space complexity is $V+E$. First, it’s linear with the number of nodes because each of the vertices is stored as a key in the list, then is linear to the edges because each edge is store as an entry in the list. The edge existence is $O(\text{degree}(v))$, this is, of the order of number of neighbors of the node $v$, as opposed to the adjacency matrix we need to scan the list. The same happens with neighbor traversal for the adjacency list.</p>

<p>The edge list has space complexity linear with the number of edges, this is because we store a list of all the edges. The edge existence is also linear with the number of edges as to find an edge we need to scan the list of our edges, and that is obviously linear with the number of edges (they are unsorted). Finally the neighbor trasversal is the same as we need to scan linearly all the list to find all the neighboring nodes of a give one.</p>

<p>Which method is best?. It depends of the kind of graph (see last column). For dense matrices (many edges) you may use adjacency matrices as the lookup for tables is very fast, however if there are many nodes the space complexity increases very fast. The adjacency list is good if you have a sparse graphs (few edges), it is efficient for saving space and the trasnversal is faster as it depends on the number of neighbors (low in sparse graphs). We will mostly use these two representations but the edge list is more predictable (everything is linear with the number of edges) and is best for small graphs.</p>

<h2 id="wrap-up">Wrap up</h2>

<p>This post is just the basics of graphs, what they are used for, their properites and classification. I wanted to make sure we had different visual representations of each kind of graph so that the reader could learn better the topic, as we say, an image is worth more than a thousand words.</p>

<p>In the next posts we will learn how to transverse (explore) graphs and different algorithms to do it. This is an exciting and practical topic!.</p>]]></content><author><name>Sebastia Agramunt Puig</name></author><category term="Data Structures" /><category term="Graphs" /><category term="computer science" /><summary type="html"><![CDATA[A graph is a structure used in graph theory, consisting of vertices (or nodes) and edges (or links) that connect pairs of vertices. Graphs are used to model relationships, networks, and structures in various fields like computer science, social networks, logistics, and physics.]]></summary></entry><entry><title type="html">Initial Raspberry Pi configuration</title><link href="https://agramunt.me/posts/raspberry-intro/" rel="alternate" type="text/html" title="Initial Raspberry Pi configuration" /><published>2024-11-24T14:15:00-08:00</published><updated>2024-12-03T06:58:26-08:00</updated><id>https://agramunt.me/posts/raspberry-intro</id><content type="html" xml:base="https://agramunt.me/posts/raspberry-intro/"><![CDATA[<p>I’m a big fan of Raspberry Pi’s (RP), so far I had three, the second, the third and the fourth generation. It all started as a side project to have fun: host services, setup an ssh server, install potsgresql and <code class="language-plaintext highlighter-rouge">SELECT DELETE database</code> without having an eery feeling while doing so. Basically mess around with a computer I didn’t care about.</p>

<p>I found out that you can actually do many things in an RP: syncrhonizing files and notes (encrypted), download files through Torrent, setting up a small service to record cryptocurrency market fluctuation… All that in a low power consupmtion device!. I decided to document my raspberry setup right before I moved to California. Why then?. Well… I wanted a clean computer to SSH into and also host a VPN so that I could watch Spanish shows. Yes, I know… you expected a better answer… Two years ago I wrote all the process in my personal notes. Now it’s time to publish those.</p>

<p>This first post is the basic setup, installing the raspberry pi operating system and the basic software.</p>

<h2 id="format-the-sd-card-and-install-the-os">Format the SD card and install the OS</h2>

<p>A raspberry pi has an SD card where the operating system is copied, which operating system should we install?. RPs have their own operating system based on Debian. Currenlty we have versions on Bookworm (version 12) or Bullseye (version 11), as they are the <a href="https://en.wikipedia.org/wiki/Debian_version_history">latests versions</a> of Debian today. Check the  <a href="https://www.raspberrypi.com/software/operating-systems/">RP OS downloads page</a> for the latests versions. How do we download and install any of these operating systems? Here is the easy way:</p>

<ul>
  <li>Plug the SD card into your laptop. If you don’t have a slot for it, just search for “SD card adapter” and buy one.</li>
  <li>Download the <a href="https://www.raspberrypi.com/software/">Raspberry Pi Imager</a>.</li>
  <li>Open the the image loader
    <ul>
      <li>Select your device (in my case Raspberry 4)</li>
      <li>Choose Operating System. Normally 64-bit and lite. No need to install any desktop or extra apps.</li>
      <li>Choose the storage: your SD card</li>
      <li>At some point the app will ask you several things, if it doesn’t make sure you find them before you flash the operating system. Click enable SSH and set up a password, this is important as we don’t want to use a screen for first login to the RP. If you don’t set up a password and leave it as default, the user is <code class="language-plaintext highlighter-rouge">pi</code> and the password <code class="language-plaintext highlighter-rouge">raspberry</code> (or that’s what we used to have in old versions of the OS). Don’t change the hostname unless you have more than one RP living in the same network. If you will use WiFi to connect your device set it up now in this section however it is always better to connect the RP to the router through Ethernet cable for speed reasons.</li>
    </ul>
  </li>
</ul>

<p>After this you should be all set, extract your SD card safely and plug it in to your RP.</p>

<h2 id="first-raspberry-pi-boot-and-ssh">First Raspberry Pi boot and SSH</h2>

<p>The next step is to connect your Raspberry Pi with the SD card to the power source and to your router using the Ethernet cable. Start the Raspberry Pi and do go your personal computer:</p>

<ul>
  <li>Go to your router UI (in my case http://192.168.1.1) and identify the IP of your Raspberry. Let’s say it is <code class="language-plaintext highlighter-rouge">192.168.1.128</code>.</li>
  <li>SSH to the raspberry <code class="language-plaintext highlighter-rouge">ssh pi@192.168.1.128</code> and type the password you inserted. Congrats, you are now logged in to your raspberry pi.</li>
</ul>

<h2 id="generating-ssh-key-pair-for-login">Generating SSH key pair for login</h2>

<p>Using password is not recommended, it is fine for the first boot but is not as secure as generating a pair of SSH keys. This pair consist of a private and a public key, the first is the one you keep in your laptop whereas the second is the one you share with the machine you want to connect to, in this case the RP.</p>

<p>Let’s generate a new SSH key pair with a more secure algorithm than <a href="https://en.wikipedia.org/wiki/RSA_(cryptosystem)">RSA</a>, I choose <a href="https://ed25519.cr.yp.to/">ed25519</a>.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
</pre></td><td class="rouge-code"><pre>ssh-keygen <span class="nt">-t</span> ed25519 <span class="nt">-f</span> ~/.ssh/id_raspberry <span class="nt">-C</span> <span class="s2">"raspberry key"</span>

<span class="c"># # if you still want to use RSA run at least 4096 bytes</span>
<span class="c"># ssh-keygen -t rsa -b 4096 -f ~/.ssh/id_raspberry -C "raspberry key"</span>
</pre></td></tr></tbody></table></code></pre></div></div>
<p>Do not add a passkey when you are prompted, it will add complexity to your SSH. Without password the key is already quite secure. Now ls to your <code class="language-plaintext highlighter-rouge">~/.ssh</code> directory:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre><span class="nb">ls</span> ~/.ssh
</pre></td></tr></tbody></table></code></pre></div></div>
<p>Two files should appear <code class="language-plaintext highlighter-rouge">id_raspberry</code> and <code class="language-plaintext highlighter-rouge">id_raspberry.pub</code>. We need to copy the latter to the raspberry, we can use the command <code class="language-plaintext highlighter-rouge">ssh-copy-id</code> for it (assuming <code class="language-plaintext highlighter-rouge">192.168.1.128</code> is the raspberry pi IP):</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>ssh-copy-id <span class="nt">-i</span> ~/.ssh/id_raspberry.pub pi@192.168.1.128
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Now try to SSH to your raspberry using the user and the password as <code class="language-plaintext highlighter-rouge">ssh pi@192.168.1.128</code> and once logged in see that the key has been added in <code class="language-plaintext highlighter-rouge">~/.ssh/authorized_keys</code> by “catting” it <code class="language-plaintext highlighter-rouge">cat ~/.ssh/authorized_keys</code>. At this point you should be able to SSH without a password with the key as</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>ssh <span class="nt">-i</span> ~/.ssh/id_raspberry pi@192.168.1.128
</pre></td></tr></tbody></table></code></pre></div></div>

<h2 id="easier-ssh">Easier SSH</h2>

<p>The last command is not nice, you have to remember the identity file, the IP etc… I normally add a config file to simplify the login command. Create a file:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre><span class="nb">touch</span> ~/.ssh/config
</pre></td></tr></tbody></table></code></pre></div></div>

<p>and copy the following in it</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
</pre></td><td class="rouge-code"><pre>Host raspberry_local
HostName 192.168.1.128
User pi
Port 22
IdentityFile ~/.ssh/id_raspberry
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Now you can login by simply typing in the command line <code class="language-plaintext highlighter-rouge">ssh raspberry_local</code>. The config above is very useful if you want to add other machines you normally SSH into. It is also convenient to create an alias, add in <code class="language-plaintext highlighter-rouge">~/.zshrc</code> (I use ZSH) or <code class="language-plaintext highlighter-rouge">~/.bashrc</code> (if bash):</p>

<div class="language-ssh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre><span class="k">alias</span> rpi_local='ssh raspberry_local'
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Then <code class="language-plaintext highlighter-rouge">source ~/.zshrc</code> and ssh as <code class="language-plaintext highlighter-rouge">rpi_local</code> in the command line.</p>

<h2 id="disable-login-with-password">Disable login with password</h2>

<p>Once the passwordless SSH is enabled it is time to disable the password SSH, after all that’s why we set up the ssh key pair. SSH to the raspberry</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>ssh raspberry_local
</pre></td></tr></tbody></table></code></pre></div></div>

<p>And install <code class="language-plaintext highlighter-rouge">vim</code></p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre><span class="nb">sudo </span>apt-get <span class="nb">install </span>vim
</pre></td></tr></tbody></table></code></pre></div></div>

<p>open the config file <code class="language-plaintext highlighter-rouge">vim /etc/ssh/sshd_config</code> and modify PasswordAuthentication to</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>PasswordAuthentication no
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Now reboot the raspberry</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre><span class="nb">sudo </span>reboot
</pre></td></tr></tbody></table></code></pre></div></div>

<p>And wait for the machine to restart. It may take a minute or two this time.</p>

<h2 id="disable-desktop">Disable desktop</h2>

<p>If you happened to install the desktop OS (sometimes the Lite version didn’t work for me) it is time to disable it. It uses resources and after all we want to SSH only to the machine.</p>

<p>Once logged into the raspberry type <code class="language-plaintext highlighter-rouge">sudo raspi-config</code> to see the menu, then go to <code class="language-plaintext highlighter-rouge">System Options -&gt; Boot/Auto Login</code> and select <code class="language-plaintext highlighter-rouge">Console</code>, then reboot with <code class="language-plaintext highlighter-rouge">sudo reboot</code>.</p>

<p>Finally ssh back to the raspberry and upgrade all the software in the system to have the latest versions on all packages</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre><span class="nb">sudo </span>apt-get update <span class="o">&amp;&amp;</span> <span class="nb">sudo </span>apt-get upgrade
<span class="nb">sudo </span>apt full-upgrade
</pre></td></tr></tbody></table></code></pre></div></div>

<h2 id="final-notes">Final notes</h2>

<p>Now you have the basic raspberry setup, SSH using public-private key pair, disabled the desktop and with the most updated software that comes with the operating system. To me this is the basic start, a blank page. In the following posts we will configure internet access and create a VPN.</p>]]></content><author><name>Sebastia Agramunt Puig</name></author><category term="Raspberry" /><category term="Linux" /><category term="computer science" /><summary type="html"><![CDATA[I’m a big fan of Raspberry Pi’s (RP), so far I had three, the second, the third and the fourth generation. It all started as a side project to have fun: host services, setup an ssh server, install potsgresql and SELECT DELETE database without having an eery feeling while doing so. Basically mess around with a computer I didn’t care about.]]></summary></entry><entry><title type="html">UV python package and project manager</title><link href="https://agramunt.me/posts/python-virtual-environments-with-uv/" rel="alternate" type="text/html" title="UV python package and project manager" /><published>2024-11-23T14:15:00-08:00</published><updated>2025-09-20T10:24:15-07:00</updated><id>https://agramunt.me/posts/python-virtual-environments-with-uv</id><content type="html" xml:base="https://agramunt.me/posts/python-virtual-environments-with-uv/"><![CDATA[<p><a href="https://github.com/astral-sh/uv">uv</a> is a more modern tool to build environments and manage python projects. It’s built in Rust and claims to be extremely fast in resolving dependencies (10x-100x speedup compared to pip).</p>

<h2 id="tldr">TLDR</h2>

<p>General steps for creating a virtual environment</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
</pre></td><td class="rouge-code"><pre><span class="c"># install exec as</span>
curl <span class="nt">-LsSf</span> https://astral.sh/uv/install.sh | sh <span class="nt">-s</span> <span class="nt">--</span> <span class="nt">--verbose</span>

<span class="c"># install a python version and create virtual environment named general in ~/.venvs</span>
uv python <span class="nb">install </span>3.12
uv venv ~/.venvs/general <span class="nt">--python</span> 3.12

<span class="c"># activate environment</span>
<span class="nb">source</span> ~/.venvs/general/bin/activate

<span class="c"># install some packages</span>
uv pip <span class="nb">install </span>numpy pandas

<span class="c"># see the list of packages and versions</span>
uv pip list

<span class="c">#  # install from pyproject.com</span>
<span class="c"># create env in current directory</span>
uv venv .venv

<span class="c"># sync packages</span>
uv <span class="nb">sync</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>General steps to create a package (assuming we have python 3.13)</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
</pre></td><td class="rouge-code"><pre><span class="c"># create package structure</span>
uv init <span class="nt">--name</span><span class="o">=</span>newpackage <span class="nt">--build-backend</span><span class="o">=</span>setuptools <span class="nt">--package</span>

<span class="c"># create first module (dumb file for example)</span>
<span class="nb">touch </span>src/newpackage/modulea.py
<span class="nb">echo</span> <span class="s2">"import numpy as np"</span> <span class="o">&gt;&gt;</span> src/newpackage/modulea.py

<span class="c"># add dependencies</span>
uv add numpy pandas matplotlib

<span class="c"># see tree depencencies</span>
uv tree

<span class="c"># lock/pin dependencies</span>
uv lock

<span class="c"># build &amp; publish</span>
uv build <span class="o">&amp;&amp;</span> uv publish
</pre></td></tr></tbody></table></code></pre></div></div>

<h2 id="install-uv">Install uv</h2>

<p>It is possible to install <code class="language-plaintext highlighter-rouge">uv</code> system-wide using a bash installer for MacOS and Linux, I would recommend this if you decide <code class="language-plaintext highlighter-rouge">uv</code> is your definite tool to build projects. If that’s the case, run on Linux or MacOS</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>curl <span class="nt">-LsSf</span> https://astral.sh/uv/install.sh | sh <span class="nt">-s</span> <span class="nt">--</span> <span class="nt">--verbose</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>that will install <code class="language-plaintext highlighter-rouge">uv</code> in <code class="language-plaintext highlighter-rouge">~/.cargo/bin/uv</code>. Checking on the file <code class="language-plaintext highlighter-rouge">install.sh</code> just the two binaries <code class="language-plaintext highlighter-rouge">uv</code> and <code class="language-plaintext highlighter-rouge">uvx</code> will be installed, not any other files. This is good, we just care about the binaries at this point.</p>

<p>Make sure you have <code class="language-plaintext highlighter-rouge">~/.cargo/bin/</code> directory in your <code class="language-plaintext highlighter-rouge">PATH</code> variable and execute <code class="language-plaintext highlighter-rouge">uv --help</code> to display the helper.</p>

<h2 id="managing-python-versions">Managing Python versions</h2>

<p><code class="language-plaintext highlighter-rouge">uv</code> is actually a great tool to substitute <code class="language-plaintext highlighter-rouge">pyenv</code> if you don’t like the latter. With a simple command you can install and manage several python versions for your system:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>uv python <span class="nt">--help</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Let’s install <code class="language-plaintext highlighter-rouge">3.13</code> version</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>uv python <span class="nb">install </span>3.13
</pre></td></tr></tbody></table></code></pre></div></div>

<p>And see that it has been installed in a directory that you can get from <code class="language-plaintext highlighter-rouge">uv python dir</code>, in my case (MacOS intel) <code class="language-plaintext highlighter-rouge">~/.local/share/uv/python</code>.</p>

<p>Running <code class="language-plaintext highlighter-rouge">uv python list</code> will show all available versions (including the ones installed in <code class="language-plaintext highlighter-rouge">pyenv</code> tool) and the ones you can install.</p>

<h2 id="managing-python-environments">Managing Python environments</h2>

<p>As mentioned before a python environment requires fist a python version, we can first fix the python version by running</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>uv python pin 3.13
</pre></td></tr></tbody></table></code></pre></div></div>

<p>And then we create the environment in <code class="language-plaintext highlighter-rouge">.venv</code> as</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>uv venv .venv
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Check that the python version for the new environment is the correct one with <code class="language-plaintext highlighter-rouge">.venv/bin/python --version</code>.</p>

<p>Begin installing packages (no need to activate the environment if you are alrady in the directory where <code class="language-plaintext highlighter-rouge">.venv</code> lives)</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>uv pip <span class="nb">install </span>numpy pandas
</pre></td></tr></tbody></table></code></pre></div></div>

<p>and check which packages are in the environment</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>uv pip list
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Alternatively you can use <code class="language-plaintext highlighter-rouge">uv pip freeze</code>.</p>

<h2 id="create-a-python-project">Create a python project</h2>

<p><code class="language-plaintext highlighter-rouge">uv</code> can manage the creation of a basic project structure. First select your python version</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>uv python pin 3.13
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Then run</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>uv init <span class="nt">--help</span>
</pre></td></tr></tbody></table></code></pre></div></div>
<p>among many options you will see options for the kind of project to be built:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
</pre></td><td class="rouge-code"><pre>--package                        Set up the project to be built as a Python package
--no-package                     Do not set up the project to be built as a Python package
--app                            Create a project for an application
--lib                            Create a project for a library
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Let’s create a package by running</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>uv init <span class="nt">--name</span><span class="o">=</span>newpackage <span class="nt">--build-backend</span><span class="o">=</span>setuptools <span class="nt">--package</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>this will create a project with the structure (run <code class="language-plaintext highlighter-rouge">tree</code>):</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
</pre></td><td class="rouge-code"><pre><span class="nb">.</span>
├── README.md
├── pyproject.toml
└── src
    └── newpackage
        └── __init__.py
</pre></td></tr></tbody></table></code></pre></div></div>

<p>if you <code class="language-plaintext highlighter-rouge">cat pyproject.toml</code> you will get the definition of your project:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
</pre></td><td class="rouge-code"><pre>[project]
name = "newpackage"
version = "0.1.0"
description = "Add your description here"
readme = "README.md"
requires-python = "&gt;=3.12"
dependencies = []

[project.scripts]
newpackage = "newpackage:main"

[build-system]
requires = ["setuptools&gt;=61"]
build-backend = "setuptools.build_meta"
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Now you can start adding dependencies</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>uv add numpy pandas matplotlib ruff
</pre></td></tr></tbody></table></code></pre></div></div>

<p>This will automatically create a virtual environment (in <code class="language-plaintext highlighter-rouge">.venv</code>) and install the package, also will modify the <code class="language-plaintext highlighter-rouge">pyproject.toml</code> with the package.Note, you can remember dependencies running <code class="language-plaintext highlighter-rouge">uv remove pkg</code>, for instance <code class="language-plaintext highlighter-rouge">uv remove pandas</code> to remove pandas.</p>

<p>A very convenient tool is the <code class="language-plaintext highlighter-rouge">tree</code>, we used <code class="language-plaintext highlighter-rouge">pipdeptree</code> package previously but the nice thing about <code class="language-plaintext highlighter-rouge">uv</code> is that this tooling comes by default. Just run</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>uv tree
</pre></td></tr></tbody></table></code></pre></div></div>

<p>that shows</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
</pre></td><td class="rouge-code"><pre>newpackage v0.1.0
├── matplotlib v3.9.2
│   ├── contourpy v1.3.0
│   │   └── numpy v2.1.3
│   ├── cycler v0.12.1
│   ├── fonttools v4.54.1
│   ├── kiwisolver v1.4.7
│   ├── numpy v2.1.3
│   ├── packaging v24.1
│   ├── pillow v11.0.0
│   ├── pyparsing v3.2.0
│   └── python-dateutil v2.9.0.post0
│       └── six v1.16.0
├── numpy v2.1.3
└── pandas v2.2.3
    ├── numpy v2.1.3
    ├── python-dateutil v2.9.0.post0 (*)
    ├── pytz v2024.2
    └── tzdata v2024.2
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Finally you can even pin the dependencies using</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>uv lock
</pre></td></tr></tbody></table></code></pre></div></div>
<p>that will generate a file <code class="language-plaintext highlighter-rouge">uv.lock</code> that contains the package version along with the hash and the specific wheel to download from pypi on every platform, similarly to what we have seen in pipvenv post in file <code class="language-plaintext highlighter-rouge">Pipfile.lock</code>. As mentioned several times in these series of posts, one may want to use a locked environment when running a service and not when defining a package.</p>

<p>In some ocasions you may want to build and publish the python package, for that uv has the commands <code class="language-plaintext highlighter-rouge">build</code> and <code class="language-plaintext highlighter-rouge">publish</code>. We won’t get into the details of it in this post (we’ll have a later post dedicated to it), just remember that you can handle this with <code class="language-plaintext highlighter-rouge">uv</code>.</p>

<h2 id="a-more-complete-pyprojecttoml-and-how-to-install-in-uv-and-pip">A more complete <code class="language-plaintext highlighter-rouge">pyproject.toml</code> and how to install in <code class="language-plaintext highlighter-rouge">uv</code> and <code class="language-plaintext highlighter-rouge">pip</code></h2>

<p>In this section we’ll show a better <code class="language-plaintext highlighter-rouge">pyproject.toml</code> and how to install it using <code class="language-plaintext highlighter-rouge">pip</code> and <code class="language-plaintext highlighter-rouge">uv</code>. One of the good things about <code class="language-plaintext highlighter-rouge">uv</code> is that it seems to be completely compatible with <code class="language-plaintext highlighter-rouge">pip</code>, the default dependency manager. That makes it ideal for any project as the default user won’t use <code class="language-plaintext highlighter-rouge">uv</code>.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
</pre></td><td class="rouge-code"><pre>[project]
name = "newpackage"
version = "0.1.0"
description = "My new package"
authors = [
    { name = "the author" }
]
license = { text = "MIT license" }
readme = "README.md"

requires-python = "&gt;=3.9,&lt;3.13"
dependencies = [
    "matplotlib&gt;=3.9",
    "numpy&gt;=1.26,&lt;2.0.0",
    "pandas&gt;=2.2",
    "scipy&gt;=1.13",
    "tifffile&gt;=2024.8",
    "pip&gt;=24.3.1",
]

[project.optional-dependencies]
dev = [
    "coverage&gt;=7.6",
    "pytest&gt;=8.3",
    "ruff&gt;=0.7",
]

# Ruff is a great tool for linting
[tool.ruff]
line-length = 99

[tool.ruff.lint]
select = [
    # Pyflakes
    "F",
    # Pycodestyle &amp; Warnings
    "E",
    "W",
    # isort for unsorted imports
    "I001"
]

[tool.ruff.format]
quote-style = "single"
indent-style = "space"
docstring-code-format = true
docstring-code-line-length = 20

# build system, use setuptools, the default for Python
[build-system]
requires = ["setuptools&gt;=61"]
build-backend = "setuptools.build_meta"

# python CLI, structure scripts/my_script.py funcion main
# once installed the environment, activate it and run `my_cli` on terminal
# to run the CLI
[project.scripts]
my_cli = "scripts.my_script:main"

## In case you run a private pypi repository, uncomment and change URL
# [[tool.uv.index]]
# name="pypi-internal-server"
# url = "http://pypi-server.mydomain.com:8081/repository/"
# priority = "supplemental"
</pre></td></tr></tbody></table></code></pre></div></div>

<p>To install, create the enviornment and sync</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
</pre></td><td class="rouge-code"><pre>uv venv .venv
uv <span class="nb">sync</span>

<span class="c"># to install with the optional dependencies (dev in our case)</span>
uv <span class="nb">sync</span> <span class="nt">--all-extras</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Equivalently in pip you can do</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
</pre></td><td class="rouge-code"><pre>python <span class="nt">-m</span> venv .venv
.venv/bin/python <span class="nt">-m</span> pip <span class="nb">install</span> <span class="nt">--upgrade</span> pip
.venv/bin/python <span class="nt">-m</span> pip <span class="nb">install</span> <span class="nb">.</span> <span class="nt">--extra-index-url</span> http://pypi-server.mydomain.com:8081/repository/

<span class="c"># to install with optional dependencies</span>
.venv/bin/python <span class="nt">-m</span> pip <span class="nb">install</span> .[dev] <span class="nt">--extra-index-url</span> http://pypi-server.mydomain.com:8081/repository/
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Now, there’s a lot here, let me explain, the first part just specifies the python versions and the dependencies. Then we have the tool <code class="language-plaintext highlighter-rouge">ruff</code>, more on that later, the build system (<code class="language-plaintext highlighter-rouge">setuptools</code> as the default tool in python) and finally a CLI and an internal pypi repository.</p>

<p>Let me begin by <code class="language-plaintext highlighter-rouge">ruff</code> tells you what lines of your code are not properly formatted (linting) and also formats them (changes the format of the code, a formater). Run it with <code class="language-plaintext highlighter-rouge">uv run ruff check .</code>. Imagine you have a file with the following content in your package:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
</pre></td><td class="rouge-code"><pre><span class="kn">import</span> <span class="n">os</span>
<span class="kn">import</span> <span class="n">sys</span>  <span class="c1"># Unused import
</span>
<span class="k">def</span> <span class="nf">example_function</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">):</span>
    <span class="k">return</span> <span class="n">x</span> <span class="o">+</span> <span class="n">y</span>

<span class="k">def</span> <span class="nf">unused_function</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">):</span>  <span class="c1"># Unused function
</span>    <span class="k">return</span> <span class="n">a</span> <span class="o">*</span> <span class="n">b</span>

<span class="nf">print</span><span class="p">(</span><span class="nf">example_function</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">))</span>

<span class="k">def</span> <span class="nf">_make_ssl_transport</span><span class="p">(</span>
    <span class="n">rawsock</span><span class="p">,</span> <span class="n">protocol</span><span class="p">,</span> <span class="n">sslcontext</span><span class="p">,</span> <span class="n">waiter</span><span class="o">=</span><span class="bp">None</span><span class="p">,</span>
    <span class="o">*</span><span class="p">,</span> <span class="n">server_side</span><span class="o">=</span><span class="bp">False</span><span class="p">,</span> <span class="n">server_hostname</span><span class="o">=</span><span class="bp">None</span><span class="p">,</span>
    <span class="n">extra</span><span class="o">=</span><span class="bp">None</span><span class="p">,</span> <span class="n">server</span><span class="o">=</span><span class="bp">None</span><span class="p">,</span>
    <span class="n">ssl_handshake_timeout</span><span class="o">=</span><span class="bp">None</span><span class="p">,</span>
    <span class="n">call_connection_made</span><span class="o">=</span><span class="bp">True</span><span class="p">):</span>
    <span class="sh">'''</span><span class="s">Make an SSL transport.</span><span class="sh">'''</span>
    <span class="k">if</span> <span class="n">waiter</span> <span class="ow">is</span> <span class="bp">None</span><span class="p">:</span>
      <span class="k">pass</span>

    <span class="k">if</span> <span class="n">extra</span> <span class="ow">is</span> <span class="bp">None</span><span class="p">:</span>
      <span class="n">extra</span> <span class="o">=</span> <span class="p">{}</span>

</pre></td></tr></tbody></table></code></pre></div></div>

<p>Ruff (with the above configuration) here will complain in several places. In the first line it will raise an error because the imports are not sorted (using isort rule). In the first and second lines it will also raise error, this time <code class="language-plaintext highlighter-rouge">F401</code> because the packages sys and os are imported but never used. The unused function will raise <code class="language-plaintext highlighter-rouge">F841</code> and finally (not the case in this example) if any line would exceed 99 characters it would raise <code class="language-plaintext highlighter-rouge">E501</code>. Ruff is good to keep you code clean, check all the rules running ` uv run ruff rule –all<code class="language-plaintext highlighter-rouge"> and all the linters with </code>uv run ruff linter<code class="language-plaintext highlighter-rouge">.  Once the errors are identified you can fix them as suggested by </code>ruff` with</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>uv run ruff check <span class="nb">.</span> <span class="nt">--fix</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Finally format the code with</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>uv run ruff format
</pre></td></tr></tbody></table></code></pre></div></div>

<p>In the documentation ruff developers say that “ the formatter is designed as a drop-in replacement for Black”. It indeed formats the function for us to a much nicer one!. Check the <a href="https://docs.astral.sh/ruff/rules/">rules</a> and the <a href="https://docs.astral.sh/ruff/formatter/">formater</a> in the official documentation.</p>

<p>Finally we have added a private repository in the <code class="language-plaintext highlighter-rouge">pyproject.toml</code>. The private repository is a URL where we can host wheels, the software artifacts containing a package (most of the times is basically a zipped repository). In some organizations we publish packages in an internal repository but by default python tries to find all packages in <a href="https://pypi.org/">pypi</a>. Adding the final lines with the appropiate URL will tell <code class="language-plaintext highlighter-rouge">uv</code> that it may have to look at that repository too. We have defined this for <code class="language-plaintext highlighter-rouge">uv</code> only in the <code class="language-plaintext highlighter-rouge">pyproject.toml</code>, actually it is not possible to define it for <code class="language-plaintext highlighter-rouge">pip</code>. In that case we simply add the extra index at the time of installation through <code class="language-plaintext highlighter-rouge">--extra-index-url http://pypi-server.mydomain.com:8081/repository/</code>. An alternative for <code class="language-plaintext highlighter-rouge">pip</code> is to add a <code class="language-plaintext highlighter-rouge">pip.conf</code> file (see more details <a href="https://pip.pypa.io/en/stable/topics/configuration/">here</a>).</p>

<p>This section ended up being a bit long but I wanted to show a good <code class="language-plaintext highlighter-rouge">pyproject.toml</code> file with most of the things you need on a python package project. Hope it is helpful!. I will likely write two other posts about <code class="language-plaintext highlighter-rouge">ruff</code> and how to setup your own <code class="language-plaintext highlighter-rouge">pypi</code> server securely, in a much more detailed way.</p>

<h2 id="speed-benchmark">Speed benchmark</h2>

<p>uv promisses large speedups in resolving dependencies on their <a href="https://docs.astral.sh/uv/">weppage</a>, about 10x to 100x compared to pip. In this section we will test how fast each engine is capable of resonving basic dependencies. We will compare <code class="language-plaintext highlighter-rouge">uv</code> with other popular tools like <code class="language-plaintext highlighter-rouge">pip</code>, <code class="language-plaintext highlighter-rouge">conda</code>, <code class="language-plaintext highlighter-rouge">mabmba</code> and <code class="language-plaintext highlighter-rouge">poetry</code>. I will use my 2019 macbook pro AMD with 16 GB of memory. Also to compare apples to apples I will start building from scratch each environment without caching packages, i.e. downloading all the packages each time. We will ask the dependency manager to install the following:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>pandas scikit-learn flask fastapi matplotlib requests pytest boto3 pyyaml cryptography jupyterlab seaborn pillow sqlalchemy
</pre></td></tr></tbody></table></code></pre></div></div>

<p>on a python 3.11 version.</p>

<p>Let’s begin with the tool presented in this post, <code class="language-plaintext highlighter-rouge">uv</code>.</p>

<h3 id="uv">uv</h3>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
</pre></td><td class="rouge-code"><pre>uv python <span class="nb">install </span>3.11
uv python pin 3.11
<span class="nb">rm</span> <span class="nt">-rf</span> .venv
uv venv .venv
<span class="nb">time </span>uv pip <span class="nb">install </span>pandas scikit-learn flask fastapi matplotlib requests pytest boto3 pyyaml cryptography jupyterlab seaborn pillow sqlalchemy <span class="nt">--no-cache-dir</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>with result:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre> 3.08s user 8.33s system 148% cpu 7.668 total
</pre></td></tr></tbody></table></code></pre></div></div>

<h3 id="pip">pip</h3>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
</pre></td><td class="rouge-code"><pre>pyenv <span class="nb">install </span>3.11.2
pyenv shell 3.11.2

<span class="nb">rm</span> <span class="nt">-rf</span> .venv
python <span class="nt">-m</span> venv .venv
<span class="nb">source</span> .venv/bin/activate
pip <span class="nb">install</span> <span class="nt">--upgrade</span> pip
<span class="nb">time </span>pip <span class="nb">install </span>pandas scikit-learn flask fastapi matplotlib requests pytest boto3 pyyaml cryptography jupyterlab seaborn pillow sqlalchemy <span class="nt">--no-cache-dir</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>with result:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>32.32s user 7.25s system 66% cpu 59.662 total
</pre></td></tr></tbody></table></code></pre></div></div>

<h3 id="conda">Conda</h3>

<p>Run</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre>pyenv shell miniconda3-latest
conda remove <span class="nt">--name</span> myenv <span class="nt">--all</span> <span class="nt">-y</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>create the file <code class="language-plaintext highlighter-rouge">environment.yml</code>, this way we can time better the process</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
</pre></td><td class="rouge-code"><pre>name: myenv
dependencies:
  - python=3.11
  - pandas
  - scikit-learn
  - flask
  - fastapi
  - matplotlib
  - requests
  - pytest
  - boto3
  - pyyaml
  - cryptography
  - jupyterlab
  - seaborn
  - pillow
  - sqlalchemy
</pre></td></tr></tbody></table></code></pre></div></div>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre>conda clean <span class="nt">--all</span> <span class="nt">-y</span>
<span class="nb">time </span>conda <span class="nb">env </span>create <span class="nt">-f</span> environment.yml
</pre></td></tr></tbody></table></code></pre></div></div>

<p>getting a time of 43 seconds.</p>

<h3 id="mamba">Mamba</h3>

<p>Run</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre>pyenv shell mambaforge
conda remove <span class="nt">--name</span> myenv <span class="nt">--all</span> <span class="nt">-y</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>create same file <code class="language-plaintext highlighter-rouge">environment.yml</code> as the <code class="language-plaintext highlighter-rouge">conda</code> section.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre>conda clean <span class="nt">--all</span> <span class="nt">-y</span>
<span class="nb">time </span>conda <span class="nb">env </span>create <span class="nt">-f</span> environment.yml
</pre></td></tr></tbody></table></code></pre></div></div>

<p>geting a time of 52 seconds.</p>

<h3 id="poetry">Poetry</h3>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
</pre></td><td class="rouge-code"><pre>poetry config virtualenvs.in-project <span class="nb">true
</span>pyrenv shell 3.11.9
poetry <span class="nb">env </span>use 3.11.9
</pre></td></tr></tbody></table></code></pre></div></div>

<p>The <code class="language-plaintext highlighter-rouge">pyproject.toml</code> that has to be placed in the project directory</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
</pre></td><td class="rouge-code"><pre>[tool.poetry]
name = "speed-test"
version = "0.1.0"
description = ""
authors = ["Your Name &lt;you@example.com&gt;"]
readme = "README.md"

[tool.poetry.dependencies]
python = "3.11.9"
pandas = "*"
scikit-learn = "*"
flask = "*"
fastapi = "*"
matplotlib = "*"
requests = "*"
pytest = "*"
boto3 = "*"
pyyaml = "*"
cryptography = "*"
jupyterlab = "*"
seaborn = "*"
pillow = "*"
sqlalchemy = "*"

[build-system]
requires = ["poetry-core"]
build-backend = "poetry.core.masonry.api"
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Then run</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre><span class="nb">time </span>poetry <span class="nb">install</span> <span class="nt">--no-cache</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Which takes 25.05s user 14.10s system 100% cpu 38.877 total.</p>

<h3 id="benchmark-conclusions">Benchmark conclusions</h3>

<p>Summing up, <code class="language-plaintext highlighter-rouge">uv</code> seems to be the fastest by far, around ~8x compared to <code class="language-plaintext highlighter-rouge">pip</code>. Very promissing!</p>

<table>
  <thead>
    <tr>
      <th>tool</th>
      <th>time to resolve depencencies(seconds)</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>uv</td>
      <td>07.66</td>
    </tr>
    <tr>
      <td>pip</td>
      <td>59.62</td>
    </tr>
    <tr>
      <td>conda</td>
      <td>43</td>
    </tr>
    <tr>
      <td>mamba</td>
      <td>52</td>
    </tr>
    <tr>
      <td>poetry</td>
      <td>39.87</td>
    </tr>
  </tbody>
</table>

<h2 id="conclusions">Conclusions</h2>

<p>uv is not only super fast in resolving depencencies, it manages your python versions and by default creates the environment in the local directory. With <code class="language-plaintext highlighter-rouge">uv</code> you don’t need anything else, no <code class="language-plaintext highlighter-rouge">pyenv</code> for managing your python versions, or no <code class="language-plaintext highlighter-rouge">poetry</code> to build and publish your wheels. Even creating a new project boilerplate is super easy!. I have been reading out there and seems that the only drawback is that the dependency management is a bit less strict compared to <code class="language-plaintext highlighter-rouge">poety</code>. To me this is fine, this tool simply works. Another big plus of <code class="language-plaintext highlighter-rouge">uv</code> is that it is perfectly compatible with <code class="language-plaintext highlighter-rouge">pip</code>, this is, the <code class="language-plaintext highlighter-rouge">pyproject.toml</code> defined for <code class="language-plaintext highlighter-rouge">uv</code> can be installed using <code class="language-plaintext highlighter-rouge">pip</code> seamlessly as it follows the standard of <a href="https://peps.python.org/pep-0621/">PEP-621</a> not forcing the users of your package to use <code class="language-plaintext highlighter-rouge">uv</code> but the most general tool <code class="language-plaintext highlighter-rouge">pip</code> if they want to.</p>]]></content><author><name>Sebastia Agramunt Puig</name></author><category term="Python" /><category term="computer science" /><summary type="html"><![CDATA[uv is a more modern tool to build environments and manage python projects. It’s built in Rust and claims to be extremely fast in resolving dependencies (10x-100x speedup compared to pip).]]></summary></entry><entry><title type="html">Python virtual environments with Poetry</title><link href="https://agramunt.me/posts/python-virtual-environments-with-poetry/" rel="alternate" type="text/html" title="Python virtual environments with Poetry" /><published>2024-11-01T05:36:00-07:00</published><updated>2024-11-01T05:36:00-07:00</updated><id>https://agramunt.me/posts/python-virtual-environments-with-poetry</id><content type="html" xml:base="https://agramunt.me/posts/python-virtual-environments-with-poetry/"><![CDATA[<p><a href="https://python-poetry.org">Python Poetry</a> is a tool for package dependency management and packaging. It has become very popular among developers.</p>

<h2 id="tldr">TLDR</h2>

<p>To create a virtual environment, first install poetry in your project</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
</pre></td><td class="rouge-code"><pre>pyenv <span class="nb">install </span>3.12.2 <span class="nt">-f</span>
pyenv shell 3.12.2

<span class="c"># create a venv with poetry exec</span>
python <span class="nt">-m</span> venv .venv_poetry
.venv_poetry/bin/python <span class="nt">-m</span> pip <span class="nb">install</span> <span class="nt">-U</span> pip setuptools
.venv_poetry/bin/python <span class="nt">-m</span> pip <span class="nb">install </span>poetry

<span class="c"># config poetry to create the venv in the current directory</span>
.venv_poetry/bin/poetry config virtualenvs.prefer-active-python <span class="nb">true</span>
.venv_poetry/bin/poetry virtualenvs.in-project <span class="nb">true</span>

</pre></td></tr></tbody></table></code></pre></div></div>

<p>Add place the <code class="language-plaintext highlighter-rouge">pyproject.toml</code> file in the directory:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
</pre></td><td class="rouge-code"><pre>[tool.poetry]
package-mode = false

[tool.poetry.dependencies]
python = "&gt;=3.10,&lt;3.13"

[build-system]
requires = ["poetry-core"]
build-backend = "poetry.core.masonry.api"
</pre></td></tr></tbody></table></code></pre></div></div>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
</pre></td><td class="rouge-code"><pre><span class="c"># add dependencies</span>
.venv_poetry/bin/poetry add numpy
.venv_poetry/bin/poetry add pandas
.venv_poetry/bin/poetry add matplotlib

<span class="c"># add development dependencies</span>
poetry add pytest <span class="nt">--group</span> dev

<span class="c"># install (create environment) with all dependencies</span>
.venv_poetry/bin/poetry <span class="nb">install</span>

<span class="c"># install (create environment) without dev dependencies</span>
.venv_poetry/bin/poetry <span class="nb">install</span> <span class="nt">--no-dev</span>

<span class="c"># execute python in the new environment</span>
.venv/bin/python <span class="nt">--version</span>

<span class="c"># lock the dependencies</span>
.venv_poetry/bin/poetry lock
</pre></td></tr></tbody></table></code></pre></div></div>
<p>To create a new repository read the specific section in this post.</p>

<h2 id="install-poetry">Install Poetry</h2>

<p>Similar to <code class="language-plaintext highlighter-rouge">virtualenv</code>, <code class="language-plaintext highlighter-rouge">poetry</code> is a package of python so you can install via <code class="language-plaintext highlighter-rouge">pip install</code>. The most common is to install it in the default interpreter, the global in <code class="language-plaintext highlighter-rouge">pyenv</code> or the default in the system <code class="language-plaintext highlighter-rouge">/usr/local/bin/python3</code> (in MacOS).</p>

<p>Define a global python using pyenv, as an example we will use <code class="language-plaintext highlighter-rouge">3.12</code>.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre><span class="nv">GLOBAL_PYTHON</span><span class="o">=</span>3.12
pyenv global <span class="k">${</span><span class="nv">GLOBAL_PYTHON</span><span class="k">}</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Install poetry on it (see <a href="https://python-poetry.org/docs/#installation">official documentation</a>) and check the help (just to see that it works)</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
</pre></td><td class="rouge-code"><pre>python <span class="nt">-m</span> pip <span class="nb">install</span> <span class="nt">-U</span> pip setuptools
python <span class="nt">-m</span> pip <span class="nb">install </span>poetry
poetry <span class="nt">--help</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Then you will have <code class="language-plaintext highlighter-rouge">poetry</code> available anytime you are in the pyenv global interpreter on your shell.</p>

<p>Instead of the described method I prefer to always create a new virtual environment in the project to install <code class="language-plaintext highlighter-rouge">poetry</code>. The reason is that I may use poetry for certain projects, not for everything so to me it is better to install it per project. To do this, navigate to your python project, install the python version:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre>pyenv <span class="nb">install </span>3.12.2 <span class="nt">-f</span>
pyenv shell 3.12.2
</pre></td></tr></tbody></table></code></pre></div></div>

<p>and create the environment in e.g. <code class="language-plaintext highlighter-rouge">.venv_poetry</code> then install <code class="language-plaintext highlighter-rouge">poetry</code>:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
</pre></td><td class="rouge-code"><pre>python <span class="nt">-m</span> venv .venv_poetry
.venv_poetry/bin/python <span class="nt">-m</span> pip <span class="nb">install</span> <span class="nt">-U</span> pip setuptools
.venv_poetry/bin/python <span class="nt">-m</span> pip <span class="nb">install </span>poetry
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Now all you need to do to use <code class="language-plaintext highlighter-rouge">poetry</code> is to execute the binary in this environment: simply call <code class="language-plaintext highlighter-rouge">.venv_poetry/bin/poetry</code>. To see that this works you can show the poetry config:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>.venv_poetry/bin/poetry config <span class="nt">--list</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>To remove <code class="language-plaintext highlighter-rouge">poetry</code> simply remove the virutal environment <code class="language-plaintext highlighter-rouge">.venv_poetry</code>.</p>

<h2 id="create-a-virtual-environment">Create a virtual environment</h2>

<p>Before creating a new environment we need to select the python version. To me, the best way is to still use <code class="language-plaintext highlighter-rouge">pyenv</code> to install and manage python versions and select the shell version using <code class="language-plaintext highlighter-rouge">pyenv shell</code> command. Then you can tell <code class="language-plaintext highlighter-rouge">poetry</code> to use the current activated python binary to create the environment. Let’s do this, as before we create a virtual environment encapsulating <code class="language-plaintext highlighter-rouge">poetry</code> as:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
</pre></td><td class="rouge-code"><pre>pyenv <span class="nb">install </span>3.12.2 <span class="nt">-f</span>
pyenv shell 3.12.2

python <span class="nt">-m</span> venv .venv_poetry
.venv_poetry/bin/python <span class="nt">-m</span> pip <span class="nb">install</span> <span class="nt">-U</span> pip setuptools
.venv_poetry/bin/python <span class="nt">-m</span> pip <span class="nb">install </span>poetry
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Now in the poetry config we need to modify a couple of parameters:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre>.venv_poetry/bin/poetry config virtualenvs.prefer-active-python <span class="nb">true</span>
.venv_poetry/bin/poetry virtualenvs.in-project <span class="nb">true</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>This will enable using the current activated python and also make it live in the current directory. Create a <code class="language-plaintext highlighter-rouge">pyproject.toml</code> with the following content</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
</pre></td><td class="rouge-code"><pre>[tool.poetry]
package-mode = false

[tool.poetry.dependencies]
python = "^3.11"
numpy = "^2.1.3"
pandas = "^2.2.3"

[tool.poetry.group.dev.dependencies]
pytest = "^8.3.3"

[build-system]
requires = ["poetry-core"]
build-backend = "poetry.core.masonry.api"
</pre></td></tr></tbody></table></code></pre></div></div>

<p>see that I set <code class="language-plaintext highlighter-rouge">package-mode=false</code> indicating that I don’t want this configuration to be a package, just want to specify the dependencies for a virtual environment. You can tell that this <code class="language-plaintext highlighter-rouge">pyproject.toml</code> is built by poetry, first because the build-system is poetry but also every other section is <code class="language-plaintext highlighter-rouge">tool.poetry</code>. Don’t worry if you are working with python packages (remember this example is just a toml to install dependencies in a virtual environment) you can always pip install the project, just need to compile the wheel with poetry and pip install it.</p>

<p>Let’s continue with installing the virtual environment specified. Say you want to install the new environment using python 3.11, install this version first using <code class="language-plaintext highlighter-rouge">pyenv</code></p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre>pyenv <span class="nb">install </span>3.11 <span class="nt">-f</span>
pyenv shell 3.11
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Then install the environment as</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>.venv_poetry/bin/poetry <span class="nb">install</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>See that this will the usual <code class="language-plaintext highlighter-rouge">.venv</code> directory where all the environment is saved. Check that the python version used is <code class="language-plaintext highlighter-rouge">3.11</code> and not <code class="language-plaintext highlighter-rouge">3.12</code> by running <code class="language-plaintext highlighter-rouge">.venv/bin/python --version</code>. Now you activate the environment and use it with <code class="language-plaintext highlighter-rouge">source .venv/bin/activate</code>.</p>

<h2 id="creating-packages-using-poetry">Creating packages using poetry</h2>

<p>It is super simple to create a new package, simply create a new directory, install poetry and run <code class="language-plaintext highlighter-rouge">poetry init</code>, see the code:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
</pre></td><td class="rouge-code"><pre><span class="nb">mkdir </span>my_package <span class="o">&amp;&amp;</span> <span class="nb">cd </span>my_package
python <span class="nt">-m</span> venv .venv_poetry
.venv_poetry/bin/python <span class="nt">-m</span> pip <span class="nb">install</span> <span class="nt">-U</span> pip setuptools
.venv_poetry/bin/python <span class="nt">-m</span> pip <span class="nb">install </span>poetry
.venv_poetry/bin/poetry config virtualenvs.prefer-active-python <span class="nb">true</span>
.venv_poetry/bin/poetry virtualenvs.in-project <span class="nb">true</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>then initialize the package with</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>.venv_poetry/bin/poetry init
</pre></td></tr></tbody></table></code></pre></div></div>

<p>and follow the steps in the command line. Poetry will ask you for the dependencies, the name of the project, the license etc… Finally will create the <code class="language-plaintext highlighter-rouge">pyproject.toml</code>. Create the <code class="language-plaintext highlighter-rouge">README.md</code> yourself with <code class="language-plaintext highlighter-rouge">touch README.md</code> otherwise poetry will raise an error before installing. Now add the source code:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
</pre></td><td class="rouge-code"><pre><span class="nb">mkdir </span>my_package
<span class="nb">touch </span>my_package/__init__.py
<span class="nb">touch </span>my_package/modulea.py
</pre></td></tr></tbody></table></code></pre></div></div>
<p>In <code class="language-plaintext highlighter-rouge">modulea.py</code> place something like (as an example):</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
</pre></td><td class="rouge-code"><pre><span class="kn">import</span> <span class="n">numpy</span> <span class="k">as</span> <span class="n">np</span>

<span class="k">def</span> <span class="nf">numpy_max</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">):</span>
    <span class="k">return</span> <span class="n">np</span><span class="p">.</span><span class="nf">max</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">)</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>now install the project</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>.venv_poetry/bin/poetry <span class="nb">install</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>and check that you can import the package from your new environment</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>.venv/bin/python <span class="nt">-c</span> <span class="s2">"import my_package"</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>And the only thing you are left to do is to code your new flashy package.</p>

<h2 id="creating-and-publishing-wheels">Creating and publishing wheels</h2>

<p>Python wheels are the artifacts that you download from <a href="https://pypi.org/">pypi</a> repository to install packages. A wheel is basically a zipped file that containes the source code (python files) and compiled code that is platform specific (if any). Here I will show you how to create and publish a package using poetry</p>

<p>To build the wheel of your project simply run</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>.venv_poetry/bin/poetry build 
</pre></td></tr></tbody></table></code></pre></div></div>

<p>you will see a new diretory in your project called <code class="language-plaintext highlighter-rouge">dist</code> where two new files will be placed with extension <code class="language-plaintext highlighter-rouge">.whl</code> and <code class="language-plaintext highlighter-rouge">.tar.gz</code>. The latter is simply the zipped project and is platform independent, it is useful to unzip and compile for different platforms. We will go into the detail of this in another post where we will build modules using compiled C++ and shared libraries. Here we deal with pure python code so our project will be platform independend as long as the dependencies are available for every platform (e.g. numpy contains compiled C++ code and is available on many platforms and architectures).</p>

<p>The next step is to publish the wheel, for that you need to tell <code class="language-plaintext highlighter-rouge">poetry</code> where it should be published, include the details in the <code class="language-plaintext highlighter-rouge">pyproject.toml</code>, for instance:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
</pre></td><td class="rouge-code"><pre>[tool.poetry.repositories]
custom = { url = "https://your.custom.repo.url" }
my_pypi = { url = "https://pypi.org/project/my_package_name/" }
</pre></td></tr></tbody></table></code></pre></div></div>

<p>where obviously you need to change the URLs and names to your own. Then you can run</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>poetry publish <span class="nt">--repository</span> custom
</pre></td></tr></tbody></table></code></pre></div></div>

<p>to publish to your custom repo. Also to add a new repository from which you want to download packages add the following to your <code class="language-plaintext highlighter-rouge">pyproject.toml</code>:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
</pre></td><td class="rouge-code"><pre>[[tool.poetry.source]]
name = "custom_repo"
url = "https://your.custom.repo.url"
priority = "supplemental"
</pre></td></tr></tbody></table></code></pre></div></div>

<p>the fact that it is supplemental indicates <code class="language-plaintext highlighter-rouge">poetry</code> that the primary where to look files from is pypi. It will look for wheels there and then if it can’t find them it will default to your custom repository.</p>]]></content><author><name>Sebastia Agramunt Puig</name></author><category term="Python" /><category term="computer science" /><summary type="html"><![CDATA[Python Poetry is a tool for package dependency management and packaging. It has become very popular among developers.]]></summary></entry><entry><title type="html">Python virtual environments with pipenv</title><link href="https://agramunt.me/posts/python-virtual-environments-with-pipenv/" rel="alternate" type="text/html" title="Python virtual environments with pipenv" /><published>2024-09-01T08:05:00-07:00</published><updated>2024-09-01T08:05:00-07:00</updated><id>https://agramunt.me/posts/python-virtual-environments-with-pipenv</id><content type="html" xml:base="https://agramunt.me/posts/python-virtual-environments-with-pipenv/"><![CDATA[<p>As they describe in <a href="https://pipenv.pypa.io/en/latest/">pipenv documentation</a>: Pipenv is a Python virtualenv management tool that supports a multitude of systems and nicely bridges the gaps between pip, python (using system python, pyenv or asdf) and virtualenv. So, virtualenv and pyenv? seems like a desirable integration!.</p>

<h2 id="tldr-pipenv">TLDR pipenv</h2>

<p>Using <code class="language-plaintext highlighter-rouge">pyenv</code> install <code class="language-plaintext highlighter-rouge">pipenv</code> in global pyenv, then create a virtual environment in the current directory</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
</pre></td><td class="rouge-code"><pre><span class="c"># install pipenv</span>
<span class="nv">GLOBAL_PYTHON</span><span class="o">=</span>3.12.4
pyenv global <span class="k">${</span><span class="nv">GLOBAL_PYTHON</span><span class="k">}</span>

python <span class="nt">-m</span> pip <span class="nb">install </span>pipenv

<span class="c"># create environment</span>
pyenv shell <span class="nt">--unset</span>
<span class="nb">rm</span> <span class="nt">-rf</span> .python-version

<span class="nv">PYTHON_VERSION</span><span class="o">=</span>3.10.14
<span class="nv">PYTHON_PATH</span><span class="o">=</span><span class="s2">"</span><span class="si">$(</span>pyenv root<span class="si">)</span><span class="s2">/versions/</span><span class="nv">$PYTHON_VERSION</span><span class="s2">/bin/python"</span>

<span class="nb">export </span><span class="nv">PIPENV_VENV_IN_PROJECT</span><span class="o">=</span>1
pipenv <span class="nt">--python</span> <span class="k">${</span><span class="nv">PYTHON_PATH</span><span class="k">}</span>

<span class="c"># install some packages</span>
pipenv <span class="nb">install </span>numpy
pipenv <span class="nb">install </span>pandas

<span class="c"># show dependency tree</span>
pipenv graph

<span class="c"># launch python</span>
.venv/bin/python <span class="nt">-c</span> <span class="s2">"import numpy as np"</span>

<span class="c"># activate environment</span>
<span class="nb">source</span> .venv/bin/activate

<span class="c"># deactivate</span>
deactivate environment
</pre></td></tr></tbody></table></code></pre></div></div>

<h2 id="install">Install</h2>

<p>As virtualenv we need to pip install to our global python (which again I choose to <code class="language-plaintext highlighter-rouge">3.11.2</code>)</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
</pre></td><td class="rouge-code"><pre><span class="nv">GLOBAL_PYTHON</span><span class="o">=</span>3.11.2
pyenv global <span class="k">${</span><span class="nv">GLOBAL_PYTHON</span><span class="k">}</span>

python <span class="nt">-m</span> pip <span class="nb">install </span>pipenv
python <span class="nt">-m</span> pipenv <span class="nt">--help</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<h2 id="create-a-new-environment">Create a new environment</h2>

<p>Before creating the environment make sure you are in the <code class="language-plaintext highlighter-rouge">global</code> pyenv version where you installed <code class="language-plaintext highlighter-rouge">pipenv</code> as a command line tool:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre>pyenv shell <span class="nt">--unset</span>
<span class="nb">rm</span> <span class="nt">-rf</span> .python-version
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Now create a virtual environment for a specific python version indicating where is the binary of that python in pyenv.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
</pre></td><td class="rouge-code"><pre><span class="nv">PYTHON_VERSION</span><span class="o">=</span>3.10.14
<span class="nv">PYTHON_PATH</span><span class="o">=</span><span class="s2">"</span><span class="si">$(</span>pyenv root<span class="si">)</span><span class="s2">/versions/</span><span class="nv">$PYTHON_VERSION</span><span class="s2">/bin/python"</span>

<span class="nb">export </span><span class="nv">PIPENV_VENV_IN_PROJECT</span><span class="o">=</span>1
pipenv <span class="nt">--python</span> <span class="k">${</span><span class="nv">PYTHON_PATH</span><span class="k">}</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>this will create a virtual environment in your current directory under <code class="language-plaintext highlighter-rouge">.venv</code> and a <code class="language-plaintext highlighter-rouge">Pipfile</code> too. If <code class="language-plaintext highlighter-rouge">PIPENV_VENV_IN_PROJECT</code> is not set to 1 the environment will be installed somewhere else in the system (in MacOS in <code class="language-plaintext highlighter-rouge">~/.local/share/virtualenvs/</code>).</p>

<p>Start installing packages</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre>pipenv <span class="nb">install </span>numpy
pipenv <span class="nb">install </span>pandas
</pre></td></tr></tbody></table></code></pre></div></div>

<p>And the dependencies will be added in <code class="language-plaintext highlighter-rouge">Pipfile</code>:</p>

<div class="language-toml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
</pre></td><td class="rouge-code"><pre><span class="k">[[</span><span class="n">source</span><span class="k">]]</span>
<span class="n">url</span> <span class="o">=</span><span class="w"> </span><span class="s">"https://pypi.org/simple"</span>
<span class="n">verify_ssl</span> <span class="o">=</span><span class="w"> </span><span class="kc">true</span>
<span class="n">name</span> <span class="o">=</span><span class="w"> </span><span class="s">"pypi"</span>

<span class="k">[</span><span class="n">packages</span><span class="k">]</span>
<span class="n">numpy</span> <span class="o">=</span><span class="w"> </span><span class="s">"*"</span>
<span class="n">pandas</span> <span class="o">=</span><span class="w"> </span><span class="s">"*"</span>

<span class="k">[</span><span class="n">dev-packages</span><span class="k">]</span>

<span class="k">[</span><span class="n">requires</span><span class="k">]</span>
<span class="n">python_version</span> <span class="o">=</span><span class="w"> </span><span class="s">"3.10"</span>
<span class="n">python_full_version</span> <span class="o">=</span><span class="w"> </span><span class="s">"3.10.14"</span>

</pre></td></tr></tbody></table></code></pre></div></div>
<p>This is not saying much about the numpy and pandas versions, it basically allows any version of those packages. If you want something more specific for numpy for instance, you can run <code class="language-plaintext highlighter-rouge">pipenv install "numpy&gt;=1.20,&lt;2.0"</code>.</p>

<p>But obviously we have installed two packages and in our eviroinment they have a specific version. That is specified in the new file created <code class="language-plaintext highlighter-rouge">Pipfile.lock</code>, those are the pinned dependencies we were mentioning earlier in this post. Let’s inspect a bit <code class="language-plaintext highlighter-rouge">Pipfile.lock</code>:</p>

<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
</pre></td><td class="rouge-code"><pre><span class="w">  </span><span class="nl">"numpy"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
      </span><span class="nl">"hashes"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="w">
          </span><span class="s2">"sha256:08801848a40aea24ce16c2ecde3b756f9ad756586fb2d13210939eb69b023f5b"</span><span class="p">,</span><span class="w">
          </span><span class="s2">"sha256:0937e54c09f7a9a68da6889362ddd2ff584c02d015ec92672c099b61555f8911"</span><span class="p">,</span><span class="w">
          </span><span class="s2">"..."</span><span class="p">,</span><span class="w">
          </span><span class="p">]</span><span class="w">
      </span><span class="nl">"index"</span><span class="p">:</span><span class="w"> </span><span class="s2">"pypi"</span><span class="p">,</span><span class="w">
      </span><span class="nl">"markers"</span><span class="p">:</span><span class="w"> </span><span class="s2">"python_version &gt;= '3.10'"</span><span class="p">,</span><span class="w">
      </span><span class="nl">"version"</span><span class="p">:</span><span class="w"> </span><span class="s2">"==2.1.0"</span><span class="w">
  </span><span class="p">}</span><span class="w">
</span></pre></td></tr></tbody></table></code></pre></div></div>

<p>First notice the field “version”, it specifies exactly the version installed in the virtual environment, also “python_version” indicates for which python versions this package is compatible with. This is great to know what can you modify in your environment and solve dependencies. A third non so trivial part are the hashes, these are the checksums for the downoaded packages allowed for <code class="language-plaintext highlighter-rouge">numpy</code> package. Or in other words, when we download the numpy wheel (zipped package in python) and we do the checksum, it has to mach one of the ones in the list, with this we avoid corruption of the file for security. Also, all these checksums are the hashes for all the platforms and operating systems!. So, if you develop in MacOS and you send your friend this lock file, he will have the checksum in the file and so it will be compatible with his system (at least in theory).</p>

<p>Let’s continue by activating environment created</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre><span class="nb">source</span> .venv/bin/activate
</pre></td></tr></tbody></table></code></pre></div></div>

<p>A very nice feature of pipenv is <code class="language-plaintext highlighter-rouge">pipenv graph</code> where it is displayed on the screen the dependencies of your packages:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
</pre></td><td class="rouge-code"><pre><span class="nv">pandas</span><span class="o">==</span>2.2.2
├── numpy <span class="o">[</span>required: <span class="o">&gt;=</span>1.22.4, installed: 2.1.0]
├── python-dateutil <span class="o">[</span>required: <span class="o">&gt;=</span>2.8.2, installed: 2.9.0.post0]
│   └── six <span class="o">[</span>required: <span class="o">&gt;=</span>1.5, installed: 1.16.0]
├── pytz <span class="o">[</span>required: <span class="o">&gt;=</span>2020.1, installed: 2024.1]
└── tzdata <span class="o">[</span>required: <span class="o">&gt;=</span>2022.7, installed: 2024.1]
</pre></td></tr></tbody></table></code></pre></div></div>

<p>This feature is very useful if trying to resolve complex inconsistencies in your package. In this example you can see that even though we manually installed <code class="language-plaintext highlighter-rouge">numpy</code> and <code class="language-plaintext highlighter-rouge">pandas</code> we only needed to install <code class="language-plaintext highlighter-rouge">pandas</code> as <code class="language-plaintext highlighter-rouge">numpy</code> is already a dependency of <code class="language-plaintext highlighter-rouge">pandas</code>.</p>

<h2 id="pin-the-environment">Pin the environment</h2>

<p>We have seen that being able to specify the python package versions is important for other developers to build the same environment. If your friend uses <code class="language-plaintext highlighter-rouge">pipenv</code> too, it will be enough to send him the <code class="language-plaintext highlighter-rouge">Pipfile</code> and the <code class="language-plaintext highlighter-rouge">Pipfile.lock</code>. Otherwise you can create the <code class="language-plaintext highlighter-rouge">requirements.txt</code> file too with</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>pipenv requirements <span class="o">&gt;</span> requirements.txt
</pre></td></tr></tbody></table></code></pre></div></div>
<p>In our example the contents of the file will be</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
</pre></td><td class="rouge-code"><pre>-i https://pypi.org/simple
numpy==2.1.0; python_version &gt;= '3.10'
pandas==2.2.2; python_version &gt;= '3.9'
python-dateutil==2.9.0.post0; python_version &gt;= '2.7' and python_version not in '3.0, 3.1, 3.2, 3.3'
pytz==2024.1
six==1.16.0; python_version &gt;= '2.7' and python_version not in '3.0, 3.1, 3.2, 3.3'
tzdata==2024.1; python_version &gt;= '2'
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Then, for instance, if you want to use <code class="language-plaintext highlighter-rouge">venv</code> module to recreate the environment you can do</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
</pre></td><td class="rouge-code"><pre><span class="nb">rm</span> <span class="nt">-rf</span> .venv
python <span class="nt">-m</span> venv .venv
.venv/bin/python <span class="nt">-m</span> pip <span class="nb">install</span> <span class="nt">-r</span> requirements.txt
</pre></td></tr></tbody></table></code></pre></div></div>

<p>which, if you do <code class="language-plaintext highlighter-rouge">pip freeze</code> gives the versions</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
</pre></td><td class="rouge-code"><pre>numpy==2.1.0
pandas==2.2.2
python-dateutil==2.9.0.post0
pytz==2024.1
six==1.16.0
tzdata==2024.1
</pre></td></tr></tbody></table></code></pre></div></div>
<p>that are the same.</p>

<h2 id="remove-a-pipenv-environment">Remove a pipenv environment</h2>

<p>Finally, remove the environment with:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
</pre></td><td class="rouge-code"><pre><span class="nb">rm</span> <span class="nt">-rf</span> .venv
<span class="nb">rm</span> <span class="nt">-rf</span> Pipfile
<span class="nb">rm</span> <span class="nt">-rf</span> Pipfile.lock
</pre></td></tr></tbody></table></code></pre></div></div>

<h2 id="conclusions">Conclusions</h2>

<p>I’ve always preferred <code class="language-plaintext highlighter-rouge">venv</code> for building virtual environments since it comes by default with Python. I’m not a fan of <code class="language-plaintext highlighter-rouge">conda</code>, but I recently explored <code class="language-plaintext highlighter-rouge">pipenv</code> and was pleasantly surprised. I’ll still use <code class="language-plaintext highlighter-rouge">venv</code> for my repositories but will definitely use <code class="language-plaintext highlighter-rouge">pipenv</code> for:</p>

<ul>
  <li>Upgrading Python dependencies: <code class="language-plaintext highlighter-rouge">pipenv graph</code> makes it much easier to upgrade both Python and package versions.</li>
  <li>Generating a more detailed <code class="language-plaintext highlighter-rouge">requirements.txt</code>, including Python version info, which is useful for developers using <code class="language-plaintext highlighter-rouge">venv</code> or other tools.</li>
  <li>Cross-platform builds: Though I haven’t needed it yet, <code class="language-plaintext highlighter-rouge">Pipfile.lock</code> could simplify building environments on different platforms and reduce build times, thanks to its platform-specific hashes.</li>
</ul>]]></content><author><name>Sebastia Agramunt Puig</name></author><category term="Python" /><category term="computer science" /><summary type="html"><![CDATA[As they describe in pipenv documentation: Pipenv is a Python virtualenv management tool that supports a multitude of systems and nicely bridges the gaps between pip, python (using system python, pyenv or asdf) and virtualenv. So, virtualenv and pyenv? seems like a desirable integration!.]]></summary></entry><entry><title type="html">Number theory for cryptography and privacy preserving machine learning</title><link href="https://agramunt.me/posts/number-theory/" rel="alternate" type="text/html" title="Number theory for cryptography and privacy preserving machine learning" /><published>2024-09-01T05:08:00-07:00</published><updated>2024-09-01T05:08:00-07:00</updated><id>https://agramunt.me/posts/number-theory</id><content type="html" xml:base="https://agramunt.me/posts/number-theory/"><![CDATA[<p>This is a first post in which I intend to explain the basic ingredients needed to understand the cryptography for privacy preserving machine learning topics. Here I will cover number theory, for python code check the <a href="https://github.com/SebastiaAgramunt/Cryptography">github repository</a>.</p>

<h2 id="introduction">Introduction</h2>

<p>In this post I will focus on the most basic ingredients for cryptography, basic number theory. This is needed to understand all sorts of cryptograhpt: symmetric vs asymmetric cryptography, hash functions, digital signatures, random number generation, key exchange protocols, secret sharing schemes, homomorphisms and secure computation among others.</p>

<h2 id="divisibility-and-greatest-common-divisor">Divisibility and greatest common divisor</h2>

<p>In this section I write about division of integer numbers. This will be used through all this post in one way or another. Given two natural numbers a and b (the latter nonzero) we say that b divides a if there is another integer $c$ such that $a=b \cdot c$. If, conversely we can’t find such $c$ then if $b&lt;a$ we can find a relation $a=b \cdot q+r$ where $q$ is called quotient and $r$ remainder. This, in other words is what we learn at primary school, just a bit more formalised.</p>

<p>If there’s a number $d$ that divides both, $a$ and $b$, we also say that $d$ is a common divisor of $a$ and $b$. For instance, 2 is a common divisor of 60 and 80 since you can write 60=30<em>2 and 80=40</em>2. One way to calculate the greatest common divisor (gcd) of two integers is through the extended euclidean algorithm: given two positive integers $a$ and $b$, the following equation holds</p>

\[au+bv=\textit{gcd}(a,b)\]

<p>The python code to solve this equation can be found here. For instance, the solution for the pair $(a, b)=(30,27)$ is $g, u, v = (3, 1, -1)$ where $g$ is the gcd. And a last definition, $a$ and $b$ are said to be <strong>coprime</strong> iff $gcd(a, b)=1$, that is the largest number that divides both is 1.</p>

<h2 id="modular-arithmetic">Modular arithmetic</h2>

<p>Modular arithmetic is a system of arithmetic for integers, where numbers “wrap around” when reaching a certain value. First we need to fix a value m and then compute the modulo over an integer $a$, the result is the remainder of the division. For instance, if $m=7$</p>

<table>
  <thead>
    <tr>
      <th>i</th>
      <th>i(mod 7)</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>0</td>
      <td>0</td>
    </tr>
    <tr>
      <td>1</td>
      <td>1</td>
    </tr>
    <tr>
      <td>2</td>
      <td>2</td>
    </tr>
    <tr>
      <td>3</td>
      <td>3</td>
    </tr>
    <tr>
      <td>4</td>
      <td>4</td>
    </tr>
    <tr>
      <td>5</td>
      <td>5</td>
    </tr>
    <tr>
      <td>6</td>
      <td>6</td>
    </tr>
    <tr>
      <td>7</td>
      <td>0</td>
    </tr>
    <tr>
      <td>8</td>
      <td>1</td>
    </tr>
    <tr>
      <td>9</td>
      <td>2</td>
    </tr>
  </tbody>
</table>

<p>you can see that the value of the operation remains the same for $i&lt;m$, when it reaches m it “wraps around” and begins with 0 again.</p>

<p>Now you may wonder… why does this have to do with cryptography? Well, there are two reasons, the first and most important is that modular arithmetic allows the construction of simple algebras like groups or fields, these are the building blocks of cryptography. The second reason is that this defines a finite set of elements (not infinite like natural numbers and real numbers) and therefore is more tractable on a computer.</p>

<p>The modulo operation defines an <a href="https://en.wikipedia.org/wiki/Group_(mathematics)">algebraic group</a> over the sum, so if we take the previous example of $m=7$, the elements of the group are $(0, 1, 2, 3, 4, 5, 6)$ and the operation is the sum modulo m. See the “multiplication” table for operation sum modulo:</p>

<table>
  <thead>
    <tr>
      <th>+</th>
      <th>0</th>
      <th>1</th>
      <th>2</th>
      <th>3</th>
      <th>4</th>
      <th>5</th>
      <th>6</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>0</td>
      <td>0</td>
      <td>1</td>
      <td>2</td>
      <td>3</td>
      <td>4</td>
      <td>5</td>
      <td>6</td>
    </tr>
    <tr>
      <td>1</td>
      <td>1</td>
      <td>2</td>
      <td>3</td>
      <td>4</td>
      <td>5</td>
      <td>6</td>
      <td>0</td>
    </tr>
    <tr>
      <td>2</td>
      <td>2</td>
      <td>3</td>
      <td>4</td>
      <td>5</td>
      <td>6</td>
      <td>0</td>
      <td>1</td>
    </tr>
    <tr>
      <td>3</td>
      <td>3</td>
      <td>4</td>
      <td>5</td>
      <td>6</td>
      <td>0</td>
      <td>1</td>
      <td>2</td>
    </tr>
    <tr>
      <td>4</td>
      <td>4</td>
      <td>5</td>
      <td>6</td>
      <td>0</td>
      <td>1</td>
      <td>2</td>
      <td>3</td>
    </tr>
    <tr>
      <td>5</td>
      <td>5</td>
      <td>6</td>
      <td>0</td>
      <td>1</td>
      <td>2</td>
      <td>3</td>
      <td>4</td>
    </tr>
    <tr>
      <td>6</td>
      <td>6</td>
      <td>0</td>
      <td>1</td>
      <td>2</td>
      <td>3</td>
      <td>4</td>
      <td>5</td>
    </tr>
  </tbody>
</table>

<p>A group has the following properties</p>

<ul>
  <li><strong>Element closure</strong>: for any two elements a and b of the group, their operation returns c which is also in the group. E.g. (5+3)(mod 7)=1</li>
  <li><strong>Associativity</strong>: for any three elements a, b, c it holds (a+b)+c=a+(b+c). This obviously happens in the sum operation.</li>
  <li><strong>Existence of identity</strong>: There exist an element e in the set such that for any a in the set a+e=a. In this example the neutral element is 0. E.g. (5+0)(mod 7)=5.</li>
  <li><strong>Inverse element</strong>: For any element in the group a there must be another element b such that a+b=e. E.g. the inverse of 5 is 2 in our example because (5+2)(mod 7)=0</li>
</ul>

<p>If the group is commutative (i.e. $a+b=b+a$), then the group is also called <strong>commutative group</strong> or <strong>abelian</strong>.</p>

<h2 id="modulo-operations-on-product">Modulo operations on product</h2>

<p>The sum modulo operation works well to construct a group, just choose an $m$ (any $m$ of the natural numbers will work) and you will be in the realm of the numbers $(0, 1, 2, …, m-1)$ and the modulo operation with addition. But what if instead of the sum we choose the product to define a group?. In this case the neutral element is $1$ and we’ll find out that sometimes we cannot form a group for arbitrary $m$, for instance take $m=10$, are you able to find the multiplicative inverse of 2, i.e. find $x$ such that</p>

\[2 \cdot x =1 \pmod{10}\]

<p>Let’s inspect the entire multiplication table  for $2 \cdot x \pmod{10}$</p>

<table>
  <thead>
    <tr>
      <th>$x$</th>
      <th>$2 \cdot x \pmod{10}$</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>0</td>
      <td>0</td>
    </tr>
    <tr>
      <td>1</td>
      <td>2</td>
    </tr>
    <tr>
      <td>2</td>
      <td>4</td>
    </tr>
    <tr>
      <td>3</td>
      <td>6</td>
    </tr>
    <tr>
      <td>4</td>
      <td>8</td>
    </tr>
    <tr>
      <td>5</td>
      <td>0</td>
    </tr>
    <tr>
      <td>6</td>
      <td>2</td>
    </tr>
    <tr>
      <td>7</td>
      <td>4</td>
    </tr>
    <tr>
      <td>8</td>
      <td>6</td>
    </tr>
    <tr>
      <td>9</td>
      <td>8</td>
    </tr>
  </tbody>
</table>

<p>From the multiplication table you can see that there’s no number that multiplied by $2$ gives the neutral multiplication number $1$ in the field $\pmod{10}$. We can say that these elements do not form a group because we are missing inverse elements on some (if you check it, those without inverse are 2, 4, 5, 6, 8). But good news we can pick those elements that have inverse and form a group!. Let me write a multiplication table for such group in $m=10$:</p>

<table>
  <thead>
    <tr>
      <th>x</th>
      <th>1</th>
      <th>3</th>
      <th>7</th>
      <th>9</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>1</td>
      <td>1</td>
      <td>3</td>
      <td>7</td>
      <td>9</td>
    </tr>
    <tr>
      <td>3</td>
      <td>3</td>
      <td>9</td>
      <td>1</td>
      <td>7</td>
    </tr>
    <tr>
      <td>7</td>
      <td>7</td>
      <td>1</td>
      <td>9</td>
      <td>3</td>
    </tr>
    <tr>
      <td>9</td>
      <td>9</td>
      <td>7</td>
      <td>3</td>
      <td>1</td>
    </tr>
  </tbody>
</table>

<p>See that all elements in this list ${1, 3, 7, 9}$ have an inverse. We can say that ${1, 3, 7, 9}$ with the operation multiplication $\pmod{10}$ form an abelian group.</p>

<p>Is there a way to know if an element has inverse modulo $m$? Yes!. Let $a$ and $m$ be integers such that $a&lt;m$, then $a \cdot b \pmod{m}=1$ for some integer $b$ if and only if $gcd(a, m)=1$. Actually we can use the <strong>extended euclidean algorithm</strong> to calculate the inverse:</p>

<p>If $a$ has inverse modulo $m$, then from the stated above:</p>

\[au+mv=1 \pmod{m}\]

<p>by applying $\pmod{m}$ to both sides of the equation we will have</p>

\[au=1 \pmod{m}\]

<p>and therefore $u$ is the inverse of $a \pmod{m}$. There is a special case when $m$ is a prime number, we will denote a general prime number $p$ from now on. In this case, $gcd(a, p)=1$ since $p$ is only divisible by himself and $1$, therefore all the elements $(1, 2, 3, …, p-1)$ will have inverse and will form a group with the product $\pmod{p}$. An easy way to calculate the multiplicative inverse in a prime modulo group is to use the <strong>Fermat’s little theorem</strong>: Let $p$ be a prime number and let $a$ be any integer then:</p>

\[a^{p-1}=1 \pmod{p}\]

<p>if $a$ is not divisible by $p$, otherwise the above equation equals $0$. Then to calculate the inverse of $a$, we just need to multiply by $a^{-1}$ both sides of the equation:</p>

\[a^{-1}=a^{p-2}\pmod{p}\]

<p>so we can now calculate the inverse modulo $p$ of $a$. This may seem computationally expensive but we use the fast powering algorithm an implementation of which in Python can be found <a href="https://github.com/SebastiaAgramunt/Cryptography/blob/master/notebooks/crypt.py#L135">here</a>.</p>

<h2 id="rings-and-fields">Rings and Fields</h2>

<p>So far we have seen how to to define mathematical groups with sum and multiplication modulo an integer $m$. We can construct other algebraic structures using both operations. A <strong>ring</strong> is an algebraic structure consisting of a set of elements $S$ and two operations $(+, \cdot)$ that fulfil the following properties</p>

<ul>
  <li>The set with the first operation form an abelian group. i.e. $(S, +)$ is abelian</li>
  <li>There’s associativity on the second operation. I.e. if $a$, $b$, $c$ are elements of $S$ then $(a \cdot b) \cdot c=a \cdot (b \cdot c)$.</li>
  <li>Existence of identity on the second operation. I.e. there exist an element $e$ such that for any $a$ in the set $a \cdot e=a$.</li>
  <li>The second operation is distributive with respect to the first one. This is $a \cdot (b + c)=a \cdot b +a \cdot c$ and $(b + c) \cdot a = b \cdot a + c \cdot a$.</li>
</ul>

<p>The set $S={0, 1, 2, 3}$ with the modular operations of additions ($+ \pmod{4})$ and multiplications ($\cdot \mod{4}$) is a ring. Another example of ring is the set of matrices of dimension $3 \times 3$ and real coefficients. You can check that all properties above are accomplished (takehome exercise!).</p>

<p>A <strong>field</strong> is a ring such that the second operation also satisfies all the abelian group properties (after throwing out the identity element of the first operation). The field has multiplicative inverses, multiplicative identity and is commutative.</p>

<p>A typical example of field is the set $S={0, 1, 2, 3, …, p}$ where $p$ is a prime number and operations are addition and multiplication modulo $p$. See that since $p$ is a prime all the elements of the set have a multiplicative inverse and therefore constitute an abelian group with the multiplication operation. This field is commonly denoted as $\mathbb{F}_p$. The set of matrices with dimension $3 \times 3$ and real coefficients is not a field since some matrices do not have a multiplicative inverse.</p>

<h2 id="conclusions">Conclusions</h2>

<p>We understood what is modulo arithmetic and how with such operation we can define groups, rings and fields over finite sets of elements. These structures appear constantly in cryptography, for instance when working with elliptic curves or in simple protocols like Diffie-Hellman key exchange.</p>

<p>In the next posts we are going to work with the defined algebraic structures to understand key concepts on cryptography and multiparty computation.</p>

<p>Thank you for reading!. If you like the article, please star my <a href="https://github.com/SebastiaAgramunt/Cryptography">github repository</a>.</p>]]></content><author><name>Sebastia Agramunt Puig</name></author><category term="Cryptography" /><category term="cryptography" /><category term="mathematics" /><summary type="html"><![CDATA[This is a first post in which I intend to explain the basic ingredients needed to understand the cryptography for privacy preserving machine learning topics. Here I will cover number theory, for python code check the github repository.]]></summary></entry><entry><title type="html">Setup a python project</title><link href="https://agramunt.me/posts/python-project/" rel="alternate" type="text/html" title="Setup a python project" /><published>2024-09-01T03:05:00-07:00</published><updated>2024-09-01T03:05:00-07:00</updated><id>https://agramunt.me/posts/python-project</id><content type="html" xml:base="https://agramunt.me/posts/python-project/"><![CDATA[<p>The easiest way to distribute your code is to packagify it, this is create a package that can be <code class="language-plaintext highlighter-rouge">pip</code> installed, in this post I am going to show how to do that.</p>

<h2 id="project-structure">Project structure</h2>

<p>First you need to define a project structure, our project is going to be pure python, this is, no compiled code, just python and compiled dependencies like <code class="language-plaintext highlighter-rouge">numpy</code>. Create the following files in your new project directory, we will call the new package <code class="language-plaintext highlighter-rouge">mypackage</code>.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
</pre></td><td class="rouge-code"><pre><span class="nb">.</span>
├── README.md
├── pyproject.toml
├── src
│   └── mypackage
│       ├── __init__.py
│       ├── modulea.py
│       └── moduleb.py
└── tests
    ├── __init__.py
    ├── test_modulea.py
    └── test_moduleb.py
</pre></td></tr></tbody></table></code></pre></div></div>

<p>The <code class="language-plaintext highlighter-rouge">pyproject.toml</code> is where all the dependencies will be defined, we will get over it in the next section. For now, leave the <code class="language-plaintext highlighter-rouge">__init__.py</code> files empty and a custom function to <code class="language-plaintext highlighter-rouge">modulea.py</code></p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre><span class="k">def</span> <span class="nf">add</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">):</span>
    <span class="k">return</span> <span class="n">a</span> <span class="o">+</span> <span class="n">b</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>and another fucntion example to <code class="language-plaintext highlighter-rouge">moduleb.py</code></p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre><span class="k">def</span> <span class="nf">subtract</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">):</span>
    <span class="k">return</span> <span class="n">a</span> <span class="o">-</span> <span class="n">b</span>
</pre></td></tr></tbody></table></code></pre></div></div>
<p>We’ll get to the testing part in a while but basically that’s it, this is a pure python package.</p>

<h2 id="importing-without-installing">Importing without installing</h2>

<p>Python pacakges and modules work importing files from directories. As an example create a new virtual environment in the root directory of the project:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
</pre></td><td class="rouge-code"><pre><span class="nv">PYTHON_VERSION</span><span class="o">=</span>3.12.4
pyenv shell <span class="k">${</span><span class="nv">PYTHON_VERSION</span><span class="k">}</span>
python <span class="nt">-m</span> venv .venv
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Activate the environment and import addition from modulea:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre><span class="nb">source</span> .venv/bin/activate
python <span class="nt">-c</span> <span class="s2">"from src.mypackage.modulea import add; print(add(3,4))"</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>This works, see that <code class="language-plaintext highlighter-rouge">src.mypackage.modulea</code> is the path in directories to the file <code class="language-plaintext highlighter-rouge">modulea.py</code>. Now go one directory up and try the same</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre><span class="nb">cd</span> ..
python <span class="nt">-c</span> <span class="s2">"from src.mypackage.modulea import add; print(add(3,4))"</span>
</pre></td></tr></tbody></table></code></pre></div></div>
<p>the error you get is</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
</pre></td><td class="rouge-code"><pre>Traceback (most recent call last):
  File "&lt;string&gt;", line 1, in &lt;module&gt;
ModuleNotFoundError: No module named 'src'
</pre></td></tr></tbody></table></code></pre></div></div>
<p>that means you can’t import, you are no longer in the project directory. How can we import the package from anywhere having our virtual environment activated?.It’s necessary to install the pacakge into the virtual evnvironment, we will do this in the following sections.</p>

<h2 id="defining-a-project-with-pyprojecttoml">Defining a project with pyproject.toml</h2>

<p>The “new” (since 2016) standard to specify dependencies in a package is the <code class="language-plaintext highlighter-rouge">pyproject.toml</code> file, it was introduced in <a href="https://peps.python.org/pep-0518/">PEP 518</a>, here is an example</p>

<div class="language-toml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
</pre></td><td class="rouge-code"><pre><span class="k">[</span><span class="n">build-system</span><span class="k">]</span>
<span class="n">requires</span> <span class="o">=</span><span class="w"> </span><span class="p">[</span><span class="s">"setuptools&gt;=42"</span><span class="p">,</span> <span class="s">"wheel"</span><span class="p">]</span>
<span class="n">build-backend</span> <span class="o">=</span><span class="w"> </span><span class="s">"setuptools.build_meta"</span>

<span class="k">[</span><span class="n">project</span><span class="k">]</span>
<span class="n">name</span> <span class="o">=</span><span class="w"> </span><span class="s">"mypackage"</span>
<span class="n">version</span> <span class="o">=</span><span class="w"> </span><span class="s">"0.1.0"</span>
<span class="n">description</span> <span class="o">=</span><span class="w"> </span><span class="s">"A simple example project"</span>
<span class="n">authors</span> <span class="o">=</span><span class="w"> </span><span class="p">[</span>
    <span class="p">{</span><span class="w"> </span><span class="n">name</span><span class="w"> </span><span class="p">=</span><span class="w"> </span><span class="s">"Your Name"</span><span class="p">,</span><span class="w"> </span><span class="n">email</span><span class="w"> </span><span class="p">=</span><span class="w"> </span><span class="s">"you@example.com"</span><span class="w"> </span><span class="p">}</span>
<span class="p">]</span>
<span class="n">license</span> <span class="o">=</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">text</span><span class="w"> </span><span class="p">=</span><span class="w"> </span><span class="s">"MIT"</span><span class="w"> </span><span class="p">}</span>
<span class="n">readme</span> <span class="o">=</span><span class="w"> </span><span class="s">"README.md"</span>
<span class="n">keywords</span> <span class="o">=</span><span class="w"> </span><span class="p">[</span><span class="s">"example"</span><span class="p">,</span> <span class="s">"setuptools"</span><span class="p">,</span> <span class="s">"pyproject"</span><span class="p">]</span>
<span class="n">classifiers</span> <span class="o">=</span><span class="w"> </span><span class="p">[</span>
    <span class="s">"Programming Language :: Python :: 3"</span><span class="p">,</span>
    <span class="s">"License :: OSI Approved :: MIT License"</span><span class="p">,</span>
    <span class="s">"Operating System :: OS Independent"</span><span class="p">,</span>
<span class="p">]</span>
<span class="n">requires-python</span> <span class="o">=</span><span class="w"> </span><span class="s">"&gt;=3.9,&lt;3.13"</span>
<span class="n">dependencies</span> <span class="o">=</span><span class="w"> </span><span class="p">[</span>
    <span class="s">"requests&gt;=2.20"</span><span class="p">,</span>
    <span class="s">"numpy&gt;=1.18"</span><span class="p">,</span>
    <span class="s">"pandas&gt;=2.0"</span><span class="p">,</span>
    <span class="s">"pipdeptree&gt;2.23"</span>
<span class="p">]</span>

<span class="k">[</span><span class="n">project</span><span class="k">.</span><span class="n">urls</span><span class="k">]</span>
<span class="s">"Homepage"</span> <span class="o">=</span><span class="w"> </span><span class="s">"https://example.com"</span>
<span class="s">"Repository"</span> <span class="o">=</span><span class="w"> </span><span class="s">"https://github.com/example/mypackage"</span>

<span class="k">[</span><span class="n">tool</span><span class="k">.</span><span class="n">setuptools</span><span class="k">]</span>
<span class="n">packages</span> <span class="o">=</span><span class="w"> </span><span class="p">[</span><span class="s">"mypackage"</span><span class="p">]</span>
<span class="n">package-dir</span> <span class="o">=</span><span class="w"> </span><span class="p">{</span><span class="s">""</span><span class="w"> </span><span class="p">=</span><span class="w"> </span><span class="s">"src"</span><span class="p">}</span>

<span class="k">[</span><span class="n">project</span><span class="k">.</span><span class="n">optional-dependencies</span><span class="k">]</span>
<span class="n">dev</span> <span class="o">=</span><span class="w"> </span><span class="p">[</span>
    <span class="s">"tox&gt;=4.2"</span><span class="p">,</span>
    <span class="s">"pytest&gt;=8.3"</span><span class="p">,</span>
    <span class="s">"pytest-cov&gt;=5"</span><span class="p">,</span>
    <span class="s">"ruff&gt;=0.7"</span>
<span class="p">]</span>

</pre></td></tr></tbody></table></code></pre></div></div>

<p>Install the package in a new environment</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
</pre></td><td class="rouge-code"><pre><span class="c"># install python version create environment and activate it.</span>
<span class="nv">PYTHON_VERSION</span><span class="o">=</span>3.12.4
pyenv shell <span class="k">${</span><span class="nv">PYTHON_VERSION</span><span class="k">}</span>
python <span class="nt">-m</span> venv .venv
<span class="nb">source</span> .venv/bin/activate

<span class="c"># upgrade pip and install package</span>
pip <span class="nb">install</span> <span class="nt">--upgrade</span> pip
python <span class="nt">-m</span> pip <span class="nb">install</span> <span class="nb">.</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>That will install all dependencies but not the ones listed in optional. To install all depenencies including the optional (in this example the <code class="language-plaintext highlighter-rouge">dev</code>) just run</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
</pre></td><td class="rouge-code"><pre>pip <span class="nb">install</span> .[dev]

<span class="c"># if you use zsh like me, do</span>
pip <span class="nb">install</span> <span class="s1">'.[dev]'</span>

<span class="c"># or if you want to install the package in editable mode</span>
python <span class="nt">-m</span> pip <span class="nb">install</span> <span class="nt">-e</span> <span class="s1">'.[dev]'</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Test that the package is installed</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>python -c "from mypackage.modulea import add; print(add(3,4))"
</pre></td></tr></tbody></table></code></pre></div></div>

<h2 id="test-your-package">Test your package</h2>

<p>The package <code class="language-plaintext highlighter-rouge">unittest</code> comes by default with python, it’s convenient and easy, under the directory <code class="language-plaintext highlighter-rouge">tests</code> add two files to make unit tests of your code</p>

<p>In <code class="language-plaintext highlighter-rouge">test_modulea.py</code>, copy</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
</pre></td><td class="rouge-code"><pre><span class="kn">import</span> <span class="n">unittest</span>
<span class="kn">from</span> <span class="n">mypackage.modulea</span> <span class="kn">import</span> <span class="n">add</span>

<span class="k">class</span> <span class="nc">TestModule1</span><span class="p">(</span><span class="n">unittest</span><span class="p">.</span><span class="n">TestCase</span><span class="p">):</span>
    <span class="k">def</span> <span class="nf">test_add</span><span class="p">(</span><span class="n">self</span><span class="p">):</span>
        <span class="n">self</span><span class="p">.</span><span class="nf">assertEqual</span><span class="p">(</span><span class="nf">add</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">),</span> <span class="mi">3</span><span class="p">)</span>
        <span class="n">self</span><span class="p">.</span><span class="nf">assertEqual</span><span class="p">(</span><span class="nf">add</span><span class="p">(</span><span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">),</span> <span class="mi">0</span><span class="p">)</span>
        <span class="n">self</span><span class="p">.</span><span class="nf">assertEqual</span><span class="p">(</span><span class="nf">add</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">),</span> <span class="mi">0</span><span class="p">)</span>

<span class="k">if</span> <span class="n">__name__</span> <span class="o">==</span> <span class="sh">'</span><span class="s">__main__</span><span class="sh">'</span><span class="p">:</span>
    <span class="n">unittest</span><span class="p">.</span><span class="nf">main</span><span class="p">()</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>and in <code class="language-plaintext highlighter-rouge">test_moduleb.py</code></p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
</pre></td><td class="rouge-code"><pre><span class="kn">import</span> <span class="n">unittest</span>
<span class="kn">from</span> <span class="n">mypackage.moduleb</span> <span class="kn">import</span> <span class="n">subtract</span>

<span class="k">class</span> <span class="nc">TestModule2</span><span class="p">(</span><span class="n">unittest</span><span class="p">.</span><span class="n">TestCase</span><span class="p">):</span>
    <span class="k">def</span> <span class="nf">test_subtract</span><span class="p">(</span><span class="n">self</span><span class="p">):</span>
        <span class="n">self</span><span class="p">.</span><span class="nf">assertEqual</span><span class="p">(</span><span class="nf">subtract</span><span class="p">(</span><span class="mi">2</span><span class="p">,</span> <span class="mi">1</span><span class="p">),</span> <span class="mi">1</span><span class="p">)</span>
        <span class="n">self</span><span class="p">.</span><span class="nf">assertEqual</span><span class="p">(</span><span class="nf">subtract</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">),</span> <span class="mi">0</span><span class="p">)</span>
        <span class="n">self</span><span class="p">.</span><span class="nf">assertEqual</span><span class="p">(</span><span class="nf">subtract</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">),</span> <span class="o">-</span><span class="mi">1</span><span class="p">)</span>

<span class="k">if</span> <span class="n">__name__</span> <span class="o">==</span> <span class="sh">'</span><span class="s">__main__</span><span class="sh">'</span><span class="p">:</span>
    <span class="n">unittest</span><span class="p">.</span><span class="nf">main</span><span class="p">()</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Since we have included a main function you can execute the tests by simply calling the scripts as</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre>.venv/bin/python tests/test_modulea.py
.venv/bin/python tests/test_moduleb.py
</pre></td></tr></tbody></table></code></pre></div></div>

<p>but you can also run them all with pytest, just do</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>.venv/bin/python <span class="nt">-m</span> pytest
</pre></td></tr></tbody></table></code></pre></div></div>

<p>from the root directory. You can also run</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>.venv/bin/python <span class="nt">-m</span> unittest discover <span class="nt">-s</span> tests <span class="nt">-p</span> <span class="s2">"test_*.py"</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>to run the same tests. This command calls the <code class="language-plaintext highlighter-rouge">unittest</code> command line, tells it to discover tests that are under the directory <code class="language-plaintext highlighter-rouge">tests</code> and the test file names have the pattern <code class="language-plaintext highlighter-rouge">test_*.py</code>. Clearly <code class="language-plaintext highlighter-rouge">pytest</code> is more simple, among other features Pytest automatically discovers test files and functions without requiring special naming conventions like “test_*.py”. But <code class="language-plaintext highlighter-rouge">unittest</code> comes by default with the newer python versions. I would use <code class="language-plaintext highlighter-rouge">pytest</code> in every modern project always placing it under the <code class="language-plaintext highlighter-rouge">[dev]</code> depencencies.</p>

<p>Adding coverage is also important, it helps you track the if all your functionality is tested. We can get a coverage report along with our test run for <code class="language-plaintext highlighter-rouge">pytest</code> using the package <code class="language-plaintext highlighter-rouge">pytest-cov</code>. Run the tests and the coverage with</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>.venv/bin/python <span class="nt">-m</span> pytest <span class="nt">--cov</span><span class="o">=</span>src/mypackage tests/
</pre></td></tr></tbody></table></code></pre></div></div>

<p>The output should be something like this:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
</pre></td><td class="rouge-code"><pre>platform darwin <span class="nt">--</span> Python 3.11.2, pytest-8.3.3, pluggy-1.5.0
rootdir: /Users/sebas/tmp/mypackage
configfile: pyproject.toml
plugins: cov-5.0.0
collected 2 items                                                                                                                                                                                                                                                                                                                                   

tests/test_modulea.py <span class="nb">.</span>                                                                                                                                                                                                                                                                                                                       <span class="o">[</span> 50%]
tests/test_moduleb.py <span class="nb">.</span>                                                                                                                                                                                                                                                                                                                       <span class="o">[</span>100%]

<span class="nt">----------</span> coverage: platform darwin, python 3.11.2-final-0 <span class="nt">----------</span>
Name                        Stmts   Miss  Cover
<span class="nt">-----------------------------------------------</span>
src/mypackage/__init__.py       0      0   100%
src/mypackage/modulea.py        2      0   100%
src/mypackage/moduleb.py        2      0   100%
<span class="nt">-----------------------------------------------</span>
TOTAL                           4      0   100%
</pre></td></tr></tbody></table></code></pre></div></div>

<p>All tests pass and the coverage is 100%, meaning that all our functionality has at least one test.</p>

<h2 id="testing-with-tox">Testing with TOX</h2>

<p>This section may be controversial, TOX is a tool that allows you to test in different python versions locally. This means, testing in your operating system, not different architectures and operating systems. This is fine as long as you are aware of it. In most cases you won’t need to compile code. Let’s see with an example how it works, create a file <code class="language-plaintext highlighter-rouge">tox.ini</code> in the root of the project with the content</p>

<div class="language-toml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
</pre></td><td class="rouge-code"><pre><span class="k">[</span><span class="n">tox</span><span class="k">]</span>
<span class="n">envlist</span> <span class="o">=</span><span class="w"> </span><span class="err">py</span><span class="mi">39</span><span class="err">,</span> <span class="n">py310</span><span class="err">,</span> <span class="n">py311</span><span class="err">,</span> <span class="n">py312</span>
<span class="n">isolated_build</span> <span class="o">=</span><span class="w"> </span><span class="err">True
skip_missing_interpreters = False

</span><span class="p">[</span><span class="err">testenv]
deps =
    pytest&gt;=</span><span class="mf">7.0</span>
    <span class="err">pytest-cov&gt;=</span><span class="mf">2.12</span>
<span class="err">commands =
    pytest --cov=mypackage --cov-report=term-missing </span><span class="p">{</span><span class="n">posargs</span><span class="p">}</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Now run <code class="language-plaintext highlighter-rouge">tox</code> in your environment (has been installed in the <code class="language-plaintext highlighter-rouge">pyproject.toml</code>) and lo and behold it creates new environments for each python version and runs the tests with coverage. In another post we will see how to set up automatic testing for different platforms on github using github actions.</p>

<h2 id="linting-with-ruff">Linting with Ruff</h2>

<p>Code should be organised and structured in a consensual way. That is why some compaines publish their code style (e.g. <a href="https://google.github.io/styleguide/pyguide.html">Python’s style at Google</a> ). To achieve this in an automated way, for that linters are useful. The Python Enhancement Proposal 8 or <a href="https://peps.python.org/pep-0008/">PEP8</a> proposed a series of best practices on writing code that can be reviewed in <a href="https://www.flake8rules.com/">flake8rules.com</a> with each rule consisting of a letter and a number,</p>

<ul>
  <li>Error Codes: Starting with E (e.g., E123, E501) to represent style issues.</li>
  <li>Warning Codes: Starting with W (e.g., W503).</li>
  <li>Complexity Checks: Codes starting with C (e.g., C901), usually related to cyclomatic complexity.</li>
  <li>Other Code Categories: Codes like F, N, and D for various kinds of issues such as undefined names (F), naming conventions (N), or docstring style (D).</li>
</ul>

<p>A popular linter in python is <a href="https://github.com/astral-sh/ruff">Ruff</a> and this is the one we will use for the project but other tools are <a href="https://github.com/PyCQA/flake8">flake8</a> or <a href="https://github.com/psf/black">black</a>. I already included <code class="language-plaintext highlighter-rouge">ruff</code> in our environment <code class="language-plaintext highlighter-rouge">pyproject.toml</code> so you don’t need to install it, just add the configuration in <code class="language-plaintext highlighter-rouge">pyproject.toml</code> appending the following</p>

<div class="language-toml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
</pre></td><td class="rouge-code"><pre><span class="k">[</span><span class="n">tool</span><span class="k">.</span><span class="n">ruff</span><span class="k">]</span>
<span class="n">line-length</span> <span class="o">=</span><span class="w"> </span><span class="mi">88</span>

<span class="k">[</span><span class="n">tool</span><span class="k">.</span><span class="n">ruff</span><span class="k">.</span><span class="n">lint</span><span class="k">]</span>
<span class="n">select</span> <span class="o">=</span><span class="w"> </span><span class="p">[</span><span class="s">"E"</span><span class="p">,</span> <span class="s">"W"</span><span class="p">,</span> <span class="s">"F"</span><span class="p">]</span>

<span class="k">[</span><span class="n">tool</span><span class="k">.</span><span class="n">ruff</span><span class="k">.</span><span class="n">format</span><span class="k">]</span>
<span class="n">docstring-code-format</span> <span class="o">=</span><span class="w"> </span><span class="kc">true</span>
<span class="n">docstring-code-line-length</span> <span class="o">=</span><span class="w"> </span><span class="mi">72</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Here when we run <code class="language-plaintext highlighter-rouge">ruff check</code> on the command line the program will find errors (E), warnings (W) and undifined names (F), also will complain for code lines larger than 88 (not PEP8 but pretty standard line lenght) and will also check for the docstrings format and length. This is a basic configuration but you can go very complex from here. Check the <a href="https://docs.astral.sh/ruff/">Ruff documentation</a> for more information.</p>

<h2 id="recap">Recap</h2>

<p>Here we just showed the basics to setup a python project in pure python, in following posts we will learn how to automate continuous integraiont / continous development CI/CD, package building and docker containers for development and testing.</p>]]></content><author><name>Sebastia Agramunt Puig</name></author><category term="Python" /><category term="computer science" /><summary type="html"><![CDATA[The easiest way to distribute your code is to packagify it, this is create a package that can be pip installed, in this post I am going to show how to do that.]]></summary></entry><entry><title type="html">Python virtual environments with virtualenv</title><link href="https://agramunt.me/posts/python-virtual-environments-with-virtualenv/" rel="alternate" type="text/html" title="Python virtual environments with virtualenv" /><published>2024-08-31T19:15:00-07:00</published><updated>2024-08-31T19:15:00-07:00</updated><id>https://agramunt.me/posts/python-virtual-environments-with-virtualenv</id><content type="html" xml:base="https://agramunt.me/posts/python-virtual-environments-with-virtualenv/"><![CDATA[<p><a href="https://virtualenv.pypa.io/en/latest/index.html">virtualenv</a> is a convenient tool to install virtual environemnts. Part of it is already implemented in venv for python&gt;=3.3.</p>

<h2 id="install-virtualenv">Install virtualenv</h2>

<p><code class="language-plaintext highlighter-rouge">virtualenv</code> is a package of python so you can install via <code class="language-plaintext highlighter-rouge">pip install</code>. We could install <code class="language-plaintext highlighter-rouge">virtualenv</code> in a virtual environment and call it from there. Instead if you decide to use <code class="language-plaintext highlighter-rouge">virtualenv</code> for your projects the most common is to install it in the default interpreter, the global in <code class="language-plaintext highlighter-rouge">pyenv</code> or the default in the system <code class="language-plaintext highlighter-rouge">/usr/local/bin/python3</code> (in MacOS).</p>

<p>Define a global python using pyenv, as an example we will use <code class="language-plaintext highlighter-rouge">3.11.2</code>.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre><span class="nv">GLOBAL_PYTHON</span><span class="o">=</span>3.11.2
pyenv global <span class="k">${</span><span class="nv">GLOBAL_PYTHON</span><span class="k">}</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Install virtualenv on it and check the help (just to see that it works)</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre>python <span class="nt">-m</span> pip <span class="nb">install </span>virtualenv
python <span class="nt">-m</span> virtualenv <span class="nt">--help</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<h2 id="create-virtual-environment-with-virtualenv">Create virtual environment with virtualenv</h2>

<p>Now create a virtual environment with a custom python version (assuming we use <code class="language-plaintext highlighter-rouge">pyenv</code> for managing the versions) and that you installed <code class="language-plaintext highlighter-rouge">virtualenv</code> in the global pyenv python manager. You need to point to the specific python binary version for your new environment, so first install the python version for your environment if you don’t have it</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre><span class="nv">PYTHON_VERSION</span><span class="o">=</span>3.12.4
pyenv <span class="nb">install</span> <span class="k">${</span><span class="nv">PYTHON_VERSION</span><span class="k">}</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Make sure you are in the global python version in your pyenv, that’s where we installed virtualenv.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre>pyenv shell <span class="nt">--unset</span>
<span class="nb">rm</span> <span class="nt">-rf</span> .python-version
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Now you are ready to create the virtual environment</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre><span class="nv">PYTHON_PATH</span><span class="o">=</span><span class="s2">"</span><span class="si">$(</span>pyenv root<span class="si">)</span><span class="s2">/versions/</span><span class="nv">$PYTHON_VERSION</span><span class="s2">/bin/python"</span>
virtualenv <span class="nt">-p</span> <span class="k">${</span><span class="nv">PYTHON_PATH</span><span class="k">}</span> .venv
</pre></td></tr></tbody></table></code></pre></div></div>

<p>The environmet can be found now in <code class="language-plaintext highlighter-rouge">.venv</code> directory.</p>

<h2 id="activate-and-add-dependencies">Activate and add dependencies</h2>

<p>Activate it as usual with <code class="language-plaintext highlighter-rouge">source .venv/bin/activate</code> and install with <code class="language-plaintext highlighter-rouge">pip</code>, for instance <code class="language-plaintext highlighter-rouge">pip install numpy</code>. Then deactivate with <code class="language-plaintext highlighter-rouge">deactivate</code> and remove virtual environment by just deleting the directory <code class="language-plaintext highlighter-rouge">.venv</code>.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
</pre></td><td class="rouge-code"><pre><span class="nb">source</span> .venv/bin/activate
pip <span class="nb">install </span>numpy

deactivate
<span class="nb">rm</span> <span class="nt">-rf</span> .venv
</pre></td></tr></tbody></table></code></pre></div></div>]]></content><author><name>Sebastia Agramunt Puig</name></author><category term="Python" /><category term="computer science" /><summary type="html"><![CDATA[virtualenv is a convenient tool to install virtual environemnts. Part of it is already implemented in venv for python&gt;=3.3.]]></summary></entry><entry><title type="html">Docker images for python</title><link href="https://agramunt.me/posts/python-docker/" rel="alternate" type="text/html" title="Docker images for python" /><published>2024-08-31T19:15:00-07:00</published><updated>2025-09-27T17:26:57-07:00</updated><id>https://agramunt.me/posts/python-docker</id><content type="html" xml:base="https://agramunt.me/posts/python-docker/"><![CDATA[<p>Docker is a containerization platform that allows developers to package applications and their dependencies into containers. Containers provide an isolated environment that runs the application the same way across different systems. In this post I’m going to provide some docker images to run on Python.</p>

<p>The code for this post is in my repository <a href="https://github.com/SebastiaAgramunt/blogging-code/tree/main">blogging-code</a>, subdirectory <a href="https://github.com/SebastiaAgramunt/blogging-code/tree/main/python-dockerfiles">python-dockerfiles</a>.</p>

<h2 id="install-docker-and-colima-on-macos">Install Docker and Colima on MacOS</h2>

<p>First we need <code class="language-plaintext highlighter-rouge">docker</code> as the engine</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
</pre></td><td class="rouge-code"><pre>brew <span class="nb">install </span>docker
brew <span class="nb">link </span>docker
docker <span class="nt">--version</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>And then install <a href="https://github.com/abiosoft/colima">Colima</a> which is an open source alternative to Docker Desktop.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>brew <span class="nb">install </span>colima
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Start colima as the engine for docker</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre>colima delete
colima start
</pre></td></tr></tbody></table></code></pre></div></div>

<p>When you no longer need your containers running make sure to do <code class="language-plaintext highlighter-rouge">brew stop colima</code>.</p>

<h2 id="python-on-debian-based-distributions">Python on Debian-based distributions</h2>

<p>From <a href="https://hub.docker.com/_/python">Python DockerHub page</a> we have several options to crate a customized docker image. Let’s say wa want to run an application in python 3.12. Normally you would pick between the images <code class="language-plaintext highlighter-rouge">python:3.12</code>, <code class="language-plaintext highlighter-rouge">python:3.12-slim</code> or <code class="language-plaintext highlighter-rouge">python:3.12-alpine</code>. The first image is the largest ~915MB, and contains more libraries and tools than needed for running basic python (although not specified in the docs). The second image (the slim) is a simplified version of the first one and weights about 45MB, the latter (alpine) is an even more minimal version with size about 24MB. We won’t use the alpine, it is so basic we can’t even add users, also wouldn’t be able to install numpy and other packages since numpy is based on <code class="language-plaintext highlighter-rouge">glibc</code> library and uses <code class="language-plaintext highlighter-rouge">musl</code> libc.</p>

<h3 id="python-slim">Python-slim</h3>
<p>To create new custom images we write Dockerfiles. Those are plaintext files that tell docker how to build the image. For the first example create a file named <code class="language-plaintext highlighter-rouge">Dockerfile-python-3.12-slim</code> with the following content</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
</pre></td><td class="rouge-code"><pre>FROM python:3.12-slim

ARG <span class="nv">USERNAME</span><span class="o">=</span>user
ARG <span class="nv">UID</span><span class="o">=</span>1000
ARG <span class="nv">GID</span><span class="o">=</span>1000

RUN <span class="k">if </span>getent group <span class="k">${</span><span class="nv">GID</span><span class="k">}</span> <span class="o">&gt;</span>/dev/null<span class="p">;</span> <span class="k">then</span> <span class="se">\</span>
        <span class="nb">echo</span> <span class="s2">"Group with GID </span><span class="k">${</span><span class="nv">GID</span><span class="k">}</span><span class="s2"> already exists, using it."</span><span class="p">;</span> <span class="se">\</span>
        <span class="nv">GROUP_NAME</span><span class="o">=</span><span class="si">$(</span>getent group <span class="k">${</span><span class="nv">GID</span><span class="k">}</span> | <span class="nb">cut</span> <span class="nt">-d</span>: <span class="nt">-f1</span><span class="si">)</span><span class="p">;</span> <span class="se">\</span>
    <span class="k">else</span> <span class="se">\</span>
        <span class="nv">GROUP_NAME</span><span class="o">=</span><span class="k">${</span><span class="nv">USERNAME</span><span class="k">}</span><span class="p">;</span> <span class="se">\</span>
        groupadd <span class="nt">--gid</span> <span class="k">${</span><span class="nv">GID</span><span class="k">}</span> <span class="k">${</span><span class="nv">GROUP_NAME</span><span class="k">}</span><span class="p">;</span> <span class="se">\</span>
    <span class="k">fi</span> <span class="o">&amp;&amp;</span> <span class="se">\</span>
    useradd <span class="nt">--uid</span> <span class="k">${</span><span class="nv">UID</span><span class="k">}</span> <span class="nt">--gid</span> <span class="k">${</span><span class="nv">GID</span><span class="k">}</span> <span class="nt">--create-home</span> <span class="nt">--shell</span> /bin/bash <span class="k">${</span><span class="nv">USERNAME</span><span class="k">}</span>

WORKDIR /home/<span class="k">${</span><span class="nv">USERNAME</span><span class="k">}</span>
RUN <span class="nb">chown</span> <span class="nt">-R</span> <span class="k">${</span><span class="nv">UID</span><span class="k">}</span>:<span class="k">${</span><span class="nv">GID</span><span class="k">}</span> /home/<span class="k">${</span><span class="nv">USERNAME</span><span class="k">}</span>

USER <span class="k">${</span><span class="nv">USERNAME</span><span class="k">}</span>
CMD <span class="o">[</span><span class="s2">"/bin/bash"</span><span class="o">]</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>This starts from the image <code class="language-plaintext highlighter-rouge">python:3.12-slim</code> that docker will pull from dockerhub. The instructions are to create a new group if it doesn’t exist and add a user. Why we do this?. Docker normally operates as root so if you create a new image and a container from it, you will be root. In this case if you mount a sensitive directory from your system (imagine you mount <code class="language-plaintext highlighter-rouge">/usr/lib</code>) and accidentally delete it in your container, you would be in trouble. A workaround I found for this not to happen is to crate a user in the container that is the same you have in the host with the same group id, thus removing the root by default.</p>

<p>The last part of the dockerfile sets the working directory (home for the user) and changes the privileges of it.  Finally the command USER sets the user that will be used when you execute a container.</p>

<p>Now we can build and run the image with</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
</pre></td><td class="rouge-code"><pre>docker build <span class="nt">-f</span> Docker/Dockerfile-python-3.12 <span class="se">\</span>
            <span class="nt">--build-arg</span> <span class="nv">USERNAME</span><span class="o">=</span><span class="si">$(</span><span class="nb">whoami</span><span class="si">)</span> <span class="se">\</span>
            <span class="nt">--build-arg</span> <span class="nv">UID</span><span class="o">=</span><span class="si">$(</span><span class="nb">id</span> <span class="nt">-u</span><span class="si">)</span> <span class="se">\</span>
            <span class="nt">--build-arg</span> <span class="nv">GID</span><span class="o">=</span><span class="si">$(</span><span class="nb">id</span> <span class="nt">-g</span><span class="si">)</span> <span class="se">\</span>
            <span class="nt">-t</span> python-3.12-image <span class="nb">.</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>And then run and ssh into it with</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>docker run <span class="nt">-it</span> python-3.12-image /bin/bash
</pre></td></tr></tbody></table></code></pre></div></div>

<p>That’s it, now you are in the container, type <code class="language-plaintext highlighter-rouge">python --version</code> to check that your version is <code class="language-plaintext highlighter-rouge">3.12</code>. In another terminal check the images and containers you have in the system</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
</pre></td><td class="rouge-code"><pre><span class="c"># show all containers (running and stopped)</span>
docker ps <span class="nt">-a</span> 

<span class="c"># show images</span>
docker images
</pre></td></tr></tbody></table></code></pre></div></div>

<p>stop the container and remove the image with</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre>docker <span class="nb">rm </span>container
docker rmi python-3.12-image
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Then prune</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre>docker container prune
docker image prune
</pre></td></tr></tbody></table></code></pre></div></div>

<h3 id="python-compiled">Python compiled</h3>

<p>In this case we will download and install python from source on docker, open a document with name <code class="language-plaintext highlighter-rouge">Dockerfile-python-3.12-build</code>, pick your version from <a href="https://www.python.org/ftp/python/">FTP Python page</a>.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
</pre></td><td class="rouge-code"><pre>FROM ubuntu:latest

ARG <span class="nv">USERNAME</span><span class="o">=</span>user
ARG <span class="nv">UID</span><span class="o">=</span>1000
ARG <span class="nv">GID</span><span class="o">=</span>1000

ENV <span class="nv">PYTHON_VERSION</span><span class="o">=</span>3.12.9

<span class="c"># install required packages to compile python</span>
RUN <span class="nb">set</span> <span class="nt">-x</span> <span class="se">\</span>
    <span class="o">&amp;&amp;</span> <span class="nb">echo</span> <span class="s2">"Updating..."</span> <span class="se">\</span>
    <span class="o">&amp;&amp;</span> apt-get upgrade <span class="se">\</span>
    <span class="o">&amp;&amp;</span> apt-get update <span class="se">\</span>
    <span class="o">&amp;&amp;</span> <span class="nb">echo</span> <span class="s2">"Installing Packages..."</span> <span class="se">\</span>
    <span class="o">&amp;&amp;</span> apt-get <span class="nb">install</span> <span class="nt">-y</span> <span class="se">\</span>
    build-essential <span class="se">\</span>
    zlib1g-dev <span class="se">\</span>
    libncurses5-dev <span class="se">\</span>
    libgdbm-dev <span class="se">\</span>
    libnss3-dev <span class="se">\</span>
    libssl-dev <span class="se">\</span>
    libsqlite3-dev <span class="se">\</span>
    libreadline-dev <span class="se">\</span>
    libffi-dev curl <span class="se">\</span>
    libbz2-dev <span class="se">\</span>
    liblzma-dev <span class="se">\</span>
    wget

<span class="c"># download and compile python</span>
RUN <span class="nb">cd </span>usr/src <span class="o">&amp;&amp;</span> <span class="se">\</span>
    <span class="nv">PYTHON_VERSION_SHORT</span><span class="o">=</span><span class="k">${</span><span class="nv">PYTHON_VERSION</span><span class="p">%.*</span><span class="k">}</span> <span class="o">&amp;&amp;</span> <span class="se">\</span>
    wget https://www.python.org/ftp/python/<span class="k">${</span><span class="nv">PYTHON_VERSION</span><span class="k">}</span>/Python-<span class="k">${</span><span class="nv">PYTHON_VERSION</span><span class="k">}</span>.tgz <span class="o">&amp;&amp;</span> <span class="se">\</span>
    <span class="nb">tar </span>xzf Python-<span class="k">${</span><span class="nv">PYTHON_VERSION</span><span class="k">}</span>.tgz <span class="o">&amp;&amp;</span> <span class="se">\</span>
    <span class="nb">cd </span>Python-<span class="k">${</span><span class="nv">PYTHON_VERSION</span><span class="k">}</span> <span class="o">&amp;&amp;</span> <span class="se">\</span>
    ./configure <span class="nt">--enable-optimizations</span> <span class="o">&amp;&amp;</span> <span class="se">\</span>
    make <span class="nt">-j</span> 16 <span class="o">&amp;&amp;</span> <span class="se">\</span>
    make altinstall <span class="o">&amp;&amp;</span> <span class="se">\</span>
    <span class="nb">ln</span> <span class="nt">-s</span> /usr/local/bin/python<span class="k">${</span><span class="nv">PYTHON_VERSION_SHORT</span><span class="k">}</span> /usr/bin/python <span class="o">&amp;&amp;</span> <span class="se">\</span>
    <span class="nb">cd</span> / <span class="o">&amp;&amp;</span> <span class="se">\</span>
    <span class="nb">rm</span> <span class="nt">-rf</span> /usr/src/Python-<span class="k">${</span><span class="nv">PYTHON_VERSION</span><span class="k">}</span>.tgz /usr/src/

RUN <span class="k">if </span>getent group <span class="k">${</span><span class="nv">GID</span><span class="k">}</span> <span class="o">&gt;</span>/dev/null<span class="p">;</span> <span class="k">then</span> <span class="se">\</span>
        <span class="nb">echo</span> <span class="s2">"Group with GID </span><span class="k">${</span><span class="nv">GID</span><span class="k">}</span><span class="s2"> already exists, using it."</span><span class="p">;</span> <span class="se">\</span>
        <span class="nv">GROUP_NAME</span><span class="o">=</span><span class="si">$(</span>getent group <span class="k">${</span><span class="nv">GID</span><span class="k">}</span> | <span class="nb">cut</span> <span class="nt">-d</span>: <span class="nt">-f1</span><span class="si">)</span><span class="p">;</span> <span class="se">\</span>
    <span class="k">else</span> <span class="se">\</span>
        <span class="nv">GROUP_NAME</span><span class="o">=</span><span class="k">${</span><span class="nv">USERNAME</span><span class="k">}</span><span class="p">;</span> <span class="se">\</span>
        groupadd <span class="nt">--gid</span> <span class="k">${</span><span class="nv">GID</span><span class="k">}</span> <span class="k">${</span><span class="nv">GROUP_NAME</span><span class="k">}</span><span class="p">;</span> <span class="se">\</span>
    <span class="k">fi</span> <span class="o">&amp;&amp;</span> <span class="se">\</span>
    useradd <span class="nt">--uid</span> <span class="k">${</span><span class="nv">UID</span><span class="k">}</span> <span class="nt">--gid</span> <span class="k">${</span><span class="nv">GID</span><span class="k">}</span> <span class="nt">--create-home</span> <span class="nt">--shell</span> /bin/bash <span class="k">${</span><span class="nv">USERNAME</span><span class="k">}</span>

WORKDIR /home/<span class="k">${</span><span class="nv">USERNAME</span><span class="k">}</span>
RUN <span class="nb">chown</span> <span class="nt">-R</span> <span class="k">${</span><span class="nv">UID</span><span class="k">}</span>:<span class="k">${</span><span class="nv">GID</span><span class="k">}</span> /home/<span class="k">${</span><span class="nv">USERNAME</span><span class="k">}</span>

USER <span class="k">${</span><span class="nv">USERNAME</span><span class="k">}</span>
CMD <span class="o">[</span><span class="s2">"/bin/bash"</span><span class="o">]</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<h3 id="mamba--conda">Mamba &amp; Conda</h3>

<p>A third docker image that can be useful is one containing mamba and conda. A container to do data science perhaps. I do not recommend this container to run apps or microservices. Let’s name the dockerfile as <code class="language-plaintext highlighter-rouge">Dockerfile-mamba</code>.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
</pre></td><td class="rouge-code"><pre>FROM ubuntu:latest

ARG <span class="nv">USERNAME</span><span class="o">=</span>user
ARG <span class="nv">UID</span><span class="o">=</span>1000
ARG <span class="nv">GID</span><span class="o">=</span>1000

RUN <span class="nb">set</span> <span class="nt">-x</span> <span class="se">\</span>
    <span class="o">&amp;&amp;</span> <span class="nb">echo</span> <span class="s2">"Updating..."</span> <span class="se">\</span>
    <span class="o">&amp;&amp;</span> apt-get upgrade <span class="se">\</span>
    <span class="o">&amp;&amp;</span> apt-get update <span class="se">\</span>
    <span class="o">&amp;&amp;</span> <span class="nb">echo</span> <span class="s2">"Installing Packages..."</span> <span class="se">\</span>
    <span class="o">&amp;&amp;</span> apt-get <span class="nb">install</span> <span class="nt">-y</span> <span class="se">\</span>
    wget 

<span class="c"># Define Miniforge version and install path</span>
ENV <span class="nv">MINIFORGE_PATH</span><span class="o">=</span>/opt/miniforge

<span class="c"># Download and install Miniforge</span>
RUN wget https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-x86_64.sh <span class="nt">-O</span> /tmp/Miniforge.sh <span class="se">\</span>
    <span class="o">&amp;&amp;</span> bash /tmp/Miniforge.sh <span class="nt">-b</span> <span class="nt">-p</span> <span class="nv">$MINIFORGE_PATH</span> <span class="se">\</span>
    <span class="o">&amp;&amp;</span> <span class="nb">rm</span> /tmp/Miniforge.sh

<span class="c"># Set environment variables for Conda and Mamba</span>
ENV <span class="nv">PATH</span><span class="o">=</span><span class="s2">"</span><span class="nv">$MINIFORGE_PATH</span><span class="s2">/bin:</span><span class="nv">$PATH</span><span class="s2">"</span>

RUN <span class="k">if </span>getent group <span class="k">${</span><span class="nv">GID</span><span class="k">}</span> <span class="o">&gt;</span>/dev/null<span class="p">;</span> <span class="k">then</span> <span class="se">\</span>
        <span class="nb">echo</span> <span class="s2">"Group with GID </span><span class="k">${</span><span class="nv">GID</span><span class="k">}</span><span class="s2"> already exists, using it."</span><span class="p">;</span> <span class="se">\</span>
        <span class="nv">GROUP_NAME</span><span class="o">=</span><span class="si">$(</span>getent group <span class="k">${</span><span class="nv">GID</span><span class="k">}</span> | <span class="nb">cut</span> <span class="nt">-d</span>: <span class="nt">-f1</span><span class="si">)</span><span class="p">;</span> <span class="se">\</span>
    <span class="k">else</span> <span class="se">\</span>
        <span class="nv">GROUP_NAME</span><span class="o">=</span><span class="k">${</span><span class="nv">USERNAME</span><span class="k">}</span><span class="p">;</span> <span class="se">\</span>
        groupadd <span class="nt">--gid</span> <span class="k">${</span><span class="nv">GID</span><span class="k">}</span> <span class="k">${</span><span class="nv">GROUP_NAME</span><span class="k">}</span><span class="p">;</span> <span class="se">\</span>
    <span class="k">fi</span> <span class="o">&amp;&amp;</span> <span class="se">\</span>
    useradd <span class="nt">--uid</span> <span class="k">${</span><span class="nv">UID</span><span class="k">}</span> <span class="nt">--gid</span> <span class="k">${</span><span class="nv">GID</span><span class="k">}</span> <span class="nt">--create-home</span> <span class="nt">--shell</span> /bin/bash <span class="k">${</span><span class="nv">USERNAME</span><span class="k">}</span>

WORKDIR /home/<span class="k">${</span><span class="nv">USERNAME</span><span class="k">}</span>
RUN <span class="nb">chown</span> <span class="nt">-R</span> <span class="k">${</span><span class="nv">UID</span><span class="k">}</span>:<span class="k">${</span><span class="nv">GID</span><span class="k">}</span> /home/<span class="k">${</span><span class="nv">USERNAME</span><span class="k">}</span>

USER <span class="k">${</span><span class="nv">USERNAME</span><span class="k">}</span>
CMD <span class="o">[</span><span class="s2">"/bin/bash"</span><span class="o">]</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<h3 id="docker-image-with-uv">Docker image with UV</h3>

<p>Following the previous pattern you may not be surprised about this one</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
</pre></td><td class="rouge-code"><pre>FROM ubuntu:latest

ARG <span class="nv">USERNAME</span><span class="o">=</span>user
ARG <span class="nv">UID</span><span class="o">=</span>1000
ARG <span class="nv">GID</span><span class="o">=</span>1000

RUN <span class="nb">set</span> <span class="nt">-x</span> <span class="se">\</span>
    <span class="o">&amp;&amp;</span> <span class="nb">echo</span> <span class="s2">"Updating..."</span> <span class="se">\</span>
    <span class="o">&amp;&amp;</span> apt-get upgrade <span class="se">\</span>
    <span class="o">&amp;&amp;</span> apt-get update <span class="se">\</span>
    <span class="o">&amp;&amp;</span> <span class="nb">echo</span> <span class="s2">"Installing Packages..."</span> <span class="se">\</span>
    <span class="o">&amp;&amp;</span> apt-get <span class="nb">install</span> <span class="nt">-y</span> <span class="se">\</span>
    wget <span class="se">\</span>
    curl

RUN <span class="k">if </span>getent group <span class="k">${</span><span class="nv">GID</span><span class="k">}</span> <span class="o">&gt;</span>/dev/null<span class="p">;</span> <span class="k">then</span> <span class="se">\</span>
        <span class="nb">echo</span> <span class="s2">"Group with GID </span><span class="k">${</span><span class="nv">GID</span><span class="k">}</span><span class="s2"> already exists, using it."</span><span class="p">;</span> <span class="se">\</span>
        <span class="nv">GROUP_NAME</span><span class="o">=</span><span class="si">$(</span>getent group <span class="k">${</span><span class="nv">GID</span><span class="k">}</span> | <span class="nb">cut</span> <span class="nt">-d</span>: <span class="nt">-f1</span><span class="si">)</span><span class="p">;</span> <span class="se">\</span>
    <span class="k">else</span> <span class="se">\</span>
        <span class="nv">GROUP_NAME</span><span class="o">=</span><span class="k">${</span><span class="nv">USERNAME</span><span class="k">}</span><span class="p">;</span> <span class="se">\</span>
        groupadd <span class="nt">--gid</span> <span class="k">${</span><span class="nv">GID</span><span class="k">}</span> <span class="k">${</span><span class="nv">GROUP_NAME</span><span class="k">}</span><span class="p">;</span> <span class="se">\</span>
    <span class="k">fi</span> <span class="o">&amp;&amp;</span> <span class="se">\</span>
    useradd <span class="nt">--uid</span> <span class="k">${</span><span class="nv">UID</span><span class="k">}</span> <span class="nt">--gid</span> <span class="k">${</span><span class="nv">GID</span><span class="k">}</span> <span class="nt">--create-home</span> <span class="nt">--shell</span> /bin/bash <span class="k">${</span><span class="nv">USERNAME</span><span class="k">}</span>

WORKDIR /home/<span class="k">${</span><span class="nv">USERNAME</span><span class="k">}</span>
RUN <span class="nb">chown</span> <span class="nt">-R</span> <span class="k">${</span><span class="nv">UID</span><span class="k">}</span>:<span class="k">${</span><span class="nv">GID</span><span class="k">}</span> /home/<span class="k">${</span><span class="nv">USERNAME</span><span class="k">}</span>

USER <span class="k">${</span><span class="nv">USERNAME</span><span class="k">}</span>

<span class="c"># install UV on user</span>
RUN curl <span class="nt">-LsSf</span> https://astral.sh/uv/install.sh | sh <span class="nt">-s</span> <span class="nt">--</span> <span class="nt">--verbose</span> 

CMD <span class="o">[</span><span class="s2">"/bin/bash"</span><span class="o">]</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Here instead of installing <code class="language-plaintext highlighter-rouge">uv</code> on root we install it on the user (goes after the statement <code class="language-plaintext highlighter-rouge">USER</code>).</p>

<h2 id="building-and-running-the-images">Building and running the images</h2>

<p>As usual I have a quick recipe to build these images, a bash file (I will name it as <code class="language-plaintext highlighter-rouge">build-run.sh</code>) that can be used for the different builds</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
</pre></td><td class="rouge-code"><pre><span class="c">#!/bin/bash</span>

<span class="nv">THIS_DIR</span><span class="o">=</span><span class="si">$(</span><span class="nb">dirname</span> <span class="s2">"</span><span class="si">$(</span><span class="nb">realpath</span> <span class="s2">"</span><span class="nv">$0</span><span class="s2">"</span><span class="si">)</span><span class="s2">"</span><span class="si">)</span>

<span class="c"># # Dockerfile name</span>
<span class="c"># DOCKERFILE=python-3.12</span>
<span class="c"># DOCKERFILE=python-3.12-slim</span>
<span class="c"># DOCKERFILE=python-3.12-build</span>
<span class="c"># DOCKERFILE=mamba</span>
<span class="nv">DOCKERFILE</span><span class="o">=</span>uv


build_image<span class="o">(){</span>
    docker build <span class="nt">-f</span> Docker/Dockerfile-<span class="k">${</span><span class="nv">DOCKERFILE</span><span class="k">}</span> <span class="se">\</span>
                <span class="nt">--build-arg</span> <span class="nv">USERNAME</span><span class="o">=</span><span class="si">$(</span><span class="nb">whoami</span><span class="si">)</span> <span class="se">\</span>
                <span class="nt">--build-arg</span> <span class="nv">UID</span><span class="o">=</span><span class="si">$(</span><span class="nb">id</span> <span class="nt">-u</span><span class="si">)</span> <span class="se">\</span>
                <span class="nt">--build-arg</span> <span class="nv">GID</span><span class="o">=</span><span class="si">$(</span><span class="nb">id</span> <span class="nt">-g</span><span class="si">)</span> <span class="se">\</span>
                 <span class="nt">-t</span> <span class="k">${</span><span class="nv">DOCKERFILE</span><span class="k">}</span><span class="nt">-image</span> <span class="nb">.</span>
<span class="o">}</span>

run_image<span class="o">(){</span>
    docker run <span class="se">\</span>
    <span class="nt">-v</span> <span class="k">${</span><span class="nv">THIS_DIR</span><span class="k">}</span>:/home/<span class="si">$(</span><span class="nb">whoami</span><span class="si">)</span> <span class="se">\</span>
    <span class="nt">-it</span> <span class="se">\</span>
    <span class="k">${</span><span class="nv">DOCKERFILE</span><span class="k">}</span><span class="nt">-image</span> <span class="se">\</span>
     /bin/bash
<span class="o">}</span>

croak<span class="o">(){</span>
    <span class="nb">echo</span> <span class="s2">"[ERROR] </span><span class="nv">$*</span><span class="s2">"</span> <span class="o">&gt;</span> /dev/stderr
    <span class="nb">exit </span>1
<span class="o">}</span>

main<span class="o">(){</span>
    <span class="k">if</span> <span class="o">[[</span> <span class="nt">-z</span> <span class="s2">"</span><span class="nv">$TASK</span><span class="s2">"</span> <span class="o">]]</span><span class="p">;</span> <span class="k">then
        </span>croak <span class="s2">"No TASK specified."</span>
    <span class="k">fi
    </span><span class="nb">echo</span> <span class="s2">"[INFO] running </span><span class="nv">$TASK</span><span class="s2"> </span><span class="nv">$*</span><span class="s2">"</span>
    <span class="nv">$TASK</span> <span class="s2">"</span><span class="nv">$@</span><span class="s2">"</span>
<span class="o">}</span>

main <span class="s2">"</span><span class="nv">$@</span><span class="s2">"</span>
</pre></td></tr></tbody></table></code></pre></div></div>
<p>simple uncomment the dockerfile you want to build (here we uncomment <code class="language-plaintext highlighter-rouge">uv</code>) and run the following commands to build an ssh</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
</pre></td><td class="rouge-code"><pre><span class="nb">export </span><span class="nv">TASK</span><span class="o">=</span>build_image
./build-run.sh

<span class="nb">export </span><span class="nv">TASK</span><span class="o">=</span>run_image
./build-run.sh
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Place the dockerfiles under <code class="language-plaintext highlighter-rouge">Docker</code> directory as we show in the repository.</p>]]></content><author><name>Sebastia Agramunt Puig</name></author><category term="Python" /><category term="computer science" /><summary type="html"><![CDATA[Docker is a containerization platform that allows developers to package applications and their dependencies into containers. Containers provide an isolated environment that runs the application the same way across different systems. In this post I’m going to provide some docker images to run on Python.]]></summary></entry><entry><title type="html">Python Virtual Environments with Venv</title><link href="https://agramunt.me/posts/python-virtual-environments-with-venv/" rel="alternate" type="text/html" title="Python Virtual Environments with Venv" /><published>2024-08-04T06:05:00-07:00</published><updated>2024-08-04T06:05:00-07:00</updated><id>https://agramunt.me/posts/python-virtual-environments-with-venv</id><content type="html" xml:base="https://agramunt.me/posts/python-virtual-environments-with-venv/"><![CDATA[<p>We have seen so far how to get different python versions in your system and briefly how to install packages using <code class="language-plaintext highlighter-rouge">pip install</code>. In real projects you want to take control over both, python version as well as packages versions and also having a fixed combination of these per project.</p>

<p>Virtual environments allows you to create a completely separated environment combining a python version with a set of packages in a specific version. In this post we will show how to create a virtual environment using <code class="language-plaintext highlighter-rouge">venv</code>, the default virtual environment manager in Python. As the first post in virtual environments we will also dive a bit deeper on the directories that are created and good practices on maintaining virtual environments.</p>

<h2 id="tldr">TLDR</h2>

<p>To create a virtual environment using <code class="language-plaintext highlighter-rouge">venv</code> command line:</p>

<p>Save the following <code class="language-plaintext highlighter-rouge">create_environment.sh</code> in the place where you want to create the virtual environment changing the python version</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
</pre></td><td class="rouge-code"><pre><span class="c">#!/bin/bash</span>

<span class="nv">SCRIPT_DIR</span><span class="o">=</span><span class="s2">"</span><span class="si">$(</span><span class="nb">cd</span> <span class="s2">"</span><span class="si">$(</span><span class="nb">dirname</span> <span class="s2">"</span><span class="k">${</span><span class="nv">BASH_SOURCE</span><span class="p">[0]</span><span class="k">}</span><span class="s2">"</span><span class="si">)</span><span class="s2">"</span> <span class="o">&amp;&amp;</span> <span class="nb">pwd</span><span class="si">)</span><span class="s2">"</span>
<span class="nv">PYTHON_VERSION</span><span class="o">=</span>3.12.2

<span class="c"># set python version from your pyenv</span>
<span class="nv">PYTHON_PATH</span><span class="o">=</span><span class="s2">"</span><span class="si">$(</span>pyenv root<span class="si">)</span><span class="s2">/versions/</span><span class="nv">$PYTHON_VERSION</span><span class="s2">/bin/python"</span>

<span class="c"># remove env and recreate</span>
<span class="nb">rm</span> <span class="nt">-rf</span> <span class="k">${</span><span class="nv">SCRIPT_DR</span><span class="k">}</span>/.venv

<span class="k">${</span><span class="nv">PYTHON_PATH</span><span class="k">}</span> <span class="nt">-m</span> venv <span class="k">${</span><span class="nv">SCRIPT_DIR</span><span class="k">}</span>/.venv
<span class="k">${</span><span class="nv">SCRIPT_DIR</span><span class="k">}</span>/.venv/bin/python <span class="nt">-m</span> pip <span class="nb">install</span> <span class="nt">--upgrade</span> pip
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Then install manually your dependencies like</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
</pre></td><td class="rouge-code"><pre>.venv/bin/python <span class="nt">-m</span> pip <span class="nb">install </span>numpy
.venv/bin/python <span class="nt">-m</span> pip <span class="nb">install </span>pandas
.venv/bin/python <span class="nt">-m</span> pip <span class="nb">install </span>matplotlib
</pre></td></tr></tbody></table></code></pre></div></div>
<p>To pin all dependencies, freeze the environemnt into a file like</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>.venv/bin/python <span class="nt">-m</span> pip freeze <span class="o">&gt;</span> requirements.txt
</pre></td></tr></tbody></table></code></pre></div></div>

<p>if you want to recreate this environment you can use <code class="language-plaintext highlighter-rouge">requirements.txt</code> to reinstall all the same packages. Just create a new environment in e.g. <code class="language-plaintext highlighter-rouge">.venv</code> and run</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>.venv/bin/python <span class="nt">-m</span> pip <span class="nb">install</span> <span class="nt">-r</span> requirements.txt
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Sometimes it is convenient to have a single environment to run in different projects, for instance different analysis on data science. In this case I create a virtual environment in <code class="language-plaintext highlighter-rouge">~/.venvs</code></p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
</pre></td><td class="rouge-code"><pre><span class="c">#!/bin/bash</span>

<span class="nv">PYTHON_VERSION</span><span class="o">=</span>3.12.2
<span class="nv">ENVIRONMENT_DIR</span><span class="o">=</span>~/.venvs/data_science

<span class="c"># set python version from your pyenv</span>
<span class="nv">PYTHON_PATH</span><span class="o">=</span><span class="s2">"</span><span class="si">$(</span>pyenv root<span class="si">)</span><span class="s2">/versions/</span><span class="nv">$PYTHON_VERSION</span><span class="s2">/bin/python"</span>

<span class="k">${</span><span class="nv">PYTHON_PATH</span><span class="k">}</span> <span class="nt">-m</span> venv <span class="k">${</span><span class="nv">ENVIRONMENT_DIR</span><span class="k">}</span>
<span class="k">${</span><span class="nv">ENVIRONMENT_DIR</span><span class="k">}</span>/bin/python <span class="nt">-m</span> pip <span class="nb">install</span> <span class="nt">--upgrade</span> pip
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Then activate it and start installing your packages</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
</pre></td><td class="rouge-code"><pre><span class="nb">source</span> ~/.venvs/data_science/bin/activate

pip <span class="nb">install </span>numpy
pip <span class="nb">install </span>pandas
pip <span class="nb">install </span>matplotlib
</pre></td></tr></tbody></table></code></pre></div></div>

<h2 id="install-venv">Install venv</h2>

<p>It comes by default with <code class="language-plaintext highlighter-rouge">python</code>. Call it via <code class="language-plaintext highlighter-rouge">python -m venv --help</code>.</p>

<h2 id="create-a-virtual-environemnt">Create a virtual environemnt</h2>

<p>The venv <a href="https://docs.python.org/3/library/venv.html">venv</a> module can be invoked to create a virtual environment. The syntax is very simple, <code class="language-plaintext highlighter-rouge">python -m venv path/to/your/new/venv</code>. First select or install the python version with which you want to create the virtual environment, I normally use <code class="language-plaintext highlighter-rouge">pyenv</code> (see the <a href="../pyenv">pyenv post</a> for further reference) to do this, let’s install <code class="language-plaintext highlighter-rouge">3.12.4</code> as an example</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
</pre></td><td class="rouge-code"><pre><span class="nv">PYTHON_VERSION</span><span class="o">=</span>3.12.4
pyenv <span class="nb">install</span> <span class="k">${</span><span class="nv">PYTHON_VERSION</span><span class="k">}</span>
pyenv shell <span class="k">${</span><span class="nv">PYTHON_VERSION</span><span class="k">}</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Now with <code class="language-plaintext highlighter-rouge">python --version</code> you should get <code class="language-plaintext highlighter-rouge">3.12.4</code>. Next you can create the virtual environment in the current directory as</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre>python <span class="nt">-m</span> venv .venv
.venv/bin/python <span class="nt">-m</span> pip <span class="nb">install</span> <span class="nt">--upgrade</span> pip
</pre></td></tr></tbody></table></code></pre></div></div>

<p>And you will see this directory in the current one, jus type <code class="language-plaintext highlighter-rouge">ls -lhat .</code> . You will see three directories <code class="language-plaintext highlighter-rouge">include</code>, <code class="language-plaintext highlighter-rouge">lib</code> and <code class="language-plaintext highlighter-rouge">bin</code> and a file <code class="language-plaintext highlighter-rouge">pyenv.cfg</code>. The file just has metadata of the command used to create the virtual environment, the executable path and the python version. The real bread and butter is in the directories.</p>

<p><code class="language-plaintext highlighter-rouge">lib</code> is where all installed packages live, try to <code class="language-plaintext highlighter-rouge">python -m pip install numpy</code> and <code class="language-plaintext highlighter-rouge">ls</code> to <code class="language-plaintext highlighter-rouge">.venv/lib/python3.12/site-packages</code>, there will be a directory with your <code class="language-plaintext highlighter-rouge">numpy</code>, if you wan to investigate further go for instance to <code class="language-plaintext highlighter-rouge">random</code> and check the library there <code class="language-plaintext highlighter-rouge">ls -lhat .venv/lib/python3.12/site-packages/numpy/random</code>, in my case there is a file <code class="language-plaintext highlighter-rouge">_mt19937.cpython-312-darwin.so</code> corresponding to the shared library for the <a href="https://en.wikipedia.org/wiki/Mersenne_Twister">Mersenne Twister</a>. We’ll see more on compiled libraries for python in a later post, it’s not the topic here.</p>

<p>In <code class="language-plaintext highlighter-rouge">bin</code> directory there are the executables for python and pip. Run <code class="language-plaintext highlighter-rouge">ls -lhat .venv/bin</code> to inspect it.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
</pre></td><td class="rouge-code"><pre>drwxr-xr-x  14 user  group   448B  5 Aug 18:59 <span class="nb">.</span>
<span class="nt">-rwxr-xr-x</span>   1 user  group   233B  5 Aug 18:59 f2py
<span class="nt">-rw-r--r--</span>   1 user  group   2.0K  5 Aug 18:37 activate
<span class="nt">-rw-r--r--</span>   1 user  group   8.8K  5 Aug 18:37 Activate.ps1
<span class="nt">-rw-r--r--</span>   1 user  group   918B  5 Aug 18:37 activate.csh
<span class="nt">-rw-r--r--</span>   1 user  group   2.1K  5 Aug 18:37 activate.fish
<span class="nt">-rwxr-xr-x</span>   1 user  group   238B  5 Aug 18:37 pip3.12
<span class="nt">-rwxr-xr-x</span>   1 user  group   238B  5 Aug 18:37 pip3
<span class="nt">-rwxr-xr-x</span>   1 user  group   238B  5 Aug 18:37 pip
lrwxr-xr-x   1 user  group     6B  5 Aug 18:37 python3.12 -&gt; python
lrwxr-xr-x   1 user  group     6B  5 Aug 18:37 python3 -&gt; python
lrwxr-xr-x   1 user  group    46B  5 Aug 18:37 python -&gt; ~/.pyenv/versions/3.12.4/bin/python
</pre></td></tr></tbody></table></code></pre></div></div>
<p>See that python executable is actually a symbolic link to the original python binary from your freshly installed python <code class="language-plaintext highlighter-rouge">3.12.4</code> in <code class="language-plaintext highlighter-rouge">~/.pyenv</code>. This means that python is not really copied in your virtual environment but rather just using the one you used to create the environment. At the same time there’s <code class="language-plaintext highlighter-rouge">python3.12</code>, <code class="language-plaintext highlighter-rouge">python3</code> both pointing to python in the directory. Find also <code class="language-plaintext highlighter-rouge">pip</code>, the default python package manager, this one is not a symbolic link and executing it will install packages inside your virtual environment as explained before. Finally there is <code class="language-plaintext highlighter-rouge">activate</code> (or <code class="language-plaintext highlighter-rouge">activate.chs</code>, <code class="language-plaintext highlighter-rouge">activate.fish</code>… for different shells) that modifies your <code class="language-plaintext highlighter-rouge">PATH</code> when it is sourced so that it can find <code class="language-plaintext highlighter-rouge">pip</code> and <code class="language-plaintext highlighter-rouge">python</code> in your virtual environment.</p>

<p>To activate the environment simply:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre><span class="nb">source</span> .venv/bin/activate
</pre></td></tr></tbody></table></code></pre></div></div>

<p>(deactivate with <code class="language-plaintext highlighter-rouge">deactivate</code>). Then every time you type <code class="language-plaintext highlighter-rouge">python</code> it will get this environment. However, I’m a special guy and I’m not very fan of activating environments when I’m working in a specific project, I just call <code class="language-plaintext highlighter-rouge">python</code> and <code class="language-plaintext highlighter-rouge">pip</code> directy from the directory of the virtual environment:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
</pre></td><td class="rouge-code"><pre><span class="c"># calling python</span>
.venv/bin/python

<span class="c"># calling pip to install numpy</span>
.venv/bin/python <span class="nt">-m</span> pip <span class="nb">install </span>numpy

<span class="c"># or equivalently</span>
.venv/bin/pip <span class="nb">install </span>numpy
</pre></td></tr></tbody></table></code></pre></div></div>

<p>For convenience I always create a bash script and place it in the root of my repository, for instance</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
</pre></td><td class="rouge-code"><pre><span class="c">#!/bin/bash</span>

<span class="nv">SCRIPT_DIR</span><span class="o">=</span><span class="s2">"</span><span class="si">$(</span><span class="nb">cd</span> <span class="s2">"</span><span class="si">$(</span><span class="nb">dirname</span> <span class="s2">"</span><span class="k">${</span><span class="nv">BASH_SOURCE</span><span class="p">[0]</span><span class="k">}</span><span class="s2">"</span><span class="si">)</span><span class="s2">"</span> <span class="o">&amp;&amp;</span> <span class="nb">pwd</span><span class="si">)</span><span class="s2">"</span>
<span class="nv">PYTHON_VERSION</span><span class="o">=</span>3.12.2

<span class="c"># set python version from your pyenv</span>
<span class="nv">PYTHON_PATH</span><span class="o">=</span><span class="s2">"</span><span class="si">$(</span>pyenv root<span class="si">)</span><span class="s2">/versions/</span><span class="nv">$PYTHON_VERSION</span><span class="s2">/bin/python"</span>

<span class="c"># remove env and recreate</span>
<span class="nb">rm</span> <span class="nt">-rf</span> <span class="k">${</span><span class="nv">SCRIPT_DR</span><span class="k">}</span>/.venv

<span class="k">${</span><span class="nv">PYTHON_PATH</span><span class="k">}</span> <span class="nt">-m</span> venv <span class="k">${</span><span class="nv">SCRIPT_DIR</span><span class="k">}</span>/.venv
<span class="k">${</span><span class="nv">SCRIPT_DIR</span><span class="k">}</span>/.venv/bin/python <span class="nt">-m</span> pip <span class="nb">install</span> <span class="nt">--upgrade</span> pip

<span class="c"># # install any kind of dependencies</span>
<span class="c"># ${SCRIPT_DIR}/.venv/bin/python -m pip install -r requirements.txt</span>

<span class="c"># #or install repository package</span>
<span class="c"># ${SCRIPT_DIR}/.venv/bin/python -m pip install ${SCRIPT_DIR}</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>which will install a new virtual enviroment in the directory where the script is placed using python 3.12.2.</p>

<blockquote class="prompt-tip">
  <p>TIP: Sometimes there’s no need to create a new repository per project. For instance, if you are a data scientist crunching some numbers here and there you may want to use the same virtual environment without paying too much attention to what is the python or package versions. For those cases create virtual environments in your home directory like <code class="language-plaintext highlighter-rouge">mkdir ~/.venvs &amp;&amp; python -m venv ~/.venvs/data_science</code>, and activate with <code class="language-plaintext highlighter-rouge">source ~/.venvs/data_science/bin/activate</code>.</p>
</blockquote>

<h1 id="install-and-pin-dependencies-on-a-virtual-environment-with-pip">Install and pin dependencies on a virtual environment with pip</h1>

<p>Once created a virtual environment, the most common is to install your dependencies. No matter what you do in python, it’s 99% probable that you need an external package. In command line you can just <code class="language-plaintext highlighter-rouge">pip install package</code> whatever dependency specifying the version so that another developer can reproduce your code with exactly the same results. The same could be applied for services (e.g. REST APIs) for which need to be shut down, deleted and rebuilt and started. For this reason we must pin dependencies for reproducible outcomes, we do that with a file called <code class="language-plaintext highlighter-rouge">requirements.txt</code>.</p>

<p>Let’s install <code class="language-plaintext highlighter-rouge">numpy</code>, <code class="language-plaintext highlighter-rouge">pandas</code> and <code class="language-plaintext highlighter-rouge">matplotlib</code> to our virtual environment</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
</pre></td><td class="rouge-code"><pre>.venv/bin/python <span class="nt">-m</span> pip <span class="nb">install </span>numpy
.venv/bin/python <span class="nt">-m</span> pip <span class="nb">install </span>pandas
.venv/bin/python <span class="nt">-m</span> pip <span class="nb">install </span>matplotlib
</pre></td></tr></tbody></table></code></pre></div></div>

<p>pip will interpret at this point which is the most suitable version for each of these packages, usually the latest one if your python version is quite new, to see which ones were installed specifically we can run</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>.venv/bin/python <span class="nt">-m</span> pip freeze
</pre></td></tr></tbody></table></code></pre></div></div>

<p>In my case, I see <code class="language-plaintext highlighter-rouge">numpy==1.21.6</code>, <code class="language-plaintext highlighter-rouge">pandas==1.3.5</code> and <code class="language-plaintext highlighter-rouge">matplotlib==3.5.3</code> and a bunch of other packages. Two things to note here, first, the <code class="language-plaintext highlighter-rouge">==</code> means that it uses that specific version on fhe package and no other. The second why are there other packages if we just installed three?. Turns out these three packages use other packages at the same time and those have to be pinned too!. You can see the whole depencency with two packages I recently found called <code class="language-plaintext highlighter-rouge">pipdeptree</code> and <code class="language-plaintext highlighter-rouge">pip-tree</code>, both are great, check them out.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
</pre></td><td class="rouge-code"><pre><span class="c"># install pipdeptree</span>
.venv/bin/python <span class="nt">-m</span> pip <span class="nb">install </span>pipdeptree

<span class="c"># install pip-tree</span>
<span class="c"># .venv/bin/python -m pip install pip-tree</span>

<span class="c"># run pipdeptree</span>
.venv/bin/pipdeptree

<span class="c"># or run pip-tree</span>
<span class="c"># .venv/bin/pip-tree</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>which gives the following:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
</pre></td><td class="rouge-code"><pre><span class="nv">matplotlib</span><span class="o">==</span>3.5.3
  - cycler <span class="o">[</span>required: <span class="o">&gt;=</span>0.10, installed: 0.11.0]
  - fonttools <span class="o">[</span>required: <span class="o">&gt;=</span>4.22.0, installed: 4.38.0]
  - kiwisolver <span class="o">[</span>required: <span class="o">&gt;=</span>1.0.1, installed: 1.4.5]
    - typing-extensions <span class="o">[</span>required: Any, installed: 4.7.1]
  - numpy <span class="o">[</span>required: <span class="o">&gt;=</span>1.17, installed: 1.21.6]
  - packaging <span class="o">[</span>required: <span class="o">&gt;=</span>20.0, installed: 24.0]
  - Pillow <span class="o">[</span>required: <span class="o">&gt;=</span>6.2.0, installed: 9.5.0]
  - pyparsing <span class="o">[</span>required: <span class="o">&gt;=</span>2.2.1, installed: 3.1.4]
  - python-dateutil <span class="o">[</span>required: <span class="o">&gt;=</span>2.7, installed: 2.9.0.post0]
    - six <span class="o">[</span>required: <span class="o">&gt;=</span>1.5, installed: 1.16.0]
<span class="nv">pandas</span><span class="o">==</span>1.3.5
  - numpy <span class="o">[</span>required: <span class="o">&gt;=</span>1.17.3, installed: 1.21.6]
  - python-dateutil <span class="o">[</span>required: <span class="o">&gt;=</span>2.7.3, installed: 2.9.0.post0]
    - six <span class="o">[</span>required: <span class="o">&gt;=</span>1.5, installed: 1.16.0]
  - pytz <span class="o">[</span>required: <span class="o">&gt;=</span>2017.3, installed: 2024.1]
<span class="nv">pip</span><span class="o">==</span>24.0
<span class="nv">pipdeptree</span><span class="o">==</span>2.9.6
<span class="nv">setuptools</span><span class="o">==</span>40.8.0
</pre></td></tr></tbody></table></code></pre></div></div>

<p>So <code class="language-plaintext highlighter-rouge">matplotlib</code> depends on <code class="language-plaintext highlighter-rouge">cycler</code>, <code class="language-plaintext highlighter-rouge">fonttools</code>, <code class="language-plaintext highlighter-rouge">kiwisolver</code>… , <code class="language-plaintext highlighter-rouge">pandas</code> on <code class="language-plaintext highlighter-rouge">numpy</code>, <code class="language-plaintext highlighter-rouge">python-dateutil</code>…, <code class="language-plaintext highlighter-rouge">pipdeptree</code> does not depend on any other package and finally we have the default <code class="language-plaintext highlighter-rouge">pip</code>, and <code class="language-plaintext highlighter-rouge">setuptools</code> that come with any python installation. See that the specific version of <code class="language-plaintext highlighter-rouge">matplotlib</code> (and <code class="language-plaintext highlighter-rouge">pandas</code>) pinned and their dependencies are pinned too (e.g. <code class="language-plaintext highlighter-rouge">Pillow==9.5.0</code>). In all the sub-dependencies we have a <code class="language-plaintext highlighter-rouge">required&gt;=</code> indicating that the package could work with another version of the sup-dependency as long as this requirement is met. The task of <code class="language-plaintext highlighter-rouge">pip</code> is to figure out the versions of the dependencies and supdependencies so that everything is compatible.</p>

<p>Why would one not pin a dependency when building a package?. Easy, python packages need to be flexible, they should work for a range of python versions and package dependencies, if you constrict too much your package nobody will use it as it will be incompatible with other packages, but we will talk more about that in another post.</p>

<p>Getting back to topic, we have a pinned version of the packages, how can we give it to someone (another developer) to install the same exact environment as you? Introducing the file <code class="language-plaintext highlighter-rouge">requirements.txt</code>. Seems legacy but still everyone is using it to pin dependencies. To generate it, run</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>.venv/bin/python <span class="nt">-m</span> pip freeze <span class="o">&gt;</span> requirements.txt
</pre></td></tr></tbody></table></code></pre></div></div>

<p>and there you go!. Pass it to your friend, make sure he uses the same python version as you and he/she’ll need to run</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>.venv/bin/python <span class="nt">-m</span> pip instll <span class="nt">-r</span> requirements.txt
</pre></td></tr></tbody></table></code></pre></div></div>

<p>This will get him exactly the same environment.</p>]]></content><author><name>Sebastia Agramunt Puig</name></author><category term="Python" /><category term="computer science" /><summary type="html"><![CDATA[We have seen so far how to get different python versions in your system and briefly how to install packages using pip install. In real projects you want to take control over both, python version as well as packages versions and also having a fixed combination of these per project.]]></summary></entry><entry><title type="html">Python version management with Pyenv</title><link href="https://agramunt.me/posts/pyenv/" rel="alternate" type="text/html" title="Python version management with Pyenv" /><published>2024-07-05T02:20:15-07:00</published><updated>2024-11-05T06:52:39-08:00</updated><id>https://agramunt.me/posts/pyenv</id><content type="html" xml:base="https://agramunt.me/posts/pyenv/"><![CDATA[<p>In the previous posts we have seen how to install different python versions in the system but in a cumbersome way, either using <code class="language-plaintext highlighter-rouge">brew</code> to install another version in your system (and modify manually the PATH variable to make them availiable) or download and compile specific versions or using <code class="language-plaintext highlighter-rouge">conda</code> for creating new environments using different python version. This is where <a href="https://github.com/pyenv/pyenv">pyenv</a> comes very handy. Pyenv allows you to install different python versions in your system and select which one to use at every moment with a couple of easy commands. In this post we explore how to install and use pyenv, in my opinion the utlimate python version manager for any developer.</p>

<h2 id="tldr">TLDR</h2>

<p>Assuming you use <code class="language-plaintext highlighter-rouge">zsh</code> and on <code class="language-plaintext highlighter-rouge">linux</code>/<code class="language-plaintext highlighter-rouge">macOS</code> machine</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
</pre></td><td class="rouge-code"><pre>git clone https://github.com/pyenv/pyenv.git ~/.pyenv

<span class="nv">SRC_FILE</span><span class="o">=</span><span class="s2">"</span><span class="k">${</span><span class="nv">HOME</span><span class="k">}</span><span class="s2">/.zshrc"</span>
<span class="nb">echo</span> <span class="s1">'export PYENV_ROOT="$HOME/.pyenv"'</span> <span class="o">&gt;&gt;</span> <span class="k">${</span><span class="nv">SRC_FILE</span><span class="k">}</span>
<span class="nb">echo</span> <span class="s1">'command -v pyenv &gt;/dev/null || export PATH="$PYENV_ROOT/bin:$PATH"'</span> <span class="o">&gt;&gt;</span> <span class="k">${</span><span class="nv">SRC_FILE</span><span class="k">}</span>
<span class="nb">echo</span> <span class="s1">'eval "$(pyenv init -)"'</span> <span class="o">&gt;&gt;</span> <span class="k">${</span><span class="nv">SRC_FILE</span><span class="k">}</span>
</pre></td></tr></tbody></table></code></pre></div></div>
<p>then <code class="language-plaintext highlighter-rouge">source ~/.zshrc</code> and check if <code class="language-plaintext highlighter-rouge">pyenv</code> is installed</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>pyenv <span class="nt">--version</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Install a python verision and use it in your current shell</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre>pyenv <span class="nb">install </span>3.12.4
pyenv shell 3.12.4
</pre></td></tr></tbody></table></code></pre></div></div>

<h2 id="prerequisites-to-install-pyenv">Prerequisites to install pyenv</h2>

<p>For linux</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
</pre></td><td class="rouge-code"><pre><span class="nb">sudo </span>apt-get update
<span class="nb">sudo </span>apt-get <span class="nb">install</span> <span class="nt">-y</span> git <span class="se">\</span>
                        curl <span class="se">\</span>
                        build-essential <span class="se">\</span>
                        libssl-dev <span class="se">\</span>
                        zlib1g-dev <span class="se">\</span>
                        libbz2-dev <span class="se">\</span>
                        libreadline-dev <span class="se">\</span>
                        libsqlite3-dev <span class="se">\</span>
                        libffi-dev <span class="se">\</span>
                        libncurses5-dev <span class="se">\</span>
                        libncursesw5-dev <span class="se">\</span>
                        xz-utils <span class="se">\</span>
                        tk-dev

</pre></td></tr></tbody></table></code></pre></div></div>

<p>For MacOs</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
</pre></td><td class="rouge-code"><pre>brew update

brew <span class="nb">install </span>git <span class="se">\</span>
             curl <span class="se">\</span>
             openssl <span class="se">\</span>
             readline <span class="se">\</span>
             sqlite3 <span class="se">\</span>
             xz <span class="se">\</span>
             zlib
</pre></td></tr></tbody></table></code></pre></div></div>

<h2 id="install-pyenv">Install Pyenv</h2>

<p>I prefer to install directly clonning the repository into the recommended installation path <code class="language-plaintext highlighter-rouge">~/.pyenv</code>.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>git clone https://github.com/pyenv/pyenv.git ~/.pyenv
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Now open <code class="language-plaintext highlighter-rouge">~/.zshrc</code> (or <code class="language-plaintext highlighter-rouge">~/.bashrc</code>) and append:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
</pre></td><td class="rouge-code"><pre><span class="nv">SRC_FILE</span><span class="o">=</span><span class="s2">"</span><span class="k">${</span><span class="nv">HOME</span><span class="k">}</span><span class="s2">/.zshrc"</span>
<span class="nb">echo</span> <span class="s1">'export PYENV_ROOT="$HOME/.pyenv"'</span> <span class="o">&gt;&gt;</span> <span class="k">${</span><span class="nv">SRC_FILE</span><span class="k">}</span>
<span class="nb">echo</span> <span class="s1">'command -v pyenv &gt;/dev/null || export PATH="$PYENV_ROOT/bin:$PATH"'</span> <span class="o">&gt;&gt;</span> <span class="k">${</span><span class="nv">SRC_FILE</span><span class="k">}</span>
<span class="nb">echo</span> <span class="s1">'eval "$(pyenv init -)"'</span> <span class="o">&gt;&gt;</span> <span class="k">${</span><span class="nv">SRC_FILE</span><span class="k">}</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>finally just <code class="language-plaintext highlighter-rouge">source ~/.zshrc</code> (or <code class="language-plaintext highlighter-rouge">source ~/.bashrc</code>). Now try</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre>whereis pyenv
pyenv <span class="nt">--help</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<h2 id="uninstall-pyenv">Uninstall Pyenv</h2>

<p>Remove the directory</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre><span class="nb">rm</span> <span class="nt">-fr</span> ~/.pyenv
</pre></td></tr></tbody></table></code></pre></div></div>

<p>and the remove the lines you appended in <code class="language-plaintext highlighter-rouge">~/.zshrc</code> or <code class="language-plaintext highlighter-rouge">~/.bashrc</code>:</p>

<h2 id="installing-a-specific-python-version">Installing a specific python version</h2>

<p>Let’s say I want to install Python <code class="language-plaintext highlighter-rouge">3.11</code>, I can take a look fo the available versions of <code class="language-plaintext highlighter-rouge">3.11</code> running</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>pyenv <span class="nb">install</span> <span class="nt">--list</span> | <span class="nb">grep </span>3.11
</pre></td></tr></tbody></table></code></pre></div></div>

<p>then select <code class="language-plaintext highlighter-rouge">3.11.2</code></p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>pyenv <span class="nb">install </span>3.11.2
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Let’s also install <code class="language-plaintext highlighter-rouge">3.12.4</code></p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>pyenv <span class="nb">install </span>3.12.4
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Now check which versions you have installed running</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>pyenv versions
</pre></td></tr></tbody></table></code></pre></div></div>

<p>getting</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
</pre></td><td class="rouge-code"><pre><span class="k">*</span> system <span class="o">(</span><span class="nb">set </span>by ~/.pyenv/version<span class="o">)</span>
  3.11.2
  3.12.4
</pre></td></tr></tbody></table></code></pre></div></div>

<p>In asterisk the current active python version in the terminal, in this case system. To uninstall any version (e.g. <code class="language-plaintext highlighter-rouge">3.11.2</code>) just do</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>pyenv uninstall 3.11.2
</pre></td></tr></tbody></table></code></pre></div></div>

<h2 id="setting-python-version">Setting python version</h2>

<p>Pyenv allows you to define which installed python version to use at each time. There is the <code class="language-plaintext highlighter-rouge">global</code>, <code class="language-plaintext highlighter-rouge">local</code> and <code class="language-plaintext highlighter-rouge">shell</code> levels.</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">global</code>:  sets the global python version in your system</li>
  <li><code class="language-plaintext highlighter-rouge">local</code> : sets a python version for the current directory</li>
  <li><code class="language-plaintext highlighter-rouge">shell</code>: sets python version for current shell.</li>
</ul>

<p>Global python versions are set like <code class="language-plaintext highlighter-rouge">3.11.2</code> just type in your terminal</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>pyenv global 3.11.2
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Now every time you run <code class="language-plaintext highlighter-rouge">python</code> in a new terminal, it is going to default to <code class="language-plaintext highlighter-rouge">3.11.2</code>.</p>

<p>To set a local python version you type</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>pyenv <span class="nb">local </span>3.11.2
</pre></td></tr></tbody></table></code></pre></div></div>

<p>in your desired directory. You will see it will create a new file called <code class="language-plaintext highlighter-rouge">.pyenv-version</code> containing <code class="language-plaintext highlighter-rouge">3.11.2</code>. When you are in your terminal and have <code class="language-plaintext highlighter-rouge">cd</code>ed to that directory, pyenv will look for that file and point the <code class="language-plaintext highlighter-rouge">python</code> command to that specific version. This is very useful when developing on specific projects if you want to pin the python version to be used in that project. That being said, the developer obviously has to use <code class="language-plaintext highlighter-rouge">pyenv</code> to develop in that project.</p>

<p>Finally, to pin a python version for your current interactive shell you just run</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>pyenv shell 3.11.2
</pre></td></tr></tbody></table></code></pre></div></div>

<p>This will be effective as long as you work on that interactive shell, the moment you close it, it defaults to <code class="language-plaintext highlighter-rouge">local</code> or <code class="language-plaintext highlighter-rouge">global</code> python versions. I use the <code class="language-plaintext highlighter-rouge">pyenv shell</code> command for creating new virtual environments with specific python versions, but I will comment on that in a later post.</p>

<p>The hierarchy for the python version is the following, <code class="language-plaintext highlighter-rouge">shell</code> &gt; <code class="language-plaintext highlighter-rouge">local</code> &gt; <code class="language-plaintext highlighter-rouge">global</code> meaning that, if there is a <code class="language-plaintext highlighter-rouge">shell</code> python version, the terminal will take that, then if not it will default to the <code class="language-plaintext highlighter-rouge">local</code> then if that doesn’t exist it will default to the <code class="language-plaintext highlighter-rouge">global</code> version.</p>

<h2 id="a-bit-of-pyenv-inner-workings">A bit of pyenv inner workings</h2>

<p>I am not going to rewrite all the information in the <code class="language-plaintext highlighter-rouge">pyenv</code> <a href="https://github.com/pyenv/pyenv?tab=readme-ov-file#how-it-works">README.md</a>, check there for a fully detaile dexplanation on how <code class="language-plaintext highlighter-rouge">pyenv</code> works. I just want to highlight few things to get a birds eye view. Essentially what <code class="language-plaintext highlighter-rouge">pyenv</code> installation does is to prepend the path <code class="language-plaintext highlighter-rouge">~/.pyenv/shims</code>, if you check closely what’s in there:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre><span class="nb">ls</span> <span class="nt">-lhat</span> ~/.pyenv/shims
</pre></td></tr></tbody></table></code></pre></div></div>

<p>you will find a <code class="language-plaintext highlighter-rouge">python</code> and <code class="language-plaintext highlighter-rouge">pip</code> executables among others. Every time you run <code class="language-plaintext highlighter-rouge">python</code> your shell tries to find that executable in the directories in its <code class="language-plaintext highlighter-rouge">PATH</code>, starting from the first. By prepending ` ~/.pyenv/shims<code class="language-plaintext highlighter-rouge"> to the </code>PATH<code class="language-plaintext highlighter-rouge">, the </code>python<code class="language-plaintext highlighter-rouge"> executable found will be </code> ~/.pyenv/shims/python<code class="language-plaintext highlighter-rouge">. Then, that </code>python<code class="language-plaintext highlighter-rouge"> is actually an executable that redirects the call to </code>pyenv<code class="language-plaintext highlighter-rouge"> which in turn decides wether to point to </code>global<code class="language-plaintext highlighter-rouge">, </code>local<code class="language-plaintext highlighter-rouge"> or </code>shell` python versions. This is very clean!</p>

<p>Different versions of python are installed under <code class="language-plaintext highlighter-rouge">~/.pyenv/versions</code>. Without using shims you can call directly any python version you have installed on your bash/zsh shell, for instance if we want to call python 3.11.2:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>~/.pyenv/versions/3.11.2/bin/python
</pre></td></tr></tbody></table></code></pre></div></div>

<p>but obviously is not good practice to install packages calling this python, instead you should create a new environment that copy your python version to your new environment in the current directory using <code class="language-plaintext highlighter-rouge">venv</code> binary from the desired python version (we’ll go on details on the next post):</p>

<h2 id="conda-in-pyenv">Conda in pyenv</h2>

<p>One of the main reasons why I wouldn’t install miniconda directly is because you can actually install it as a <code class="language-plaintext highlighter-rouge">pyenv</code> version although it is a bit tricky to set up if you still don’t want to mess your <code class="language-plaintext highlighter-rouge">PATH</code> variable. Check the miniconda versions</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>pyenv <span class="nb">install</span> <span class="nt">--list</span> | <span class="nb">grep </span>miniconda3
</pre></td></tr></tbody></table></code></pre></div></div>
<p>try to get the latest version: <code class="language-plaintext highlighter-rouge">miniconda3-latest</code>, let’s proceed to install it</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>pyenv <span class="nb">install </span>miniconda3-latest
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Go ahead and make it available on shell, then check conda is on your command line</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
</pre></td><td class="rouge-code"><pre>pyenv shell miniconda3-latest
conda <span class="nt">--help</span>
conda <span class="nt">--version</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>If you ever need to update this version just run</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>conda update <span class="nt">-n</span> base <span class="nt">-c</span> defaults conda
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Now you can create a new virtual environment called <code class="language-plaintext highlighter-rouge">myenv2</code></p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>conda create <span class="nt">-n</span> myenv2 <span class="nv">python</span><span class="o">=</span>3.12 <span class="nt">-y</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>After this it gets trycky if you want to activate the environment. Normally you would run <code class="language-plaintext highlighter-rouge">conda activate myenv2</code> but then conda complains that you need to run <code class="language-plaintext highlighter-rouge">conda init zsh</code> (in my cas since I use zsh) first, which writes some lines to <code class="language-plaintext highlighter-rouge">~/.zshrc</code> that allow you to activate an environment by modifying the PATH. I pesonally don’t like letting conda control my environment PATH (that’s why I use <code class="language-plaintext highlighter-rouge">pyenv</code> after all), what one can do is to activate the base environment like</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre><span class="nb">source</span> ~/.pyenv/versions/miniconda3-latest/bin/activate
</pre></td></tr></tbody></table></code></pre></div></div>
<p>and then you can activate the environment as</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>conda activate myenv2
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Many times I won’t use the <code class="language-plaintext highlighter-rouge">activate</code> functionality, I just call the python binary:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
</pre></td><td class="rouge-code"><pre><span class="nv">PYTHON_VERSION</span><span class="o">=</span>miniconda3-latest
<span class="nv">ENV</span><span class="o">=</span>myenv2
<span class="k">${</span><span class="nv">PYENV_ROOT</span><span class="k">}</span>/versions/<span class="k">${</span><span class="nv">PYTHON_VERSION</span><span class="k">}</span>/envs/<span class="k">${</span><span class="nv">ENV</span><span class="k">}</span>/bin/python <span class="nt">--version</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>this is pretty handy to use in bash scripts. See that every new environment we install using conda will be placed in the directory <code class="language-plaintext highlighter-rouge">~/.pyenv/versions/miniconda3-latest/envs/</code>.</p>

<h2 id="conclusions">Conclusions</h2>

<p>pyenv is a versatile python version manager. It simply works and allows you to use other managers like <code class="language-plaintext highlighter-rouge">minicionda</code> or <code class="language-plaintext highlighter-rouge">mamba</code>. It is my preferred way of managing python versions in a development environment by far.</p>]]></content><author><name>Sebastia Agramunt Puig</name></author><category term="Python" /><category term="computer science" /><summary type="html"><![CDATA[In the previous posts we have seen how to install different python versions in the system but in a cumbersome way, either using brew to install another version in your system (and modify manually the PATH variable to make them availiable) or download and compile specific versions or using conda for creating new environments using different python version. This is where pyenv comes very handy. Pyenv allows you to install different python versions in your system and select which one to use at every moment with a couple of easy commands. In this post we explore how to install and use pyenv, in my opinion the utlimate python version manager for any developer.]]></summary></entry><entry><title type="html">Conda Package Manager</title><link href="https://agramunt.me/posts/conda/" rel="alternate" type="text/html" title="Conda Package Manager" /><published>2024-07-04T01:23:12-07:00</published><updated>2026-01-24T12:49:11-08:00</updated><id>https://agramunt.me/posts/conda</id><content type="html" xml:base="https://agramunt.me/posts/conda/"><![CDATA[<p><a href="https://www.anaconda.com/">Anaconda</a> is a distribution of the Python and R programming languages for scientific computing, data science, machine learning, and large-scale data processing. It simplifies package management and deployment, and it provides many useful tools and libraries out of the box. Having said that I have to confess that I’m not really a fan of anaconda, it basically contains many packages and software you may not want to use like <a href="https://www.spyder-ide.org/">Spyder IDE</a> (I use vscode instead), Anaconda navigator, a UI that helps you manage your virtual environments, <a href="https://posit.co/">RStudio</a> for <a href="https://www.r-project.org/">R</a> and lots of python packages that you may not want to use. I downloaded to install it today and the installation helper advises me that total installation is 4.82 GB. As you may guess, I’m not installing Anaconda today not even for a try (I did it in the past).</p>

<p>A better alternative (IMHO) to Anaconda is its mini-version, <a href="https://docs.anaconda.com/miniconda/">miniconda</a>. It’s basically the anaconda environment manager and package installer without all the fancy UI you may not need. In this post we’ll show how to install it</p>

<h2 id="tldr">TLDR</h2>

<p>For MacOS and zsh terminal</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
</pre></td><td class="rouge-code"><pre><span class="nb">mkdir</span> <span class="nt">-p</span> ~/miniconda3
curl https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh <span class="nt">-o</span> ~/miniconda3/miniconda.sh
bash ~/miniconda3/miniconda.sh <span class="nt">-b</span> <span class="nt">-u</span> <span class="nt">-p</span> ~/miniconda3
<span class="nb">rm</span> <span class="nt">-rf</span> ~/miniconda3/miniconda.sh
~/miniconda3/bin/conda init zsh
<span class="nb">source</span> ~/.zshrc
</pre></td></tr></tbody></table></code></pre></div></div>

<h2 id="install-miniconda">Install miniconda</h2>

<p>As of today you have the following versions:</p>

<ul>
  <li>MacOS Intel (old Macs) <a href="https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh">Miniconda3-latest-MacOSX-x86_64.sh</a>,</li>
  <li>MacOS M1 ARM (new Macs) <a href="https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-arm64.sh">Miniconda3-latest-MacOSX-arm64.sh</a></li>
  <li>Linux <a href="https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh">Miniconda3-latest-Linux-x86_64.sh</a>.</li>
  <li>Windows <a href="https://repo.anaconda.com/miniconda/Miniconda3-latest-Windows-x86_64.exe">Miniconda3-latest-Windows-x86_64.exe</a></li>
</ul>

<p>I’m going to show how to install for MacOS intel, but you should find all instructions <a href="https://docs.anaconda.com/miniconda/">here</a>. It is as easy as to download a bash script and execute it, that will complete the installation.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
</pre></td><td class="rouge-code"><pre><span class="nb">mkdir</span> <span class="nt">-p</span> ~/miniconda3
curl https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh <span class="nt">-o</span> ~/miniconda3/miniconda.sh
bash ~/miniconda3/miniconda.sh <span class="nt">-b</span> <span class="nt">-u</span> <span class="nt">-p</span> ~/miniconda3
<span class="nb">rm</span> <span class="nt">-rf</span> ~/miniconda3/miniconda.sh
</pre></td></tr></tbody></table></code></pre></div></div>

<p>That installs miniconda but to make it availiable to your shell command you should</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>~/miniconda3/bin/conda init bash
</pre></td></tr></tbody></table></code></pre></div></div>

<p>if you are using bash or</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>~/miniconda3/bin/conda init zsh
</pre></td></tr></tbody></table></code></pre></div></div>

<p>if you use zsh. For instance, in zsh, this will append some lines of code at your <code class="language-plaintext highlighter-rouge">~/.zshrc</code>, check it by running <code class="language-plaintext highlighter-rouge">cat ~/.zshrc</code> and see that conda has created the code in between <code class="language-plaintext highlighter-rouge"># &gt;&gt;&gt; conda initialize &gt;&gt;&gt;</code> and <code class="language-plaintext highlighter-rouge"># &lt;&lt;&lt; conda initialize &lt;&lt;&lt;</code>. Now, to update your command line just run</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre><span class="nb">source</span> ~/.zshrc
</pre></td></tr></tbody></table></code></pre></div></div>

<p>(or <code class="language-plaintext highlighter-rouge">~/.bashrc</code> if you use bash) and see that the command <code class="language-plaintext highlighter-rouge">conda</code> is there. If that works, congrats you have conda installed!. All your conda stuff is in <code class="language-plaintext highlighter-rouge">~/miniconda3</code>. conda is now available to your system and it has overloaded the <code class="language-plaintext highlighter-rouge">python</code> command in your terminal, check that by running <code class="language-plaintext highlighter-rouge">which python</code>, then you’ll find that it points to <code class="language-plaintext highlighter-rouge">~/miniconda3/bin/python</code>. Basically now your default python is managed by conda, that’s what conda does when you execute the <code class="language-plaintext highlighter-rouge">init zsh</code>/<code class="language-plaintext highlighter-rouge">init bash</code> command to write to your <code class="language-plaintext highlighter-rouge">~/.bashrc</code>/<code class="language-plaintext highlighter-rouge">~/.zshrc</code>.</p>

<h2 id="uninstall">Uninstall</h2>

<p>To uninstall just remove <code class="language-plaintext highlighter-rouge">~/.miniconda3</code> and the comments appended in <code class="language-plaintext highlighter-rouge">~/.bashrc</code> or <code class="language-plaintext highlighter-rouge">~/.zshrc</code>. In the example shown:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre><span class="nb">rm</span> <span class="nt">-rf</span> ~/miniconda3
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Open <code class="language-plaintext highlighter-rouge">~.zshrc</code> with</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>vim ~/.zshrc
</pre></td></tr></tbody></table></code></pre></div></div>

<p>and renomve everything in between th lines <code class="language-plaintext highlighter-rouge"># &gt;&gt;&gt; conda initialize &gt;&gt;&gt;</code> and <code class="language-plaintext highlighter-rouge"># &lt;&lt;&lt; conda initialize &lt;&lt;&lt;</code>. Now refresh your terminal</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre><span class="nb">source</span> ~/.zshrc
</pre></td></tr></tbody></table></code></pre></div></div>

<p>type <code class="language-plaintext highlighter-rouge">conda</code> to see that it’s not there anymore.</p>

<h2 id="virtual-enviroments">Virtual enviroments</h2>

<p>In miniconda and anaconda the <code class="language-plaintext highlighter-rouge">conda</code> command is responsible to managing virtual environments. We will investigate that further in a future <a href="../python-virtual-environments-with-venv"> post</a>  but basically an environment is an isolated python where we have installed different pacakges and a specific python version. Virtual environments are perfect if you work on different projects in your system, for instance, if you need <code class="language-plaintext highlighter-rouge">numpy</code> and python <code class="language-plaintext highlighter-rouge">3.9</code> in one project and <code class="language-plaintext highlighter-rouge">matplotlib</code> and python <code class="language-plaintext highlighter-rouge">3.12</code> in another you need to have two different virtual environments, one for each project. Virtual environments are perfect because they isolate a single python version and a list of packages for each project.</p>

<p>Before creating the virtual environment it is good practice to update conda from time to time, you can do it with</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>conda update <span class="nt">-n</span> base <span class="nt">-c</span> defaults conda
</pre></td></tr></tbody></table></code></pre></div></div>

<p>this updates the “base” environment. The environment that <code class="language-plaintext highlighter-rouge">conda</code> (command line tool) uses. To create a virtual environment callaed <code class="language-plaintext highlighter-rouge">myenv</code> using python 3.12 just run</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>conda create <span class="nt">-n</span> myenv <span class="nv">python</span><span class="o">=</span>3.12 <span class="nt">-y</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>now you can activate it with</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>conda activate myenv
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Run <code class="language-plaintext highlighter-rouge">which python</code> and see that it is using a python binary from <code class="language-plaintext highlighter-rouge">~/miniconda3/envs/myenv/bin/python</code>. The path <code class="language-plaintext highlighter-rouge">~/miniconda3/envs/myenv</code> is where your new environment has been installed. Now you can install some packages like:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
</pre></td><td class="rouge-code"><pre>conda <span class="nb">install </span>numpy <span class="nt">-y</span>
conda <span class="nb">install </span>pandas <span class="nt">-y</span>
conda <span class="nb">install </span>scikit-learn <span class="nt">-y</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>check the packages you have installed in the current activated virtual environment with the command</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>conda list
</pre></td></tr></tbody></table></code></pre></div></div>

<p>and you will find your <code class="language-plaintext highlighter-rouge">numpy</code>, <code class="language-plaintext highlighter-rouge">pandas</code> and <code class="language-plaintext highlighter-rouge">scikit-learn</code> packages that you just installed listed there.</p>

<p>To deactive the environment run</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>conda deactivate
</pre></td></tr></tbody></table></code></pre></div></div>

<p>And to list all your virtual environments managed by conda run</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>conda <span class="nb">env </span>list
</pre></td></tr></tbody></table></code></pre></div></div>
<p>Finally, to remove the environment make sure it’s deactivated and run</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>conda remove <span class="nt">--name</span> myenv <span class="nt">--all</span> <span class="nt">-y</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<h2 id="virtual-environments-from-file">Virtual environments from file</h2>

<p>In some projects you will find a file with a name similar to <code class="language-plaintext highlighter-rouge">environment.yml</code> and contents like:</p>

<div class="language-yml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
</pre></td><td class="rouge-code"><pre><span class="na">name</span><span class="pi">:</span> <span class="s">myenv</span>
<span class="na">channels</span><span class="pi">:</span>
  <span class="pi">-</span> <span class="s">defaults</span>
<span class="na">dependencies</span><span class="pi">:</span>
  <span class="pi">-</span> <span class="s">python=3.12</span>
  <span class="pi">-</span> <span class="s">numpy</span>
  <span class="pi">-</span> <span class="s">pandas</span>
  <span class="pi">-</span> <span class="s">scikit-learn</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>This specifies a conda environment (the file has been generated using <code class="language-plaintext highlighter-rouge">conda env export --from-history &gt; environment.yml</code> with the activated environment we created in the last section). Then to install</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>conda <span class="nb">env </span>create <span class="nt">-f</span> environment.yml
</pre></td></tr></tbody></table></code></pre></div></div>

<h2 id="much-more-than-virtual-environment-manager">Much more than virtual environment manager</h2>

<p>Conda is very convenient to manage dependencies, that is, to resolve conflicts in between different package versions. <a href="https://conda-forge.org/">conda-forge</a> is defined as a place to find community-led recipes, infrastructure and distributions for conda, this is the secret sauce why conda is so successful in my opinion. The comunity has been able to maintain and distribute compiled packages for different operating systems and architectures and as of today they have over 25.7K packages, check them in the <a href="https://conda-forge.org/packages/">package browser</a>. They make sure the packages are cross platform precompiled so will be able to install no matter what the operating system or platform you use.</p>

<p>Another advantage of conda (that has to do with precompiled software) is that you can install libraries in an isolated environment. It often happens that the machine you are using for developing does not contain <a href="https://www.openblas.net/">openblas</a> or <a href="https://opencv.org/">opencv</a>, then you can install them with your operating system package manager but if you are not root you don’t have those permissions… Then it is convenient to install them in your home directory or simply use conda to download the compiled libraries for your system, that way you don’t need to compile and spend time to use the libraries.</p>

<p>We are going to install a compiled library using conda, the chosen library is <a href="https://opencv.org/">opencv</a>, a library for computer vision in C++.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>conda <span class="nb">install </span>conda-forge::libopencv <span class="nt">-y</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>where <code class="language-plaintext highlighter-rouge">conda-forge</code> is the conda channel (the URL where the package lives). Go check that the library has been installed by finding the directory inside your environment</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>find <span class="nv">$CONDA_PREFIX</span> <span class="nt">-name</span> <span class="s2">"opencv2"</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>here <code class="language-plaintext highlighter-rouge">$CONDA_PREFIX</code> should be the prefix of the activated environment. This should output <code class="language-plaintext highlighter-rouge">~/miniconda3/envs/myenv/include/opencv4/opencv2</code>.</p>

<p>The header files are in this directory, just look for the main ones</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
</pre></td><td class="rouge-code"><pre><span class="nb">ls</span> <span class="nt">-lhat</span> <span class="nv">$CONDA_PREFIX</span>/include/opencv4/opencv2 | <span class="nb">grep </span>opencv.hpp
<span class="nb">ls</span> <span class="nt">-lhat</span> <span class="nv">$CONDA_PREFIX</span>/include/opencv4/opencv2 | <span class="nb">grep </span>core.hpp
<span class="nb">ls</span> <span class="nt">-lhat</span> <span class="nv">$CONDA_PREFIX</span>/include/opencv4/opencv2 | <span class="nb">grep </span>imgproc.hpp
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Now we can check where are the compiled libraries. They should be in the <code class="language-plaintext highlighter-rouge">lib</code> directory, let’s check for the basic ones</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre><span class="nb">ls</span> <span class="nt">-lhat</span> <span class="nv">$CONDA_PREFIX</span>/lib/ | <span class="nb">grep </span>libopencv_core
<span class="nb">ls</span> <span class="nt">-lhat</span> <span class="nv">$CONDA_PREFIX</span>/lib/ | <span class="nb">grep </span>libopencv_imgproc
</pre></td></tr></tbody></table></code></pre></div></div>

<p>You should be set, then if you wanted to use <code class="language-plaintext highlighter-rouge">opencv</code> in a C++ program you could use the headers and dynamically link to the library.</p>

<h2 id="conda-in-bash-scritps">Conda in bash scritps</h2>

<p>I like using bash scritps to automate builds, running scripts etc. In this section I’ll show you how to use conda in bash scripts to create enviornments, run scripts etc. The key is that you locate where is your conda installed, for me, since I mostly use <code class="language-plaintext highlighter-rouge">pyenv</code> it’s in</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre><span class="nv">CONDA_ROOT</span><span class="o">=</span><span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/.pyenv/versions/miniforge3-latest"</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>This is where your conda is installed. You should see directories like <code class="language-plaintext highlighter-rouge">bin</code>, <code class="language-plaintext highlighter-rouge">libexec</code>, <code class="language-plaintext highlighter-rouge">envs</code> etc… We can define other variables for binaries that we are interested:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
</pre></td><td class="rouge-code"><pre><span class="nv">CONDA_PY</span><span class="o">=</span><span class="s2">"</span><span class="nv">$CONDA_ROOT</span><span class="s2">/bin/python"</span>
<span class="nv">CONDA_BIN</span><span class="o">=</span><span class="s2">"</span><span class="nv">$CONDA_ROOT</span><span class="s2">/bin/conda"</span>
<span class="nv">MAMBA_BIN</span><span class="o">=</span><span class="s2">"</span><span class="nv">$CONDA_ROOT</span><span class="s2">/bin/mamba"</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Define these at the beginning of your script. Now let’s crate a function to create a virtual environment</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
</pre></td><td class="rouge-code"><pre>install-env-conda<span class="o">()</span> <span class="o">{</span>
    <span class="c"># Ensure mamba is installed or install it</span>
    <span class="k">if</span> <span class="o">[</span> <span class="o">!</span> <span class="nt">-x</span> <span class="s2">"</span><span class="nv">$MAMBA_BIN</span><span class="s2">"</span> <span class="o">]</span><span class="p">;</span> <span class="k">then</span>
      <span class="s2">"</span><span class="nv">$CONDA_PY</span><span class="s2">"</span> <span class="s2">"</span><span class="nv">$CONDA_BIN</span><span class="s2">"</span> <span class="nb">install</span> <span class="nt">-n</span> base <span class="nt">-y</span> <span class="nt">-c</span> conda-forge mamba
    <span class="k">fi</span>

    <span class="c"># create environment first remove if it already exist</span>
    <span class="nv">ENV_NAME</span><span class="o">=</span>my_env
    <span class="s2">"</span><span class="nv">$CONDA_PY</span><span class="s2">"</span> <span class="s2">"</span><span class="nv">$CONDA_BIN</span><span class="s2">"</span> remove <span class="nt">-name</span> <span class="nv">$ENV_NAME</span> <span class="nt">--all</span> <span class="nt">-y</span> <span class="o">||</span> <span class="nb">true</span>
    <span class="s2">"</span><span class="nv">$CONDA_PY</span><span class="s2">"</span> <span class="s2">"</span><span class="nv">$CONDA_BIN</span><span class="s2">"</span> create <span class="nt">-n</span> <span class="s2">"</span><span class="nv">$ENV_NAME</span><span class="s2">$"</span> <span class="nt">-y</span> <span class="nv">python</span><span class="o">=</span>3.12

    <span class="c"># install some packages with mamba (bit faster)</span>
    <span class="s2">"</span><span class="nv">$MAMBA_BIN</span><span class="s2">"</span> <span class="nb">install</span> <span class="nt">-n</span> <span class="s2">"</span><span class="nv">$ENV_NAME</span><span class="s2">"</span> <span class="nt">-y</span> <span class="nt">-c</span> conda-forge <span class="se">\</span>
        pandas <span class="se">\</span>
        numpy

    <span class="c"># also install with pip if desired (first upgrade pip)</span>
    <span class="s2">"</span><span class="nv">$CONDA_PY</span><span class="s2">"</span> <span class="s2">"</span><span class="nv">$CONDA_BIN</span><span class="s2">"</span> run <span class="nt">-n</span> <span class="s2">"</span><span class="nv">$ENV_NAME</span><span class="s2">"</span> python <span class="nt">-m</span> pip <span class="nb">install</span> <span class="nt">--upgrade</span> pip
    <span class="s2">"</span><span class="nv">$CONDA_PY</span><span class="s2">"</span> <span class="s2">"</span><span class="nv">$CONDA_BIN</span><span class="s2">"</span> run <span class="nt">-n</span> <span class="s2">"</span><span class="nv">$ENV_NAME</span><span class="s2">"</span> python <span class="nt">-m</span> pip <span class="nb">install </span>tqdm click matplotlib

    <span class="c"># or install specific wheels from file</span>
    <span class="nv">WHEEL_PATH</span><span class="o">=</span><span class="s2">"path/to/wheel.whl"</span>
    <span class="s2">"</span><span class="nv">$CONDA_PY</span><span class="s2">"</span> <span class="s2">"</span><span class="nv">$CONDA_BIN</span><span class="s2">"</span> run <span class="nt">-n</span> <span class="s2">"</span><span class="nv">$ENV_NAME</span><span class="s2">"</span> python <span class="nt">-m</span> pip <span class="nb">install</span> <span class="k">${</span><span class="nv">WHEEL_PATH</span><span class="k">}</span>

    <span class="c"># or from an external repository a custom package</span>
    <span class="nv">EXTRA_INDEX_URL</span><span class="o">=</span><span class="s2">"url/to/private/pypi/repository"</span>
    <span class="s2">"</span><span class="nv">$CONDA_PY</span><span class="s2">"</span> <span class="s2">"</span><span class="nv">$CONDA_BIN</span><span class="s2">"</span> run <span class="nt">-n</span> <span class="s2">"</span><span class="nv">$ENV_NAME</span><span class="s2">"</span> python <span class="nt">-m</span> pip <span class="nb">install</span> <span class="k">${</span><span class="nv">EXTRA_INDEX_URL</span><span class="k">}</span> custom_package
<span class="o">}</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>execucting this bash function will create your new conda envrionment which will be place by conda in <code class="language-plaintext highlighter-rouge">${CONDA_ROOT}/envs/my_env</code> (depending on the name you give it). Now we can execute a python script using this environment.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
</pre></td><td class="rouge-code"><pre>
execute-python-script<span class="o">()</span> <span class="o">{</span>
  <span class="nb">local </span><span class="nv">SCRIPT</span><span class="o">=</span><span class="s2">"path/to/script.py"</span>

  <span class="c"># check if script exists just in case</span>
  <span class="k">if</span> <span class="o">[</span> <span class="o">!</span> <span class="nt">-f</span> <span class="s2">"</span><span class="nv">$SCRIPT</span><span class="s2">"</span> <span class="o">]</span><span class="p">;</span> <span class="k">then
    </span><span class="nb">echo</span> <span class="s2">"ERROR: script not found: </span><span class="nv">$SCRIPT</span><span class="s2">"</span> <span class="o">&gt;</span>&amp;2
    <span class="k">return </span>1
  <span class="k">fi</span>

  <span class="c"># run the script</span>
  <span class="s2">"</span><span class="nv">$CONDA_PY</span><span class="s2">"</span> <span class="s2">"</span><span class="nv">$CONDA_BIN</span><span class="s2">"</span> run <span class="nt">-n</span> <span class="s2">"</span><span class="nv">$ENV_NAME</span><span class="s2">"</span> python <span class="s2">"</span><span class="nv">$SCRIPT</span><span class="s2">"</span>
<span class="o">}</span>

As easy as that. Now you can work conforatbly with conda <span class="k">in </span>your bash scripts.

</pre></td></tr></tbody></table></code></pre></div></div>

<h2 id="my-final-takeaway">My final takeaway</h2>

<p>I’ve mentioned from the beninning that I am not a particular fan of <code class="language-plaintext highlighter-rouge">conda</code> or <code class="language-plaintext highlighter-rouge">anaconda</code>, I just find it convenient for someone who is starting its journey with python and want to just code and have things working. I would not recomend it for tools in production, but it is a great tool for data science and machine learning practitioners. As I mentioned, there is no free lunch in convenience, for instance, conda-forge doesn’t have all the packages and the ones that they have they are not normally up-to-date so you may end up using <code class="language-plaintext highlighter-rouge">pip</code> sometimes (more on that in another post). Having two dependency managers is in general not a great idea. Also another thing I find inconvenient is that conda modifies your bash PATH, recall you always have this “base” in your terminal meaning that the “base” environment is active. That IMHO is annoying and also a reminder that conda controls all your python in the system. Of course there are hacks to modify <code class="language-plaintext highlighter-rouge">~/.bashrc</code> to make that dissapear and get more control to use other python managers, but it’s that, a hack. I would better use other tools like <a href="https://github.com/pyenv/pyenv">pyenv</a> which I’ll present in my next post of this series. The great use of <code class="language-plaintext highlighter-rouge">conda</code> is as a library manager, it makes it so easy to install libraries in your system that later you can use to compile your programs, even thought you can do it manually <code class="language-plaintext highlighter-rouge">conda install</code> is so easy to use!.</p>]]></content><author><name>Sebastia Agramunt Puig</name></author><category term="Python" /><category term="computer science" /><summary type="html"><![CDATA[Anaconda is a distribution of the Python and R programming languages for scientific computing, data science, machine learning, and large-scale data processing. It simplifies package management and deployment, and it provides many useful tools and libraries out of the box. Having said that I have to confess that I’m not really a fan of anaconda, it basically contains many packages and software you may not want to use like Spyder IDE (I use vscode instead), Anaconda navigator, a UI that helps you manage your virtual environments, RStudio for R and lots of python packages that you may not want to use. I downloaded to install it today and the installation helper advises me that total installation is 4.82 GB. As you may guess, I’m not installing Anaconda today not even for a try (I did it in the past).]]></summary></entry><entry><title type="html">Install Python from source</title><link href="https://agramunt.me/posts/install-python-source/" rel="alternate" type="text/html" title="Install Python from source" /><published>2024-05-09T19:10:00-07:00</published><updated>2025-03-03T21:48:42-08:00</updated><id>https://agramunt.me/posts/install-python-source</id><content type="html" xml:base="https://agramunt.me/posts/install-python-source/"><![CDATA[<p>In the previous posts of this series we have used installers that are platform specific so we don’t need to go through the process of compiling Python. In this post we will explore how to build python from source and make it available in your system. It certainly is more cumbersome but worth learning.</p>

<h2 id="tldr">TLDR</h2>

<p>Install python version in specific directory <code class="language-plaintext highlighter-rouge">~/.python/</code></p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
</pre></td><td class="rouge-code"><pre><span class="c"># select version and create directory structure</span>
<span class="nb">export </span><span class="nv">PYTHON_VERSION</span><span class="o">=</span>3.12.3
<span class="nb">export </span><span class="nv">TEMPDIR</span><span class="o">=</span>tmp_install

<span class="nb">mkdir</span> <span class="nt">-p</span> ~/.python/<span class="k">${</span><span class="nv">PYTHON_VERSION</span><span class="k">}</span>
<span class="nb">mkdir</span> <span class="nt">-p</span> ~/.python/<span class="k">${</span><span class="nv">TEMPDIR</span><span class="k">}</span>

<span class="c"># download and uncompress on tempdir</span>
curl <span class="nt">-O</span> <span class="s2">"https://www.python.org/ftp/python/</span><span class="k">${</span><span class="nv">PYTHON_VERSION</span><span class="k">}</span><span class="s2">/Python-</span><span class="k">${</span><span class="nv">PYTHON_VERSION</span><span class="k">}</span><span class="s2">.tar.xz"</span> <span class="nt">--output-dir</span> ~/.python/<span class="k">${</span><span class="nv">TEMPDIR</span><span class="k">}</span>
<span class="nb">tar</span> <span class="nt">-xvJf</span> ~/.python/<span class="k">${</span><span class="nv">TEMPDIR</span><span class="k">}</span>/Python-<span class="k">${</span><span class="nv">PYTHON_VERSION</span><span class="k">}</span>.tar.xz <span class="nt">-C</span> ~/.python/<span class="k">${</span><span class="nv">TEMPDIR</span><span class="k">}</span> <span class="nt">--strip-components</span><span class="o">=</span>1

<span class="c"># configure &amp; compile (be patient, this takes a while...)</span>
<span class="nb">cd</span> ~/.python/<span class="k">${</span><span class="nv">TEMPDIR</span><span class="k">}</span>
./configure <span class="nt">--enable-optimizations</span> <span class="nt">--prefix</span><span class="o">=</span><span class="s2">"</span><span class="k">${</span><span class="nv">HOME</span><span class="k">}</span><span class="s2">/.python/</span><span class="k">${</span><span class="nv">PYTHON_VERSION</span><span class="k">}</span><span class="s2">"</span>
make
make <span class="nb">install</span>

<span class="c"># Then remove the temporary directory</span>
<span class="nb">rm</span> <span class="nt">-rf</span> ~/.python/<span class="k">${</span><span class="nv">TEMPDIR</span><span class="k">}</span>

<span class="c"># now you can run python</span>
~/.python/<span class="k">${</span><span class="nv">PYTHON_VERSION</span><span class="k">}</span>/bin/python3

<span class="c"># make the binary available by exporting to PATH</span>
<span class="nb">echo</span> <span class="s2">"export PATH=</span><span class="k">${</span><span class="nv">HOME</span><span class="k">}</span><span class="s2">/.python/</span><span class="k">${</span><span class="nv">PYTHON_VERSION</span><span class="k">}</span><span class="s2">/bin:</span><span class="se">\$</span><span class="s2">PATH"</span> <span class="o">&gt;&gt;</span> ~/.zshrc
<span class="nb">source</span> ~/.zshrc
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Now run</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>python3
</pre></td></tr></tbody></table></code></pre></div></div>

<p>to open the python prompt</p>

<h2 id="prerequisites">Prerequisites</h2>

<p>We need several libraries in our OS before we are able to compile python. In the following sections we will see how to install the needed libraries in different operating systems</p>

<h3 id="linux-debian-eg-ubuntu">Linux Debian (e.g Ubuntu)</h3>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
</pre></td><td class="rouge-code"><pre><span class="nb">sudo </span>apt update
<span class="nb">sudo </span>apt <span class="nb">install </span>build-essential <span class="se">\</span>
                 zlib1g-dev <span class="se">\</span>
                 libncurses5-dev <span class="se">\</span>
                 libgdbm-dev <span class="se">\</span>
                 libnss3-dev <span class="se">\</span>
                 libssl-dev <span class="se">\</span>
                 libsqlite3-dev <span class="se">\</span>
                 libreadline-dev <span class="se">\</span>
                 libffi-dev curl <span class="se">\</span>
                 libbz2-dev <span class="se">\</span>
                 liblzma-dev
</pre></td></tr></tbody></table></code></pre></div></div>

<h3 id="linux-redhad-eg-fedora">Linux RedHad (e.g. Fedora)</h3>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
</pre></td><td class="rouge-code"><pre><span class="nb">sudo </span>dnf groupinstall <span class="s2">"Development Tools"</span>
<span class="nb">sudo </span>dnf <span class="nb">install </span>zlib-devel <span class="se">\</span>
                 bzip2 <span class="se">\</span>
                 bzip2-devel <span class="se">\</span>
                 readline-devel <span class="se">\</span>
                 sqlite <span class="se">\</span>
                 sqlite-devel <span class="se">\</span>
                 openssl-devel <span class="se">\</span>
                 xz <span class="se">\</span>
                 xz-devel <span class="se">\</span>
                 libffi-devel
</pre></td></tr></tbody></table></code></pre></div></div>

<h3 id="macos">MacOS</h3>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>xcode-select <span class="nt">--install</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<h2 id="compile-and-install">Compile and install</h2>

<p>We will install python in a custom directory in our home: <code class="language-plaintext highlighter-rouge">~/.python</code> and donwload the source files in <code class="language-plaintext highlighter-rouge">~/.python/tmp_dir</code>. Select a version from one of the versions in the <a href="https://www.python.org/downloads/">Python download page</a>, go to a specific release and get the link of the “XZ compressed source tarball”. In this case we choose Python version <code class="language-plaintext highlighter-rouge">3.12.3</code></p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
</pre></td><td class="rouge-code"><pre><span class="c"># select version and create directory structure</span>
<span class="nb">export </span><span class="nv">PYTHON_VERSION</span><span class="o">=</span>3.12.3
<span class="nb">export </span><span class="nv">TEMPDIR</span><span class="o">=</span>tmp_install

<span class="c"># creates the directories</span>
<span class="nb">mkdir</span> <span class="nt">-p</span> ~/.python/<span class="k">${</span><span class="nv">PYTHON_VERSION</span><span class="k">}</span>
<span class="nb">mkdir</span> <span class="nt">-p</span> ~/.python/<span class="k">${</span><span class="nv">TEMPDIR</span><span class="k">}</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Now download the file source code to the temporary directory</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>curl <span class="nt">-O</span> <span class="s2">"https://www.python.org/ftp/python/</span><span class="k">${</span><span class="nv">PYTHON_VERSION</span><span class="k">}</span><span class="s2">/Python-</span><span class="k">${</span><span class="nv">PYTHON_VERSION</span><span class="k">}</span><span class="s2">.tar.xz"</span> <span class="nt">--output-dir</span> ~/.python/<span class="k">${</span><span class="nv">TEMPDIR</span><span class="k">}</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Then we need to uncompress the file into the temporary directory</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre><span class="nb">tar</span> <span class="nt">-xvJf</span> ~/.python/<span class="k">${</span><span class="nv">TEMPDIR</span><span class="k">}</span>/Python-<span class="k">${</span><span class="nv">PYTHON_VERSION</span><span class="k">}</span>.tar.xz <span class="nt">-C</span> ~/.python/<span class="k">${</span><span class="nv">TEMPDIR</span><span class="k">}</span> <span class="nt">--strip-components</span><span class="o">=</span>1
</pre></td></tr></tbody></table></code></pre></div></div>

<p>change directory to the temporary directory whwere we have unzipped the file and run configure having the prefix the place you want to install the software, in our case is <code class="language-plaintext highlighter-rouge">~/.python/${PYTHON_VERSION}</code>.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
</pre></td><td class="rouge-code"><pre><span class="c"># configure &amp; compile (be patient, this takes a while...)</span>
<span class="nb">cd</span> ~/.python/<span class="k">${</span><span class="nv">TEMPDIR</span><span class="k">}</span>
./configure <span class="nt">--enable-optimizations</span> <span class="nt">--prefix</span><span class="o">=</span><span class="s2">"</span><span class="k">${</span><span class="nv">HOME</span><span class="k">}</span><span class="s2">/.python/</span><span class="k">${</span><span class="nv">PYTHON_VERSION</span><span class="k">}</span><span class="s2">"</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Finally compile and install:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre>make
make <span class="nb">install</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Now you have python executable in</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>~/.python/<span class="k">${</span><span class="nv">PYTHON_VERSION</span><span class="k">}</span>/bin/python3
</pre></td></tr></tbody></table></code></pre></div></div>

<p>To make it available to your system just install just export to PATH and source your <code class="language-plaintext highlighter-rouge">~/.bashrc</code> or <code class="language-plaintext highlighter-rouge">~/.zshrc</code>. The following will append the echo commands to the bash file while substituing <code class="language-plaintext highlighter-rouge">HOME</code> and <code class="language-plaintext highlighter-rouge">PYTHON_VERSION</code> but not <code class="language-plaintext highlighter-rouge">PATH</code>, since we don’t want to write all the path there.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre><span class="nb">echo</span> <span class="s2">"export PATH=</span><span class="k">${</span><span class="nv">HOME</span><span class="k">}</span><span class="s2">/.python/</span><span class="k">${</span><span class="nv">PYTHON_VERSION</span><span class="k">}</span><span class="s2">/bin:</span><span class="se">\$</span><span class="s2">PATH"</span> <span class="o">&gt;&gt;</span> ~/.zshrc
<span class="nb">source</span> ~/.zshrc
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Whilst I prefer the first option, if it is easier to you you can just create an alias to your bash/zsh profile</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre><span class="nb">echo alias </span><span class="nv">python</span><span class="o">=</span><span class="k">${</span><span class="nv">HOME</span><span class="k">}</span>/.python/<span class="k">${</span><span class="nv">PYTHON_VERSION</span><span class="k">}</span>/bin/python3 <span class="o">&gt;&gt;</span> ~/.zshrc
<span class="nb">source</span> ~/.zshrc
</pre></td></tr></tbody></table></code></pre></div></div>

<p>You are all set, just type <code class="language-plaintext highlighter-rouge">python</code> (if you have chosen aliasing) or <code class="language-plaintext highlighter-rouge">python3</code> (if you have chosen exporting path) and you will see the python prompt.</p>]]></content><author><name>Sebastia Agramunt Puig</name></author><category term="Python" /><category term="computer science" /><summary type="html"><![CDATA[In the previous posts of this series we have used installers that are platform specific so we don’t need to go through the process of compiling Python. In this post we will explore how to build python from source and make it available in your system. It certainly is more cumbersome but worth learning.]]></summary></entry><entry><title type="html">Install Python using HomeBrew</title><link href="https://agramunt.me/posts/install-python-brew/" rel="alternate" type="text/html" title="Install Python using HomeBrew" /><published>2024-04-25T19:10:00-07:00</published><updated>2025-07-30T21:55:16-07:00</updated><id>https://agramunt.me/posts/install-python-brew</id><content type="html" xml:base="https://agramunt.me/posts/install-python-brew/"><![CDATA[<p><a href="https://brew.sh/">HomeBrew</a> is a popular Linux/MacOS package installer. We will use it to install python.</p>

<h2 id="macos">MacOS</h2>

<h3 id="tldr">TLDR</h3>

<p>Download and install homebrew and install python</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
</pre></td><td class="rouge-code"><pre>/bin/bash <span class="nt">-c</span> <span class="s2">"</span><span class="si">$(</span>curl <span class="nt">-fsSL</span> https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh<span class="si">)</span><span class="s2">"</span>
brew <span class="nb">install </span>python
python3
</pre></td></tr></tbody></table></code></pre></div></div>

<h3 id="prerequisites">Prerequisites</h3>

<p>On MacOS Brew installs the software in <code class="language-plaintext highlighter-rouge">/usr/local/Cellar/</code> and creates symlinks to the binaries in <code class="language-plaintext highlighter-rouge">/usr/local/opt/</code> and <code class="language-plaintext highlighter-rouge">/usr/local/bin/</code>.</p>

<p>Install HomeBrew if you don’t have it as</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>/bin/bash <span class="nt">-c</span> <span class="s2">"</span><span class="si">$(</span>curl <span class="nt">-fsSL</span> https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh<span class="si">)</span><span class="s2">"</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>The brew command will show in <code class="language-plaintext highlighter-rouge">/usr/local/bin/brew</code>, which should be part of the <code class="language-plaintext highlighter-rouge">PATH</code>. Install python running</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>brew <span class="nb">install </span>python
</pre></td></tr></tbody></table></code></pre></div></div>

<p>HomeBrew will install the latest python version, in my case <code class="language-plaintext highlighter-rouge">3.12</code>. It also creates a link in <code class="language-plaintext highlighter-rouge">/usr/local/bin/</code>. To open a python prompt jus type <code class="language-plaintext highlighter-rouge">python3</code>.</p>

<p>If we take a close look and run <code class="language-plaintext highlighter-rouge">ls -lhat /usr/local/bin | grep python3</code> we see that <code class="language-plaintext highlighter-rouge">python3</code> is a link to <code class="language-plaintext highlighter-rouge">/usr/local/Cellar/python@3.12/3.12.3/bin/python3</code> which in turn is a link to <code class="language-plaintext highlighter-rouge">/usr/local/Frameworks/Python.framework/Versions/3.12/bin/python3</code>. The real executable (and the path to the original installation) is the latter.</p>

<p>I always like to have an alias to <code class="language-plaintext highlighter-rouge">python</code> rather than <code class="language-plaintext highlighter-rouge">python3</code>. These days <code class="language-plaintext highlighter-rouge">python2</code> has been deprecated and is not used in any modern project.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre><span class="c"># append the command to the file ~/.zshrc</span>
<span class="nb">echo alias </span><span class="nv">pytyon</span><span class="o">=</span>/Library/Frameworks/Python.framework/Versions/3.12/bin/python3 <span class="o">&gt;&gt;</span> ~/.zshrc
</pre></td></tr></tbody></table></code></pre></div></div>

<h2 id="linux-ubuntu-dist">Linux (Ubuntu Dist)</h2>

<p>As prerequisites you need to have <code class="language-plaintext highlighter-rouge">curl</code> and <code class="language-plaintext highlighter-rouge">git</code>.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre><span class="nb">sudo </span>apt-get update <span class="o">&amp;&amp;</span> apt-get <span class="nb">install</span> <span class="nt">-y</span> curl git
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Then install HomeBrew as before</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>/bin/bash <span class="nt">-c</span> <span class="s2">"</span><span class="si">$(</span>curl <span class="nt">-fsSL</span> https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh<span class="si">)</span><span class="s2">"</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>The binaries will be in <code class="language-plaintext highlighter-rouge">/home/linuxbrew/.linuxbrew/bin</code>, make sure you add this to your path in <code class="language-plaintext highlighter-rouge">.zshrc</code> or <code class="language-plaintext highlighter-rouge">.bashrc</code></p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre><span class="nb">echo</span> <span class="s1">'export PATH="$PATH:/home/linuxbrew/.linuxbrew/bin"'</span> <span class="o">&gt;&gt;</span> ~/.zshrc
</pre></td></tr></tbody></table></code></pre></div></div>

<p>This bin is actually a symlink of the directory <code class="language-plaintext highlighter-rouge">/home/linuxbrew/.linuxbrew/Homebrew/bin/</code> that contains the <code class="language-plaintext highlighter-rouge">brew</code> binary. Finally source the file so that changes make effect</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
</pre></td><td class="rouge-code"><pre><span class="c"># if you use zsh</span>
<span class="nb">source</span> ~/.zshrc

<span class="c"># if you use bash</span>
<span class="nb">source</span> ~/.bashrc
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Now install python executing</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>brew <span class="nb">install </span>python
</pre></td></tr></tbody></table></code></pre></div></div>

<p>This automatically installst the latest python version, which, in my case, is <code class="language-plaintext highlighter-rouge">3.12</code>. The binary can be found in <code class="language-plaintext highlighter-rouge">/home/linuxbrew/.linuxbrew/bin/python3</code>. This path has been previoulsy added to the <code class="language-plaintext highlighter-rouge">PATH</code> so we can go ahead and type in our terminal</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
</pre></td><td class="rouge-code"><pre>
<span class="c"># open a terminal in python.</span>
python3

<span class="c"># same as before but specifying the version we just installed.</span>
python3.12
</pre></td></tr></tbody></table></code></pre></div></div>]]></content><author><name>Sebastia Agramunt Puig</name></author><category term="Python" /><category term="computer science" /><summary type="html"><![CDATA[HomeBrew is a popular Linux/MacOS package installer. We will use it to install python.]]></summary></entry><entry><title type="html">Install Python</title><link href="https://agramunt.me/posts/install-python/" rel="alternate" type="text/html" title="Install Python" /><published>2024-04-20T19:10:00-07:00</published><updated>2025-07-30T21:55:16-07:00</updated><id>https://agramunt.me/posts/install-python</id><content type="html" xml:base="https://agramunt.me/posts/install-python/"><![CDATA[<p>Python is a very popular scripting language, according to <a href="https://survey.stackoverflow.co/2023/">StackOverflow 2023 survey</a> it is the third most commonly used language and the fourth in the ranking for professional developers. All in all, it is a very simple language to learn and thus is one of the first languages new developers incorporate in their projects. In this post we will see how to install Python in your system.</p>

<h2 id="tldr">TLDR</h2>

<p>Tutorial to install python from installable in MacOS.</p>

<ul>
  <li>Download <a href="https://www.python.org/ftp/python/3.12.3/python-3.12.3-macos11.pkg">python-3.12.3-macos11.pkg</a> (or choose your version, as of today latest is <code class="language-plaintext highlighter-rouge">3.12</code>).</li>
  <li>Install using the installer helper that will place all the python files in <code class="language-plaintext highlighter-rouge">/Library/Frameworks/Python.framework/Versions/3.12/</code>.</li>
  <li>Create an alias, execute
    <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre><span class="nb">echo alias </span><span class="nv">pytyon</span><span class="o">=</span>/Library/Frameworks/Python.framework/Versions/3.12/bin/python3 <span class="o">&gt;&gt;</span> ~/.zshrc
</pre></td></tr></tbody></table></code></pre></div>    </div>
  </li>
  <li>Run python by executing <code class="language-plaintext highlighter-rouge">python</code> in termial.</li>
</ul>

<h2 id="install-python-with-installable">Install Python with installable</h2>

<p>There are many ways to install python in your computer, the most simple one is from <a href="https://www.python.org/">python.org</a>. Go to downloads and choose your operating system / architecture and version from the stable releases options.</p>

<p>As an example we will install Python <code class="language-plaintext highlighter-rouge">3.12.3</code> which is the most recent stable one at the time of writing this post. These version numbers are called major, minor and revision, in the example major is 3, minor is 12 and revision is 3. It is important to choose the right version of python as some software projects may have old functionality and may only work in an older python version.</p>

<p>For MacOS, first download the installer <a href="https://www.python.org/ftp/python/3.12.3/python-3.12.3-macos11.pkg">python-3.12.3-macos11.pkg</a> (choose the appropiate installer for your OS) and proceed with the installation. Then, open a terminal and type <code class="language-plaintext highlighter-rouge">python3</code> or <code class="language-plaintext highlighter-rouge">python3.12</code> to start the python command prompt:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
</pre></td><td class="rouge-code"><pre>Python 3.12.3 <span class="o">(</span>v3.12.3:f6650f9ad7, Apr  9 2024, 08:18:48<span class="o">)</span> <span class="o">[</span>Clang 13.0.0 <span class="o">(</span>clang-1300.0.29.30<span class="o">)]</span> on darwin
Type <span class="s2">"help"</span>, <span class="s2">"copyright"</span>, <span class="s2">"credits"</span> or <span class="s2">"license"</span> <span class="k">for </span>more information.
<span class="o">&gt;&gt;&gt;</span>
</pre></td></tr></tbody></table></code></pre></div></div>

<p>That’s it, you are in python!. See that in the prompt it is specified the version <code class="language-plaintext highlighter-rouge">3.12.3</code> , the date it was released and the compiler (Clang in my case). Just to test type</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
</pre></td><td class="rouge-code"><pre><span class="n">Python</span> <span class="mf">3.12</span><span class="p">.</span><span class="mi">3</span> <span class="p">(</span><span class="n">v3</span><span class="p">.</span><span class="mf">12.3</span><span class="p">:</span><span class="n">f6650f9ad7</span><span class="p">,</span> <span class="n">Apr</span>  <span class="mi">9</span> <span class="mi">2024</span><span class="p">,</span> <span class="mi">08</span><span class="p">:</span><span class="mi">18</span><span class="p">:</span><span class="mi">48</span><span class="p">)</span> <span class="p">[</span><span class="n">Clang</span> <span class="mf">13.0</span><span class="p">.</span><span class="mi">0</span> <span class="p">(</span><span class="n">clang</span><span class="o">-</span><span class="mf">1300.0</span><span class="p">.</span><span class="mf">29.30</span><span class="p">)]</span> <span class="n">on</span> <span class="n">darwin</span>
<span class="n">Type</span> <span class="sh">"</span><span class="s">help</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">copyright</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">credits</span><span class="sh">"</span> <span class="ow">or</span> <span class="sh">"</span><span class="s">license</span><span class="sh">"</span> <span class="k">for</span> <span class="n">more</span> <span class="n">information</span><span class="p">.</span>
<span class="o">&gt;&gt;&gt;</span> <span class="nf">print</span><span class="p">(</span><span class="sh">"</span><span class="s">Hi there</span><span class="sh">"</span><span class="p">)</span>
<span class="n">Hi</span> <span class="n">there</span>
</pre></td></tr></tbody></table></code></pre></div></div>
<p>That’s it, you have successfully installed Python!.</p>

<h2 id="install-location">Install location</h2>

<p>In MacOS python is installed in <code class="language-plaintext highlighter-rouge">/Library/Frameworks/Python.framework/Versions/3.12/</code> (for Windows it would be <code class="language-plaintext highlighter-rouge">C:\Users\&lt;Username&gt;\AppData\Local\Programs\Python\Python3.12</code>). If you installed any other python version, say <code class="language-plaintext highlighter-rouge">3.11.2</code>, just change the path to the corresponding major and minor versions. Let’s explore what is in there, the main directories are:</p>

<table>
  <thead>
    <tr>
      <th>Directory</th>
      <th>Usage</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>bin</td>
      <td>Where the binaries (executables) are stored, this is python and pip among others.</td>
    </tr>
    <tr>
      <td>include/python3.12</td>
      <td>All header files for the compiled libraries. For instance, the file <code class="language-plaintext highlighter-rouge">floatobject.h </code> contains the C++ definition of the float object in Python. The headers in this directory are needed if we want to compile a C++ extension to be used in Python (a Python wrapper for a C++ library).</td>
    </tr>
    <tr>
      <td>lib</td>
      <td>contains compiled libraries. Concretely it contains <code class="language-plaintext highlighter-rouge">libpython3.12.a</code> or  <code class="language-plaintext highlighter-rouge">libpython3.12.so</code> wich is the (static or dynamic) compiled standard library for python. This directory is added automatically to <code class="language-plaintext highlighter-rouge">sys.path</code>. Also custom compiled packages are copied here.</td>
    </tr>
    <tr>
      <td>share</td>
      <td>Includes miscellaneous files such as documentation, configuration files, and examples. It might also contain man pages, sample scripts, and other shared assets that don’t fit into the binary, header, or library categories.</td>
    </tr>
  </tbody>
</table>

<p>The python executable you run when typing <code class="language-plaintext highlighter-rouge">python3</code> in your terminal is <code class="language-plaintext highlighter-rouge">/Library/Frameworks/Python.framework/Versions/3.12/bin/python3</code> (just type <code class="language-plaintext highlighter-rouge">which python3.12</code> or <code class="language-plaintext highlighter-rouge">which python3</code>).</p>

<p>The commands <code class="language-plaintext highlighter-rouge">python3.12</code> and <code class="language-plaintext highlighter-rouge">python3</code> are recognized because they are executables found in a directory from your <code class="language-plaintext highlighter-rouge">PATH</code> variable. To find these directories just <code class="language-plaintext highlighter-rouge">tr ':' '\n' &lt;&lt;&lt; "$PATH"</code>, for MacOS you will see the path <code class="language-plaintext highlighter-rouge">/Library/Frameworks/Python.framework/Versions/3.12/bin</code>, in this directory is where <code class="language-plaintext highlighter-rouge">python3.12</code> executable lives, and also a simulated link to it <code class="language-plaintext highlighter-rouge">python3</code>.</p>

<h2 id="install-another-python-version">Install another python version</h2>

<p>Let’s say you need to use Python <code class="language-plaintext highlighter-rouge">3.11.7</code> for another project but you arleady have <code class="language-plaintext highlighter-rouge">3.12.3</code>, what do you do?. Uninstall and reinstall?. Luckily you don’t need to do that. As we did previously, just download the installable <a href="https://www.python.org/ftp/python/3.11.7/python-3.11.7-macos11.pkg">python-3.11.7-macos11.pkg</a> and execute it. Open a new terminal and type <code class="language-plaintext highlighter-rouge">python3 --version</code>, you will see that now instead of having <code class="language-plaintext highlighter-rouge">3.12.2</code> (as installed previously), it will show <code class="language-plaintext highlighter-rouge">3.11.7</code>. What just happend?. Run <code class="language-plaintext highlighter-rouge">tr ':' '\n' &lt;&lt;&lt; "$PATH" | grep Python</code>, you will see two paths, first <code class="language-plaintext highlighter-rouge">/Library/Frameworks/Python.framework/Versions/3.11/bin</code> and second <code class="language-plaintext highlighter-rouge">/Library/Frameworks/Python.framework/Versions/3.12/bin</code>. So by installing a second version we get the new path first. When bash looks for an executable it goes sequentially from the first path to the last, if there are two <code class="language-plaintext highlighter-rouge">python3</code> executables in the <code class="language-plaintext highlighter-rouge">PATH</code>, bash will just pick the first, in this case <code class="language-plaintext highlighter-rouge">3.11</code>. But no panic, you can still run any python version installed, check it doing</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre>python3.11 <span class="nt">--version</span>
python3.12 <span class="nt">--version</span>
</pre></td></tr></tbody></table></code></pre></div></div>
<p>so to run a python script <code class="language-plaintext highlighter-rouge">script.py</code> in python 3.12 just do <code class="language-plaintext highlighter-rouge">python3.12 script.py</code>.</p>

<p>We can still do another “hack” if you want to change <code class="language-plaintext highlighter-rouge">pythhon3</code> be either of the two installed versions, just add the python exec path as the first entry of your <code class="language-plaintext highlighter-rouge">PATH</code>:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
6
7
8
9
10
11
12
</pre></td><td class="rouge-code"><pre><span class="c"># to run python3 and execute python3.12</span>
<span class="nb">export </span><span class="nv">PATH</span><span class="o">=</span><span class="s2">"/Library/Frameworks/Python.framework/Versions/3.12/bin:</span><span class="k">${</span><span class="nv">PATH</span><span class="k">}</span><span class="s2">"</span>

<span class="c"># the following will run script.py using python3.12</span>
python3 script.py 


<span class="c"># to run python3 and execute python3.11</span>
<span class="nb">export </span><span class="nv">PATH</span><span class="o">=</span><span class="s2">"/Library/Frameworks/Python.framework/Versions/3.11/bin:</span><span class="k">${</span><span class="nv">PATH</span><span class="k">}</span><span class="s2">"</span>

<span class="c"># the following will run script.py using python3.11</span>
python3 script.py 
</pre></td></tr></tbody></table></code></pre></div></div>

<h2 id="make-an-alias-to-python-executable">Make an alias to python executable</h2>

<p>To pin a specific python version and run that whenever you type <code class="language-plaintext highlighter-rouge">python</code> just create an alias. For that open <code class="language-plaintext highlighter-rouge">.bashrc</code> or <code class="language-plaintext highlighter-rouge">.zshrc</code> depending if you use bash or zsh shell (look for the equivalent if you have another shell) and append <code class="language-plaintext highlighter-rouge">alias pytyon=/Library/Frameworks/Python.framework/Versions/3.12/bin/python3</code>. To append just open a terminal and do the following (assuming you use zsh):</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
3
4
5
</pre></td><td class="rouge-code"><pre><span class="c"># append the command to the file ~/.zshrc</span>
<span class="nb">echo alias </span><span class="nv">pytyon</span><span class="o">=</span>/Library/Frameworks/Python.framework/Versions/3.12/bin/python3 <span class="o">&gt;&gt;</span> ~/.zshrc

<span class="c"># source ~/.zshrc to make the changes effective</span>
<span class="nb">source</span> ~/.zshrc
</pre></td></tr></tbody></table></code></pre></div></div>

<h2 id="uninstall">Uninstall</h2>

<p>Get the list of your python installed</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre><span class="nb">ls</span> <span class="nt">-lhat</span> /Library/Frameworks/Python.framework/Versions/  
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Remove specific versions, in this case we will remove <code class="language-plaintext highlighter-rouge">3.11</code> and <code class="language-plaintext highlighter-rouge">3.12</code></p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
2
</pre></td><td class="rouge-code"><pre><span class="nb">sudo rm</span> <span class="nt">-rf</span> /Library/Frameworks/Python.framework/Versions/3.11
<span class="nb">sudo rm</span> <span class="nt">-rf</span> /Library/Frameworks/Python.framework/Versions/3.12
</pre></td></tr></tbody></table></code></pre></div></div>

<p>or remove entire python</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre><span class="nb">sudo rm</span> <span class="nt">-rf</span> /Library/Frameworks/Python.framework/
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Then remove the linked binaries</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre><span class="nb">sudo rm</span> /usr/local/bin/python3 /usr/local/bin/pip3
</pre></td></tr></tbody></table></code></pre></div></div>

<p>Then since you have probably modified the <code class="language-plaintext highlighter-rouge">$PATH</code>, just go and remove the specific exports and aliases in <code class="language-plaintext highlighter-rouge">~/.zshrc</code> or <code class="language-plaintext highlighter-rouge">~/.bashrc</code>. Also probably the installation has modified <code class="language-plaintext highlighter-rouge">~/.zprofile</code> or <code class="language-plaintext highlighter-rouge">~/.bash_profile</code> adding the paths <code class="language-plaintext highlighter-rouge">/Library/Frameworks/Python.framework...</code>. Comment or remove those lines.</p>

<p>Finally remove the alias we created before in <code class="language-plaintext highlighter-rouge">~/.zshrc</code> (or <code class="language-plaintext highlighter-rouge">~/.bashrc</code>). The line <code class="language-plaintext highlighter-rouge">alias python=/Library/Frameworks/Python.framework/Versions/3.12/bin/python3</code>.</p>

<p>and restart the terminal by typing</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><table class="rouge-table"><tbody><tr><td class="rouge-gutter gl"><pre class="lineno">1
</pre></td><td class="rouge-code"><pre>reset
</pre></td></tr></tbody></table></code></pre></div></div>

<p>You are all set. Python is deleted from your system</p>]]></content><author><name>Sebastia Agramunt Puig</name></author><category term="Python" /><category term="computer science" /><summary type="html"><![CDATA[Python is a very popular scripting language, according to StackOverflow 2023 survey it is the third most commonly used language and the fourth in the ranking for professional developers. All in all, it is a very simple language to learn and thus is one of the first languages new developers incorporate in their projects. In this post we will see how to install Python in your system.]]></summary></entry><entry><title type="html">Prime Numbers</title><link href="https://agramunt.me/posts/prime-numbers/" rel="alternate" type="text/html" title="Prime Numbers" /><published>2024-03-02T18:10:00-08:00</published><updated>2024-03-02T18:10:00-08:00</updated><id>https://agramunt.me/posts/prime-numbers</id><content type="html" xml:base="https://agramunt.me/posts/prime-numbers/"><![CDATA[<p>Prime numbers are the building blocks of arithmetics. In this short post we will investigate some attributes of prime numbers and how to work with them in a computer.</p>

<p>One of the main applications of prime numbers is in some algorithms related to cryptography (e.g. RSA) and therefore it is related to techniques developed for privacy preserving machine learning. In a previous post we have defined groups and fields using prime numbers and our main aim in this post is to show how to calculate prime numbers and check whether a certain number is prime or not. Due to the mathematical importance of primes I considered worthwhile to write a full post about them.
A prime number is a natural number that is only divisible by himself and 1. For instance 7 is a prime number because there’s no natural number smaller than 7 that divided by 7 results in an integer. Some sorted prime numbers are \(2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, \cdots\).</p>

<h1 id="the-building-blocks-of-arithmetics">The building blocks of arithmetics</h1>

<p>Prime numbers are considered the building blocks of arithmetics because no matter what natural number you may think of, you can express it as product of prime numbers. For instance, take \(30\), it can be decomposed as \(2·3·5\). Or \(123456\) in \(2^6·3·643\). The process of decomposing a number to its prime number factors is called it factorisation.
The <a href="https://en.wikipedia.org/wiki/Fundamental_theorem_of_arithmetic">fundamental theorem of arithmetics</a> (a.k.a unique factorisation theorem) states that every integer greater than 1 either is a prime number itself or can be represented as the product of prime numbers and furthermore this representation is unique. So given an arbitrary integer a we can write it as</p>

\[\begin{equation}
    a=p_1^{e_1} \cdot p_2^{e_2} \cdots p_r^{e_r}
\end{equation}\]

<p>where the p’s are prime numbers and e’s are the exponentiation of those.</p>

<p>But let’s get back to topic, we are interested in cryptography, what does all this have to do with it?. Well, in modern cryptography one tries to find hard mathematical problems to solve that are easy to check. Factorisation of prime numbers is a very difficult task to solve. There are algorithms like <a href="https://en.wikipedia.org/wiki/Pollard%27s_p_%E2%88%92_1_algorithm">Pollard’s factorisation</a> or <a href="https://en.wikipedia.org/wiki/Lenstra_elliptic-curve_factorization">Lenstra elliptic-curve factorisation</a> that are efficient but if you have a quantum computer at hand the best by far is the <a href="https://en.wikipedia.org/wiki/Shor%27s_algorithm">Shor algorithm</a> (it is polynomial in log(N)). In <a href="https://en.wikipedia.org/wiki/RSA_(cryptosystem)">RSA</a> algorithm one chooses two large prime numbers $p$ and $q$ and compute $N=p \cdot q$, the task of the adversary to break the code is to find the factors of $N$. For sufficiently large prime numbers the probability of solving this task by chance is negligible and the classic factorisation methods take too long so in practice it is impossible to crack in today’s computing power based on binary operations. Conversely having $p$ and $N$ it is very easy to check if $p$ is a factor of $N$ with one operation. This is also a requirement to modern crypto protocols.</p>

<h1 id="how-many-prime-numbers-are-there-can-we-find-a-magic-formula-to-get-them-all">How many prime numbers are there? Can we find a “magic formula” to get them all?</h1>

<p>It was proven by Euclid back in 300 BCE that there are infinitely many prime numbers. The largest one found as of August 2020 is 2^82589933-1 and has 24,862,048 digits when written in base 10.</p>

<p>It would be nice to have a formula that when you input “give me the 532 prime” it would output 3833 (the 532th prime), all that taking O(1) compute time. Unfortunately this formula does not exist and finding new prime numbers take a lot of computational effort. That means that they are not predictable and this is one of the mysteries on primes… once you see a prime you don’t know when you’ll find the next one. Is there some kind of random/chaos in prime number structure? Nobody knows yet.</p>

<p>Even though we can’t predict primes with a formula we can calculate the probability of finding a prime number in between a range of numbers. We define $\pi (x)$ as the number of prime numbers smaller than $x$. For instance, $\pi (20)=8$, because prime numbers smaller than $20$ are $2, 3, 5, 7, 11, 13, 17, 19$. Therefore, to calculate the number of primes between $x_2$ and $x_1$ ($x_2$&gt;$x_1$) we just need to subtract both $\pi (x_2)-\pi (x_1)$. Here $\pi (x)$ is exact calculation and we have to do it numerically. The <a href="https://en.wikipedia.org/wiki/Prime_number_theorem">prime number theorem</a> establishes the asymptotic distribution of prime numbers by approximating the count to $x/\ln(x)$</p>

\[\lim_{x \rightarrow \infty} \frac{\pi (x)}{x/\ln(x)}=1\]

<p>A graphical representation of this result is shown below</p>

<p><img src="/assets/img/posts//2024-03-02-prime-numbers/img1.png" alt="the projection is a segment" width="700" class="center" />
<em>$\pi (x)$ limit as $x\rightarrow \infty$.</em></p>

<p>In the graph you can see how the count of prime numbers (the exact one) divided by the approximation tends to 1 for large x for the two approximations, the quotient and the integral. Actually the best approximation is the integral one. This result is very interesting and took many years to find, it gives some structure to this madness of prime numbers. But still, we can’t predict them and this is good for crypto!.</p>

<h1 id="primality-testing-and-prime-generation">Primality testing and prime generation</h1>
<p>In this section we investigate how to find prime numbers and test if an arbitrary number is prime.</p>

<h2 id="a-naive-way-to-generate-primes">A naive way to generate primes</h2>
<p>The natural way to generate prime numbers is to start from the smallest one $2$ and test if the next one $3$ is divisible by $2$, since it is not, we add it to the list, then we have $2$, $3$. We go to test $4$ now, since it is divisible by one of the primes we already have $2$ we don’t add it to the list and we try the next one, $5$. $5$ is not divisible neither by $2$ nor by $3$ so we add it to the list, $2$, $3$, $5$ and so on… This is a very computationally intensive algorithm since all the time you have to check your full list that, by the way, is increasing every time you find a new prime.</p>

<h2 id="the-sieve-of-eratosthenes">The sieve of Eratosthenes</h2>
<p>A faster way to generate prime numbers is using the sieve of Eratosthenes where basically you build a list of all the natural numbers from $1$ to $n$ ($n$ is the natural number below which you want all the primes) and remove all the multiples of the newly find prime. Say we want the prime numbers smaller than $n$=120. The algorithm starts with $2$, then you discard its multiples $2$, $4$, $8$, $\cdots$, $120$), you go for $3$ and you know it is prime because it hasn’t been discarded, so you eliminate multiples of $3$ that haven’t been discarded $3$, $9$, $12$, $15$, $\cdots$. The next number is $4$ and has been discarded so you go to 5 and add it as prime, then eliminate its multiples… You can find a nice implementation in python here.</p>

<h2 id="primality-testing-and-miller-rabin-algorithm">Primality testing and Miller-Rabin algorithm</h2>
<p>All the approaches so far seem to be very lengthly… in cryptography you often work with $256$ bit prime numbers ($2^{255}$ to $2^{256}$). Now, you can’t generate all prime numbers up to $2^{256}$ and then choose one at random, for this you need to use a test of primality. Basically what you do is draw a random natural number and then check whether this number is prime or not.</p>

<p>So given a natural number $n$, how can you tell if $n$ is prime or not? Remember first Fermat’s little theorem: Let $p$ be a prime number and let $a$ be any integer then</p>

\[a^{p-1}=1 \mod{p}\]

<p>We may use it to check if $n$ is prime. If we plug $n$ instead of $p$ to the above equation (taking for instance. $a=2$) and find that $a^{n-1}(\mod n)$ is $1$ then can we say that n is prime?. The answer is no, because Fermat’s little theorem just goes in one direction (we need to know for sure that $p$ is prime). It does however give a good indicative that maybe $n$ is prime. So what if we test many different $a$? If we find that the power is $1$ for all of them can we say that $n$ is prime? Unfortunately the answer is again no. There are in fact numbers known as Carmichael numbers that are composite and its powers $a^{n-1}$ are always $1$. A well known Carmichael number is $561=3·11·17$ and whose powers are</p>

\[a^{560}=1 \mod{561}\]

<p>for all $a$ smaller than $561$.</p>

<p>Seems we are in a cul-de-sac situation… But hopefully we have the Miller-Rabin algorithm to help us. This test was developed by Miller in 1976 as full deterministic test but then modified by Rabin in 1980 to make it a probabilistic algorithm. In order to circumvent the Carmichael numbers let me make the following proposition: Let $p$ be a prime (different from 2) then we can write $p$ in the form of</p>

\[p-1=2^kq\]

<p>where q is an odd number and k an integer. Now let a be any number not divisible by p. Then one of the following two conditions is true</p>

\[a^{q}-1=0 \mod{p}\]

<p>or</p>

\[a^{cq}+1=0 \mod{p}\]

<p>for $c$ in $1$, $2$, $4$, $\cdots$, $2^k$. You can find the proof of this proposition is the book of Hoffstein, Pipher and Silverman. This happens strictly when $p$ is prime but let’s substitute this $p$ for an arbitrary number $n$. We can write $n-1=2^kq$ and find $a$ such that both conditions above are not fulfilled, then we call a a Miller-Rabin witness for the compositeness of $n$. I.e. if we find such a we know for sure that $n$ is composite.</p>

<p>Ok, but how many random $a$’s should one test to give a certainty of say 99% that the number $p$ is probably prime?. There’s another proposition that assesses that. Let $n$ be an odd composite number, then at least 75% of the numbers between $1$ and $n-1$ are Miller-Rabin witnesses for $n$. This means that if we randomly sample $10$ distinct values of a in the range of $1$ to $n-1$, the probability of hitting at least $1$ witness is $1- P(k=1, n=10)$=$0.99997$ (where $P$ is the Bernoulli probability). We are therefore quite sure that if with $10$ trials we haven’t found a witness the number is prime but we will never be 100% sure.</p>

<p>My implementation in code of the Miller-Rabin algorithm can be found here. Also the code to generate random prime numbers. Checking the calculation time with a regular laptop (2.6GHz 6 core 16 GB RAM) I get in between $25ms$ to $60ms$ to generate a $256$ bit random prime, in between $80$ and $120ms$ for $512$ bit prime and between $500ms$ to $2.15s$ for $1024$ bit prime. All this checking for $40$ maximum values of potential witnesses $a$.</p>

<h1 id="conclusions">Conclusions</h1>
<p>Prime numbers are the building blocks of arithmetics and are very useful in cryptography. We’ve seen how prime numbers distribute in density, how to efficiently find them and how to test if a number is prime or composite using the Miller-Rabin primality testing.</p>

<p>Thank you for reading!.</p>]]></content><author><name>Sebastia Agramunt Puig</name></author><category term="Cryptography" /><category term="cryptography" /><category term="mathematics" /><summary type="html"><![CDATA[Prime numbers are the building blocks of arithmetics. In this short post we will investigate some attributes of prime numbers and how to work with them in a computer.]]></summary></entry></feed>