Install Conda
Anaconda is a distribution of the Python and R programming languages for scientific computing, data science, machine learning, and large-scale data processing. It simplifies package management and deployment, and it provides many useful tools and libraries out of the box. Having said that I have to confess that I’m not really a fan of anaconda, it basically contains many packages and software you may not want to use like Spyder IDE (I use vscode instead), Anaconda navigator, a UI that helps you manage your virtual environments, RStudio for R and lots of python packages that you may not want to use. I downloaded to install it today and the installation helper advises me that total installation is 4.82 GB. As you may guess, I’m not installing Anaconda today not even for a try (I did it in the past).
A better alternative (IMHO) to Anaconda is its mini-version, miniconda. It’s basically the anaconda environment manager and package installer without all the fancy UI you may not need. In this post we’ll show how to install it
TLDR
For MacOS and zsh terminal
1
2
3
4
5
6
mkdir -p ~/miniconda3
curl https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh -o ~/miniconda3/miniconda.sh
bash ~/miniconda3/miniconda.sh -b -u -p ~/miniconda3
rm -rf ~/miniconda3/miniconda.sh
~/miniconda3/bin/conda init zsh
source ~/.zshrc
Install miniconda
As of today you have the following versions:
- MacOS Intel (old Macs) Miniconda3-latest-MacOSX-x86_64.sh,
- MacOS M1 ARM (new Macs) Miniconda3-latest-MacOSX-arm64.sh
- Linux Miniconda3-latest-Linux-x86_64.sh.
- Windows Miniconda3-latest-Windows-x86_64.exe
I’m going to show how to install for MacOS intel, but you should find all instructions here. It is as easy as to download a bash script and execute it, that will complete the installation.
1
2
3
4
mkdir -p ~/miniconda3
curl https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh -o ~/miniconda3/miniconda.sh
bash ~/miniconda3/miniconda.sh -b -u -p ~/miniconda3
rm -rf ~/miniconda3/miniconda.sh
That installs miniconda but to make it availiable to your shell command you should
1
~/miniconda3/bin/conda init bash
if you are using bash or
1
~/miniconda3/bin/conda init zsh
if you use zsh. For instance, in zsh, this will append some lines of code at your ~/.zshrc
, check it by running cat ~/.zshrc
and see that conda has created the code in between # >>> conda initialize >>>
and # <<< conda initialize <<<
. Now, to update your command line just run
1
source ~/.zshrc
(or ~/.bashrc
if you use bash) and see that the command conda
is there. If that works, congrats you have conda installed!. All your conda stuff is in ~/miniconda3
. conda is now available to your system and it has overloaded the python
command in your terminal, check that by running which python
, then you’ll find that it points to ~/miniconda3/bin/python
. Basically now your default python is managed by conda, that’s what conda does when you execute the init zsh
/init bash
command to write to your ~/.bashrc
/~/.zshrc
.
Uninstall
To uninstall just remove ~/.miniconda3
and the comments appended in ~/.bashrc
or ~/.zshrc
. In the example shown:
1
rm -rf ~/miniconda3
Open ~.zshrc
with
1
vim ~/.zshrc
and renomve everything in between th lines # >>> conda initialize >>>
and # <<< conda initialize <<<
. Now refresh your terminal
1
source ~/.zshrc
type conda
to see that it’s not there anymore.
Virtual enviroments
In miniconda and anaconda the conda
command is responsible to managing virtual environments. We will investigate that further in a future post but basically an environment is an isolated python where we have installed different pacakges and a specific python version. Virtual environments are perfect if you work on different projects in your system, for instance, if you need numpy
and python 3.9
in one project and matplotlib
and python 3.12
in another you need to have two different virtual environments, one for each project. Virtual environments are perfect because they isolate a single python version and a list of packages for each project.
Before creating the virtual environment it is good practice to update conda from time to time, you can do it with
1
conda update -n base -c defaults conda
this updates the “base” environment. The environment that conda
(command line tool) uses. To create a virtual environment callaed myenv
using python 3.12 just run
1
conda create -n myenv python=3.12 -y
now you can activate it with
1
conda activate myenv
Run which python
and see that it is using a python binary from ~/miniconda3/envs/myenv/bin/python
. The path ~/miniconda3/envs/myenv
is where your new environment has been installed. Now you can install some packages like:
1
2
3
conda install numpy -y
conda install pandas -y
conda install scikit-learn -y
check the packages you have installed in the current activated virtual environment with the command
1
conda list
and you will find your numpy
, pandas
and scikit-learn
packages that you just installed listed there.
To deactive the environment run
1
conda deactivate
And to list all your virtual environments managed by conda run
1
conda env list
Finally, to remove the environment make sure it’s deactivated and run
1
conda remove --name myenv --all -y
Virtual environments from file
In some projects you will find a file with a name similar to environment.yml
and contents like:
1
2
3
4
5
6
7
8
name: myenv
channels:
- defaults
dependencies:
- python=3.12
- numpy
- pandas
- scikit-learn
This specifies a conda environment (the file has been generated using conda env export --from-history > environment.yml
with the activated environment we created in the last section). Then to install
1
conda env create -f environment.yml
Much more than virtual environment manager
Conda is very convenient to manage dependencies, that is, to resolve conflicts in between different package versions. conda-forge is defined as a place to find community-led recipes, infrastructure and distributions for conda, this is the secret sauce why conda is so successful in my opinion. The comunity has been able to maintain and distribute compiled packages for different operating systems and architectures and as of today they have over 25.7K packages, check them in the package browser. They make sure the packages are cross platform precompiled so will be able to install no matter what the operating system or platform you use.
Another advantage of conda (that has to do with precompiled software) is that you can install libraries in an isolated environment. It often happens that the machine you are using for developing does not contain openblas or opencv, then you can install them with your operating system package manager but if you are not root you don’t have those permissions… Then it is convenient to install them in your home directory or simply use conda to download the compiled libraries for your system, that way you don’t need to compile and spend time to use the libraries.
We are going to install a compiled library using conda, the chosen library is opencv, a library for computer vision in C++.
1
conda install conda-forge::libopencv -y
where conda-forge
is the conda channel (the URL where the package lives). Go check that the library has been installed by finding the directory inside your environment
1
find $CONDA_PREFIX -name "opencv2"
here $CONDA_PREFIX
should be the prefix of the activated environment. This should output ~/miniconda3/envs/myenv/include/opencv4/opencv2
.
The header files are in this directory, just look for the main ones
1
2
3
ls -lhat $CONDA_PREFIX/include/opencv4/opencv2 | grep opencv.hpp
ls -lhat $CONDA_PREFIX/include/opencv4/opencv2 | grep core.hpp
ls -lhat $CONDA_PREFIX/include/opencv4/opencv2 | grep imgproc.hpp
Now we can check where are the compiled libraries. They should be in the lib
directory, let’s check for the basic ones
1
2
ls -lhat $CONDA_PREFIX/lib/ | grep libopencv_core
ls -lhat $CONDA_PREFIX/lib/ | grep libopencv_imgproc
You should be set, then if you wanted to use opencv
in a C++ program you could use the headers and dynamically link to the library
My final takeaway
I’ve mentioned from the beninning that I am not a particular fan of conda
or anaconda
, I just find it convenient for someone who is starting its journey with python and want to just code and have things working. I would not recomend it for tools in production, but it is a great tool for data science and machine learning practitioners. As I mentioned, there is no free lunch in convenience, for instance, conda-forge doesn’t have all the packages and the ones that they have they are not normally up-to-date so you may end up using pip
sometimes (more on that in another post). Having two dependency managers is in general not a great idea. Also another thing I find inconvenient is that conda modifies your bash PATH, recall you always have this “base” in your terminal meaning that the “base” environment is active. That IMHO is annoying and also a reminder that conda controls all your python in the system. Of course there are hacks to modify ~/.bashrc
to make that dissapear and get more control to use other python managers, but it’s that, a hack. I would better use other tools like pyenv which I’ll present in my next post of this series. The great use of conda
is as a library manager, it makes it so easy to install libraries in your system that later you can use to compile your programs, even thought you can do it manually conda install
is so easy to use!.