Charlene's thoughts on software, language models, and, well... mostly just those two. 😄
What Happens When You Install Python

What Happens When You Install Python

Charlene Chambliss

2023 Sep 7

Recently someone I know needed to install Python, and as is completely normal and expected for newcomers, was confused by the process. After consulting many different resources and getting progressively more confused over the course of a couple hours (again…exceedingly normal), this blog post from Zbigniew Marcinkowski helped the person finally successfully install Python.

I remember this kind of frustrating experience quite well from when I was learning, so now that I understand this a bit better, I thought I’d write down what’s actually going on, and explain why some of these steps are necessary.

Assuming you’ve read or at least skimmed the linked post, here’s the additional context you’d need to understand what’s going on at each step. I am purposefully not going to go into TOO much additional detail / am choosing to underspecify a little bit, to keep things approachable. Feel free to email me if you have additional questions.

Python is just an “executable”

Python is actually just a singular executable file that you launch and which eventually gets executed by your CPU/processor, like any other software program.

Strictly speaking, an executable could be one of several different file types. The important part is that it conforms to your OS’s “application binary interface,” which means the OS knows what to do in order to run it.

Your computer has a lot of executables on it already, for doing various OS tasks or basic command-line operations, like ls, which you might have used already for [l]i[s]ting the files in a directory.

To be able to use the command python or python3, your CLI has to know where to find the executable that corresponds to that command. You can type which python3 into the terminal to see which python executable you’re currently using.

To see what type of executable something is, you can use file. Here’s what it says when I ask about my system Python (the default Python that comes with MacOS’s developer tools):

> file /usr/bin/python3
/usr/bin/python3: Mach-O universal binary with 2 architectures: [x86_64:Mach-O 64-bit executable x86_64] [arm64e:Mach-O 64-bit executable arm64e]
/usr/bin/python3 (for architecture x86_64): Mach-O 64-bit executable x86_64
/usr/bin/python3 (for architecture arm64e): Mach-O 64-bit executable arm64e

This output can be a little jargony if you’re not familiar with CPU architectures, but all this really says is that python3 is a “Mach-O” container with two binaries in it, one that can be executed by pre-M1 Macs, and one that can be executed by Macs using the M1+ chips. (x86_64 is Intel, arm64 is M1+.)

Homebrew and executables

If you’re on MacOS, Homebrew is a nifty CLI tool that manages random executables that you install from elsewhere. Using Homebrew to manage random installs is a great idea and keeps things organized.

The way Homebrew works is by (1) putting executables in a folder that it has full control over, and then (2) adding that folder to your PATH. PATH is a sort of background variable (“environment variable”) that tells your shell where to look for executables whenever you type in the name of something. If you want to see what your PATH is, you can write echo $PATH (the $ is important because in bash it indicates PATH is a variable and not just regular text).

Here’s a snippet of mine:

> echo $PATH
/Users/charlenechambliss/Library/pnpm:/Users/charlenechambliss/.bun/bin:/Users/charlenechambliss/mambaforge/condabin:/Users/charlenechambliss/.nvm/versions/node/v20.5.0/bin:/opt/homebrew/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/bin:/Users/charlenechambliss/nvim-macos/bin:/Users/charlenechambliss/.pyenv/shims:/Users/charlenechambliss/.orbstack/bin
... (so on and so forth)

It looks kind of crazy, but it’s literally just a bunch of paths smushed together and separated by :, and when you write python, your shell looks in each one of those directories from beginning to end until it finds an executable named python. If you’ve ever gotten command not found while using your shell, it was because it couldn’t find the executable on your PATH.

The end result is once you’ve homebrew-installed something, like Python, it’s now available on your command line! And anytime you want to upgrade, you just use brew upgrade, or brew uninstall to uninstall.

If you want to see the other homebrew commands, type brew --help, or brew help {command}, such as brew help install to get help on particular commands.

Bonus: Another reason Homebrew is awesome is because it’ll automatically install and manage dependencies of executables. So for example, the dependencies to install Python with brew are mpdecimal, openssl@3, sqlite, and xz, which are all themselves executables that Python needs to reference for various pieces of functionality. So it automatically installs those for you before installing Python. And then at some point if you were to uninstall Python and you didn’t need those anymore, you could run brew autoremove to get rid of dependencies that are no longer needed for any currently-installed package.

Development environments

venv is important because it lets you have different Python “environments” for every project you work on. This matters because once you start installing Python libraries into your environment, some of those libraries will have dependencies that aren’t compatible with other libraries, or aren’t compatible with certain versions of Python (only works with 3.10 or lower), etc.

So you can’t really have an environment with tons of data science packages and then use the same environment for interacting with cloud services via their Python clients, because the cloud services’ Python clients often have pretty specific dependencies that will generate conflicts with your existing data science stuff. (Speaking from experience trying to install e.g. pandas and clustering libraries in the same environment as the Python clients for various GCP services, e.g. BigQuery.)

Using only the dependencies that your project strictly needs also means that the size of your project when you want to deploy it somewhere (e.g. as a web app, or as installable software, or whatever) will be smaller.

Other random factoids

Don’t use system Python

MacOS doesn’t ship with the developer tools included anymore, which is why you have to run --xcode-select install. /usr/bin/python3, which I mentioned at the top of the post, is the Python that gets installed when you run this. But you should never ever ever ever ever use the system (default) Python, and you should especially never ever install libraries with your system Python. Always use a virtual environment when you’re working on projects, even though it’s slightly more hassle.

I won’t go into toooo much detail about why, but the short version is that your OS sometimes uses python for its own internal operations, and if you change it in any way, that could have unexpected results when your OS is doing OS things. This is probably less of a problem for Mac users, but it’s generally good practice to avoid. See here for more from Perplexity on why not to use system Python.

Don’t forget to activate your virtual environment / check that it’s activated before pip installing things.

Pyenv

For managing different versions of Python, you should use pyenv (also installable by homebrew). pyenv lets you keep your Python installations separate from your Python virtual environments, allowing you to reuse the same Python version for multiple virtual environments.

The docs for pyenv aren’t the easiest to digest for a newcomer to the space. They start out talking about shims and whatnot, which I personally had no idea about back when I started using it. Instead I would recommend reading a comprehensive but newbie-friendly article about it, like this one from Real Python.

Realistically, managing different Python versions is only important once you work on multiple bigger applications that need to stay version-locked to a particuar version for stability reasons. Most larger applications keep using a particular version of Python, and only choose to upgrade to another major version every year or two, because it’s kind of a pain to do so for a sufficiently big app. This is (again) mostly because of challenges with dependencies, and unexpected behavior changes that can arise when upgrading.

brew installing applications?

Brew can actually also install applications, as Zbigniew shows with VSCodium in the post. A lot of devs will set up new machines this way, like they’ll write a script to use brew to install VS Code, Chrome, Warp, and various other stuff that’s available from brew, so that they don’t have to manually go to every website and download each application one by one. Not every application is available from brew, but a lot are. There are a couple minor drawbacks of doing this, but it suffices for a surprising amount of cases.

More on how your operating system executes files

If you’re curious to know more about what’s really going on behind the scenes when your computer executes a file, this wonderful series from Lexi Mattick at Hack Club goes into more detail, but in a way that’s still approachable. I read this just recently, and even after working on software for a few years, I still learned a lot!