tasiaiso.vulpecula.zone/docs/posts/curlpipebash.md

185 lines
8.1 KiB
Markdown

---
date: 2024-05-08
unlisted: true
---
# Using curl | bash safely
> I don't know what I'm doing.
> This is
>
> Take everything I say here with a wheelbarrow of salt. Do your own research. Don't trust *one* random peron on the internet with your infrastructure.
In April of 2024 I wrote [a post](./old-curlpipebash.md) on Fedi explaining that using `curl | bash` was not a security risk.
I based my original argument on the fact that you ultimately have to trust the person that provides you the code.
<!-- discuss on ? -->
A bit later, I discussed on the same topic in a Matrix channel.
The people involved showed me how it was actually a bit risky to use `curl | bash`.
This is what caused me to do research on the topic and write this post.
<!-- which is true, but *incomplete*. -->
But is it actually dangerous ?
Is the cake a lie ? <!-- ? -->
Well, as you could probably imagine, it turns out that the answer actually is, "it depends".
I'll talk about what the actual dangers of using `curl | bash` are, and how we can mitigate them.
> TL;DR: If you're here because you just want to download software, go for it. You're *probably* going to be just fine. If you're interested in learning or want to implement a `curl | bash` script however, please read the rest.
## Terminology DONE
- Software artifact: Stuff that comes out of your repository: code, shell scripts, binaries, etc. In this blog post I will focus on the shell script that installs your binaries more than anything else.
- Signing authority: a server that hosts the artifact's cryptographic hash or signature.
- Artifact provider: a server that serves the artifact directly to us.
## Surface attack DONE
We can establish a simplified supply chain for a software artifact:
```text
/----------\ /--------\ /--------\
| Artifact | ------> | Server | ------> | Client |
\----------/ | \--------/ | \--------/
(1) (2) (3) (4) (5)
```
An malicious actor could compromise the supply chain by attacking:
- (1): The machine the artifact is built on;
- (2): The connection beteen the artifact builder and the server;
- (3): The machine the server is served to client by;
- (4): The connection beteen the server and the client;
- (5): The client that requests the artifact.
For the purpose of this post however, the attack vectors (1), (2) and (5) are out of scope, which leaves us with only (3) and (4).
> There's not a lot that can be leveraged then ? So I'd imagine using `curl | bash` is safe *most of the time*.
Precisely. *Most of the time*.
## An example script DONE
```bash
curl --proto '=https' --tlsv1.2 -sSf -L https://install.determinate.systems/nix | sh -s -- install
```
This script installs the Determinate Nix installer, an installer for the Nix package manager.
We'll use this as an example for the rest of this blog post. Let's break it down a bit:
- `curl`: Call the cUrl commande line utility; This will create a HTTPS request;
- `--proto '=https'`:
- `--tlsv1.2`: Only connect to the server with a secure tunnel (TLS v1.2 or later);
- `-sSf -L`: Do not output progress updates, TODO and folow redirections;
- `https://install.determinate.systems/nix`: The URL that points to an installation script;
- `|`: If `curl` gets the script successfully, pass it on to the next command;
- `sh`: Execute whatever `curl` gets from the server
- `-s`:
- `-- install`:
- ``:
We can see that the script explicitly requires `curl` to use a secure connection.
At first glance, this seeems like a secure way to run the installer.
However, this script does't check that the script you're downloading is what it should be.
If the server is compromised in some way, we could be downloading malware instead.
We can mitigate this risk by using a method used by most package managers, which is using 2 different servers with different functions:
- One that hosts the artifact's cryptographic hash or signature (here called *signing authority*);
- And another one that serves the artifact directly to us (here called *artifact provider*).
This way, if either server is compromised, the software that's served to the client will not be verified and therefore not run.
We can drastically reduce the risk of getting both machines compromised at once by:
- Having them be controlled by 2 different entities (companies and/or persons);
- Having them be managed by 2 different systems administrators;
- Using different data centers, network routes, domains and SSL certificates;
- Using different operating systems;
- Using different HTTP servers;
- Using different configurations;
This way, the only thing we have to trust is that the artifacts uploaded to the servers are healthy, and that **both** servers are not compromised at once (which should be overwhelmely unlikely if they are separate and different enough).
Now, our infrastructure looks like this:
```text
/-----------\
| Signing |
/-> | authority | --\
/----------\ | \-----------/ \---> /--------\
| Artifact | ---| | Client |
\----------/ | /-----------\ /---> \--------/
\-> | Artifact | --/
| provider |
\-----------/
```
> There are still other parameters that I won't bother bringing into the picture right now, like the SSL certificates provider, and of course, the way the servers get the artifact in the first place (which depends on how your script is written and how and where your software is built).
An example infrastructure would look like this:
- Signing authority
- Managed by John Doe
- Hosted by DigitalOcean (Germany)
- OS: NixOS
- HTTP server: Nginx
- Domain: `determinate.systems`
- Signing authority (alternative)
- Managed by gitea.com
<!-- - Hosted by DigitalOcean (Germany) --> TODO
<!-- - OS: NixOS -->
<!-- - HTTP server: Nginx -->
- Domain: `gitea.com`
- Artifact provider
- Managed by Jane Poe
- Hosted by a worldwide CDN (Hetzner) TODO
- OS: RHEL
- HTTP server: Apache
- Domain: `install-determinate.systems`
> Notice the artifact is now in a different domain (`install-determinate.systems`) and not in a subdomain like it was previously (`install.determinate.systems`).
Now, compromising this part of the supply chain has become extremely hard. The attacker will either:
- Need technical knowledge in NixOS, RHEL, Nginx and Apache, as well as compromising an entire CDN (TODO);
- Compromise both of the sysadmin's machines through social engineering;
- ...
- Use several of the methods listed above.
Now, it would be a lot more feasible to attack another part of the supply chain, which is a subject for another blog post.
## Implementing curl | bash safely
> You've spent so much time explaining that `curl | bash` is insecure, why would we bother making a secure version of it ?
Because the other way around this is to package your software for every distro and package manager under the sun, which is a task which simply imagining sends shivers down my spine.
Making a shell script that leverages this infrastructure isn't actually hard at all. Most of the work is around creating two resilient and independent servers. What we have to do is simply to check the artifact provider's response against a hash or a signature provided by the signing authority.
```bash
CURL=$(curl --tlsv1.3 https://pastebin.com/raw/Tity9gDQ)
# CURL=$(curl --tlsv1.3 https://pastebin.com/raw/xYTmzaMQ)
EXPECTED='caa42ef74ba42d3d097bfcd7c718cd22ca807c1116ce1f86b00ecce9337858d7 -'
ACTUAL=$(echo $CURL | sha256sum)
if [ "$EXPECTED" == "$ACTUAL" ]; then
echo $CURL | bash
else
echo "Checksum mismatch"
fi
```
This can be minified a bit, but it's more readable like that.
### Updating the script
When a new artifact is available, the artifact provider has to start hosting it.
Then, the signing authority needs to get the artifact's hash (dirctly from the source) and then update the way the script is displayed (git repo or website).
Preferably, the artifact provided should include the artifact's version in it's URL and keep hosting non-vulnerable versions, that way the script will still work before the signing authority finishes its work, and after another update is released.