If you build containers, you should be looking into Docker Slim
As the title says. I build containers for my platforms/clients/myself-selfhosted@home and you would not believe how much smaller you can get your images. Here's an example when slimming one of my images:
That's a Python app that I didn't have to do multi-staged build with docker because of the Slim command. And it's a working version of that app that I'm using today.
Same for one of my flutter apps that I thought it was as small as it could be:
AFAIK it works by analyzing your docker image, checking whats actually used and then throwing out anything else.
For example if you use the Ubuntu base image you have a full minimal OS install. If you're now running a python server for example it's highly unlikely that you will need the perl interpreter that's in the default install so it can be thrown out.
It can get problematic if you want to run something that loads libraries or runs programs dynamically at runtime, since the tool can't easily detect them then and you need to manually intervene. Tried it once on a custom machine learning container and it kept throwing out parts that I actually needed, so I gave up in the end.
It's usefulness is also somewhat limited, since docker containers also share their base images. So if you have three containers running that are all based on Ubuntu 22.04 you will still only have to download it once
Great write up! That's everything exactly right. It's mostly useful to try and reduce the time it takes to pull images to run them. And also reduce the footprint of storing those in your registries.
It ptraces the main container process and cuts off unused files. It also fires some customizable HTTP requests to trigger any dynamically loading libraries. Clever idea. If I understand correctly, the problems that arise to me are:
Undoubtedly some essential files will be omitted. Unless my image consists merely of scratch and an executable, I can't imagine myself successfully covering all edge cases.
What about files that aren't loaded by HTTP requests?
I'm not shitting on this program at all. These are two problems that I'm sure they could solve or just tell straight up "we can't guarantee it'll work in XYZ scenarios. Don't use it if that's your use case". Then I saw that this is backed by some kinda SaaS with a domain that ends with .ai, and that explains why THAT FUCKING README IS WRITTEN like a FUCJik/INg MIND NUMBING LINKEDIN POST that my CEO could write bro what the fuck do you mean by simplifying the value of my digital assets in a seamless secure cost efficient way????? Who fucking cares??? ?WHat does your program ACTUALLY DO??????????
10000000s of seemingly AI-generated paragraphs going on and on about how convenient their product is, 1 measly line in a diagram that describes what it actually does. Again not to shit on the programmers at all, this is a great idea and I'm glad that it's being explored I just hate this industry I can't read another pile of gibberish like that. That ruined my night. Thanks for listening
COuldn't agree more on this! Honestly. I understand that people want hefty descriptions with few inputs on their side, but this is sad.
Anyways! Some of my python cronjobs that I run on my cluster don't have an exposed service, and I can still make it work just fine by passing along the --exec flag and the stuff that takes to run the app. The complicated part is to define properly your environment variables that are necessary to run your use-cases and make sure that you execute all the necessary files. It's not a solution that fits all, for sure! And I honestly don't use it for everything. It's a tool to be used in some use-cases
Oh there's an --exec flag as well? That's great. This seems like a totally viable solution for cases where the crux of the container is a small script, with a handful of decision branches so the surface area to cover is manageable, but it also needs to come in a non-alpine distro because I assume that's the hefty part that we're like to remove. But that's just off the top of my head, I'm sure there's more. It's genuinely a good idea and it deserves a respectful README as well :(