I apologize if this is a debate that has already taken place. Please delete the post, and kindly indicate where I can send my message.
We all know that in technology, there are always things where one has to accept trust in the developer(s), whether it's hardware or software. Some things are currently unavoidable to change in the short term, so that's not where I'm focusing my point.
But something bothers me about "Open-Source" applications. I don't know how to compile, and I'm not willing to dedicate so many hours of my life to learning it. So, in addition to trusting reputable companies, I now choose to trust a reputable person or group, who likely receives code audits for their open-source code. However, these audits are based on the open-source code, not on what ends up being compiled for my final consumer execution. In the end, each project is a bucket of trust unless I know how to compile. And even then, there may be ways that something slips past us, but I understand that it would at least reduce the risk. I read that F-Droid did this: they didn't trust the app creator, but rather compiled their own version from the open-source code. It seemed fantastic to me, but the problem was always the delay.
The question is: Couldn't a program with AI be created to compile any GitHub repository directly? It would eliminate the need to trust even the developer themselves; we would only have to trust their code, as we already do today, and the audits would finally have real value. We would know that what we receive is that code.
I would also love for the concept of Flatpak to be like this: that the developer doesn't sign the binary, but only signs the code, and Flathub (or some backend) creates everything automatically. And if there are doubts about Flathub or any other backend, users could do it locally. It would be a bit more tedious, but its value in privacy would be enormous.
By the way, if any of this already works this way and I am confused, please enlighten me.
I develop software in C++, C# and Python. All the languages mentioned feature package managers to manage compilation and delivery of binaries. I can force them to compile from source in the case I do not trust binaries created by some other person. Recompiling is expensive with regard to time.
Conan, a package manager for C++, uses hashes of source code and packaged binaries for verifying integrity. I am of the opinion that even the most clever systems for maintaining integrity can be broken. I have no idea how AI fits into the problem of package management and trust.
An AI to compile any repository sounds nice. I am the goto build engineer on my current team. We have four projects slightly different build processes. I wrote the CMake and Python to meet the needs of the developers. Some want flattened include heirarchies, others want hidden headers, so on and so forth. The continuous integration is the same however, so maybe we can standardize the DevOps work. I assume continuous delivery is where the AI would live. I am wary of taking control of the build process away from software developers.
Your insights as a software developer are truly valuable. Thank you for explaining.
I agree with your points on the complexities of the build process and the potential pitfalls of taking control away from developers. However, the goal is not to replace the role of developers but to provide additional transparency for those lacking technical expertise. An AI could assist in clarifying this process, and while trust is a wider issue, AI could help in verifying package integrity. The idea is to automate and standardize some aspects of the build process, not to diminish developer control. As technology advances, it's an idea.
Installing F-Droid is spooky. I like the alleged functionality, but I am not certain the source code of the binary is what is running on my device. I also want better guarantees of integrity from F-Droid.
My software developer tendencies are itching. I will pitch some bad ideas on verifying integrity and creating trust.
The initially proposed AI could be a federation of build servers. Each build server compiles the source code providing a hash of the binary. Hashes showing up more frequently implies more of the federation have the same binary. Bad binaries presenting a different hash could be filtered by the consumer based on consensus.
I am hesitant to make an AI level decision like dropping less frequent hashes from consumers entirely. The possibility of the more frequent hashes being incorrect is worrying. A drawback is the lack of automation in forcing the consumer to choose a hash. Maybe the consumer can choose settings to make an AI like decision to always accept the most frequent hash. That decision would be opt in.