Linux distros ban 'tainted' AI-generated code — NetBSD and Gentoo lead the charge on forbidding AI-written code

The OpenAI logo is being displayed on a smartphone, with the Microsoft logo visible on the screen in the background, in this photo illustration taken in Brussels, Belgium
(Image credit: Getty Images)

Coming out of the Free and Open Source Software (FOSS) community this week, we have a Gentoo Wiki post and an update to NetBSD's commit guidelines that forbid or heavily restrict the usage of AI-generated code within these open-source Linux distributions. Considering how controversial AI is and how often it can turn around solutions that don't quite work correctly (especially for programming tasks), there are plenty of practical reasons for applying these new policies.

In NetBSD's case, "code generated by a large language model or similar technology" is "presumed to be tainted code and must not be committed without prior written approval." So, while their policy may technically allow AI contribution in the future, it won't be without human oversight to ensure the feature(s) actually work.

Meanwhile, Gentoo Linux is more direct about banning AI tools altogether while contributing to the Gentoo project. The Gentoo team directly cites copyright, quality, and ethical concerns behind their reasoning for not allowing LLM-generated code into their operating system. In particular, their ethical concerns emphasize that commercial AI projects "are frequently indulging in blatant copyright violation to train their models," their use of natural resources may be too severe, and that LLMs have been used to empower scammers.

Compared to statements by the likes of Jensen Huang pointing toward AI and AI hardware as "the death of coding," these decisions from established FOSS projects show that we still need skilled work from individual humans for the best results, not machines.

Fortunately for critics of generative AI, it's slowly becoming more accessible to turn off these features where they are unwanted. For example, you can block the AI results from Google's search results to avoid getting potentially incorrect advice from a robot.

Elsewhere in the FOSS scene, the use of gen AI is still being debated— for example, there is still a thread discussing the use of AI-generated content in the Debian Project. For the most part, though, it seems like Debian is unwilling to put its foot down and outright ban generative AI from their project like Gentoo and NetBSD, which could have interesting long-term implications if Debian winds up proving the dangers of letting an AI write an OS.

Freelance News Writer
  • thisisaname
    Ban or just covering themselves. When Ai generated code is submitted it the the submitter who did the "bad thing" not them.
    Reply
  • bit_user
    The article said:
    For the most part, though, it seems like Debian is unwilling to put its foot down and outright ban generative AI from their project like Gentoo and NetBSD, which could have interesting long-term implications if Debian winds up proving the dangers of letting an AI write an OS.
    Debian is a distro, not an OS. Anything they upstream to the Linux kernel would be subject to the Linux Foundation's policy. While distros sometimes maintain their own kernel patches, these tend to be fairly small and probably things they pull from an upstream branch or that they subsequently try to upstream themselves. The main things Debian maintains is their packaging + related tools + packaging repos.

    If they had a blanket "no AI content" policy, then it could have implications on what sorts of software packages could be included in their package repos. Instead, the post linked by the article said:
    "Apparently we are far from a consensus on an official Debian position regarding the use of generative AI as a whole in the project. We're therefore content to use the resources we have and let each team handle the content using their own criteria"That's really not saying much, especially if you don't know what all the teams are and their individual policies.
    Reply
  • Findecanor
    NetBSD is not a Linux distribution. It is a different operating system: a Unix, with long lineage. It was forked from 386BSD in 1993.

    Linux is a Unix clone... Actually "Linux" is only a kernel, but a distribution contains also "userland" of libraries and programs to make it useful: often from GNU (hence the common "GNU/Linux" moniker).
    NetBSD includes its own userland.
    Reply
  • bit_user
    Findecanor said:
    NetBSD is not a Linux distribution. It is a different operating system: a Unix, with long lineage. It was forked from 386BSD in 1993.

    Linux is a Unix clone... Actually "Linux" is only a kernel,
    If we're going to be pedantic, then Linux is not a UNIX clone! Linux inherits heavily from both UNIX and BSD, but has since developed many of its own innovations and APIs. At this point, the distinction between UNIX vs. BSD is pretty pointless and Linux is pretty far beyond either. Linux is simply Linux.

    Findecanor said:
    a distribution contains also "userland" of libraries and programs to make it useful: often from GNU (hence the common "GNU/Linux" moniker).
    GNU contributes some core utilities, GCC, and glibc. By this point, there are non-GNU versions of pretty much all the GNU stuff, should anyone want a totally GNU-free Linux distro.

    Also, GNU has their own a kernel called Hurd, which has been limping along for like 30 years and I think still in a semi-working state.
    Reply
  • jackt
    good!
    Reply
  • Alvar "Miles" Udell
    Meanwhile a plethora of security vulnerabilities are patched every month by humans fixing bugs and critical issues in code created by humans, and those patches can, and often do, cause other issues from broken functionality to a bricked device.
    Reply
  • Sleepy_Hollowed
    Gentoo has got it right.

    Can't wait for the "AI" bubble to burst.
    Reply
  • CmdrShepard
    thisisaname said:
    Ban or just covering themselves. When Ai generated code is submitted it the the submitter who did the "bad thing" not them.
    If they are the ones allowing the commit it is their fault if the tainted code ends up in their codebase.

    The decision is sensible -- all code generated by AI is tainted because you can't prove it wasn't trained on copyrighted code. Better to be safe then sorry.
    Reply
  • CmdrShepard
    Alvar Miles Udell said:
    Meanwhile a plethora of security vulnerabilities are patched every month by humans fixing bugs and critical issues in code created by humans, and those patches can, and often do, cause other issues from broken functionality to a bricked device.
    This is not about the quality but of the legality of "AI" contribution. If a model was trained on copyrighted code and / or code which has non-permissive license incompatible with whatever they need then it's not legal to include it.
    Reply
  • thisisaname
    CmdrShepard said:
    If they are the ones allowing the commit it is their fault if the tainted code ends up in their codebase.

    The decision is sensible -- all code generated by AI is tainted because you can't prove it wasn't trained on copyrighted code. Better to be safe then sorry.
    The problem is how do you spot this AI generated code? All this ban does is tell people not to commit AI generated code. It does not say they will stop people committing AI generated code.
    If Ai tainted code it submitted and later found out they will blame the person who submitted it, with the line they where told not to di it.
    Reply