• 8uurg@lemmy.world
    link
    fedilink
    English
    arrow-up
    4
    arrow-down
    1
    ·
    3 days ago

    Yes, true, but that is assuming:

    1. Any potential future improvement solely comes from ingesting more useful data.
    2. That the amount of data produced is not ever increasing (even excluding AI slop).
    3. No (new) techniques that makes it more efficient in terms of data required to train are published or engineered.
    4. No (new) techniques that improve reliability are used, e.g. by specializing it for code auditing specifically.

    What the author of the blogpost has shown is that it can find useful issues even now. If you apply this to a codebase, have a human categorize issues by real / fake, and train the thing to make it more likely to generate real issues and less likely to generate false positives, it could still be improved specifically for this application. That does not require nearly as much data as general improvements.

    While I agree that improvements are not a given, I wouldn’t assume that it could never happen anymore. Despite these companies effectively exhausting all of the text on the internet, currently improvements are still being made left-right-and-center. If the many billions they are spending improve these models such that we have a fancy new tool for ensuring our software is more safe and secure: great! If it ends up being an endless money pit, and nothing ever comes from it, oh well. I’ll just wait-and-see which of the two will be the case.