• utopiah@lemmy.ml
    link
    fedilink
    arrow-up
    4
    arrow-down
    2
    ·
    edit-2
    3 days ago

    Looks like another of those “Asked AI to find X. AI does find X as requested. Claims that the AI autonomously found X.”

    I mean… the program literally does what has been asked and its dataset includes examples related to the request.

    Shocked Pikachu face? Really?

    • Revan343@lemmy.ca
      link
      fedilink
      arrow-up
      4
      ·
      edit-2
      3 days ago

      The shock is that it was successful in finding a vulnerability non already known to the researcher, at a time when LLMs aren’t exactly known for reliability

      • utopiah@lemmy.ml
        link
        fedilink
        arrow-up
        1
        arrow-down
        1
        ·
        edit-2
        3 days ago

        Maybe I misunderstood but the vulnerability was unknown to them but the class of vulnerability, let’s say “bugs like that”, are well known and published by the security community, aren’t there?

        My point being that if it’s previously unknown and reproducible (not just “luck”) is major, if it’s well known in other projects, even though unknown to this specific user, then it’s unsurprising.

        Edit: I’m not a security researcher but I believe there are already a lot of tools doing static and dynamic analysis. IMHO It’d be helpful to know how those perform already versus LLMs used here, namely across which dimensions (reliability, speed, coverage e.g. exotic programming languages, accuracy of reporting e.g. hallucinations, computation complexity and thus energy costs, openness, etc) is each solution better or worst than the other. I’m always wary of “ex nihilo” demonstrations. Apologies if there is benchmark against existing tools and if I missed that.