• @VirtualOdour@sh.itjust.works
    link
    fedilink
    English
    09 months ago

    It’s a question that is based on a purposeful misunderstanding of the technology, it’s like expecting a bee keeper to know each bees name and bedtime. Really it’s like asking a bricklayer where each brick came from in the pile, He can tell you the batch but not going to know this brick came from the forth row of the sixth pallet, two from the left. There is no reason to remember that it’s not important to anyone.

    The don’t log it because it would take huge amounts of resources and gain nothing.

    • @zaphod@lemmy.ca
      link
      fedilink
      English
      3
      edit-2
      9 months ago

      What?

      Compiling quality datasets is enormously challenging and labour intensive. OpenAI absolutely knows the provenance of the data they train on as it’s part of their secret sauce. And there’s no damn way their CTO won’t have a broad strokes understanding of the origins of those datasets.