A lawsuit claims OpenAI stole 'massive amounts of personal data,' including medical records and information about children, to train ChatGPT

L4sBot · 1 year ago

A lawsuit claims OpenAI stole 'massive amounts of personal data,' including medical records and information about children, to train ChatGPT

tal · edit-2 1 year ago

When they train a neutral net, all data that it has ever seen is an input to some degree in generating an output, because all inputs contribute to some degree in affecting edge weights, so the answer is “everything I’ve ever seen”.

You are capable of learning higher-level structures and reasoning, and could form distinct memories and associate some memories with those higher-level structures, so in some cases you could remember and name an event that let you build up a piece of reasoning.

So, if you were asked “why did you ground yourself before touching that circuit board”, you might say “well, when I was an undergrad, I fried a RAM chip by touching it without grounding myself”.

The generative AIs out now are too primitive to and don’t reason like that. There’s no logic being learned in the way you’re thinking of. I guess the closest analog would be if your eyeballs were just wired directly to your cerebellum and had enormous numbers of pictures of flowers flashed at you, each with the word “flower” being said, and then someone said “flower” and recorded the kind of aggregate image that came to mind. All flowers contribute a bit to that aggregate image, but AI-you isn’t remembering distinct events and isn’t capable of forming logical structures and walking through a thought process.

In fact, if generative AIs could do that, we’d have solved a lot of the things that we want AIs to be able to do and can’t today.

And even a human couldn’t do that. If I said said “think of a girl” to human-you, maybe you might think of a specific girl or might remember a handful of the individual girls you have seen in life and think that your abstract girl looks more like one than another. But that’s still not listing all of the countless girls you have seen that have affected your mental image.

There will probably come a point where we build AIs that can remember specific events (though they won’t have unique memory of every event they’ve seen any more than you do – a lot of what intelligence does is choose “important” data and throw out the rest, and AIs don’t record everything they experience in full any more than you do). And if they could learn to reason, then they might be able to assign specific events. They might misremember, just like a human could, but they could do something of a human’s analog of remembering some events, forming logical though processes, and trying to create some kind of explanation for what events were associated with that thought process. But all that is off in a future where we build something much more analogous to being as capable as a human.

@nH95sp@lemmy.world · 1 year ago

Right, and I suppose if you still tried to charge for use of references to source data, it would then be a weird slippery slope of weighting for which source data the AI was trained on first. How would you say, bill for references to a circuit board if it was trained on things like dictionaries that include “circuit board” as well as of course, more direct references to circuit boards in tech.

Guess it could be some weird percentage, but I don’t think I would welcome that reality