has it been confirmed that deepseek was trained on chatgpt? openai was pushing this narrative pretty hard but unless there’s proof i’m not going to take them at their word
They’re claiming that DeepSeek used their API for distillation, and to be fair DeepSeek hasn’t denied this. Whether they did or not really doesn’t matter in my opinion. Watching OpenAI complain that a company did to them precisely what they themselves have been doing to everyone else is the hilarious part here. And amusingly, if they tried to go after DeepSeek legally in the US then they’d create precedent for everyone whose content they scraped to go after them.
it hasn’t been confirmed, but it makes sense. in one of my conversations with deepseek, it explicitly told me, unprompted, that it was “chatgpt” and it was trained by openai. people told me it was just “hallucinating”, but why would it hallucinate being trained specifically by openai? so yeah, they probably did the thing where they ask questions to chathpt to train their model
but as the saying goes, ladrão que rouba ladrão tem 100 anos de perdão
Even if Deepseek didn’t explicitly train on ChatGPT output, there’s so much of it pasted across the internet that just by scraping the web, Deepseek would’ve already faced that issue.
This distilling capability is super powerful. If any cutting-edge model can simply be distilled, there is literally no moat around AI. If OpenAI comes out with GPT-infinity, in a few months everyone else will just distill out an equivalent model. That means AI costs (and profits!) should drop to zero over time due to competition.
This matches both Jeremy Rifkin’s concept of zero marginal cost, and Marx’s predictions of the falling rate of profit.
has it been confirmed that deepseek was trained on chatgpt? openai was pushing this narrative pretty hard but unless there’s proof i’m not going to take them at their word
Watch this. https://youtu.be/yblat9IWPMo
I found a YouTube link in your comment. Here are links to the same video on alternative frontends that protect your privacy:
They’re claiming that DeepSeek used their API for distillation, and to be fair DeepSeek hasn’t denied this. Whether they did or not really doesn’t matter in my opinion. Watching OpenAI complain that a company did to them precisely what they themselves have been doing to everyone else is the hilarious part here. And amusingly, if they tried to go after DeepSeek legally in the US then they’d create precedent for everyone whose content they scraped to go after them.
it hasn’t been confirmed, but it makes sense. in one of my conversations with deepseek, it explicitly told me, unprompted, that it was “chatgpt” and it was trained by openai. people told me it was just “hallucinating”, but why would it hallucinate being trained specifically by openai? so yeah, they probably did the thing where they ask questions to chathpt to train their model
but as the saying goes, ladrão que rouba ladrão tem 100 anos de perdão
Even if Deepseek didn’t explicitly train on ChatGPT output, there’s so much of it pasted across the internet that just by scraping the web, Deepseek would’ve already faced that issue.
This distilling capability is super powerful. If any cutting-edge model can simply be distilled, there is literally no moat around AI. If OpenAI comes out with GPT-infinity, in a few months everyone else will just distill out an equivalent model. That means AI costs (and profits!) should drop to zero over time due to competition.
This matches both Jeremy Rifkin’s concept of zero marginal cost, and Marx’s predictions of the falling rate of profit.
The information wants to be free.