cross-posted from: https://lemmy.sdf.org/post/51189959

By comparing LLMs developed in China and outside, a study finds significantly higher levels of censorship in China-originating models, not explained by technological limitations or market preferences.

Original report: Political censorship in large language models originating from China Open Access

[…]

Jennifer Pan and Xu Xu compared the responses of foundation LLMs developed in China (BaiChuan, ChatGLM, Ernie Bot, and DeepSeek) to those developed outside of China (Llama2, Llama2-uncensored, GPT3.5, GPT4, and GPT4o) to 145 questions related to Chinese politics. The questions were sourced from events censored by the Chinese government on social media, events covered in Human Rights Watch China reports, and Chinese-language Wikipedia pages that were individually blocked by the Chinese government before the entire site was banned in 2015.

Chinese models were significantly and substantially more likely to refuse to respond to questions related to Chinese politics than non-Chinese models. When they did respond, Chinese models provided shorter responses, on average, than non-Chinese models. Chinese models also tended to have higher levels of inaccuracy in their responses than non-Chinese models, characterized by refutation of the premise of the question, omitting key information, or fabrication, such as claiming that frequently imprisoned human rights activist Liu Xiaobo was “a Japanese scientist.”

[…]

The differences between Chinese and non-Chinese chatbots could have been due to the training data that shapes them, which in China is subject to both official government censorship and self-censorship, or to intentional constraints that companies place on their models to comply with government requirements. The researchers found that the magnitude of censorious responses to prompts in simplified Chinese and English is much smaller than the difference between China-originating and non-China-originating models, suggesting that the source of the issue cannot be fully explained by training data or broader model development choices alone.

[…]

According to the authors, as Chinese LLMs are increasingly integrated into applications used globally, their approach to sensitive topics could influence information access and discourse well beyond China’s borders.

[…]