cross-posted from: https://lemmy.sdf.org/post/51189959

By comparing LLMs developed in China and outside, a study finds significantly higher levels of censorship in China-originating models, not explained by technological limitations or market preferences.

Original report: Political censorship in large language models originating from China Open Access

[…]

Jennifer Pan and Xu Xu compared the responses of foundation LLMs developed in China (BaiChuan, ChatGLM, Ernie Bot, and DeepSeek) to those developed outside of China (Llama2, Llama2-uncensored, GPT3.5, GPT4, and GPT4o) to 145 questions related to Chinese politics. The questions were sourced from events censored by the Chinese government on social media, events covered in Human Rights Watch China reports, and Chinese-language Wikipedia pages that were individually blocked by the Chinese government before the entire site was banned in 2015.

Chinese models were significantly and substantially more likely to refuse to respond to questions related to Chinese politics than non-Chinese models. When they did respond, Chinese models provided shorter responses, on average, than non-Chinese models. Chinese models also tended to have higher levels of inaccuracy in their responses than non-Chinese models, characterized by refutation of the premise of the question, omitting key information, or fabrication, such as claiming that frequently imprisoned human rights activist Liu Xiaobo was “a Japanese scientist.”

[…]

The differences between Chinese and non-Chinese chatbots could have been due to the training data that shapes them, which in China is subject to both official government censorship and self-censorship, or to intentional constraints that companies place on their models to comply with government requirements. The researchers found that the magnitude of censorious responses to prompts in simplified Chinese and English is much smaller than the difference between China-originating and non-China-originating models, suggesting that the source of the issue cannot be fully explained by training data or broader model development choices alone.

[…]

According to the authors, as Chinese LLMs are increasingly integrated into applications used globally, their approach to sensitive topics could influence information access and discourse well beyond China’s borders.

[…]

  • BrikoX@lemmy.zipM
    link
    fedilink
    English
    arrow-up
    8
    arrow-down
    3
    ·
    9 hours ago

    I don’t think these type of comparisons achieve anything of value. It really feels like the study is trying to prove the desired result instead of trying to prove a blind hypethesis.

    More realistic comparison would be US-developed models answers for levels of censorship related to United States politics vs China-developed models answers for levels of censorship related to Chinese politics.

    • HotznplotznOP
      link
      fedilink
      English
      arrow-up
      6
      arrow-down
      2
      ·
      8 hours ago

      I disagree. It just depends what you want to analyze.

      This is just another study that proves Chinese censorship regarding LLMs. There’s ample evidence.

      The US or anyone else may also censor (if the US hasn’t done so already, I wouldn’t be surprised if they do in the future), but this isn’t an excuse for China.

      • BrikoX@lemmy.zipM
        link
        fedilink
        English
        arrow-up
        2
        ·
        7 hours ago

        This is just another study that proves Chinese censorship regarding LLMs. There’s ample evidence.

        Right, it’s well known that authoritarian China censors what they consider “sensitive topics”. So instead of another study that states what is established by their laws which anyone can read, a more useful study would be comparison of how different authoritarian countries approach the same issue.

        There is also ample evidence of US government by private or public pressure making US companies to self-censor their models around political or “national security” topics.

        • HotznplotznOP
          link
          fedilink
          English
          arrow-up
          3
          arrow-down
          2
          ·
          7 hours ago

          … a more useful study …

          What is a ‘more useful study’? The researchers tested a hypothesis, and the result is clear.

          There are many other studies. A comparison of how different authoritarian countries approach this issue would also be very interesting, but this is absolutely valued research imo.

          • BrikoX@lemmy.zipM
            link
            fedilink
            English
            arrow-up
            1
            ·
            7 hours ago

            <…> but this is absolutely valued research imo.

            How is stating the obvious for a millionth time which is already backed by multiple other studies of any value? Studies are not free and this is just a waste of money.