Top

China training AI to censor ‘sensitive information’

China has reportedly introduced a censorship system powered by artificial intelligence (AI) designed to scan and detect sensitive content. Some experts suggest that this system aims to monitor social discontent and negative political criticism of the government, highlighting an instance of an authoritarian regime using cutting-edge AI technology to suppress public dissent.

On 26 March, TechCrunch reported that researcher NetAskari had discovered an Elasticsearch database on a Baidu server with no security settings. The dataset consists of over 133,000 text-based items used to train a large language model (LLM) to automatically issue a warning if it detects any content deemed sensitive by the Chinese government. The operation seems to have continued until very recently, with the latest entry being made in December 2024.

The topics covered in the dataset vary widely in terms of their sensitivity, with items ranging from labour controversies to food safety disputes to issues concerning Taiwan, all deemed as “top priority”. The targeted content types also vary, including, but not limited to, idioms and satirical jokes about politics.

For example, the keyword “Taiwan (台湾)” was found more than 15,000 times in the database, covering commentaries and information about the country’s military strength. The detection process was also context-based, flagging subtle criticisms, such as the phrase “When the tree falls, the monkeys scatter,” which was used to criticise the transient nature of the regime. The system’s design bears similarities to how people input prompts into ChatGPT, incorporating prompt tokens and LLMs.

What Is This For?

While the database doesn’t contain any information about the creator, it states its intention is for “public opinion work”, a phrase suggesting that the Chinese government is behind the operation. Michael Caster, Asia programme manager at human rights organisation Article 19, was quoted as saying that such “public opinion work” is monitored by the Cyberspace Administration of China (CAC) and generally refers to censorship and propaganda efforts. In 2016, President Xi Jinping noted the internet is the “frontline” of the China Communist Party’s “public opinion work”.

China training AI to censor 'sensitive information'
China training AI to censor ‘sensitive information’

This is not the first time the government has been accused of involvement in the AI industry to censor certain content. For example, when the Chinese chatbot DeepSeek was launched earlier this year, numerous reports indicated that the software does not respond to questions or prompts relating to politically sensitive subjects, such as the Tiananmen Square Massacre of 1989, the Hong Kong protests of 2019, and criticisms of Xi. Instead, it would reply: “Not sure how to approach this type of question.”

The ruling party in mainland China controls the information that is released to the public in order to maintain the strength and sustainability of their regime by suppressing dissent. Technology firms based in the country, like DeepSeek, have no choice but to comply with such rules. However, many experts warn that this could create an unfavourable environment where the public’s free thoughts and opinions are restricted and exposed only to censored information.

As for the database found on Baidu, the government has not confirmed its affiliation or purpose. The country’s embassy in Washington, D.C., said in a statement to TechCrunch that it opposes “groundless attacks and slanders against China” and emphasised its ongoing commitment to ensuring AI tools are ethical.

Sunny Um is a Seoul-based journalist working with 4i Magazine. She writes and talks about policies, business updates, and social issues around the Korean tech industry. She is best known for in-depth explanations of local issues for readers who need a better understanding of the Korean context. Sunny’s works appeared in prominent Korean news outlets, such as the Korea Times and Wired Korea. She currently makes regular writing contributions to newsrooms worldwide, such as Maritime Fairtrade, a non-profit media organization based in Singapore. She also works as a content strategist at 1021 Creative. A person who holds a Master’s degree in Political Economy from King’s College London, she loves to follow up on news of Korean politics and economy when she’s not writing.