Entrepreneurs

The Words TikTok Guardian ByteDance Might likely likely merely Be Observing You Tell

Published

on

Forbes obtained a trove of inner paperwork exhibiting how ByteDance tracks “aloof words” talked about on its social media apps. A full bunch of vocabulary lists housed within the firm’s “detection machine” illustrate the vary of political, social and cultural issues that the Chinese big is monitoring or suppressing.

By Alexandra S. Levine, Forbes Personnel


Most foremost social media companies have rules and tools that protect a watch on what folks on their apps can and can not stare. On the total, they’re in living to trace diversified kinds of negate and protect unhealthy or unlawful cloth off their platforms—like posts facing terrorism, abuse and suicide.

TikTok mother or father ByteDance is no diversified. However the Chinese firm’s negate moderation and monitoring appears to pass far past what’s current amongst American guests like Alphabet-owned YouTube and Meta-owned Instagram.

A Forbes investigation into TikTok and ByteDance printed that a ByteDance machine, crawl by workers in China, is monitoring mentions of what it considers “aloof words” throughout the firm’s products. In some cases, the save words are marked “must execute,” “forbidden” or “prohibited,” ByteDance will likely be blockading connected posts altogether. The machine will likely be aiming to trace each time considered one of those words comes up—recording who said it and the save they’re positioned, along side for folk within the United States.

The Chinese government has targeted folks within the U.S. who’ve spoken out in opposition to it on-line, but experts have warned that the realistic American can also moreover be naive about how far their words can lag back and forth on the internet, who will likely be watching and the skill penalties.

ByteDance’s library of “aloof words,” that are organized into hundreds of vocabulary lists, “is proof-certain that there are particular things that they are all in favour of and so that they settle on to show screen who used to be asserting them, when and how in most cases,” William Evanina, the frail head of counterintelligence for the U.S. government, counseled Forbes.

“They’re not factual collecting it for sequence’s sake,” he added.

Forbes obtained inner paperwork exhibiting heaps of of the note lists housed in this “detection machine” and is publishing them in corpulent below. While this sequence of note lists is just not exhaustive and there are positively more in ByteDance’s procedure, they illustrate the vary of political, social and cultural issues that ByteDance is conserving an peep on or suppressing.

ByteDance declined to commentary on what these heaps of of lists mean, how or the save they are applied, and who created them. Spokesperson Jennifer Banks said simplest that “there are separate and particular keyword tools extinct for diversified products, every with strict permissions and receive correct of entry to controls that permit a pick few folks within every product or platform to protect watch over or add to them.”

TikTok spokesperson Jamie Favazza said that “we imagine heaps of these checklist titles have translation errors and have to not relevant to TikTok.” (Better than 50 that she confirmed are extinct on the platform, for safety purposes, are neatly-known below.) “Regardless of wordlist names, TikTok’s keyword platform operates individually from Douyin’s and diversified China market products, with separate code, separate databases, and is maintained by separate personnel.”

“As a to blame platform, TikTok makes use of wordlists to serve give protection to our neighborhood from hate speech, misinformation, and diversified substandard negate,” Favazza added after publication. “The bulk of the checklist names Forbes supplied to us have to not extinct on TikTok, and anybody can without concerns stare negate about those issues is on hand on TikTok by means of a straightforward search on the app.”

[[Editorial prove: A handful of checklist names were partly decrease off in documentation reviewed by Forbes. In those cases, numbers were skipped over or hyphens have been added.]


CHINA & POLITICS

Lists about Chinese energy or culture

At TikTok CEO Shou Zi Bite’s first-ever testimony sooner than Congress, he counseled lawmakers below oath that “we finish not promote or remove negate at the demand of the Chinese government.” Notice lists in ByteDance’s “aloof words” machine appear to take care of negate that Beijing would likely detest of—along side language major of China’s government, protection drive, leaders and historical past. TikTok denied any of the below have ever been extinct on TikTok.

2346-Tiktok IM Celebration and Govt Detrimental Notice-

Celebration, government and navy opposed core words

504 – opposed core words of the event, government

505-Establish of Govt Agency

777-Celebration and Govt Detrimental Words List

978-Particular prohibited words for Xi and Peng

773-Xi Peng variant vocabulary

774-Mao Deng Jianghu and its variants

975-Leader Decision Targeted Prohibited Words

981 – Leader Relatives Orientation Prohibited Words

2391-FS Project-Xi Peng-Impartial & Detrimental Wrong

2392-FS Project – Celebration, Govt and Defense drive Relat-

1673-FS Project – Universal (Core Leader/Kind Variant)

1255-FS Project – Fraud Person Notice checklist

2389-FS Project- June 4th Linked-Detrimental Wrong

956-June 4 must-execute glossary

1312-Vertical June 4th Particular Overview Speech

1482-Falun Gong

2535-Thematic Arrangement for China’s Strategic Coverage


Lists that level to “Taiwan” or “Hong Kong”

Taiwan and Hong Kong have prolonged represented two of the ultimate threats to the Chinese Communist Celebration and authoritarian rule on listing of they experience appreciable autonomy from mainland China and are democratic of their ideals. Below Xi Jinping, Hong Kong has considered a fierce crackdown on real-democracy activists and diversified challengers of his regime, prompting anti-government protests in most recent years. Taiwan’s occupy push for independence has over and over precipitated retaliation from the Chinese protection drive (most not too prolonged within the past, in April, after Taiwanese and American leaders met on U.S. soil). The island has prolonged been at the center of a geopolitical tug-o-battle between the U.S. and China, and some fright it can probably perhaps well be at the center of a battle between the superpowers within the future. TikTok denied any of the below have ever been extinct on TikTok.

3034-Local Existence-Taiwan Independence & Hong Kong Independence & Tibet Independence & Xinjiang Independence

1140-Hong Kong-connected particular vocabulary for live bro-

1861-Are living China and Taiwan Title Process Vocabulary

2284-High-risk phrases in Taiwan emergency queue

2180-Prohibited artist words in music in Taiwan

2120-Taiwan’s emergency queue within the live broadcast

2220-Taiwan Blind Box Kaiping Comment Blocking off Voc-

2642-High-risk Adjuvant Vocabulary for Taiwan Emerg-

2753-Taiwan-connected theming (title)

2758-Taiwan-connected theming (commentary)

2758-Taiwan-connected theming (overview)

2776-im Taiwan aloof note take a look at



Lists about geopolitics

As tensions intensify between the U.S. and China and diversified authoritarian adversaries, some lists within the machine might perhaps perchance well perhaps form global discourse spherical American politics, U.S.-China relatives and battle in Ukraine and Russia or diversified parts of the field. TikTok denied any of the below have ever been extinct on TikTok.

977-Trump Directed Prohibited Words

982-Sino-US trade directional prohibited words

508-North Korea-connected core words

976-Putin Directed Prohibited Words

2350-Interactive Russian-Ukrainian Impress Non everlasting Vo-

623 – Recent Leader Words

1632-G Leaders’ Particular Retracement Speech

2007-G-Competitor block glossary

503-separatist forces core words

2749-LS Theming – Coup, Battle (title)

2754-LS Theming – Coup, Battle (Overview)

3253-im Deepest Chat Politics-connected Vigorous Adjust


MARGINALIZED GROUPS

Lists that level to “Tibet”

Tibet, governed as an self reliant plight of China, is in an identical vogue considered as a threat by the Chinese government. Beijing’s persecution of Tibetans—and efforts to quash their political, non secular and cultural freedoms—is correctly documented by the Recount Department and human rights groups. That has incorporated the jailing of supposed dissidents (each person from academics to musicians to non secular leaders) and restrictions on freedom of expression within the media and on-line. TikTok denied any of the below have ever been extinct on TikTok.

2553-Tiktok audio aloof words in Tibet plight

2553-Audio Sensitive Words in Douyin Tibet Place (P-

2471-Tibetan Blocked Words for Douyin Comments

1486-Toutiao Tibetan Poetry Possibility Vocabulary

3034-Local Existence-Taiwan Independence & Hong Kong Independence & Tibet Independence & Xinjiang Independence


Lists that level to “Uyghur” (also spelled “Uighur”) or “Xinjiang”

Uyghurs, an ethnic minority living basically in China’s Xinjiang plight, have been the sufferer of Chinese genocide. China has in most recent years constructed a sprawling operation of internment camps and fortified detention centers throughout Xinjiang, the save Uyghurs and diversified Muslim minorities have been topic to torture and an excessive amount of human rights abuses. The U.S. government has labeled the yearslong escalation a “genocide,” whereas the U.N. Human Rights Space of job has within the ultimate 300 and sixty five days described violations as likely “crimes in opposition to humanity.” Experiences have also confirmed a upward thrust in compelled marriages between Uyghur girls and Han men (China’s ethnic majority), calling these partnerships “kinds of gender-basically based crimes that violate global human rights standards and extra the continued genocide.”

TikTok said not considered one of many below note lists referencing ‘Uyghur’ are extinct on TikTok. “TikTok’s insurance policies prohibit claims that Uyghur camps in China construct not exist or are spurious,” Favazza said in an announcement. “On the opposite hand, negate that is educational or raises consciousness about Uyghur camps is allowed on TikTok. One in every of the systems we set in drive this policy is by means of keywords.”

2470-Tiktok commentary Uighur blocked note

2528 – Theming Suggestions of Uyghur-Han Couples (Tit-

2540-Uyghur and Han Couples Theming Suggestions

2798-IM Uyghur personal letter overview glossary

3245-Uighur Audio Overview Vocabulary-Uyghur

2103-Particular vocabulary connected to Xinjiang

3244-Xinjiang Place Audio Overview Vocabulary

718-Sensitive words in Douyin movies in Xinjiang (prelim-

3034-Local Existence-Taiwan Independence & Hong Kong Independence & Tibet Independence & Xinjiang Independence


SCIENCE & CULTURE

Lists about science and medicine

Some lists within the machine appear to show screen conversations spherical China and the Covid-19 pandemic. They look to reference one epicenter of the outbreak—the east China metropolis of Putian—to boot to a “leaked experiment” and pangolins, a species of mammal that early on used to be rumored to be to blame for spreading the coronavirus from animals to humans. TikTok denied any of the below have ever been extinct on TikTok.

2894-2196ab Leaked Experiment Vocabulary

2895-1880ab Vocabulary for Lacking Experiments

2896-2561ab Omission Experiment Vocabulary

2897-2560ab Leaked Experiment Vocabulary

2898-1168ab Leaked Experiment Vocabulary

2715-Pangolin Title Sensitive Vocabulary

1532-Putian Properly being facility Vocabulary

2654-Clinical Particular Overview Vocabulary

3031-Clinical Linked

3035-ugc Clinical Audit Highlights

3041-Clinical

3283-Clinical ASR dusky note

3285-Clinical Title Sad Thesaurus



Lists about global culture

Some lists within the machine seem like with free expression in self reliant parts of China and past—throughout the West. They take care of the complete lot from music, poetry and books to sports activities leagues and real-athletes (the NBA, World Cup and soccer participant Mesut Özil). Moreover they contact on the stock market, exact property and foreign languages and cultures. TikTok denied any of the below have ever been extinct on TikTok.

1539-Words banned from the media (diversified)

2180-Prohibited artist words in music in Taiwan

1486-Toutiao Tibetan Poetry Possibility Vocabulary

997-Spirit Canine Added Poetry Vocabulary

1983-FS Project-Particular “Night Watchman” Vocabulary

2288-Fizzo Erotica List

1808-NBA Copyright Notice checklist

3481-Qatar World Cup Genuine Cooperation Exemption

3487-Qatar World Cup Attribute Words

2778 – Ozil Theming Arrangement (Title)

2779-Ozil Theming Suggestions (Overview)

1582-hebrew_sensitive_text

2538-Islamic Theming Vocabulary (Overview)

2526-Backpack-Islamic Theming Notice checklist (Title)

517-regional crew dusky

2765 – Sicilian Trade Vocabulary

983-Stock Market Orientation Prohibited Words

1802-Specially aloof words in exact property negate in-

1803-Specially aloof words in exact property negate in-


COMPANIES

Lists about ByteDance rivals

Several lists considered by Forbes topic TikTok’s ultimate competitor within the U.S., YouTube, to boot to products from ByteDance’s fiercest challengers in China—along side Alibaba’s cloud machine Aliyun and Tencent’s messaging app WeChat. TikTok denied any of the below have ever been extinct on TikTok.

1618-YouTube Home Surveillance

1710-TikTok-Twitter public conception keywords

476-aliyun_sensitive_test

Wechat industrial protect a watch on words

1058-Cruise chat crew search block

1060-Cruise chat user search blockading

1061-Search blockading for users in Feichat crew

1123-Flychat file establish forbidden words


Lists that level to “Douyin” and diversified ByteDance products or companies

While just among the ByteDance note lists straight cite TikTok, others explicitly level to the Chinese model of the app, Douyin, which is heavily censored by the Chinese government. The lists also reference ByteDance’s recordsdata provider Toutiao, music-streaming platform Resso and place of job machine Lark (identified as Feishu in China), amongst diversified products past and ticket. TikTok denied any of the below have ever been extinct on TikTok.

718-Sensitive words in Douyin movies in Xinjiang (prelim-

2471-Tibetan Blocked Words for Douyin Comments

2553-Audio Sensitive Words in Douyin Tibet Place (P-

1486-Toutiao Tibetan Poetry Possibility Vocabulary

2817-Toutiao’s personal letter abuse risk warning words

1593-Qingbei On-line School Person Suggestions Key phrases

1599-Dali Desk Lamp Person Suggestions Key phrases

1830-Xingfuli Agent Questions and Solutions Sensitive

1592-Guagualong Enlightenment Person Suggestions Key phrases

1928-Resso-GP Suggestions Matching Vocabulary

Making an are attempting out Lark

858-Firm Product Detrimental Sensitive Vocabulary



Lists that level to “TikTok” (also “TT”) or “U.S.”

Forbes chanced on practically 100 lists within the machine with “TikTok” or “U.S.” of their establish—some fascinated about language major of the Chinese government or blocked speech about persecuted Uyghurs. TikTok denied that about half of of them had ever been applied to its platform, suggesting many might perhaps perchance well perhaps be the outcomes of translation errors from Chinese to English. (Stare below for those the firm confirmed.) Favazza, the TikTok spokesperson, said receive correct of entry to to TikTok’s note lists is controlled by TikTok’s belief and safety group and that any changes to those lists goes by means of a U.S. group member. Interior provides ticket that employees in China are also amongst those managing some TikTok lists.

LISTS THAT TIKTOK SAID WERE NOT USED ON TIKTOK:

2346-Tiktok IM Celebration and Govt Detrimental Notice-

2553-Tiktok audio aloof words in Tibet plight

2470-Tiktok commentary Uighur blocked note

TikTok Jap Comments Suppress Words

1710-TikTok-Twitter public conception keywords

1565-Tiktok Govt Affairs Media Subject Vocabulary

1688-Tik Tok Burmese Vocabulary

1283-Tiktok pedophile particular words

1420-Tiktok crew chat pornographic blockading vocabulary

1426-Tik Tok Porn Drainage Recognition Vocabulary

2186-Tiktok Single Male & Superstar Themed Vocabulary

2271-Tiktok Sensitive Person Theme Vocabulary (Title)

2272-Tiktok Sensitive Person Themed Vocabulary (Co-

2780-Tiktok push-aloof folks/prohibited audio

1507-Tiktok pretending to be a celeb establish checklist

2231-Tik Tok-Info impersonation-Particular List Coverage P-

2232-Tik Tok-Info impersonation-Political Media Fable

2234-Tiktok-Info impersonation-Strategic safety

120-TikTok Deepest Message Sensitive Words

2746-Tiktok Xiaoan Deepest Message Sensitive Notice Fil-

838-TikTok particular disclose—Comment overview first (p-

1585-TikTok Particular Events-Comment Chubby Overview Voc-

2109-Tiktok commentary bottom glossary

2548-TikTok Historical Nothingness Video Recall Vocab-

839-TIkTok Particular Match—Video First Overview (Prelim-

1576-TikTok Video Title No. 1 Itinerary Particular Vocabulary

2196-TikTok video title & nip overview + glossary unencumber

2416-Tiktok video title & nip unencumber + must-execute glossary

2561- TikTok prolonged video title & nip overview + glossary re-

2597-TikTok prolonged video title & nip unencumber + execute glossary

1807-TikTok Audio Overview Keyword Rapid



679-Tik Tok Sizzling Search Filter Words

84-Prohibited words for TikTok user nicknames

457-Tiktok company personal letter aloof words

635-Person Suggestions Filter Did/Uid-Tik Tok

946-Tiktok user nickname is just not counseled glossary

1075-Person Suggestions Filter Words-Tik Tok Purple Packet

1159-Tiktok user strategies keywords

1403-TikTok POI Blocking off Vocabulary (Process)

1455-Tiktok excessive-risk prohibited glossary (queue excessive-)

1490-Tiktok listing user strategies keywords

1542-TikTok-Tns user strategies keywords

1653-Tiktok Subject Graded Vocabulary (Deepest Overview)

1798-TT Suggestions Algorithm Matching Key phrases

1820-Tiktok poi No. 1 itinerary particular vocabulary

1843-Tiktok Are living No. 1 itinerary particular vocabulary

1981-Tik Tok Series Title Prohibited Words List (Co-

2139-Tiktok commercialization snapshot overview queue

2201-TikTok Particular Time Node Themed Vocabulary (T-

2202-TikTok Particular Time Node Themed Vocabulary-

2725-Vocabulary for TikTok particular dwelling titles to be rev-

2874-Tik Tok Emergency Response-Subject Computerized P-

(US 1707)DM – Grayscale take a look at wordlist

LISTS THAT TIKTOK CONFIRMED ARE USED ON TIKTOK FOR SAFETY REASONS:

1327-(US 570) MT_User Profile Tier1 Wordlist

577-MT_Search filter Words (Person)

1333-(US 577)MT_Search Tier1 Words (Person)

579-MT_Search ban Words (Hashtag)

1335-(US 579) MT_Search ban Words (Hashtag)

908-Hatespeeech PSA Ban Search (All)

1344-(US 908) Abominate. PSA Ban Search (All)

909-Hatespeech PSA Ban Search (Song

1345-(US 909) Hates PSA Ban Search (Song)

932 – tiktok-m customized push aloof vocabulary

990-Traditional PSA Ban Search List (All)

1348-(US 990) Traditional PSA Ban Search (ALL)

991-Traditional PSA Ban Search (Song)

1349-(US 991) Traditional PSA Ban Search (Song)

1041-Ban Search Suicide & Self Effort

1352-(US 1041) Ban Search Suicide Wordlist

1080-TT-Search Sug Auto-Elevate away Words

1353-(US 1080) Search Sug Sensitive Wordlist

1150-Person Suggestions-In a foreign nation Bulk Answer

1243-MT-DM Ban Words List

1275-MT-Search Sug Whitelist

(US 1293) NLP Pressing Beef up Words List

1322-(US) musical.ly_emergency words

1417-ED Ban Search List

1481-(US 1417) ED Ban Search List

1548-(US)DM Grownup wordlist-T1

1554-TT Anti-semitic Search Ban wordlist

2010-(US 1554) TT Anti-semitic Search Ban glossary

1700-TT Search Suicide T2 wordlist

1734-TT Distressing Search Reminder Wordlist

1737-(US 1734) TT Distressing Search Reminder Wordlist

1735-TT Terrible Search Reminder Wordlist

1738-(US 1735)TT Terrible Search Reminder Wordlist

1736-TT Anti wildlife trafficking Search wordlist

1739-(US 1736)TT Antiwildlife trafficking wordlist

1791-TT Project Survivor SearchBan wordlist

1792-(US 1791)TT Project Survivor Search Ban wordlist

1793-TT Sug Variants moderation note

1886-TT Search Help-Terrible challenges wordlist

1889-(US1886)-TT Search Help-Terrible challenges wordlist

1887-TikTok Search Help-SSH Hoaxes wordlist

1888-(US1887)TikTok Search Help-SSH Hoaxes wordlist

1975-TT Search Help-CSAM wordlist

1976-(US1975)TT Search Help-CSAM wordlist

2111-TT-Search Sug-Variant Seed Wordlist

(US 2193) TikTok counseled search inter-

2297-(US 2193)TikTok counseled search intervention wordlist

2291-SEA Personalized Candidate Words Sad Words List

2292-TikTok Person Profile Enqueue Wordlist

TikTok Search Consequence Layered Intervention

TikTok Q&A question of automated takedown

MT_Comment Auto-remove words



Trust+Security lists general to foremost social media companies

To give protection to social media users, most foremost social media platforms—along side TikTok—have insurance policies and tools geared in direction of filtering out negate and language that is violent, rude or unlawful.

908-Hatespeeech PSA Ban Search (All)

1041-Ban Search Suicide & Self Effort

1432-Bullying/harassment

1433-Misinformation/Media manipulation

1434-Terrorists & prison organizations

1439-Political

1623 – Kids’s Soft Porn OCR Vocabulary

child porn_one_level

child porn_two_level

2282-tf-idft note v2 9014 harassment

2284-tf-idf note v2 0914 misinformation

2286-tf-idft note v2 0914 suicide

2289-tf idf note v2 09114 misinformation recent technique

771 – Chubby textual negate overview of false negate

772 – Unsuitable negate title for overview


Miscellaneous lists within the machine

The which draw of many lists within the machine is unclear, and some seem like more technical or connected to the classic functioning of ByteDance apps. Silent, heaps of these lists occupy words that are “forbidden,” “banned,” “excessive-risk” or “aloof.”

1173-Shiny Monitoring Alarm Sensitive Vocabulary

1318-Forbidden glossary

1523-Forbidden Notice List-BR

2589-Deepest letter abduction vocabulary checklist

2284-High-risk phrases for emergency queues in live b-

excessive volatile words

1838-xxxx aloof words

1147-SMS activates aloof words

1260-G-Global Distribution Nickname Sensitive

849-Search Operation Push Notice Vocabulary

Suggestions Develop – Access Celebration Diversion

816-Person Suggestions Filter Place-X Project

812-X Project Person ID Banned Words

815-X Project Vocabulary (Row-inducing words)

1179-DMT Digital Vocabulary

1512-mt_web

1513-mt_transitive

658-M_ban

M-Overview

3032 – Incorrect Marketing

3033 – Marketing Sense

3036 – take a look at grayscale

3037-dry

2298-Are living commentary AB take a look at

3276 – Take a look at openAPI

peadar take a look at 15test take a look at take a look at take a look at take a look at take a look at

2285-tf-idf note v2 0914 ansa

This listing has been up up to now to supply extra commentary from TikTok.

Emily Baker-White contributed reporting.


MORE FROM FORBES

MORE FROM FORBESTikTok Guardian ByteDance’s ‘Sensitive Words’ Instrument Monitors Dialogue Of China, Trump, UyghursBy Alexandra S. LevineMORE FROM FORBESIndia Banned TikTok In 2020. TikTok Silent Has Access To Years Of Indians’ InfoBy Alexandra S. LevineMORE FROM FORBESTikTok’s China QuandaryBy Emily Baker-WhiteMORE FROM FORBESHow A TikTok Ban Would Deal A Blow To Creators, Agencies And The American EconomyBy Alexandra S. LevineMORE FROM FORBESSecurity Failures At TikTok’s Virginia Info Companies and products: Unescorted Company, Thriller Flash Drives And Illicit Crypto MiningBy Emily Baker-White

Leave a Reply

Your email address will not be published. Required fields are marked *

Trending

Exit mobile version