More than 3 years have passed since last update.

KDD2019 チュートリアルT11: Fake News Research: Theories, Detection Strategies, and Open Problems

Last updated at Posted at 2019-08-07


T11: Fake News Research: Theories, Detection Strategies, and Open Problems
Reza Zafarani, Xinyi Zhou, Kai Shu, Huan Liu
Tutorial page: https://www.fake-news-tutorial.com/
Slide: https://docs.wixstatic.com/ugd/f31d05_9a90deb47a04427d98d35444f5b6fe45.pdf
(Reza Zafaran)


  • Research backgraound
  • What is fake news
  • Related concepts
  • Fundamental theories.

Research background

fake news is now viewed as one of the greatest threats to democracy, justice, public trust, freedom of expression, journalism, Economy.
Political aspects :may have had as impact on brexirs referendum
2016 US presidencial aspect
Economic aspect.: Barak onbama was injured in an explosion wiped out $130 billion in stock value.

Social /psycological Aspects
For fake news, it is relatively easier to obtain public trust.
Validity effect
Confirmation bias
Peer pressure
Why is more attracting more public attention recently
created Faster and cheaper
The rise of social media
social media accelerates dissemination of fake news.

What is fake news , Related concepts

Fake news is intentionally and verifiably
Fake news
Authenticity : false , intention :bad, News: Yes
False news
Authenticity : false , intention :unknown, News: Yes

Fundamental theories.

Fundamental human cognition and behaivior theories developed across various disciplines such as philothophy, social science, and economic s provides invaluables insight for fake news studies .
Style based fundamental theories.
Propagation-based fundamental theories.: studying fake news based on how It spreads.

"Fake news is incorrect but hard to correct”

User-based fundamental theories.: studying fake news from perspective of user: how users engage with (or can play)

Fake news detection (outline)

  • Knowledge based Fake news detection
  • Style based Fake news detection
  • Propagation -based Fake news detection
  • Credibility based Fake news detection
  • Fake-news datasets & tools

Knowledge based Fake news detection
Its is also known as fact-checking
Expert-based manual fact-checkning
Fact-checkers: one or several domain-experts,
Crowd-sourced manual fact-checking
Fact-checkers: large-individual

Politifact, the Washington post fact checker , fact check scopes truthfiction, fullfaxt, hoxxstlayer

Expert-based manual fact-checking
Croud-sourced manual fact-checking
Automatic fact-checking
How to represent “knowledge”

Stage1: fact extraction

T1: Entity resolution (duplication/record linkage)
T2: Time recording to remove outdated knowledge
T3: knowledge fusion to handle conflicts (often in open-source knowledge extraction)
T4: credibility
T5: knowledge inference/Link prediction to infer news facts based on knowns ones
Relation machine learning: latent feature models,,, malcof random field

Stage2: fact checking: comparing knowledge between articles and knowledge graphs.

Knowledge inference for unknown SPO triples: illustrated studies.

Shortest path based method
discriminative path-based method
Knowledge inference

(Xinyi Zhou)
Fake news detection
Fake news :

  • A survey of research
  • Detection methods
  • oppotunities
Style-based fake news detection

The good
It can detect fake news before propagation
It can detect “real” fake news...

The way to detect
Style representaion
Style classification
Traditional ML : SVM, RF, XGboost
DL Framework

  • Multi-modal
  • Explainable representative
  • performance

Fake news early detection : A theory-driven model

  • Interpretability
  • Empirical relations
    Writing style
    Level: lexicon, feature: BOWs
    Level: syntax, feature: POS tags CFGs,
    Level: discourse, feature: POS tags RRs
    Frequency: absolute? Standardized ? Relatives by using TF-IDF.

  • Multi-modal

  • Event-invariant
    Input: imagem text
    decoder: Fake-news detector Event-discrimitor

Propagation-based Fake News Detection

The challenges

  • Algorism transparency : writing style can be manipulated
  • Golden datasets with reliable lables: multi-label , domain lanuguage,
  • Diffrent types of fake news:
  • Model explain-ability

The good: Massive auxiliary information can be utilized for comprehensive detecting
News cascade
Homogenous Network
Stance Network

Credibility based Fake news detection

Headline Credibility & Clickbait detection
User credibility & Bot detection

  • Low > Malicious users
  • User credibility score > susceptible users: Unintentionally engage in fake news activity
  • High > insusceptible users: immure to fake news

The challenges

  • Fake news early detection
  • Empirical relationship between fake news and clickbait
  • Assessing user intention in fake news activities
    (Kai Shu)

    Beyond News Contents: The Role of Social Context for Fake News Detection

    Fake News Detection – Multi-Source : A typical news dissemination system on social media

  • Entity: publishers, news, social engagement.
    Tri-Relationship Embedding (TriFN)

  • News contents embedding

  • Social contexts embedding

    we jointly combine news content embedding and social context embedding for fake news detection

Datasets: FakeNewsNet with information for news contents, social context and ground truth labels from fact-checking websites


  • Social context information brings additional signals to fake news detection
  • It is important to capture the relations among publishers, news pieces, and users to detect fake news
  • The proposed TriFN framework is effective to model tri-relationships through heterogeneous network embedding

Unsupervised Fake News Detection: A Generative Approach
Unsupervised Fake News Detection
- news detection method by modeling user opinions and user credibility
The hierarchical user engagement structure: We build a hierarchical user engagement structure for each news

Deep Headline Generation for Clickbait Detection
Existing approaches: extracting hand-crafted linguistic features or building sophisticated predictive models such as deep neural networks
Scale: datasets with labels are often limited
Distribution: imbalanced distribution of clickbaits and non-clickbaits
Headline Generation from Documents
Goal: Generate stylized headlines that also preserve document contents
Model: Generator Learning: a document autoencoder , a headline generator
Discriminator Learning: a transfer discriminator , a style discriminator , a pair discriminator
We study the problem of generating clickbaits/nonclickbaits from original documents for clickbait detection
We propose a novel deep generative model with adversarial learning

Fake News Datasets & Tools

Data repository: FakeNewsNet, [Github], [Kaggle], [Paper]


KDD2019 Poster session
dEFEND: Explainable Fake News Detection

A new framework for the Novel problem of explainable fake news detection
Achieve higher saccuracy than the state-of-the-art fake news detection method
Discover explainable news sentences and user comments to understand

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up