More than 5 years have passed since last update.

KDD2019 Research Track Session RT10: Embeddings II

Last updated at Posted at 2019-08-07


Research Track Session RT10: Embeddings II – Summit 2, Ground Level, Egan Center Chair: Jundong Li

Individualized Indicator for All: Stock-wise Technical Indicator Optimization with Stock Embedding Zhige Li (Shanghai Jiao Tong University)

Derek Yang (Tsinghua University); Li Zhao (Microsoft); Jiang Bian (Microsoft); Tao Qin (Microsoft); Tie-Yan Liu (Microsoft)


1, token embedding
2, analysis of embedding

Preliminaries about technical analysis
Introduction and properties
Technical indicators are developed to recognize reading patterns

Indicator-Based Portfolio construction

  • Single indicator
  • Multiple indicators
    Evaluation metrics
    Limitation: Unified Transformation
    The stock-wise indicator optimization model
    Distinguish stock properties
    Stocks within the same fun as likely to share some common characteristics

Stock embedding
Fund-stock graph construction

Indicator rescaling method

  • Constancy and continuity
  • keep the original properties
  • Stocks adjustment

Experimental insight

Effectiveness of optimized indicator

Performance in real-world investing
Case study:
Different indicator show different sensitives towards different stock
Similar rescaling weights distributed scales


  • Leverage the difference in terms of indicator’s stock-wise affinity
  • Data mining view to learn the stock representation by mining knowledge repository
  • Proposed a delicately-designed rescaling network, for the purpose of retaining the original properties the indicator

Efficient Global String Kernel with Random Features: Beyond Counting Substructures

Lingfei Wu (IBM Research); Ian En-Hsu Yen (CMU); Siyu Huo (IBM Research); Liang Zhao (George Mason University); Kun Xu (IBM Research); Liang Ma (IBM Research); Shouling Ji (Zhejiang University); Charu Aggarwal (IBM Research)

String kernel analysis: Applications
Analysis of large-scale sequential data has been one of the most crucial tasks

  • Bioinformatics
  • Text
  • Audio mining
    String kernels are effective method learning features from the sequence
  • Hardly capture long discriminative patterns
  • Diagonal dominance the kernel matrix
  • Experience quadratic complexity in the number of samples
  • A family of positive-definite global string kernels
  • Random string embedding
  • Theoretically Show Uniform convergence of RSE
Existing approaches for string kernels

String kernels by counting substructures > Ignore global properties and beed high computation of kernel matrix
Edit distance substitution > Invalid positive definite kernel

Key idea

-Develop a general framework of building string kernel and generating string embedding from edit distance

Global string kernel using edit distance and random features

The core task: to build a positive -definite string kernel utilizing the global alignment measure (via edit distance or Levenshtein distance)
Connection to distance substitution kernel

Random string embedding: random features of global string kernel
Efficient computation of RSE: use Random Features
Distance measure
The convergence of random string embedding
How many random features are required in (10) to have an accurate

Increasing R help improve the accuracy with linear complexity of R

  • presented scalable global string kernels
  • A simple yet effective way to handle real-world large-scale string data

HATS: A Hierarchical Sequence-Attention Framework for Inductive Set-of-Sets Embeddings

Changping Meng (Purdue University); Jiasen Yang (Purdue University); Bruno Ribeiro (Purdue University); Jennifer Neville (Purdue University)

typical deep learning tasks need a canonical orientation

traditional neural networks are permutation aware

example: Adamic-Adar index: measure the similarity between a pair of nodes in the graph based on the degree of their common neighbors
(since there's no canonical ordering, the data may have..)

learning representation of set-of-set

SOS is a set whose elements are themselves sets.
desired representation function properties
Output: SOS representation

architecture for SOS

intra-set representation
how to make HATS permutation invariant
which can be estimated using Monte Carlo at test time

Experiment models:

Deepset, MI-CNN, J-lstm, HATS, H-lstm
input: each SOS conatnains4 sets. each set has 10 integers from 0 to 9.
output: regression float output rounded to the nearest integer

task: predict Adamic-Adar index between two nodes in a graph

experiment hyperlink
In: each set contains m=4
Out: binary label

Task SOS:

  • anomaly detection
  • unique count
  • set-od set representation need to be permutation invariant and
  • have no trivial representation collision
  • HATS learn SOS representation following 2 properties variational learning methods for HATS, giving state of the art results in multiple tasks

TUBE: Embedding Behavior Outcomes for Predicting Success

Daheng Wang (University of Notre Dame); Tianwen Jiang (University of Notre Dame & Harbin Institute of Technology); Nitesh V. Chawla (University of Notre Dame); Meng Jiang (University of Notre Dame)

  • predicting the success of publishing papers
  • predicting effective medical resource for alleviating symptoms

Given a goal and a plan, how to predict the effectiveness of the plan
task: goal prediction: given plan p predict goals g that is likely to achieve
context recommendation: given goal and plan recommend item to be added into that maximize effectiveness.

plan effectiveness as point-to-point distance?
point-to-point distance is not effectiveness
Approach: Embeds Behavior oUTcomes TUBE
Approach: optimization
distance between observed and predicted distribution
KL-divergence distance
Approach: negative samples
experiments: building synthetic datasets

  • simulate "game" scenarios of forming a team of players for passing game stage
    experiments: synthetic datasets
    experiments: synthetic datasets results
    task: goal prediction and context recommendation
  • new problems representation learning for behavior context items and goals for success prediction

Multi-Relational Classification via Bayesian Ranked Non-Linear Embeddings

Ahmed Rashed (University of Hildesheim); Josif Grabocka (University of Hildesheim); Lars Schmidt-Thieme (University of Hildesheim) Learning Network-to-Network Model for Content-rich Network Embedding Zhicheng He (Nankai University); Jie Liu (Nankai University); Na Li (Nankai University); Yalou Huang (Nankai University)


multi-relational settings: social network, citation networks, biological interactive

goals: predicting the missing edges

main challenges
  • extremely sparse relation
  • attributed entities
  • proposed method bayesian ranked non-linear embedding

multi-relational classification

related work
  • attribute-aware models
  • non-attribute-aware models
  • bayesian ranked non-linear model

step1: non-linear embeddings
step2: scoring function
step3: bayesian personalized ranking

data: Cora, Citeseer, PPI, email-Eu-core

  • propose Bayesian ranked non-linear embeddings(BRNLE)
  • easily extendable for an arbitrary number od relation of entities

