0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 5 years have passed since last update.

KDD2019 Research Track Session RT10: Embeddings II

Last updated at Posted at 2019-08-07

まとめページへのリンク

Research Track Session RT10: Embeddings II – Summit 2, Ground Level, Egan Center Chair: Jundong Li

Individualized Indicator for All: Stock-wise Technical Indicator Optimization with Stock Embedding Zhige Li (Shanghai Jiao Tong University)

Derek Yang (Tsinghua University); Li Zhao (Microsoft); Jiang Bian (Microsoft); Tao Qin (Microsoft); Tie-Yan Liu (Microsoft)

purpose

1, token embedding
2, analysis of embedding

Preliminaries about technical analysis
Introduction and properties
Technical indicators are developed to recognize reading patterns

Indicator-Based Portfolio construction

  • Single indicator
  • Multiple indicators
    Evaluation metrics
    Limitation: Unified Transformation
    The stock-wise indicator optimization model
    Distinguish stock properties
    Stocks within the same fun as likely to share some common characteristics

Stock embedding
Fund-stock graph construction

Indicator rescaling method
Assumption:

  • Constancy and continuity
    Principle:
  • keep the original properties
  • Stocks adjustment

Experimental insight

Effectiveness of optimized indicator

Performance in real-world investing
Case study:
Different indicator show different sensitives towards different stock
Similar rescaling weights distributed scales

Conclusion

  • Leverage the difference in terms of indicator’s stock-wise affinity
  • Data mining view to learn the stock representation by mining knowledge repository
  • Proposed a delicately-designed rescaling network, for the purpose of retaining the original properties the indicator

Efficient Global String Kernel with Random Features: Beyond Counting Substructures

Lingfei Wu (IBM Research); Ian En-Hsu Yen (CMU); Siyu Huo (IBM Research); Liang Zhao (George Mason University); Kun Xu (IBM Research); Liang Ma (IBM Research); Shouling Ji (Zhejiang University); Charu Aggarwal (IBM Research)

String kernel analysis: Applications
Analysis of large-scale sequential data has been one of the most crucial tasks

  • Bioinformatics
  • Text
  • Audio mining
    String kernels are effective method learning features from the sequence
difficulty
  • Hardly capture long discriminative patterns
  • Diagonal dominance the kernel matrix
  • Experience quadratic complexity in the number of samples
Contribution
  • A family of positive-definite global string kernels
  • Random string embedding
  • Theoretically Show Uniform convergence of RSE
Existing approaches for string kernels

String kernels by counting substructures > Ignore global properties and beed high computation of kernel matrix
Edit distance substitution > Invalid positive definite kernel

Key idea

-Develop a general framework of building string kernel and generating string embedding from edit distance

Global string kernel using edit distance and random features

The core task: to build a positive -definite string kernel utilizing the global alignment measure (via edit distance or Levenshtein distance)
Connection to distance substitution kernel

Random string embedding: random features of global string kernel
Efficient computation of RSE: use Random Features
Data-independent
Distance measure
The convergence of random string embedding
How many random features are required in (10) to have an accurate

Increasing R help improve the accuracy with linear complexity of R
conclusion

  • presented scalable global string kernels
  • A simple yet effective way to handle real-world large-scale string data

HATS: A Hierarchical Sequence-Attention Framework for Inductive Set-of-Sets Embeddings

Changping Meng (Purdue University); Jiasen Yang (Purdue University); Bruno Ribeiro (Purdue University); Jennifer Neville (Purdue University)

typical deep learning tasks need a canonical orientation

traditional neural networks are permutation aware

set-of-set
example: Adamic-Adar index: measure the similarity between a pair of nodes in the graph based on the degree of their common neighbors
(since there's no canonical ordering, the data may have..)

learning representation of set-of-set

SOS is a set whose elements are themselves sets.
desired representation function properties
Output: SOS representation

architecture for SOS

intra-set representation
how to make HATS permutation invariant
which can be estimated using Monte Carlo at test time

Experiment models:

Deepset, MI-CNN, J-lstm, HATS, H-lstm
input: each SOS conatnains4 sets. each set has 10 integers from 0 to 9.
output: regression float output rounded to the nearest integer

task: predict Adamic-Adar index between two nodes in a graph

experiment hyperlink
In: each set contains m=4
Out: binary label

Task SOS:

  • anomaly detection
  • unique count
conclusion
  • set-od set representation need to be permutation invariant and
  • have no trivial representation collision
  • HATS learn SOS representation following 2 properties variational learning methods for HATS, giving state of the art results in multiple tasks

TUBE: Embedding Behavior Outcomes for Predicting Success

Daheng Wang (University of Notre Dame); Tianwen Jiang (University of Notre Dame & Harbin Institute of Technology); Nitesh V. Chawla (University of Notre Dame); Meng Jiang (University of Notre Dame)

purpose
  • predicting the success of publishing papers
  • predicting effective medical resource for alleviating symptoms

Given a goal and a plan, how to predict the effectiveness of the plan
task: goal prediction: given plan p predict goals g that is likely to achieve
context recommendation: given goal and plan recommend item to be added into that maximize effectiveness.

plan effectiveness as point-to-point distance?
point-to-point distance is not effectiveness
Approach: Embeds Behavior oUTcomes TUBE
Approach: optimization
distance between observed and predicted distribution
KL-divergence distance
Approach: negative samples
experiments: building synthetic datasets

  • simulate "game" scenarios of forming a team of players for passing game stage
    experiments: synthetic datasets
    experiments: synthetic datasets results
    task: goal prediction and context recommendation
conclusion
  • new problems representation learning for behavior context items and goals for success prediction

Multi-Relational Classification via Bayesian Ranked Non-Linear Embeddings

Ahmed Rashed (University of Hildesheim); Josif Grabocka (University of Hildesheim); Lars Schmidt-Thieme (University of Hildesheim) Learning Network-to-Network Model for Content-rich Network Embedding Zhicheng He (Nankai University); Jie Liu (Nankai University); Na Li (Nankai University); Yalou Huang (Nankai University)

intro

multi-relational settings: social network, citation networks, biological interactive

goals: predicting the missing edges

main challenges
  • extremely sparse relation
  • attributed entities
  • proposed method bayesian ranked non-linear embedding

multi-relational classification

related work
  • attribute-aware models
  • non-attribute-aware models
  • bayesian ranked non-linear model

step1: non-linear embeddings
step2: scoring function
step3: bayesian personalized ranking

data: Cora, Citeseer, PPI, email-Eu-core

conclusion
  • propose Bayesian ranked non-linear embeddings(BRNLE)
  • easily extendable for an arbitrary number od relation of entities
0
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?