More than 5 years have passed since last update.

KDD2019 Applied Data Science Track Session ADS9: E-commerce and Advertising

Last updated at 2019-08-12Posted at 2019-08-08

Applied Data Science Track Session ADS9: E-commerce and Advertising Chair: Anne Kao (Boeing)

SMOILE: A Shopper Marketing Optimization and Inverse Learning Engine

Abhilash Reddy Chenreddy (University of Illinois at Chicago); Parshan Pakiman (University of Illinois at Chicago); Selvaprabu Nadarajah (Information and Decision Sciences); Ranganathan Chandrasekaran (Information and Decision Sciences); Rick Abens (Foresight ROI, Inc.)

shopper marketing

pre-store-tactics
in-store-tactics

sales volumn after marketing
Why shopper Marketing important

it is one of the fatsest-growing of marketing for consumer packaged goods
it explains 3% to 5% of the total lift
it accounts for 3% to 13% of the total marketing budget
practically, mining historical lift and SM tactics as well as designing SM campaigns are challenging problems

How Brands Design Marketing Campaigns?

Historical data
planning for SM Tactics: sequential decision-making problem over a finite planning horizon
business constraints
Future SM campaign

planning for SM tactics
How traditional Models work?

marketing/media-mix model is widely used for lift attribution
they can become hard-to-estimate when various business constraints are active.

SMOILE Model

planing marketing campaigns
data generation process

modeling lift

related contribution

empirical optimization
data-driven optimization

modeling lift

we model lift as the summation of two parametric functions encoding effect of SM and non-SM factors on the total lift via linear combination of features of SM and non-SM factors.

lift attribution inverse learning and tactic planning optimization

data

frozen breakfast and wings
two major retailers in the US
11 and 9 SM tactics

SMOILE: performance on test set

SMOILE

is an integrated framework that merges multiple sources of data to attribute shopper marketing lift and design marketing campaigns
leverafes the structure of data generation processto compute lift and design SM campains that are consistent with datageneretion process
leverages comsumer behavior to fit better models of lift avoiding spurious results
streamlines the implementation of mining lift and designing SM campaigns
can be efficiently solved via commercial optimization solvers

Two-Sided Fairness for Repeated Matchings in Two-Sided Markets: A Case Study of a Ride-Hailing Platform

Tom Sühr (Max Planck Institute for Software Systems); Asia J. Biega (Microsoft Research); Meike Zehlike (Max Planck Institute for Software Systems); Krishna P. Gummadi (Max Planck Institute for Software Systems); Abhijnan Chakraborty (Max Planck Institute for Software Systems)

we use two-sided platforms in our everyday life

e-commerce
multimedia streaming
ride-hailing

our focus: ride-hailing platform
what about drivers platform?
concerns about drivers in the ride-hailing platform industry
potential for issues on the driver side
potential for inequality
inequality in our dataset
modeling a two-sided ride-hailing platform
modeling utility for both sides

brief history of fair matching long lines of works on fairness in matching markets

school admissions
...

fairness of repeated matching

amortized parity
amortized proportionality

methods od matching drivers & customers

nearest driver first (NDF)
worst-off driver first (WDF)

How Fair are the One-Sided methods WDF and NDF?

IN WDF, the inequality in driver income decrease
bu it results in lower average income of the drivers

our proposal: take two sides together

optimizinfg for common inequality measures directly is practically infeasible
instead minimize the difference ...

how does our two-sided method perform?
does the waiting time for the passengers increase?

Reserve Price Failure Rate Prediction with Header Bidding in Display Advertising

Achir Kalra (Forbes Media LLC); Chong Wang (S&P Global); Cristian Borcea (New Jersey Institute of Technology); Yi Chen (New Jersey Institute of Technology)

display advertising is a big business
real-time bidding

publisher <> ad exchange <> real-time bidding advertisers

impression revenue in second-price auction

data censorship problem
problem definition & proposed solution

solution: survival analysis model
parametric survival model
pairwise interaction tensor factorization
features:

user: ids store, os, browser, network bandwidth, and devices
ad placement ad unit size and ad-position
page: URL, channel sections, if the page is trending pages
contexts: hour of a day, and referrer URL

header bidding regularization
negative log likelihood

data and implementation
dataset: ~16M impression collected by Forbes

eval:

Weibull distribution works best for the proposed model
The proposed model significantly outperforms the baselines

eval on header bids Only

conclusion

proposed parametric survival model to predict the failure rate if reserve price of on online display ad impression in on ad exchange auction
the model is augmented bu pairwise interaction tensor factorization and header bidding regularization
the experimental results show that the proposed models with the Weibull distribution significantly outperforms the comparison systems
our model can be adopted by a majority of online publishers because they can collect similar data.

The Identification and Estimation of Direct and Indirect Effects in A/B Tests through Causal Mediation Analysis

Xuan Yin (Etsy, Inc.); Liangjie Hong (Etsy, Inc.)

background:

user engagements of different products can be causally dependent

examples of online products:

organic search and promoted listing

intro:

we see casual dependency from A/B test results

induced change:

a change in one product would induce users to change their behaviors in other products

the most popular KPI is ATE from A/B tests
suppose the underlying causal mechanizu is like
rec module, search, conversion

Questions

does ATE on conversion truly measure the contribution of rec module change to the marketplace?
Is ATE on conversion still a good KPI for rec module?
shall we just ignore the induced reduction in user engagement of search?

Problem of Funnel analysis

too heuristic No foundation
ambiguous
which place shall get the point
too narrow
shall rec module get any point?

it may destroy the causal interpretation of experimental results
it subsets the experimental results based on post-treatment criteria
direct Indirect effect
how about we split ATE to two parts: direct effect and indirect effect?
use direct effect on conversion as KPI

Introduction to potential Outcome Framework

causal identification
assumoptions > causal effects

identification in Rubin causal model the model behind A/B test

identification of ATE
strong ignorability and SUTVA > ATE

casual mediation analysis(CMA)
sequantial ignorability and SUTVA
we cannot CMA Directly in A/B tests

multiple unmeasured causually-dependent mediators in A/B tests break SI and invarialant ...

what we do

the literature of CMA is only a starting point
we proposenew measures for direct and indirect effects
we work out the assumptions that leadto new measure
we do the estimation and hypothesis testing using real data
we prove that

generalize CMA

generalized SI and LSEM > GADE and GACME

Take-aways

user engagement of different products can be casually dependent
the current popular KPI in A/B tests, ATE (on Conversion) is undesirable to evaluate product change
Tight attribution metric from funnel analysis is not causally interpretable
GADE and GACME are better KPI for evaluation purposes
They can be identified and easily estimated and testes in practice

Personalized Purchase Prediction of Market Baskets with Wasserstein-Based Sequence Matching

Mathias Kraus (ETH Zurich); Stefan Feuerriegel (ETH Zurich)

definition of market basket and purchase history
this work aims to identify similar customers to predict future purchases

use approach uses simple K-nearest search to find similar customers
what is the distance between two purchase which we use for KNN?

calculated 3 step

product embedding + cosine similarity
wasserstein distance
Dynamic time warping

1, cosine similarity

obtain a multi-dimensional vector fir each product
similar products are close to each other such as substitutes like red and white wine.

2, Wasserstein distance

the Wasserstein distance measures the distance between two sets of products
it measures the minimum amount of distance the embedded products
previously utilized in NLP as a distance measure between document

3, Dynamic time warping

we utilized a customized form of dynamic time warping to find similarity between (sub-)
sequences of market baskets
DTW has been shown to be a powerful distance measure between time series

based on the nearest neighbors, we make the prediction of the next market basket

data

simplified Instacart
product-level Instacart
Ta-feng grocery dataset

conclusion

combination of dynamic time warping for subsequence matching and the Wasserstein

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up