#Applied Data Science Track Session ADS9: E-commerce and Advertising Chair: Anne Kao (Boeing)
SMOILE: A Shopper Marketing Optimization and Inverse Learning Engine
Abhilash Reddy Chenreddy (University of Illinois at Chicago); Parshan Pakiman (University of Illinois at Chicago); Selvaprabu Nadarajah (Information and Decision Sciences); Ranganathan Chandrasekaran (Information and Decision Sciences); Rick Abens (Foresight ROI, Inc.)
shopper marketing
pre-store-tactics
in-store-tactics
sales volumn after marketing
Why shopper Marketing important
- it is one of the fatsest-growing of marketing for consumer packaged goods
- it explains 3% to 5% of the total lift
- it accounts for 3% to 13% of the total marketing budget
- practically, mining historical lift and SM tactics as well as designing SM campaigns are challenging problems
How Brands Design Marketing Campaigns?
Historical data
planning for SM Tactics: sequential decision-making problem over a finite planning horizon
business constraints
Future SM campaign
planning for SM tactics
How traditional Models work?
marketing/media-mix model is widely used for lift attribution
they can become hard-to-estimate when various business constraints are active.
SMOILE Model
- planing marketing campaigns
- data generation process
modeling lift
related contribution
- empirical optimization
- data-driven optimization
modeling lift
- we model lift as the summation of two parametric functions encoding effect of SM and non-SM factors on the total lift via linear combination of features of SM and non-SM factors.
lift attribution inverse learning and tactic planning optimization
data
- frozen breakfast and wings
- two major retailers in the US
- 11 and 9 SM tactics
SMOILE: performance on test set
SMOILE
- is an integrated framework that merges multiple sources of data to attribute shopper marketing lift and design marketing campaigns
- leverafes the structure of data generation processto compute lift and design SM campains that are consistent with datageneretion process
- leverages comsumer behavior to fit better models of lift avoiding spurious results
- streamlines the implementation of mining lift and designing SM campaigns
- can be efficiently solved via commercial optimization solvers
Two-Sided Fairness for Repeated Matchings in Two-Sided Markets: A Case Study of a Ride-Hailing Platform
Tom Sühr (Max Planck Institute for Software Systems); Asia J. Biega (Microsoft Research); Meike Zehlike (Max Planck Institute for Software Systems); Krishna P. Gummadi (Max Planck Institute for Software Systems); Abhijnan Chakraborty (Max Planck Institute for Software Systems)
we use two-sided platforms in our everyday life
- e-commerce
- multimedia streaming
- ride-hailing
our focus: ride-hailing platform
what about drivers platform?
concerns about drivers in the ride-hailing platform industry
potential for issues on the driver side
potential for inequality
inequality in our dataset
modeling a two-sided ride-hailing platform
modeling utility for both sides
brief history of fair matching long lines of works on fairness in matching markets
- school admissions
- ...
fairness of repeated matching
- amortized parity
- amortized proportionality
methods od matching drivers & customers
- nearest driver first (NDF)
- worst-off driver first (WDF)
How Fair are the One-Sided methods WDF and NDF?
- IN WDF, the inequality in driver income decrease
- bu it results in lower average income of the drivers
our proposal: take two sides together
- optimizinfg for common inequality measures directly is practically infeasible
- instead minimize the difference ...
how does our two-sided method perform?
does the waiting time for the passengers increase?
Reserve Price Failure Rate Prediction with Header Bidding in Display Advertising
Achir Kalra (Forbes Media LLC); Chong Wang (S&P Global); Cristian Borcea (New Jersey Institute of Technology); Yi Chen (New Jersey Institute of Technology)
display advertising is a big business
real-time bidding
- publisher <> ad exchange <> real-time bidding advertisers
impression revenue in second-price auction
data censorship problem
problem definition & proposed solution
solution: survival analysis model
parametric survival model
pairwise interaction tensor factorization
features:
- user: ids store, os, browser, network bandwidth, and devices
- ad placement ad unit size and ad-position
- page: URL, channel sections, if the page is trending pages
- contexts: hour of a day, and referrer URL
header bidding regularization
negative log likelihood
data and implementation
dataset: ~16M impression collected by Forbes
eval:
- Weibull distribution works best for the proposed model
- The proposed model significantly outperforms the baselines
eval on header bids Only
conclusion
- proposed parametric survival model to predict the failure rate if reserve price of on online display ad impression in on ad exchange auction
- the model is augmented bu pairwise interaction tensor factorization and header bidding regularization
- the experimental results show that the proposed models with the Weibull distribution significantly outperforms the comparison systems
- our model can be adopted by a majority of online publishers because they can collect similar data.
The Identification and Estimation of Direct and Indirect Effects in A/B Tests through Causal Mediation Analysis
Xuan Yin (Etsy, Inc.); Liangjie Hong (Etsy, Inc.)
background:
user engagements of different products can be causally dependent
examples of online products:
- organic search and promoted listing
intro:
- we see casual dependency from A/B test results
induced change:
a change in one product would induce users to change their behaviors in other products
the most popular KPI is ATE from A/B tests
suppose the underlying causal mechanizu is like
rec module, search, conversion
Questions
- does ATE on conversion truly measure the contribution of rec module change to the marketplace?
- Is ATE on conversion still a good KPI for rec module?
- shall we just ignore the induced reduction in user engagement of search?
Problem of Funnel analysis
- too heuristic No foundation
- ambiguous
- which place shall get the point
- too narrow
- shall rec module get any point?
it may destroy the causal interpretation of experimental results
it subsets the experimental results based on post-treatment criteria
direct Indirect effect
how about we split ATE to two parts: direct effect and indirect effect?
use direct effect on conversion as KPI
Introduction to potential Outcome Framework
causal identification
assumoptions > causal effects
identification in Rubin causal model the model behind A/B test
identification of ATE
strong ignorability and SUTVA > ATE
casual mediation analysis(CMA)
sequantial ignorability and SUTVA
we cannot CMA Directly in A/B tests
- multiple unmeasured causually-dependent mediators in A/B tests break SI and invarialant ...
what we do
- the literature of CMA is only a starting point
- we proposenew measures for direct and indirect effects
- we work out the assumptions that leadto new measure
- we do the estimation and hypothesis testing using real data
- we prove that
generalize CMA
- generalized SI and LSEM > GADE and GACME
Take-aways
- user engagement of different products can be casually dependent
- the current popular KPI in A/B tests, ATE (on Conversion) is undesirable to evaluate product change
- Tight attribution metric from funnel analysis is not causally interpretable
- GADE and GACME are better KPI for evaluation purposes
- They can be identified and easily estimated and testes in practice
Personalized Purchase Prediction of Market Baskets with Wasserstein-Based Sequence Matching
Mathias Kraus (ETH Zurich); Stefan Feuerriegel (ETH Zurich)
definition of market basket and purchase history
this work aims to identify similar customers to predict future purchases
use approach uses simple K-nearest search to find similar customers
what is the distance between two purchase which we use for KNN?
calculated 3 step
- product embedding + cosine similarity
- wasserstein distance
- Dynamic time warping
1, cosine similarity
- obtain a multi-dimensional vector fir each product
- similar products are close to each other such as substitutes like red and white wine.
2, Wasserstein distance
- the Wasserstein distance measures the distance between two sets of products
- it measures the minimum amount of distance the embedded products
- previously utilized in NLP as a distance measure between document
3, Dynamic time warping
- we utilized a customized form of dynamic time warping to find similarity between (sub-)
- sequences of market baskets
- DTW has been shown to be a powerful distance measure between time series
based on the nearest neighbors, we make the prediction of the next market basket
data
- simplified Instacart
- product-level Instacart
- Ta-feng grocery dataset
conclusion
- combination of dynamic time warping for subsequence matching and the Wasserstein