Deep direct likelihood knockoffs Conference Paper


Authors: Sudarshan, M.; Tansey, W.; Ranganath, R.
Title: Deep direct likelihood knockoffs
Conference Title: 2020 Conference on Neural Information Processing Systems (NeurIPS)
Abstract: Predictive modeling often uses black box machine learning methods, such as deep neural networks, to achieve state-of-the-art performance. In scientific domains, the scientist often wishes to discover which features are actually important for making the predictions. These discoveries may lead to costly follow-up experiments and as such it is important that the error rate on discoveries is not too high. Model-X knockoffs [2] enable important features to be discovered with control of the false discovery rate (FDR). However, knockoffs require rich generative models capable of accurately modeling the knockoff features while ensuring they obey the so-called “swap” property. We develop Deep Direct Likelihood Knockoffs (DDLK), which directly minimizes the KL divergence implied by the knockoff swap property. DDLK consists of two stages: it first maximizes the explicit likelihood of the features, then minimizes the KL divergence between the joint distribution of features and knockoffs and any swap between them. To ensure that the generated knockoffs are valid under any possible swap, DDLK uses the Gumbel-Softmax trick to optimize the knockoff generator under the worst-case swap. We find DDLK has higher power than baselines while controlling the false discovery rate on a variety of synthetic and real benchmarks including a task involving a large dataset from one of the epicenters of COVID-19. © 2020 Neural information processing systems foundation. All rights reserved.
Keywords: predictive modeling; false discovery rate; learning systems; machine learning methods; generative model; important features; deep neural networks; predictive analytics; state-of-the-art performance; large dataset; joint distributions; kl-divergence
Journal Title Advances in Neural Information Processing Systems
Volume: 33
Conference Dates: 2020 Dec 6-12
Conference Location: Virtual
ISBN: 1049-5258
Publisher: Neural Information Processing Systems Foundation  
Date Published: 2020-01-01
Language: English
PROVIDER: scopus
PMCID: PMC8096517
PUBMED: 33953523
DOI/URL:
Notes: Conference Paper -- Export Date: 1 July 2021 -- Source: Scopus
Citation Impact
MSK Authors
  1. Wesley Tansey
    15 Tansey
Related MSK Work