**Are you an EPFL student looking for a semester project?**

Work with us on data science and visualisation projects, and deploy your project as an app on top of GraphSearch.

Publication# FANOK: Knockoffs in Linear Time

Abstract

We describe a series of algorithms that efficiently implement Gaussian model-X knockoffs to control the false discovery rate on large-scale feature selection problems. Identifying the knockoff distribution requires solving a large-scale semidefinite program for which we derive several efficient methods. One handles generic covariance matrices and has a complexity scaling as O(p(3)), where p is the ambient dimension, while another assumes a rank-k factor model on the covariance matrix to reduce this complexity bound to O(pk(2)). We review an efficient procedure to estimate factor models and show that under a factor model assumption, we can sample knockoff covariates with complexity linear in the dimension. We test our methods on problems with p as large as 500 000.

Official source

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Related concepts (10)

Related publications (1)

Covariance matrix

In probability theory and statistics, a covariance matrix (also known as auto-covariance matrix, dispersion matrix, variance matrix, or variance–covariance matrix) is a square matrix giving the covariance between each pair of elements of a given random vector. Any covariance matrix is symmetric and positive semi-definite and its main diagonal contains variances (i.e., the covariance of each element with itself). Intuitively, the covariance matrix generalizes the notion of variance to multiple dimensions.

Time complexity

In computer science, the time complexity is the computational complexity that describes the amount of computer time it takes to run an algorithm. Time complexity is commonly estimated by counting the number of elementary operations performed by the algorithm, supposing that each elementary operation takes a fixed amount of time to perform. Thus, the amount of time taken and the number of elementary operations performed by the algorithm are taken to be related by a constant factor.

Problem solving

Problem solving is the process of achieving a goal by overcoming obstacles, a frequent part of most activities. Problems in need of solutions range from simple personal tasks (e.g. how to turn on an appliance) to complex issues in business and technical fields. The former is an example of simple problem solving (SPS) addressing one issue, whereas the latter is complex problem solving (CPS) with multiple interrelated obstacles.

In 1971, the first microprocessor produced in mass production had 2300 transistor and was able to compute 60'000 operations per second at 740 kHz. Nowadays, even a common gaming console uses a central