**Are you an EPFL student looking for a semester project?**

Work with us on data science and visualisation projects, and deploy your project as an app on top of GraphSearch.

Person# Zhiwen Jiang

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Related units

Loading

Courses taught by this person

Loading

Related research domains

Loading

Related publications

Loading

People doing similar research

Loading

Courses taught by this person

Related research domains

People doing similar research

No results

No results

No results

Related publications (1)

Loading

Related units (3)

In multiple testing problems where the components come from a mixture model of noise and true effect, we seek to first test for the existence of the non-zero components, and then identify the true alternatives under a fixed significance level $\alpha$. Two parameters, namely the fraction of the non-null components $\varepsilon$ and the size of the effects $\mu$, characterise the two-point mixture model under the global alternative. When the number of hypotheses $m$ goes to infinity, we are interested in an asymptotic framework where the fraction of the non-null components is vanishing, and the true effects need to be sizable to be detected. Donoho and Jin give an explicit form of the asymptotic detectable boundary based on the Gaussian mixture model under the classic calibration of the parameters of the mixture model. We prove the analogous results for the Cauchy mixture distribution as an example heavy-tailed case. This requires a different formulation of the parameters, which reflects the added difficulties.
We also propose a multiple testing procedure based on a filtering approach that can discover the true alternatives.
Benjamini and Hochberg (BH) compare the observed $p$-values to a linear threshold curve and reject the null hypotheses from the minimum up to the last up-crossing, and prove the false discovery rate (FDR) is controlled.
However, there is an intrinsic difference in heavy-tailed settings. Were we to use the BH procedure we would get a highly variable positive false discovery rate (pFDR). In our study we analyse the distribution of the $p$-values and devise a new multiple testing procedure to combine the usual case and the heavy-tailed case based on the empirical properties of the $p$-values. The filtering approach is designed to eliminate most $p$-values that are more likely to be uniform, while preserving most of the true alternatives. Based on the filtered $p$-values, we estimate the mode $\vartheta$ and define the rejection region $\mathscr{R}(\vartheta, \delta)=\left[ \vartheta -\delta/2, \vartheta +\delta/2 \right]$ such that the most informative $p$-values are included. The length $\delta$ is chosen by controlling the data-dependent estimation of FDR at a desired level.