Université Paris-Est Université Paris-Est - Marne-la-Vallée Université Paris-Est - Créteil Val-de-Marne Centre National de la Recherche Scientifique

Bayesian nonparametric inference for discovery probabilities: credible intervals and large sample asymptotics

04/10/2016 - 13:45 - 14:45
P2 P43
ARBEL Julyan

Joint work with Stefano Favaro (University of Torino) ; Bernardo Nipoti (Trinity College Dublin) ; Yee Whye Teh (University of Oxford)

Given a sample of size $n$ from a population of individuals belonging to different species with unknown proportions, a popular problem of practical interest consists in making inference on the probability $D_{n}(l)$ that the $(n+1)$-th draw coincides with a species with frequency $l$ in the sample, for any $l=0,1,\ldots,n$. This paper contributes to the methodology of Bayesian nonparametric inference for $D_{n}(l)$. Specifically, under the general framework of Gibbs-type priors we show how to derive credible intervals for a Bayesian nonparametric estimation of $D_{n}(l)$, and we investigate the large $n$ asymptotic behaviour of such an estimator. Of particular interest are special cases of our results obtained under the specification of the two parameter Poisson--Dirichlet prior and the normalized generalized Gamma prior, which are two of the most commonly used Gibbs-type priors. With respect to these two prior specifications, the proposed results are illustrated through a simulation study and a benchmark Expressed Sequence Tags dataset. To the best our knowledge, this illustration provides the first comparative study between the two parameter Poisson--Dirichlet prior and the normalized generalized Gamma prior in the context of Bayesian nonparemetric inference for $D_{n}(l)$.