Sankhya A

Concentration in the Generalized Chinese Restaurant Process – 2018

with A. Pereira (UFAL) and R. I. Oliveira (IMPA)

Abstract

The Generalized Chinese Restaurant Process (GCRP) describes a sequence of exchangeable random partitions of the numbers \{1,\dots,n\}. This process is related to the Ewens sampling model in Genetics and to Bayesian nonparametric methods such as topic models. In this paper, we study the GCRP in a regime where the number of parts grows like n^{\alpha} with α>0. We prove a non-asymptotic concentration result for the number of parts of size k=o(n^{\alpha / (2\alpha+4)}/(\log n)^{1/(2+\alpha)}). In particular, we show that these random variables concentrate around c_kV_{*}n^{\alpha} where V_{*}n^{\alpha} is the asymptotic number of parts and c_k\approx k^{-(1+\alpha)} is a positive value depending on k. We also obtain finite-n bounds for the total number of parts. Our theorems complement asymptotic statements by Pitman and more recent results on large and moderate deviations by Favaro, Feng and Gao

Links

You may find the PDF file at

ArXiv
Sankhya A

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: