1. Exploring the connection beteween proximities and predictions in Random Forests

 

ABSTRACT

Random Forests (FR) is an ensemble method for regression, classification and recently for survival analysis. This note explores the connections between the voting scheme for prediction and the proximities, generated by RF output. We state that a kernel type estimator, based on the proximities, is equivalent to a weighted voting scheme for prediction.

  1. Empirical saddlepoint approximations for the quantile of the distribution of the sample mean

 

ABSTRACT

In this paper, an approximation for the quantile of the distribution of the sample mean of independent and equally distributed observations is derived from the inversion of the Lugannani-Rice formula. As the inversion involves unknown quantities which depend on the cumulant generating function (c.g.f) of the population, the empirical c.g.f will be used to estimate them; this will yield to an explicit empirical formula for quantile approximation. Its accuracy is inspected, both theoretically and numerically.