A Kernelized Stein Discrepancy for Goodness-of-fit Tests

1. Summary

This paper derive a new discrepancy statistic for measuring differences between two probability distributions based on combining Stein’s identity with the reproducing kernel Hilbert space theory.

2. Stein Discrepancy Measure

For two smooth densities and supported on are identical if and only if

for smooth functions f (x) with proper zero-boundary conditions, where

is the stein score function of . Therefore, one can define a Stein discrepancy measure between and via

where F is a set of smooth functions that satisfies and is also rich enough to ensure whenever . The problem is that is often computationally intractable.

3. Kernelized Stein Discrepancy

This paper propose a simpler method for obtaining computational tractable Stein discrepancy by taking to be the unit ball in the reproducing kernel Hilbert space associated with a smooth positive definite kernel , and the associated Stein discrepancy is defined as

where are i.i.d. random variables drawn from and is a function depends on only through the score function which can be calculated efficiently even when has an intractable normalization constant. Specifically, assuming

4. Estimate Kernelized Stein Discrepancy

With an i.i.d. sample drawn from the (unknown) , the kernel Stein discrepancy also enables efficient empirical estimation of via a -statistic,

The distribution of can be well characterized using the theory of -statistics,

Theorem. Let be a positive definite kernel in the Stein class of and . With some mild conditions, and we have

  • If , then is asymptotically normal with

where and

  • If , then we have (the -statistics is degenerate) and

where are i.i.d. standard Gaussian random variables, and are the eigenvalues of kernel under that is, they are the solutions of

for non-zero .

The above theorem allows us to reduce the testing of to the following hypothesis testing.

results matching ""

    No results matching ""