Computer Science and Engineering, Department of

 

Date of this Version

6-16-2022

Citation

arXiv:2206.07209v1 [cs.DS] 14 Jun 2022

Comments

CC-BY

Abstract

Total variation distance (TV distance) is a fundamental notion of distance between probability distributions. In this work, we introduce and study the computational problem of determining the TV distance between two product distributions over the domain {0, 1}n. We establish the following results.

1. Exact computation of TV distance between two product distributions is #P-complete. This is in stark contrast with other distance measures such as KL, Chi-square, and Hellinger which tensorize over the marginals.

2. Given two product distributions P and Q with marginals of P being at least 1/2 and marginals of Q being at most the respective marginals of P, there exists a fully polynomial-time randomized approximation scheme (FPRAS) for computing the TV distance between P and Q. In particular, this leads to an efficient approximation scheme for the interesting case when P is an arbitrary product distribution and Q is the uniform distribution.

We pose the question of characterizing the complexity of approximating the TV distance between two arbitrary product distributions as a basic open problem in computational statistics.

Share

COinS