Computing, School of

 

School of Computing: Faculty Publications

Accessibility Remediation

If you are unable to use this item in its current form due to accessibility barriers, you may request remediation through our remediation request form.

Document Type

Article

Date of this Version

6-16-2022

Citation

arXiv:2206.07209v1 [cs.DS] 14 Jun 2022

Comments

CC-BY

Abstract

Total variation distance (TV distance) is a fundamental notion of distance between probability distributions. In this work, we introduce and study the computational problem of determining the TV distance between two product distributions over the domain {0, 1}n. We establish the following results.

1. Exact computation of TV distance between two product distributions is #P-complete. This is in stark contrast with other distance measures such as KL, Chi-square, and Hellinger which tensorize over the marginals.

2. Given two product distributions P and Q with marginals of P being at least 1/2 and marginals of Q being at most the respective marginals of P, there exists a fully polynomial-time randomized approximation scheme (FPRAS) for computing the TV distance between P and Q. In particular, this leads to an efficient approximation scheme for the interesting case when P is an arbitrary product distribution and Q is the uniform distribution.

We pose the question of characterizing the complexity of approximating the TV distance between two arbitrary product distributions as a basic open problem in computational statistics.

Share

COinS