Heisenberg uncertainty principle: a Bayesian perspective part I

While I was at QPL presenting

Benavoli, Alessio, Facchini, Alessandro, Zaffalon, Marco: Quantum mechanics: The Bayesian theory generalized to the space of Hermitian matrices. In: Physics Review A, 94 , pp. 042106, 2016.

I had a question from the audience about whether/how we can derive “Heisenberg inequality” as a consequence of our subjective (gambling) formulation of QM. This is not complicated since Heisenberg inequality is just the QM version of Covariance Inequality  which states that for any two random variables $X$ and $Y$

$$Cov(X,Y)^2\leq Var(X)Var(Y)$$

Therefore, before deriving Heisenberg uncertainty, I will show how to derive the above inequality from a Bayesian perspective.
To explain the inequality from a subjective point of view, we introduce our favorite subject, Alice. Let us assume that there two real variables $X,Y$ and that Alice only knows their first $\mu_x=E(X),\mu_y=E(Y)$ and second $E(X^2),E(Y^2)$ moments (in other words she only knows their means and variances, since $Var(Z)=E(Z^2)-E(Z)^2=\sigma_z^2$).

Assume Alice wants to compute $Cov(X,Y)$.

Since Alice does not know the joint probability distribution $P(X,Y)$ of $X,Y$ (she only knows the first two moments), she cannot compute $Cov(X,Y)$. However, she can compute bounds for $Cov(X,Y)$, or in other words, she can aim to solve the following problem
$$
\begin{array}{l}
~\max_{P} \int (X-\mu_x)(Y-\mu_y) dP(X,Y)\\
\int X dP(X,Y)=\mu_x\\
\int Y dP(X,Y)=\mu_y\\
\int X^2 dP(X,Y)=\mu_x+\sigma_x^2\\
\int Y^2 dP(X,Y)=\mu_y+\sigma_y^2\\
\end{array}
$$
This means she aims to find the maximum value of the expectation of $(X-\mu_x)(Y-\mu_y) $ among all the probability distributions that are compatible with her beliefs on $X,Y$ (the knowledge of the means and variances). She can similarly compute the minimum. This is the essence of Imprecise Probability.

To compute these bounds she first rewrites the above problem as
\begin{equation}
\label{eq:1}
\begin{array}{l}
\text{opt}_{P} C=\int (X-\mu_x)^2 +a^2(Y-\mu_y)^2 -2a (X-\mu_x)(Y-\mu_y) dP(X,Y)\\
\int X dP(X,Y)=\mu_x\\
\int Y dP(X,Y)=\mu_y\\
\int X^2 dP(X,Y)=\mu_x+\sigma_x^2\\
\int Y^2 dP(X,Y)=\mu_y+\sigma_y^2\\
\end{array}
\end{equation}
where $a$ is some scalar. Note that since $\int (X-\mu_x)^2dP(X,Y)$ and $\int (Y-\mu_y)^2dP(X,Y)$ are known (they are respectively equal to $\sigma_x^2$ and $\sigma_y^2$), adding these terms “does not change” the optimization problem (the $P$ that achieves the maximum is the same — they are just additive constants). If we assume that $a$ is positive, then the same is true for the $a$ that multiplies $(X-\mu_x)(Y-\mu_y)$ provided that $opt=\min$ (otherwise, this is true provided that $opt=\max$).

Now observe that
$$
C=(X-\mu_x)^2 +a^2(Y-\mu_y)^2 -2a (X-\mu_x)(Y-\mu_y) =[X-\mu_x-a(Y-\mu_y)]^2
$$
and, therefore, we can conclude that $\int (X-\mu_x)^2 +a^2(Y-\mu_y)^2 -2a (X-\mu_x)(Y-\mu_y) dP(X,Y)$ is always non-negative for every $a,~P$.

She has to solve the above constrained optimization problem. For the moment let us forget the constraints.
Let us assume that we can take $a$ as a function of $P$, then the unconstrained maximum can be obtained by computing the derivative of the objective function w.r.t. $a$ and solving
$$
\frac{d}{da}C=\int 2a(Y-\mu_y)^2 -2 (X-\mu_x)(Y-\mu_y) dP(X,Y)=0
$$
whose solution is $a=\frac{E[(X-\mu_x)(Y-\mu_y)]}{E[(Y-\mu_y)^2]}=\frac{Cov(X,Y)}{\sigma_y^2}$. Since the second derivative of $C$ is non-negative, this is a minimum.

If we choose $a$ in this way then we have that
\begin{equation}
\begin{array}{rcl}
0&\leq& \int (X-\mu_x)^2 + \Big(\frac{Cov(X,Y)}{\sigma_y^2}\Big)^2(Y-\mu_y)^2 +2\Big(\frac{Cov(X,Y)}{\sigma_y^2}\Big) (X-\mu_x)(Y-\mu_y) dP(X,Y)\\
&=&\sigma_x^2+\frac{Cov(X,Y)^2}{\sigma_y^2}-2\frac{Cov(X,Y)^2}{\sigma_y^2}
\end{array}
\end{equation}
Since we allowed $a$ to depend on $P$, we cannot find a better minimum.
Hence, we can derive that
$$
Cov(X,Y)^2\leq \sigma_x^2\sigma_y^2
$$
Note that to obtain the above inequality we have chosen a $P(X,Y)$ that satisfies $\int (X-\mu_x)^2 dP(X,Y)=\sigma_x^2$
and $\int (Y-\mu_y)^2 dP(X,Y)=\sigma_y^2$ and, therefore, it satisfies the constraints. This ends the proof.

Share on FacebookShare on Google+Tweet about this on TwitterShare on LinkedIn

Leave a Reply

Your email address will not be published. Required fields are marked *

*