In a previous post we derived the Covariance Inequality from a Bayesian (Imprecise probability) perspective.
There is another and more elegant way to derive this inequality:
$$Cov(X,Y)^2\leq Var(X)Var(Y)$$
To do that, we introduce again our favorite subject, Alice. Let us summarize the problem again. Assume that there two real variables $X,Y$ and that Alice only knows their first $\mu_x=E(X),\mu_y=E(Y)$ and second $E(X^2),E(Y^2)$ moments (in other words she only knows their means and variances, since $Var(Z)=E(Z^2)-E(Z)^2=\sigma_z^2$).
Alice wants to compute $Cov(X,Y)$.
Since Alice does not know the joint probability distribution $P(X,Y)$ of $X,Y$ (she only knows the first two moments), she cannot compute $Cov(X,Y)$. However, she can compute bounds for $Cov(X,Y)$, or in other words, she can aim to solve the following problem
$$
\begin{array}{l}
~\max_{P} E[(X-\mu_x)(Y-\mu_y) ]=\int (X-\mu_x)(Y-\mu_y) dP(X,Y)\\
E[X]=\int X dP(X,Y)=\mu_x\\
E[Y]=\int Y dP(X,Y)=\mu_y\\
E[X^2]=\int X^2 dP(X,Y)=\mu_x+\sigma_x^2\\
E[Y^2]=\int Y^2 dP(X,Y)=\mu_y+\sigma_y^2\\
\end{array}
$$
This means she aims to find the maximum value of the expectation of $(X-\mu_x)(Y-\mu_y) $ among all the probability distributions that are compatible with her beliefs on $X,Y$ (the knowledge of the means and variances). She can similarly compute the minimum. This is the essence of Imprecise Probability.
To compute that, we first observe that
$$
\begin{bmatrix}
X-\mu_x\\
Y-\mu_y
\end{bmatrix}\begin{bmatrix}
X-\mu_x & Y-\mu_y\\
\end{bmatrix}=
\begin{bmatrix}
(X-\mu_x)^2 & (X-\mu_x)(Y-\mu_y)\\
(Y-\mu_y)(X-\mu_x) & (Y-\mu_y)^2\\
\end{bmatrix}\geq 0
$$
where $\geq$ means that the matrix is positive semi-definite. We know that the expectation operator preserves the sign and, therefore,
$$
E\left(
\begin{bmatrix}
(X-\mu_x)^2 & (X-\mu_x)(Y-\mu_y)\\
(Y-\mu_y)(X-\mu_x) & (Y-\mu_y)^2\\
\end{bmatrix}\right)\geq 0
$$
but, because of linearity of expectation, we have that
$$
\begin{aligned}
&E\left(\begin{bmatrix}
(X-\mu_x)^2 & (X-\mu_x)(Y-\mu_y)\\
(Y-\mu_y)(X-\mu_x) & (Y-\mu_y)^2\\
\end{bmatrix}\right)\\ &~\\
&=
\begin{bmatrix}
E[(X-\mu_x)^2] & E[(X-\mu_x)(Y-\mu_y)]\\
E[(Y-\mu_y)(X-\mu_x)] & E[(Y-\mu_y)^2]\\
\end{bmatrix}\geq 0.
\end{aligned}
$$
This matrix is positive semi-definite provided that
$$
E[(X-\mu_x)(Y-\mu_y)]^2\leq E[(X-\mu_x)^2] E[(Y-\mu_y)^2]
$$
which is exactly
$$Cov(X,Y)^2\leq Var(X)Var(Y)$$.