### Advantage of probability densities over states

Knowing the *probability density* of a system is more general than knowing the state of a system. It can be used in situations where, although a system might be in some definite state, only the probabilities that the system will be in one particular state or other is actually known.

So we are considering two forms of probability.

One relates to the definition of a **state**. A state is a superposition of eigenstates of some given Hermitian operator, and the squares of the coeffiicients are then the probabilities that the system will be measured to be in one eigenstate or another.

The other form - the probability density - relates to the **system** (rather than a state) and corresponds to the degree of ignorance about the system.

### Systems in pure states

A system is said to be in a **pure state** if we have complete knowledge about that system, meaning we know exactly which state it's in.

We can interpret this in terms of a probability density, where we would set one of its *eigenvalues* equal to *one* and all of the rest equal to *zero*. So we can say that if we know *for certain* that a system is in a particular state $\ket{\psi}$, then $\ket{\psi}$ must be *the one and only* eigenstate of the probability density of the system, $\boldsymbol{\rho}$ with eigenvalue equal to *one*. That is, we must have $$\boldsymbol{\rho}\ket{\psi} = \ket{\psi}$$

Any state, $\ket{\phi}$, orthogonal to $\ket{\psi}$, has probability *zero*, which means it must be an eigenvector with eigenvalue *zero*, $$\boldsymbol{\rho}\ket{\phi} = 0$$

Interestingly, we can write the probability density as a projection operator - the dyad $$\boldsymbol{\rho} = \ket{\psi}\bra{\psi}$$ since then, $$\begin{align*}\boldsymbol{\rho}\ket{\psi} &= \left(\ket{\psi}\bra{\psi}\right)\ket{\psi}\\&= \ket{\psi}\left(\braket{\psi}{\psi}\right)\\&= \ket{\psi}\left(1\right)\\&= \ket{\psi}\end{align*}$$ and for $\ket{\phi}$ orthogonal to $\ket{\psi}$, $$\begin{align*}\boldsymbol{\rho}\ket{\phi} &= \left(\ket{\psi}\bra{\psi}\right)\ket{\phi}\\&= \ket{\psi}\left(\braket{\psi}{\phi}\right)\\&= \ket{\psi}\left(0\right)\\&= 0\end{align*}$$

We use the probability density to calcluate the average of a Hermitian observable on a system in a pure state. Since *trace* is invariant, we can calculate it in *any* basis of the state-space, so we can choose one that contains the state $\ket{\psi}$, $\left\{\ket{i}: i=1,\cdots,N\right\}$ say. Then, $$\begin{align*}\left\langle M \right\rangle &= \mathrm{Tr}\left(\boldsymbol{\rho}M\right) & \text{ definition of average}\\&= \sum_{i=1}^{N}\bra{i}\boldsymbol{\rho}M\ket{i} & \text{ definition of trace}\\&= \sum_{i=1}^{N}\bra{i}\left(\ket{\psi}\bra{\psi}\right)M\ket{i} & \text{ since } \boldsymbol{\rho} = \ket{\psi}\bra{\psi}\\&= \sum_{i=1}^{N}\braket{i}{\psi}\bra{\psi}M\ket{i} &\\&= \bra{\psi}M\ket{\psi} & \text{ since } \braket{i}{\psi} = \left\{\begin{matrix}1 & \ket{i} = \ket{\psi}\\0 & \ket{i} \neq \ket{\psi}\end{matrix}\right.\end{align*}$$

So, for a system in a pure state, we get the same definition for the average density as we did when we didn't use the probability density.

The entropy of a system in a pure state can be calculated using the result for the average, $$\begin{align*}S &= - \mathrm{Tr}\left(\boldsymbol{\rho} \log \boldsymbol{\rho}\right)\\&= - \bra{\psi}\log \boldsymbol{\rho}\ket{\psi}\\&= - \bra{\psi}\log \rho_\psi\ket{\psi}\\&= - \braket{\psi}{\psi} \log \rho_\psi\\&= - 1 \cdot \log 1\\&= 0\end{align*}$$

### Systems in mixed states

A system is in a **mixed state** if we only have partial (or no) knowledge of the system. In terms of a probability density, $\boldsymbol{\rho}$ say, this means that more than one of its eigenvalues must be non-zero.

We can use *any* basis to calculate the average of a Hermitian observable, $M$, on a mixed state, but we may as well choose the basis of eigenvectors of $\boldsymbol{\rho}$, $\left\{\ket{e_i}:i=1,2,...,N\right\}$ say, since then $\boldsymbol{\rho}$ is a diagonal matrix.

We can resolve the identity n this basis $$\mathbf{I} = \sum_{i=1}^{N}\ket{e_i}\bra{e_i}$$

Then, $$\begin{align*}\left\langle M \right\rangle &= \mathrm{Tr}\left(\boldsymbol{\rho}M\right)\\&= \sum_{i=1}^{N}\bra{e_i}\boldsymbol{\rho}M\ket{e_i}\\&= \sum_{i=1}^{N}\bra{e_i}\boldsymbol{\rho}\mathbf{I}M\ket{e_i}\\&= \sum_{i=1}^{N}\bra{e_i}\boldsymbol{\rho}\left(\sum_{j=1}^{N}\ket{e_j}\bra{e_j}\right)M\ket{e_i}\\&= \sum_{i=1}^{N}\sum_{j=1}^{N}\bra{e_i}\boldsymbol{\rho}\ket{e_j}\bra{e_j}M\ket{e_i}\\&= \sum_{i=1}^{N}\bra{e_i}\boldsymbol{\rho}\ket{e_i}\bra{e_i}M\ket{e_i}\\&= \sum_{i=1}^{N}\rho_i\bra{e_i}M\ket{e_i}\end{align*}$$ which is the sum, over $\left\{i=1,2,...,N\right\}$, of the probability of the system to be in the $i^\text{th}$ state, multiplied by the average of $M$ in that state.

The entropy of a system in a mixed state is, $$\begin{align*}S &= - \mathrm{Tr}\left(\boldsymbol{\rho}\log \boldsymbol{\rho}\right)\\&= - \sum_{i=1}^{N}\rho_i\bra{e_i}\log \boldsymbol{\rho}\ket{e_i}\\&= - \sum_{i=1}^{N}\rho_i \log \rho_i\end{align*}$$

In the extreme case, where we have no knowledge at all about the system, implying that all of the probabilities are equal to $\tfrac{1}{N}$, the entropy is $$S = - \sum_{i=1}^{N}\frac{1}{N} \log \frac{1}{N} = \log N$$