### Density matrix

We need to be able to describe systems - which is always in some definite state - where the only information we have about the system is probabilistic. For example, an electron is prepared in a magnetic field, but you only are given a probability that it will be in the *up* state and a corresponding probability that it will be *down*.

In this situation, we cannot describe the system simply as a superposition of the possible states, since we need to *weigh* each state with this new associated probability. A superposition will describe the state that the system is actually in, but will not describe our knowledge of that state.

We need to replace the notion of a state with a density matrix in order to describe this. Let $\rho$ be a probability distribution on a space of states $\left\{\ket{i}: i=1,\cdots,N\right\}$, so that $\rho_i$ is the probability that the system will be the state $\ket{i}$.

Then, the associated **probability density matrix** can be written as a diagonal matrix $$\boldsymbol{\rho} = \begin{bmatrix}\rho_1 & 0 & \cdots & 0 \\0 & \rho_2 & \cdots & 0 \\\vdots & \vdots & \ddots & \vdots \\0 & 0 & \cdots & \rho_N\end{bmatrix}$$

This is the quantum mechanical analogue of a classical probability density. We note that

- $\boldsymbol{\rho}$ is a Hermitian matrix
- The $\rho_i$'s are the eigenvalues of $\boldsymbol{\rho}$, hence $\mathrm{Tr}(\boldsymbol{\rho}) = \sum_{i=1}^{N}\rho_i = 1$

### Average density

If we have *complete* knowledge about what state a system is in, $\ket{i}$ say, then we have previously defined the *average* of a given observable, $M$ say, to be the $ii^\text{th}$ matrix element of $M$, $$M_{ii} = \bra{i}M\ket{i}$$

Now suppose that although a system has been prepared in some definite state, we only have probabilistic information about the system, in the form of a *density matrix*, $\boldsymbol{\rho}$, then the **average density** of an observable $M$ is defined to be $$\left\langle M \right\rangle = \mathrm{Tr}(\boldsymbol{\rho}M)$$

If we choose a basis for the state-space, $\left\{\ket{i}: i=1,\cdots,N\right\}$, to be one for which $\boldsymbol{\rho}$ is a diagonal matrix, then the average can also be written as $$\left\langle M \right\rangle = \sum_{i=1}^{N}\rho_i\bra{i}M\ket{i}$$ where $\rho_i = \boldsymbol{\rho}_{ii}$ is the probability that the system will be in the $i^\text{th}$ state, since $$\begin{align*}\left\langle M \right\rangle &= \mathrm{Tr}(\boldsymbol{\rho}M)\\&= \mathrm{Tr}\left(\left[\sum_{j=1}^{N}\boldsymbol{\rho}_{ij}M_{jk}\right]_{i,k=1}^{N}\right)\\&= \sum_{i=1}^{N}\left(\left[\sum_{j=1}^{N}\boldsymbol{\rho}_{ij}M_{jk}\right]_{i,k=1}^{N}\right)_{ii}\\&= \sum_{i=1}^{N}\sum_{j=1}^{N}\boldsymbol{\rho}_{ji}M_{ij}\\&= \sum_{i=1}^{N}\boldsymbol{\rho}_{ii} M_{ii}& \text{ since } j \neq i \Rightarrow \boldsymbol{\rho}_{ji} = 0\\&= \sum_{i=1}^{N}\rho_i\bra{i}M\ket{i}\end{align*}$$

### Quantum entropy

In the probabilistic situation, described above, entropy comes about simply because of the lack of definite knowledge of the state the system is in. The quantum mechanical version of definition of **entropy** is $$S = - \left\langle \log \boldsymbol{\rho} \right\rangle = - \mathrm{Tr}(\boldsymbol{\rho} \log \boldsymbol{\rho})$$ where $\boldsymbol{\rho}$ is a given probability density matrix.

For our purposes, where we choose a basis in which the density matrix is diagonal, we can write the logarithm (not proved) as a diagonal as well, $$\log \boldsymbol{\rho} = \begin{bmatrix}\log \rho_1 & 0 & \cdots & 0 \\0 & \log \rho_2 & \cdots & 0 \\\vdots & \vdots & \ddots & \vdots \\0 & 0 & \cdots & \log \rho_N\end{bmatrix}$$ and since the product of two diagonal matrices is diagonal itself, we have $$\boldsymbol{\rho} \log \boldsymbol{\rho} = \begin{bmatrix}\rho_1 \log \rho_1 & 0 & \cdots & 0 \\0 & \rho_2 \log \rho_2 & \cdots & 0 \\\vdots & \vdots & \ddots & \vdots \\0 & 0 & \cdots & \rho_N \log \rho_N\end{bmatrix}$$

Thus, on a basis $\left\{\ket{i}: i=1,\cdots,N\right\}$ in which $\boldsymbol{\rho}$ is diagonal, the entropy simply becomes $$S = - \sum_{i=1}^{N}\rho_i \log \rho_i$$

### Classical versus quantum probability

We can compare the classical and quantum definitions of states, observables, probabilities, averages and entropy. *Real* functions are replace by *Hernitian* matrices and the *summation* over a function is replaced by the *trace* of a matrix.

$$\begin{matrix}\textbf{Classical states (sets)} & \textbf{Quantum states (vector spaces)}\\ P_i \text{ is the probability that } i \text{ occurs} & \boldsymbol{\rho} \text{ is a probability density}\\ \sum_{i=1}^{N}P_i = 1 & \textrm{Tr}(\boldsymbol{\rho}) = 1\\ \text{Observables } F \text{ are functions} & \text{Observables } M \text{ are Hermitian operators}\\ \text{The average of } F \text{ is } \bar{F} = \sum_{i=1}^{N}P_i F_i & \text{The average density of } M \text{ is } \langle M \rangle = \textrm{Tr}(\boldsymbol{\rho}M)\\ \text{Entropy is } S = - \sum_{i=1}^{N}P_i \log P_i & \text{Entropy is } S = - \textrm{Tr}(\boldsymbol{\rho} \log \boldsymbol{\rho})\end{matrix}$$