Quantum information processing is using quantum mechanics as a way to encode, process and retrieve information using quantum systems. The amount of quantum theory we need to describe these tasks from a computer science perspective is remarkably shallow. It only amounts to the axioms of closed systems quantum mechanics, and in many cases to 2-dimensional systems 1For digging into higher (discrete finite or infinite) dimensional systems, refer to lectures by U. Chabaud and F. Arzani..

The aim of this lecture is to:

  • identify which axiom of quantum mechanics relates to what information processing task;
  • recall some basic facts about linear algebra that are necessary to develop fluency and intuition.

1. References

2. Axioms

From an operational point of view, a physical theory is a set of mathematical statements that can be combined to predict results of experiments. The crucial ingredient in this definition is the term "physical", as it makes the difference between a mathematical theory and a physical one. The connexion with experiments enforces how it can be disproven. It is unfortunately the only reality check — or debugging tool — available with what Nature is, and what the theory should describe2It will be important to remember this as a guideline towards getting to complex tasks using quantum mechanics, and as a strong limitation for how quantum programs can be checked.. In contrast, most mathematical statements have the advantage of being provable from simpler ones. This is a much more desirable situation as we can check the correctness of a statement instead of holding it true until Nature shows that it does not apply for some experiment.

The role of axioms of quantum mechanics is precisely to provide a set of mathematical rules that can be combined ad infinitum3The important point is the claim that the axioms can be combined with one another and still give something meaningful physically. It is an extremely strong statement. to design new experiments for which the predictions will correspond to physical reality — unless one of the axioms is wrong. In a sense, it is trying to reduce as much as possible the gap between mathematical and physical theories by allowing to prove instead of just disprove 4Note that this is only partially true, as doing so assumes that axioms are correct. Yet, there is an advantage of doing so as axioms are meant to be easier to check. Reversing this argument, you can see quantum information processing as a way to test these axioms and their composability with one another in an incomparably complex way..

Axioms span 3 physical concepts:

  1. States
  2. Measurements
  3. Evolutions

The order chosen here is non-standard5Usually, states are introduced first, followed by evolutions and measurement but intended to emphasize the power and limitations that this axiom introduces for quantum information processing6Trying to reduce the number of or simplify axioms is an important task that interests researchers on foundations of quantum mechanics. The reason is that fewer and simpler axioms should be easier to disprove, and provide a better intuition into what can be achieved with quantum mechanics. Examples include: measurement update rule can be derived from measurement + Manzanes paper on replacing the measurement axiom with composability., 7Alternate theories need to contain classical probability theory., 8When we imagine disproving a theory we need to pay attention to implicit assumptions. For instance, you could imagine trying to disprove the axioms, by testing them and accumulating statistics. But as such you already use the fact that this is meaningful (i.e. the axioms do not vary in time so that statistics can be accumulated and tell you something about the future). You also imply that axioms can be combined. Mathematically they do, but do they also physically? In a sense there is a 4th axiom that says the other 3 can be combined. One approach to trying to get rid of these implicit asumptions is to use cryptography, considering that only mathematics and locality are trusted, and that Nature is malicious..

3. Linear algebra detour

3.1. Hilbert spaces and Operators

\(V\) is a inner product space over \(\mathbb C\), if it is a vector space over \(\mathbb C\) equipped with an inner product \(\langle \psi, \varphi \rangle\).

\(\langle \ , \ \rangle\) is an inner-product if:

\begin{align} & \langle \psi, \psi \rangle > 0, \mbox{ for } \psi \neq 0 \\ & \langle \psi, \alpha_1 \varphi_1 + \alpha_2 \varphi_2 \rangle = \alpha_1\langle \psi, \varphi_1 \rangle + \alpha_2 \langle \psi, \varphi_2 \rangle \\ & \langle \psi, \varphi \rangle = \langle \varphi, \psi \rangle^* \end{align}

\(V\) is a complex Hilbert space, if:

  • \(V\) is a inner-product space over \(\mathbb C\);
  • It is complete for the norm \(\|\psi \| = \sqrt{\langle \psi, \psi \rangle}\) (i.e. Cauchy sequences for \(\| . \|\) converge in \(V\)). This is always the case for finite dimensional \(V\).

With \(\mathcal H\) and \(\mathcal H'\) being Hilbert spaces,

  • Linear operators from \(\mathcal H\) to \(\mathcal H'\) are homomorphisms \(Hom(\mathcal H, \mathcal H')\);
  • Linear operators where \(\mathcal H = \mathcal H'\) are endomorphisms \(End(\mathcal H)\);
  • The adjoint of \(O \in Hom(\mathcal H, \mathcal H')\) is the unique operator \(O^\dagger\) in \(Hom(\mathcal H', \mathcal H)\) such that \(\langle\psi', O \psi \rangle = \langle O^\dagger \psi', \psi\rangle\) for \(\psi \in \mathcal H, \ \psi' \in \mathcal H'\). When \(O\) is represented as a matrix, that of \(O^\dagger\) is the conjugate transpose;
  • For \(O \in End(\mathcal H)\), \(O\) is normal if \(OO^\dagger = O^\dagger O\);
  • For \(O \in End(\mathcal H)\), \(O\) is unitary if \(OO^\dagger = O^\dagger O = \one\);
  • For \(O \in End(\mathcal H)\), \(O\) is Hermitian if \(O = O^\dagger\);
  • For \(O \in End(\mathcal H)\), \(O\) is positive if \(\forall \psi, \ \langle \psi , O \psi\rangle \geq 0\),9Positive operators are Hermitian. It's usually denoted \(O \geq 0\), and for \(O\) and \(O'\), \(O \geq O' \Leftrightarrow O-O' \geq 0\). and positive semi-definite when \(\langle \psi , O \psi\rangle > 0, \ \psi \neq 0\);
  • For \(O \in End(\mathcal H)\), \(O\) is a projector10Projectors are positive operators. if \(O^2 = O\).

A basis \(\{\psi_i\}_i\) of \(\mathcal H\) is orthonormal if \(\langle \psi_i, \psi_j \rangle = \delta_{i,j}\).

Given an orthonormal basis \(\{\psi_i\}_i\), the matrix representation of \(O \in End(\mathcal H)\) is \(O_{i,j} = \langle \psi_i, O \psi_j\rangle\). \(O\) is diagonal wrt \(\{\psi_i\}_i\) if the matrix \(O_{i,j}\) is diagonal11Always keep in mind that \(O_{i,j}\) is not \(O\), e.g. \(O_{i,j}\) depends on the chosen basis!.

The trace of \(A \in End(\mathcal H)\) is a linear functional on endomorphisms such that \(f(xy) = f(yx)\) and such that \(f(\one) = n\) for \(\mathcal H\) an \(n\)-dimensional Hilbert space.

Note that \(\tr\) is invariant under conjugation by a unitary. Note also that \(\tr (A\otimes B) = \tr(A)\tr(B)\).

Given \(A\) and \(B\) complex \(n\times m\) complex matrices form a vector space \(V\). The Froebenius inner product of \(A\) and \(B\) in \(V\) is defined by:

\begin{equation} \langle A,B\rangle_F = \tr(A^\dagger B). \end{equation}

Note that it corresponds to the regular inner product if we write the matrices as vectors using the basis \(e_{i,j}\) corresponding to a matrix with a \(1\) at position \((i,j)\) and \(0\) elsewhere.

It naturally defines a norm over matrices, the Froebenius norm:

\begin{equation} \|A\|_F = \sqrt{\langle A, A \rangle_F}. \end{equation}

\(\| A \|_F\) is also denoted \(\| A \|_2\) as it is the 2-norm of the spectral decomposition12See below of \(A\) when the operator is normal.

Note that these definitions can be extended to the infinite-dimensional case and constitute the Hilbert-Schmidt inner product and norm. One important property of Froebenius / Hilbert-Schmidt inner product is that if \(A,B \geq 0\) then \(\langle A, B \rangle = \tr (A B) \geq 0\).

Hermitian matrices form a subspace of all possible matrices. With the Hilbert-Schmidt inner product, the Hermitian matrices form a real Hilbert space. A convenient basis of such space for the \(2^n\) dimensional case are tensor products of the Pauli matrices13Generalization to arbitrary dimensions are in section 4.1.7 of [Ren]:

\begin{align} I & = \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} \\ X & = \begin{bmatrix} 0 & 1 \\ 1 & 0 \end{bmatrix} \\ Y & = i\begin{bmatrix} 0 & -1 \\ 1 & 0 \end{bmatrix} \\ Z & = \begin{bmatrix} 1 & 0 \\ 0 & -1 \end{bmatrix} \end{align}

Note that these are not normalized. For the Hilbert-Schmidt norm, they need to be multiplied by \(\frac{1}{\sqrt 2}\). Additionally, all of them have vanishing trace, except the identity.

The trace norm of an operator \(A\) is

\begin{equation} \| A \|_1 = \tr |A|, \end{equation}

where \(|A| = \sqrt{A^\dagger A}\).

For \(A\) normal, \(\|A\|_1\) is the trace norm of the spectral decomposition14See below of \(A\).

The trace norm satisfies:

\begin{equation} \| A \|_1 = \max_U |\tr(UA)| \end{equation}

where \(U\) are unitaries.

3.2. Braket notation

A vector \(\vec \psi \in \mathcal H\) can be seen as an homomorphism \(\ket \psi \in Hom(\mathbb C, \mathcal H)\):

\begin{equation} \alpha \in \mathbb C \xrightarrow{\ket \psi} \alpha \vec \psi \in \mathcal H \end{equation}

The adjoint \(\ket \psi^\dagger\) is denoted \(\bra \psi\) and is defined as15\(\ket\psi: \mathbb C \rightarrow \mathcal H\), \(\ket\psi^\dagger: \mathcal H \rightarrow \mathbb C\), \(\langle \vec\psi \alpha, \vec \varphi\rangle = \alpha^* \langle \vec \psi, \vec \varphi \rangle = \alpha^*.\braket{\psi}{\varphi}\). Simplifying by \(\alpha^*\) gives the result.: \[\bra \psi: \ \vec \varphi \rightarrow \langle \vec \psi, \vec \varphi\rangle.\]

4. States of quantum systems

4.1. Pure states

The state of a system is its complete description. Concretely, if you can write the state of a system on a piece of paper, you can predict, through mathematical calculations, the result of any physical experiments that you could perform on the system. Here, the notion of complete description means that there is no uncertainty or no ignorance in the system that could be reduced. In such case, the state \(\ket \psi\) of a system is a ray in a Hilbert space \(\mathcal H\). That is:

\begin{align} & \|\ket \psi \| = \sqrt{\braket{\psi}{\psi}} = 1 \\ & \ket \varphi \in \overline{\ket \psi} \Leftrightarrow \, \exists \alpha \neq 0, \ \ket \varphi = \alpha \ket \psi. \end{align}

Note that usually we simply identify the equivalence class \(\overline{\ket \psi}\) with one of its representative normalized vector, here \(\ket \psi\). The reason why global phases can be ignored and why there is a strong motivation for taking the norm of the vector equal to one will become apparent when considering the prediction of experimental results.

4.2. Mixed states

Although the presentation above is standard in many textbooks, in practice, it is often replaced by a subtly different one. The state of a system is then defined as the mathematical representation of what knowledge an observer has about it. In this definition, states become relative to each observer16One might question here what kind of status the observer has from within quantum mechanics. Very rapidly, this should lead to questionning how systems are defined. You might end up being forced to consider the whole universe as the only physical system that makes sense. But then, what do you do with special relativity? What do you mean by the state of the universe when it is not accessible to you? These foundational questions will not be addressed here. Quantum mechanics will be taken from a purely operational view. Yet, it does not mean that Quantum Information Processing cannot be used to address them. One of the most celebrated example is the result \(MIP^* = RE\) from complexity theory that has implications into the structure of Hilbert spaces for infinite-dimensional composite systems., 17The additional reason for preferring a definition where states are explicitely observer-dependent is that it emphasizes the nature of states: they are the consequence of the observer's relation to a system, rather than representing a pre-existing property of the system itself without reference to an observer. This epistemic vs ontic view of quantum mechanics has been a heated debate since the early days of quantum mechanics. See for instance https://doi.org/10.1016/j.shpsb.2006.10.007.. Yet, they retain their operational property of allowing the observer to predict the results of any experiment it could perfom. The interest of such definition is that it naturally incorporates the lack of knowledge an observer might have about the system. Its predictions will be worse than for a well-informed one. This interpretation of quantum states will be convenient to represent the views two parties with different knowledge about the same system. In such case, a state \(\rho\) is best represented by a density matrix acting on a Hilbert space \(\mathcal H\) — the same as the one used before. That is:

\begin{align} \rho & = \rho^\dagger \\ \rho & \geq 0 \\ \tr \rho & = 1. \end{align}

The obvious question is how does this different representation of states connect to the state vector. One way of making this link is by explicitely working out the predictions on measurements they give and recognizing when they should be identical (see below).

An important class of mixed states is:

For a chosen orthonormal basis \(\{\ket i\}\), classical states are defined as linear combinations of the corresponding projectors \(\{\ketbra i\}\): \[\sum_i p_i \ketbra i, \] where \(p_i\) is a probability distribution.

The reason why these states are called classical will be justified by what can be predicted about them.18Spoiler: they are perfectly distinguishable as it would be expected for classical random variables. As a consequence, it is easy to embed a discrete classical probability distribution within a quantum state.

4.3. Composite systems

Lastly, we will often need to consider composite systems, i.e. systems with subparts. Here, quantum mechanics says that the state of a composite quantum system \(\mathcal A - \mathcal B\) is a ray (resp. density matrix) in the tensor product of the Hilbert spaces of each subsystem, i.e. \(\mathcal H_{\mathcal A} \otimes \mathcal H_{\mathcal B}\).

This axiom plays a crucial role in quantum information processing:

  • It is an essential ingredient for complexity theory as it defines elementary resources that will be counted to assess the complexity / efficiency of a given algorithm or protocol for resolving a task. In particular, it states that although \(n\) 2-dimensional subsystems span a \(2^n\)p-dimensional Hilbert space, they are only corresponding to a linear number of resources used19This is in complete analogy with classical complexity theory, where bits a resources allowing access to exponentially many bit-string values..
  • It creates a very rich structure if one associates to each subsystem a notion of locality. In such case, one can wonder what are the set of bipartite states that can be accessed through local operations and classical operations, and what kind of states require non-local operations to be created from independent subsystems.

5. Predicting results from experiments

An experiment is a physical setup that interacts with a quantum system and has several outcomes. The goal of the measurement axiom is to (1) specify mathematically how these physical setups are represented, (2) what can be predicted about these outcomes, and (3) how to do so.

  1. A (projector-valued) measurement (PVM) \(\mathcal M\) is a set of orthogonal projectors summing to \(\one\): \[\mathcal M = \{M_i\}_i, \ M_i M_j = \delta_{i,j} M_i, \ \sum_i M_i = \one.\] The projector \(M_i\) is said to correspond to outcome \(i\).
  2. Given the state of a quantum system defined on the same Hilbert space as \(\mathcal M\), it is possible to compute the probability of getting outcome \(i\).
  3. if the state is in state \(\ket \psi\)20Remember the state needs to be normalized. (resp. \(\rho\)) the probability of getting \(i\) is:
\begin{equation} \Pr(i|\ket \psi) = \bra \psi M_i \ket \psi, \mbox{ resp. } \Pr(i|\rho) = \tr(\rho M_i). \end{equation}

The presentation adopted here focuses on PVM instead of observables — i.e. hermitian matrices. The reason is that several observables lead to the same measurement by just changing the eigenvalues of the operator. The observables \(\sum_i \lambda_i M_i\) and \(\sum_i \lambda'_i M_i\) contain the same amount of information as long as the \(\lambda\) coefficients do not create spurious degeneracies. As for states, we can think of PVM as normalized representatives for all the possible observables acting on \(\mathcal H\).

Additionally, it emphasizes that there is no such thing as the measurement of the expectation value of an observable. There are only detector clicks that give one of the discrete outcomes \(i\). The measurement of an expectation value is the act of repeating such measurement many times for the same reprepared state and accumulating the statistics for each of the outcomes. In other words, expectation values are averages, while quantum mechanics — through this postulate — gives us the ability to sample from a probability distribution. This is of course much more information than just the average value of the probability distribution. Such distinction will become crucial when counting resources for performing discrimination or estimation tasks, and of course when evaluating the complexity of an algorithm.

It is customary to introduce thet post-measurement state at this stage. It is in fact not necessary — see for instance The Theory of Quantum Information, J. Watrous (2018). Yet, it is useful to understand what is indeed meant by post-measurement state. Indeed, in the lab, physicists have observed that when measurements are very carefully implemented, they can be non destructive in the sense that once a measurement has been done, not only does the quantum system survive to it, but an immediate subsequent measurement gives the same result. Applying this to PVMs shows that when they are implemented in such non destructive way, the state of the system just after the measurement outcome \(i\) has been observed must be \(M_i \ket \psi / \| M_i \ket \psi\|\) — where \(M_i\) is the corresponding projector and \(\ket \psi\) is the state of the system. More generally, for a density matrix \(\rho\), the post-measurement state for a non-destructive PVM \(\{M_i\}_i\) where \(i\) is observed is given by

\begin{equation} \frac{M_i \rho M_i }{\frac \tr (M_i \rho)}. \end{equation}

Nonetheless, one needs to remember that this is only if the measurement is implemented in a non-destructive way. It is a lot of efforts to do it, and most often than not, the quantum system either does not survive or is experiencing additional transformations that yield to further evolutions.

6. Evolving states

The axiom governing the evolution of quantum systems can be summarized as:

  • allowed evolutions for a system with Hilbert space \(\mathcal H\) are linear: \(\mathcal U[\ket \psi + \ket \varphi] = \mathcal U[\ket\psi] + U[\ket \varphi]\)
  • allowed evolutions should map states to states, i.e. a normalized representative of a ray should map to another normalized representative of a ray:
\begin{align} & \forall \ket \psi, \ket \varphi, \ \exists \mathcal U, \ \mathcal U[\ket \psi] = \ket \varphi \\ & \forall \mbox{ allowed } \mathcal U, \forall \ket \psi, \ \langle \mathcal U[\ket\psi], \mathcal U[\ket\psi]\rangle = 1 \end{align}

This means that the allowed evolutions are the whole unitary group acting on \(\mathcal H\).

In most aspects of quantum information processing from a computer science perspective, it will suffice to consider such discrete evolutions. Yet, implementing these evolutions using real physical systems requires to describe continuous time control and evolutions. Schroedinger's equation can be recovered by considering a parametrization of the unitary group in terms of continuous paths using infinitesimal transformation generators. This would lead to construct the associated Lie algebra, and the exponential function that appears in the solution to Schroedinger's equation.

7. More on state vectors and density matrices

7.1. Decomposing density matrices

Because density matrices in the \(2^n\)-dimensional setting are positive semi-definite and have trace 1, they can be conveniently expressed using the Pauli matrices. For a 2-dimensional system (qubit) we have:

\begin{equation} \rho = \frac{1}{2}(I + x X + y Y + z Z) \end{equation}

Note that here, the normalization factor21Always pay attention to the definition of Pauli matrices when reading a research paper, as it might not be consistent throughout the whole paper! 3 normalization conventions co-exist: \(\tr I = 1\), \(\tr I = 2\) and \(\tr I = 2/\sqrt 2\). of the Pauli matrices is implicitely defined to be 1, i.e. \(\tr \rho = 1\). The positivity criterion imposes that \(x^2+y^2+z^2\leq 1\). This is the usual Bloch sphere representation of qubits.

Note that using the generalization of the Pauli matrices to decompose Hermitian matrices, this representation can be generalized to arbitrary dimensions.

7.2. State vectors and density matrices

The correspondence between state vectors and density matrices can be established from an operational point of view. Given a state vector, we seek what density matrix it corresponds to by requiring that they give the same predictions for the same PVMs.

Let \(\ket \psi\) be a state vector. Consider the binary PVM \(\mathcal M = \{M_0, M_1\} \coloneqq \{\Pi_{\ket \psi}, \one - \Pi_{\ket \psi}\}\). Then, the measurement postulate gives that the distribution of outcomes will be:

\begin{align} \Pr_{\ket \psi}(0) & = 1 \\ \Pr_{\ket \psi}(1) & = 0. \end{align}

Now, taking the density matrix approach, we should have

\begin{align} \tr(\rho M_0) & = 1 \\ \tr(\rho M_1) & = 0. \end{align}

Using the fact that \(\rho = \rho^\dagger\) we recognize that \(\tr(\rho M_0) = \langle \rho, M_0 \rangle_F\), which imposes together with the trace 1 condition that \(\rho = M_0 = \ketbra \psi\).

Hence, state vectors correspond to density matrices that are rank-one projectors. The converse is also true. Using the fact that \(\Pi^2 = \Pi\) for projectors and the fact that \(\mathrm{rk} (\Pi) = \tr (\Pi)\) we obtain the purity criterion: \[\tr \rho^2 = 1 \Leftrightarrow \ \exists \ket \psi, \rho = \ketbra \psi.\]

This identification being done, we can derive the evolution postulate for density matrices from the one on states:

\begin{equation} \rho \rightarrow U\rho U^\dagger. \end{equation}

7.3. Density matrices as ensemble preparations

To further understand what are density matrices, we can consider a scenario where a source can prepare one of many pure states \(\ket{\psi_i}\) with probability \(p_i\). If one state prepared by the source is given to you (without you knowing the index \(i\)), what is your description of the state of the received quantum system?

One way to approach the question is to ask yourself what can we predict about the system. Take \(\mathcal M \coloneqq \{M_j\}_j = \{\ketbra {\varphi_j}\}_j\) a PVM, then:

\begin{equation} \Pr(\mbox{outcome }j) = \sum_i p_i \braket{\varphi_j}{\psi_i}. \end{equation}

But as we have seen this is the same as \(\sum_i p_i \tr(M_j \ketbra{\psi_i}{\psi_i})\). Using the linearity of the trace we have:

\begin{align} \Pr(\mbox{outcome }j) & = \tr (M_j \rho) \\ \rho & = \sum_i p_i \ketbra{\psi_i}{\psi_i}. \end{align}

Because this holds for \(\mathcal M\) an arbitrary PVM, it is possible to choose several of them so that this \(\rho\) is unique.

Note however, that while \(\rho\) is unique, each \(\rho\) can correspond to many ensemble preparations.

8. More on operators

8.1. Schur decomposition

Let \(A\in End(\mathcal H)\), then

\begin{equation} A = U T U^\dagger, \end{equation}

where \(U\) is unitary and \(T\) is upper triangular.

This decomposition is obtained by picking up an eigenvalue of \(A\) and considering the direct sum of the eigenspace associated to this eigenvalue and its orthogonal complement. Then, the decomposition is repeated on the complement.

8.2. Spectral decomposition

Let \(A\) be a normal operator in \(End(\mathcal H)\). Then

\begin{equation} A = U D U^\dagger, \end{equation}

for \(U\) a unitary and \(D\) a diagonal matrix.

The proof follows from Schur's decomposition, where normality implies that \(T\) is also normal. Using that \(T\) is upper triangular, a direct inspection shows that normality imposes that \(T\) is diagonal.

8.3. Singular value decomposition

Let \(A \in Hom(\mathcal H, \mathcal H')\), then

\begin{equation} A = V D U, \end{equation}

where \(U\) and \(V\) are unitaries.

The proof is obtained by applying the spectral theorem to \(A^\dagger A\).

8.4. Polar decomposition

Let \(A \in End(\mathcal H)\), then

\begin{equation} A = \sqrt{AA^\dagger} U = U \sqrt{A^\dagger A}, \end{equation}

For \(U\) a unitary. Note that \(\sqrt{AA^\dagger}\) is positive semi-definite.

The proof is obtained by using the SVD for \(A\), and inserting either \(V^{'\dagger} V'\) or \(U' U^{'\dagger}\) from the SVD to obtain \(V'DV^{'\dagger}\) or \(U^{'\dagger} D U\) terms that correspond to \(\sqrt{AA^\dagger}\) and \(\sqrt{A^\dagger A}\).

9. More on composite systems

9.1. Schmidt decomposition

When we have a pure state for a composite system on \(\mathcal H \otimes \mathcal H'\), we might want to pickup a separable basis22A basis of the form \(\ket {\psi_i} \otimes \ket{\psi'_j}\) to express the state.. Are all these bases equivalent or is one better than the others? In fact there is a family of bases that stands out and which is defined using Schmidt decomposition

For \(\ket\Psi \in \mathcal H \otimes \mathcal H'\), there exists \(\{\ket {\psi_i}\}_i\) and \(\{\ket {\psi'_i}\}_i\) orthonormal sets of \(\mathcal H\) and \(\mathcal H'\) respectively such that

\begin{equation} \ket \Psi = \sum_i \lambda_i \ket{\psi_i} \otimes \ket{\psi'_i}. \end{equation}

Note that there is a single summation index.

Pick two bases of \(\mathcal H\) and \(\mathcal H'\). Then

\begin{equation} \ket \Psi = \sum_{i,j} \gamma_{i,j} \ket{\varphi_i} \otimes \ket{\varphi'_j}. \end{equation}

By identifying \(\ket{\varphi_i} \otimes \ket{\varphi'_j}\) and \(\ketbra{\varphi_i}{\varphi'_j}\), \(\ket \Psi\) can be identified with a matrix \(P\)23This same trick will be used to derive the Choi-Jamiolkowski isomorphism representation of Completely Positive Trace Preserving maps.. This matrix can be decomposed using the SVD, that is:

\begin{equation} P = U \Pi V^\dagger, \end{equation}

where \(\Pi\) is positive semi-definite. Taking the column vectors \(\ket{\psi_i}\) and \(\ket{\psi'_i}\) of \(U\) and \(V\) corresponding to the non-zero entries \(\lambda_i\) of \(\Pi\) we can now write

\begin{equation} P = \sum_i \lambda_i \ketbra{\psi_i}{\psi'_i}, \end{equation}

which translates back to

\begin{equation} \ket \Psi = \sum_i \lambda_i \ket{\psi_i} \otimes \ket{\psi'_i}. \end{equation}

As a result, the number of components of the state vector for a composite system written in a separable basis can always be brought back to \(\min (\dim \mathcal H, \dim\mathcal H')\).

9.2. Entanglement

For pure states, an entangled state corresponds to a state with Schmidt number — the number of non-zero coefficient in its Schmidt decomposition — strictly greater than 1.