Principles of Quantum
Mechanics
SECOND EDITION
Principles of Quantum
Mechanics
SECOND EDITION
R. Shankar
Yale University
New Haven, Connecticut
PLENUM PRESS • NEW YORK AND LONDON
Library of Congress Cataloging—in—Publication Data
Shankar, Ramamurti.
Principles of quantum mechanics / R. Shankar.  2nd ed.
p. cm.
Includes bibliographical references and index.
ISBN 0306447908
1. Quantum theory. I. Title.
QC174.12.S52 1994
530.1'2dc20 9426837
CIP
ISBN 0306447908
©1994, 1980 Plenum Press, New York
A Division of Plenum Publishing Corporation
233 Spring Street, New York, N.Y. 10013
All rights reserved
No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any
means, electronic, mechanical, photocopying, microfilming, recording, or otherwise, without written
permission from the Publisher
Printed in the United States of America
To
My Parents
and to
Uma, Umesh, Ajeet, Meera, and Maya
Preface to the Second Edition
Over the decade and a half since I wrote the first edition, nothing has altered my
belief in the soundness of the overall approach taken here. This is based on the
response of teachers, students, and my own occasional rereading of the book. I was
generally quite happy with the book, although there were portions where I felt I
could have done better and portions which bothered me by their absence. I welcome
this opportunity to rectify all that.
Apart from small improvements scattered over the text, there are three major
changes. First, I have rewritten a big chunk of the mathematical introduction in
Chapter 1. Next, I have added a discussion of timereversal invariance. I don't know
how it got left out the first time—I wish I could go back and change it. The most
important change concerns the inclusion of Chaper 21, "Path Integrals: Part II."
The first edition already revealed my partiality for this subject by having a chapter
devoted to it, which was quite unusual in those days. In this one, I have cast off all
restraint and gone all out to discuss many kinds of path integrals and their uses.
Whereas in Chapter 8 the path integral recipe was simply given, here I start by
deriving it. I derive the configuration space integral (the usual Feynman integral),
phase space integral, and (oscillator) coherent state integral. I discuss two applica
tions: the derivation and application of the Berry phase and a study of the lowest
Landau level with an eye on the quantum Hall effect. The relevance of these topics
is unquestionable. This is followed by a section of imaginary time path integrals—
its description of tunneling, instantons, and symmetry breaking, and its relation to
classical and quantum statistical mechanics. An introduction is given to the transfer
matrix. Then I discuss spin coherent state path integrals and path integrals for
fermions. These were thought to be topics too advanced for a book like this, but I
believe this is no longer true. These concepts are extensively used and it seemed a
good idea to provide the students who had the wisdom to buy this book with a head
start.
How are instructors to deal with this extra chapter given the time constraints?
I suggest omitting some material from the earlier chapters. (No one I know, myself
included, covers the whole book while teaching any fixed group of students.) A
realistic option is for the instructor to teach part of Chapter 21 and assign the rest
as reading material, as topics for a takehome exams, term papers, etc. To ignore it, Ai
viii I think, would be to lose a wonderful opportunity to expose the student to ideas
PREFACE TO THE
that are central to many current research topics and to deny them the attendant
SECOND EDITION excitement. Since the aim of this chapter is to guide students toward more frontline
topics, it is more concise than the rest of the book. Students are also expected to
consult the references given at the end of the chapter.
Over the years, I have received some very useful feedback and I thank all those
students and teachers who took the time to do so. I thank Howard Haber for a
discussion of the Born approximation; Harsh Mathur and Ady Stern for discussions
of the Berry phase; Alan Chodos, Steve Girvin, Ilya Gruzberg, Martin Gutzwiller,
Ganpathy Murthy, Charlie Sommerfeld, and Senthil Todari for many useful com
ments on Chapter 21. I thank Amelia McNamara of Plenum for urging me to write
this edition and Plenum for its years of friendly and warm cooperation. Finally, I
thank my wife Uma for shielding me as usual from real life so I could work on this
edition, and my battery of kids (revised and expanded since the previous edition)
for continually charging me up.
R. Shankar
New Haven, Connecticut
Preface to the First Edition
Publish and perish—Giordano Bruno
Given the number of books that already exist on the subject of quantum mechanics,
one would think that the public needs one more as much as it does, say, the latest
version of the Table of Integers. But this does not deter me (as it didn't my predeces
sors) from trying to circulate my own version of how it ought to be taught. The
approach to be presented here (to be described in a moment) was first tried on a
group of Harvard undergraduates in the summer of '76, once again in the summer
of '77, and more recently at Yale on undergraduates ('77'78) and graduates ('78
'79) taking a yearlong course on the subject. In all cases the results were very
satisfactory in the sense that the students seemed to have learned the subject well
and to have enjoyed the presentation. It is, in fact, their enthusiastic response and
encouragement that convinced me of the soundness of my approach and impelled
me to write this book.
The basic idea is to develop the subject from its postulates, after addressing
some indispensable preliminaries. Now, most people would agree that the best way
to teach any subject that has reached the point of development where it can be
reduced to a few postulates is to start with the latter, for it is this approach that
gives students the fullest understanding of the foundations of the theory and how it
is to be used. But they would also argue that whereas this is all right in the case of
special relativity or mechanics, a typical student about to learn quantum mechanics
seldom has any familiarity with the mathematical language in which the postulates
are stated. I agree with these people that this problem is real, but I differ in my belief
that it should and can be overcome. This book is an attempt at doing just this.
It begins with a rather lengthy chapter in which the relevant mathematics of
vector spaces developed from simple ideas on vectors and matrices the student is
assumed to know. The level of rigor is what I think is needed to make a practicing
quantum mechanic out of the student. This chapter, which typically takes six to
eight lecture hours, is filled with examples from physics to keep students from getting
too fidgety while they wait for the "real physics." Since the math introduced has to
be taught sooner or later, I prefer sooner to later, for this way the students, when
they get to it, can give quantum theory their fullest attention without having to ix
battle with the mathematical theorems at the same time. Also, by segregating the
PREFACE TO THE mathematical theorems from the physical postulates, any possible confusion as to
FIRST EDITION which is which is nipped in the bud.
This chapter is followed by one on classical mechanics, where the Lagrangian
and Hamiltonian formalisms are developed in some depth. It is for the instructor to
decide how much of this to cover; the more students know of these matters, the
better they will understand the connection between classical and quantum mechanics.
Chapter 3 is devoted to a brief study of idealized experiments that betray the
inadequacy of classical mechanics and give a glimpse of quantum mechanics.
Having trained and motivated the students I now give them the postulates of
quantum mechanics of a single particle in one dimension. I use the word "postulate"
here to mean "that which cannot be deduced from pure mathematical or logical
reasoning, and given which one can formulate and solve quantum mechanical prob
lems and interpret the results." This is not the sense in which the true axiomatist
would use the word. For instance, where the true axiomatist would just postulate
that the dynamical variables are given by Hilbert space operators, I would add the
operator identifications, i.e., specify the operators that represent coordinate and
momentum (from which others can be built). Likewise, I would not stop with the
statement that there is a Hamiltonian operator that governs the time evolution
through the equation ihelty> / Ot= Hlty>; I would say the H is obtained from the
classical Hamiltonian by substituting for x and p the corresponding operators. While
the more general axioms have the virtue of surviving as we progress to systems of
more degrees of freedom, with or without classical counterparts, students given just
these will not know how to calculate anything such as the spectrum of the oscillator.
Now one can, of course, try to "derive" these operator assignments, but to do so
one would have to appeal to ideas of a postulatory nature themselves. (The same
goes for "deriving" the Schr6dinger equation.) As we go along, these postulates are
generalized to more degrees of freedom and it is for pedagogical reasons that these
generalizations are postponed. Perhaps when students are finished with this book,
they can free themselves from the specific operator assignments and think of quantum
mechanics as a general mathematical formalism obeying certain postulates (in the
strict sense of the term).
The postulates in Chapter 4 are followed by a lengthy discussion of the same,
with many examples from fictitious Hilbert spaces of three dimensions. Nonetheless,
students will find it hard. It is only as they go along and see these postulates used
over and over again in the rest of the book, in the setting up of problems and the
interpretation of the results, that they will catch on to how the game is played. It is
hoped they will be able to do it on their own when they graduate. I think that any
attempt to soften this initial blow will be counterproductive in the long run.
Chapter 5 deals with standard problems in one dimension. It is worth mentioning
that the scattering off a step potential is treated using a wave packet approach. If
the subject seems too hard at this stage, the instructor may decide to return to it
after Chapter 7 (oscillator), when students have gained more experience. But I think
that sooner or later students must get acquainted with this treatment of scattering.
The classical limit is the subject of the next chapter. The harmonic oscillator is
discussed in detail in the next. It is the first realistic problem and the instructor may
be eager to get to it as soon as possible. If the instructor wants, he or she can discuss
the classical limit after discussing the oscillator.
We next discuss the path integral formulation due to Feynman. Given the intui xi
tive understanding it provides, and its elegance (not to mention its ability to give PREFACE TO THE
the full propagator in just a few minutes in a class of problems), its omission from FIRST EDITION
so many books is hard to understand. While it is admittedly hard to actually evaluate
a path integral (one example is provided here), the notion of expressing the propag
ator as a sum over amplitudes from various paths is rather simple. The importance
of this point of view is becoming clearer day by day to workers in statistical mechanics
and field theory. I think every effort should be made to include at least the first three
(and possibly five) sections of this chapter in the course.
The content of the remaining chapters is standard, in the first approximation.
The style is of course peculiar to this author, as are the specific topics. For instance,
an entire chapter (11) is devoted to symmetries and their consequences. The chapter
on the hydrogen atom also contains a section on how to make numerical estimates
starting with a few mnemonics. Chapter 15, on addition of angular momenta, also
contains a section on how to understand the "accidental" degeneracies in the spectra
of hydrogen and the isotropic oscillator. The quantization of the radiation field is
discussed in Chapter 18, on timedependent perturbation theory. Finally the treat
ment of the Dirac equation in the last chapter (20) is intended to show that several
things such as electron spin, its magnetic moment, the spinorbit interaction, etc.
which were introduced in an ad hoc fashion in earlier chapters, emerge as a coherent
whole from the Dirac equation, and also to give students a glimpse of what lies
ahead. This chapter also explains how Feynman resolves the problem of negative
energy solutions (in a way that applies to bosons and fermions).
For Whom Is this Book Intended?
In writing it, I addressed students who are trying to learn the subject by them
selves; that is to say, I made it as selfcontained as possible, included a lot of exercises
and answers to most of them, and discussed several tricky points that trouble students
when they learn the subject. But I am aware that in practice it is most likely to be
used as a class text. There is enough material here for a full year graduate course.
It is, however, quite easy so adapt it to a yearlong undergraduate course. Several
sections that may be omitted without loss of continuity are indicated. The sequence
of topics may also be changed, as stated earlier in this preface. I thought it best to
let the instructor skim through the book and chart the course for his or her class,
given their level of preparation and objectives. Of course the book will not be particu
larly useful if the instructor is not sympathetic to the broad philosophy espoused
here, namely, that first comes the mathematical training and then the development
of the subject from the postulates. To instructors who feel that this approach is all
right in principle but will not work in practice, I reiterate that it has been found to
work in practice, not just by me but also by teachers elsewhere.
The book may be used by nonphysicists as well. (I have found that it goes well
with chemistry majors in my classes.) Although I wrote it for students with no familiar
ity with the subject, any previous exposure can only be advantageous.
Finally, I invite instructors and students alike to communicate to me any sugges
tions for improvement, whether they be pedagogical or in reference to errors or
misprints.
xii Acknowledgments
PREFACE TO THE As I look back to see who all made this book possible, my thoughts first turn
FIRST EDITION to my brother R. Rajaraman and friend Rajaram Nityananda, who, around the
same time, introduced me to physics in general and quantum mechanics in particular.
Next come my students, particularly Doug Stone, but for whose encouragement and
enthusiastic response I would not have undertaken this project. I am grateful to
Professor Julius Kovacs of Michigan State, whose kind words of encouragement
assured me that the book would be as well received by my peers as it was by
my students. More recently, I have profited from numerous conversations with my
colleagues at Yale, in particular Alan Chodos and Peter Mohr. My special thanks
go to Charles Sommerfield, who managed to make time to read the manuscript and
made many useful comments and recommendations. The detailed proofreading was
done by Tom Moore. I thank you, the reader, in advance, for drawing to my notice
any errors that may have slipped past us.
The bulk of the manuscript production cost were borne by the J. W. Gibbs
fellowship from Yale, which also supported me during the time the book was being
written. Ms. Laurie Liptak did a fantastic job of typing the first 18 chapters and
Ms. Linda Ford did the same with Chapters 19 and 20. The figures are by Mr. J.
Brosious. Mr. R. Badrinath kindly helped with the index.t
On the domestic front, encouragement came from my parents, my inlaws, and
most important of all from my wife, Uma, who cheerfully donated me to science for
a year or so and stood by me throughout. Little Umesh did his bit by tearing up all
my books on the subject, both as a show of support and to create a need for this
one.
R. Shankar
New Haven, Connecticut
I It is a pleasure to acknowledge the help of Mr. Richard Hatch, who drew my attention to a number
of errors in the first printing.
Prelude
Our description of the physical world is dynamic in nature and undergoes frequent
change. At any given time, we summarize our knowledge of natural phenomena by
means of certain laws. These laws adequately describe the phenomenon studied up
to that time, to an accuracy then attainable. As time passes, we enlarge the domain
of observation and improve the accuracy of measurement. As we do so, we constantly
check to see if the laws continue to be valid. Those laws that do remain valid gain
in stature, and those that do not must be abandoned in favor of new ones that do.
In this changing picture, the laws of classical mechanics formulated by Galileo,
Newton, and later by Euler, Lagrange, Hamilton, Jacobi, and others, remained
unaltered for almost three centuries. The expanding domain of classical physics met
its first obstacles around the beginning of this century. The obstruction came on two
fronts: at large velocities and small (atomic) scales. The problem of large velocities
was successfully solved by Einstein, who gave us his relativistic mechanics, while the
founders of quantum mechanics—Bohr, Heisenberg, Schrödinger, Dirac, Born, and
otherssolved the problem of smallscale physics. The union of relativity and quan
tum mechanics, needed for the description of phenomena involving simultaneously
large velocities and small scales, turns out to be very difficult. Although much pro
gress has been made in this subject, called quantum field theory, there remain many
open questions to this date. We shall concentrate here on just the smallscale problem,
that is to say, on nonrelativistic quantum mechanics.
The passage from classical to quantum mechanics has several features that are
common to all such transitions in which an old theory gives way to a new one:
(1) There is a domain D, of phenomena described by the new theory and a sub
domain Do wherein the old theory is reliable (to a given accuracy).
(2) Within the subdomain Do either theory may be used to make quantitative pre
dictions. It might often be more expedient to employ the old theory.
(3) In addition to numerical accuracy, the new theory often brings about radical
conceptual changes. Being of a qualitative nature, these will have a bearing on
all of D,.
For example, in the case of relativity, Do and D, represent (macroscopic)
phenomena involving small and arbitrary velocities, respectively, the latter, of course, xiii
xiv being bounded by the velocity of light. In addition to giving better numerical pre
PRELUDE
dictions for highvelocity phenomena, relativity theory also outlaws several cherished
notions of the Newtonian scheme, such as absolute time, absolute length, unlimited
velocities for particles, etc.
In a similar manner, quantum mechanics brings with it not only improved
numerical predictions for the microscopic world, but also conceptual changes that
rock the very foundations of classical thought.
This book introduces you to this subject, starting from its postulates. Between
you and the postulates there stand three chapters wherein you will find a summary
of the mathematical ideas appearing in the statement of the postulates, a review of
classical mechanics, and a brief description of the empirical basis for the quantum
theory. In the rest of the book, the postulates are invoked to formulate and solve a
variety of quantum mechanical problems. It is hoped that, by the time you get to
the end of the book, you will be able to do the same yourself.
Note to the Student
Do as many exercises as you can, especially the ones marked * or whose results
carry equation numbers. The answer to each exercise is given either with the exercise
or at the end of the book.
The first chapter is very important. Do not rush through it. Even if you know
the math, read it to get acquainted with the notation.
I am not saying it is an easy subject. But I hope this book makes it seem
reasonable.
Good luck.
Contents
1. Mathematical Introduction 1
1.1. Linear Vector Spaces : Basics 1
1.2. Inner Product Spaces 7
1.3. Dual Spaces and the Dirac Notation 11
1.4. Subspaces 17
1.5. Linear Operators 18
1.6. Matrix Elements of Linear Operators 20
1.7. Active and Passive Transformations 29
1.8. The Eigenvalue Problem 30
1.9. Functions of Operators and Related Concepts 54
1.10. Generalization to Infinite Dimensions 57
2. Review of Classical Mechanics 75
2.1. The Principle of Least Action and Lagrangian Mechanics 78
2.2. The Electromagnetic Lagrangian 83
2.3. The TwoBody Problem 85
2.4. How Smart Is a Particle'? 86
2.5. The Hamiltonian Formalism 86
2.6. The Electromagnetic Force in the Hamiltonian Scheme 90
2.7. Cyclic Coordinates, Poisson Brackets, and Canonical
Transformations 91
2.8. Symmetries and Their Consequences 98
3. All Is Not Well with Classical Mechanics 107
3.1. Particles and Waves in Classical Physis 107
3.2. An Experiment with Waves and Particles (Classical) 108
3.3. The DoubleSlit Experiment with Light 110
3.4. Matter Waves (de Broglie Waves) 112
3.5. Conclusions 112 XV
xvi 4. The Postulates—a General Discussion 115
CONTENTS 4.1. The Postulates 115
4.2. Discussion of Postulates IIII 116
4.3. The Schrödinger Equation (Dotting Your i's and
Crossing your h's) 143
5. Simple Problems in One Dimension 151
5.1. The Free Particle 151
5.2. The Particle in a Box 157
5.3. The Continuity Equation for Probability 164
5.4. The SingleStep Potential: a Problem in Scattering 167
5.5. The DoubleSlit Experiment 175
5.6. Some Theorems 176
6. The Classical Limit 179
7. The Harmonic Oscillator 185
7.1. Why Study the Harmonic Oscillator9 185
7.2. Review of the Classical Oscillator 188
7.3. Quantization of the Oscillator (Coordinate Basis) 189
7.4. The Oscillator in the Energy Basis 202
7.5. Passage from the Energy Basis to the X Basis 216
8. The Path Integral Formulation of Quantum Theory 223
8.1. The Path Integral Recipe 223
8.2. Analysis of the Recipe 224
8.3. An Approximation to U(t) for the Free Particle 225
8.4. Path Integral Evaluation of the FreeParticle Propagator. 226
8.5. Equivalence to the Schrödinger Equation 229
8.6. Potentials of the Form V= a+ bx+ cx2 + di+ exX 231
9. The Heisenberg Uncertainty Relations 237
9.1. Introduction 237
9.2. Derivation of the Uncertainty Relations 237
9.3. The Minimum Uncertainty Packet 239
9.4. Applications of the Uncertainty Principle 241
9.5. The EnergyTime Uncertainty Relation 245
10. Systems with N Degrees of Freedom 247
10.1. N Particles in One Dimension 247
10.2. More Particles in More Dimensions 259
10.3. Identical Particles 260
11. Symmetries and Their Consequences 279 Xvii
11.1. Overview 279 CONTENTS
11.2. Translational Invariance in Quantum Theory 279
11.3. Time Translational Invariance 294
11.4. Parity Invariance 297
11.5. TimeReversal Symmetry 301
12. Rotational Invariance and Angular Momentum 305
12.1. Translations in Two Dimensions 305
12.2. Rotations in Two Dimensions 306
12.3. The Eigenvalue Problem of Lz 313
12.4. Angular Momentum in Three Dimensions 318
12.5. The Eigenvalue Problem of L 2 and 1,, 321
12.6. Solution of Rotationally Invariant Problems 339
13. The Hydrogen Atom 353
13.1. The Eigenvalue Problem 353
13.2. The Degeneracy of the Hydrogen Spectrum 359
13.3. Numerical Estimates and Comparison with Experiment. 361
13.4. Multielectron Atoms and the Periodic Table 369
14. Spin 373
14.1. Introduction 373
14.2. What is the Nature of Spin? 373
14.3. Kinematics of Spin 374
14.4. Spin Dynamics 385
14.5. Return of Orbital Degrees of Freedom 397
15. Addition of Angular Momenta 403
15.1. A Simple Example 403
15.2. The General Problem 408
15.3. Irreducible Tensor Operators 416
15.4. Explanation of Some "Accidental" Degeneracies 421
16. Variational and WKB Methods 429
16.1. The Variational Method 429
16.2. The WentzelKramersBrillouin Method 435
17. TimeIndependent Perturbation Theory 451
17.1. The Formalism 451
17.2. Some Examples 454
17.3. Degenerate Perturbation Theory 464
xViii 18. TimeDependent Perturbation Theory 473
CONTENTS 18.1. The Problem 473
18.2. FirstOrder Perturbation Theory 474
18.3. Higher Orders in Perturbation Theory 484
18.4. A General Discussion of Electromagnetic Interactions 492
18.5. Interaction of Atoms with Electromagnetic Radiation 499
19. Scattering Theory 523
19.1. Introduction 523
19.2. Recapitulation of OneDimensional Scattering and Overview 524
19.3. The Born Approximation (TimeDependent Description) . . 529
19.4. Born Again (The TimeIndependent Approximation) 534
19.5. The Partial Wave Expansion 545
19.6. TwoParticle Scattering 555
20. The Dirac Equation 563
20.1. The FreeParticle Dirac Equation 563
20.2. Electromagnetic Interaction of the Dirac Particle 566
20.3. More on Relativistic Quantum Mechanics 574
21. Path Integrals—II 581
21.1. Derivation of the Path Integral 582
21.2. Imaginary Time Formalism 613
21.3. Spin and Fermion Path Integrals 636
21.4. Summary 652
Appendix 655
A.1. Matrix Inversion 655
A.2. Gaussian Integrals 659
A.3. Complex Numbers 660
A.4. The ig Prescription 661
ANSWERS TO SELECTED EXERCISES 665
TABLE OF CONSTANTS 669
INDEX 671
1
Mathematical Introduction
The aim of this book is to provide you with an introduction to quantum mechanics,
starting from its axioms. It is the aim of this chapter to equip you with the necessary
mathematical machinery. All the math you will need is developed here, starting from
some basic ideas on vectors and matrices that you are assumed to know. Numerous
examples and exercises related to classical mechanics are given, both to provide some
relief from the math and to demonstrate the wide applicability of the ideas developed
here. The effort you put into this chapter will be well worth your while: not only
will it prepare you for this course, but it will also unify many ideas you may have
learned piecemeal. To really learn this chapter, you must, as with any other chapter,
work out the problems.
1.1. Linear Vector Spaces: Basics
In this section you will be introduced to linear vector spaces. You are surely
familiar with the arrows from elementary physics encoding the magnitude and
direction of velocity, force, displacement, torque, etc. You know how to add them
and multiply them by scalars and the rules obeyed by these operations. For example,
you know that scalar multiplication is associative: the multiple of a sum of two
vectors is the sum of the multiples. What we want to do is abstract from this simple
case a set of basic features or axioms, and say that any set of objects obeying the same
forms a linear vector space. The cleverness lies in deciding which of the properties to
keep in the generalization. If you keep too many, there will be no other examples;
if you keep too few, there will be no interesting results to develop from the axioms.
The following is the list of properties the mathematicians have wisely chosen as
requisite for a vector space. As you read them, please compare them to the world
of arrows and make sure that these are indeed properties possessed by these familiar
vectors. But note also that conspicuously missing are the requirements that every
vector have a magnitude and direction, which was the first and most salient feature
drilled into our heads when we first heard about them. So you might think that
dropping this requirement, the baby has been thrown out with the bath water.
However, you will have ample time to appreciate the wisdom behind this choice as 1
2 you go along and see a great unification and synthesis of diverse ideas under the
CHAPTER 1
heading of vector spaces. You will see examples of vector spaces that involve entities
that you cannot intuitively perceive as having either a magnitude or a direction.
While you should be duly impressed with all this, remember that it does not hurt at
all to think of these generalizations in terms of arrows and to use the intuition to
prove theorems or at the very least anticipate them.
Definition 1. A linear vector space V is a collection of objects
I 2 >, .. . , I V>, . . . , I W>, . . . , called vectors, for which there exists
1. A definite rule for forming the vector sum, denoted I V> + I W>
2. A definite rule for multiplication by scalars a, b,. . . , denoted al V> with the
following features:
• The result of these operations is another element of the space, a feature called
closure: l V> + l W> e V.
• Scalar multiplication is distributive in the vectors: a(IV> +I W> ) =
al V> + al W>.
• Scalar multiplication is distributive in the scalars: (a+b)IV>= al V> + blV>.
• Scalar multiplication is associative: a(bl V>) = abl V > .
• Addition is commutative: l V> +I W> =I W>+1 V>.
• Addition is associative: IV> + (I W> + l Z> ) = (IV> + I W > ) + I Z> .
• There exist a null vector 10> obeying I V> +10> = I V>.
• For every vector I V> there exists an inverse under addition, l — V>, such that
I V>+1 — V>=1 0>.
There is a good way to remember all of these; do what comes naturally.
Definition 2. The numbers a, b, . . . are called the field over which the vector
space is defined.
If the field consists of all real numbers, we have a real vector space, if they are
complex, we have a complex vector space. The vectors themselves are neither real or
complex; the adjective applies only to the scalars.
Let us note that the above axioms imply
• 10> is unique, i.e., if 1 0'> has all the properties of 10>, then 10> = I 0'>.
• 01 V> =10>.
• I — V>= — I V>.
• l— V> is the unique additive inverse of I V>.
The proofs are left as to the following exercise. You don't have to know the proofs,
but you do have to know the statements.
Exercise 1.1.1. Verify these claims. For the first consider 10> +10'> and use the advertised
properties of the two null vectors in turn. For the second start with 10> = (0 + 1)1 V> +I V>.
—
For the third, begin with 1 V> + (1 V> )= 01 V> =10>. For the last, let 1W> also satisfy
I V> +IW>=10>. Since 10> is unique, this means 1 V> +1 W>= V> +1— V>. Take it from here.
3
MATHEMATICAL
Figure 1.1. The rule for vector addition. Note that it obeys axioms INTRODUCTION
(i)(iii).
Exercise 1.1.2. Consider the set of all entities of the form (a, b, c) where the entries are
real numbers. Addition and scalar multiplication are defined as follows:
(a, b, c)+ (d, e, f)= (a,+ d, b + e, c +f)
a(a, b, c)= (aa, ab, ac).
Write down the null vector and inverse of (a, b, c). Show that vectors of the form (a, b, 1) do
not form a vector space.
Observe that we are using a new symbol I V> to denote a generic vector. This
object is called ket V and this nomenclature is due to Dirac whose notation will be
discussed at some length later. We do not purposely use the symbol V to denote the
vectors as the first step in weaning you away from the limited concept of the vector
as an arrow. You are however not discouraged from associating with l V> the arrow
like object till you have seen enough vectors that are not arrows and are ready to
drop the crutch.
You were asked to verify that the set of arrows qualified as a vector space as
you read the axioms. Here are some of the key ideas you should have gone over.
The vector space consists of arrows, typical ones being V and I». The rule for
addition is familiar: take the tail of the second arrow, put it on the tip of the first,
and so on as in Fig. 1.1.
Scalar multiplication by a corresponds to stretching the vector by a factor a.
This is a real vector space since stretching by a complex number makes no sense. (If
a is negative, we interpret it as changing the direction of the arrow as well as resealing
it by I al .) Since these operations acting on arrows give more arrows, we have closure.
Addition and scalar multiplication clearly have all the desired associative and distri
butive features. The null vector is the arrow of zero length, while the inverse of a
vector is the vector reversed in direction.
So the set of all arrows qualifies as a vector space. But we cannot tamper with
it. For example, the set of all arrows with positive zcomponents do not form a
vector space: there is no inverse.
Note that so far, no reference has been made to magnitude or direction. The
point is that while the arrows have these qualities, members of a vector space need
not. This statement is pointless unless I can give you examples, so here are two.
Consider the set of all 2 X 2 matrices. We know how to add them and multiply
them by scalars (multiply all four matrix elements by that scalar). The corresponding
rules obey closure, associativity, and distributive requirements. The null matrix has
all zeros in it and the inverse under addition of a matrix is the matrix with all elements
negated. You must agree that here we have a genuine vector space consisting of
things which don't have an obvious length or direction associated with them. When
we want to highlight the fact that the matrix M is an element of a vector space, we
may want to refer to it as, say, ket number 4 or: I 4>.
4 As a second example, consider all functionsf(x) defined in an interval 0 < x <L.
CHAPTER 1 We define scalar multiplication by a simply as af(x) and addition as pointwise
addition: the sum of two functions f and g has the value f(x)+ g(x) at the point x.
The null function is zero everywhere and the additive inverse of f is —f.
Exercise 1.1.3. Do functions that vanish at the end points x=0 and x=L form a vector
space? How about periodic functions obeying f(0)=f(L)? How about functions that obey
f(0)= 4? If the functions do not qualify, list the things that go wrong.
The next concept is that of linear independence of a set of vectors 11 >, 12>. .. I n>.
First consider a linear relation of the form
E aili>=1 0 >
i =
We may assume without loss of generality that the lefthand side does not
contain any multiple of 10>, for if it did, it could be shifted to the right, and combined
with the 10> there to give 10> once more. (We are using the fact that any multiple
of 10> equals 10>.)
Definition 3. The set of vectors is said to be linearly independent if the only such
linear relation as Eq. (1.1.1) is the trivial one with all ai = 0. If the set of vectors
is not linearly independent, we say they are linearly dependent.
Equation (1.1.1) tells us that it is not possible to write any member of the
linearly independent set in terms of the others. On the other hand, if the set of
vectors is linearly dependent, such a relation will exist, and it must contain at least
two nonzero coefficients. Let us say a3 0 0. Then we could write
(1.1.2)
i=1,03 a3
thereby expressing 13> in terms of the others.
As a concrete example, consider two nonparallel vectors 11> and 12> in a plane.
These form a linearly independent set. There is no way to write one as a multiple of
the other, or equivalently, no way to combine them to get the null vector. On the
other hand, if the vectors are parallel, we can clearly write one as a multiple of the
other or equivalently play them against each other to get 0.
Notice I said 0 and not 10>. This is, strictly speaking, incorrect since a set of
vectors can only add up to a vector and not a number. It is, however, common to
represent the null vector by 0.
Suppose we bring in a third vector 13> also in the plane. If it is parallel to either
of the first two, we already have a linearly dependent set. So let us suppose it is not.
But even now the three of them are linearly dependent. This is because we can write
one of them, say 13>, as a linear combination of the other two. To find the combina
tion, draw a line from the tail of 13> in the direction of 11>. Next draw a line
antiparallel to 12> from the tip of 13>. These lines will intersect since 11> and 12> are
not parallel by assumption. The intersection point P will determine how much of 5
11> and 12> we want: we go from the tail of 13> to P using the appropriate multiple MATHEMATICAL
of 11> and go from P to the tip of 13> using the appropriate multiple of 12>. INTRODUCTION
Exercise 1.1.4. Consider three elements from the vector space of real 2 x 2 matrices :
0 —1]
1,>40 0 01
I3> = [
0 —2
Are they linearly independent? Support your answer with details. (Notice we are calling
these matrices vectors and using kets to represent them to emphasize their role as elements
of a vector space.
Exercise 1.1.5. Show that the following row vectors are linearly dependent: (1, 1, 0),
(1, 0, 1), and (3, 2, 1). Show the opposite for (1, 1, 0), (1, 0, 1), and (0, 1, 1).
Definition 4. A vector space has dimension n if it can accommodate a maximum
of n linearly independent vectors. It will be denoted by V(R) if the field is real
and by V(C) if the field is complex.
In view of the earlier discussions, the plane is twodimensional and the set of
all arrows not limited to the plane define a threedimensional vector space. How
about 2 x 2 matrices? They form a fourdimensional vector space. Here is a proof.
The following vectors are linearly independent:
[01 ol
I1>=[1 0 1 2 >= I3>=[° 14>=[0
00 0 0 10 01
since it is impossible to form linear combinations of any three of them to give the
fourth any three of them will have a zero in the one place where the fourth does
not. So the space is at least fourdimensional. Could it be bigger? No, since any
arbitrary 2 x 2 matrix can be written in terms of them:
[a b
1> + b12> + c13> + d14>
c di = al
If the scalars a, b, c, d are real, we have a real fourdimensional space, if they
are complex we have a complex fourdimensional space.
Theorem 1. Any vector I V> in an ndimensional space can be written as a
linearly combination of n linearly independent vectors 11> . . . In>.
The proof is as follows: if there were a vector I V> for which this were not
possible, it would join the given set of vectors and form a set of n+ 1 linearly
independent vectors, which is not possible in an ndimensional space by definition.
6 Definition 5. A set of n linearly independent vectors in an ndimensional space
CHAPTER 1
is called a basis.
Thus we can write, on the strength of the above
(1.1.3)
where the vectors I i> form a basis.
Definition 6. The coefficients of expansion y, of a vector in terms of a linearly
independent basis (I i> ) are called the components of the vector in that basis.
Theorem 2. The expansion in Eq. (1.1.1) is unique.
Suppose the expansion is not unique. We must then have a second expansion:
v>= E vni> (1.1.4)
Subtracting Eq. (1.1.4) from Eq. (1.1.3) (i.e., multiplying the second by the
scalar —1 and adding the two equations) we get
10> =E (v1100 (1.1.5)
which implies that
yi = y; (1.1.6)
since the basis vectors are linearly independent and only a trivial linear relation
between them can exist. Note that given a basis the components are unique, but if
we change the basis, the components will change. We refer to V> as the vector in
the abstract, having an existence of its own and satisfying various relations involving
other vectors. When we choose a basis the vectors assume concrete forms in terms
of their components and the relation between vectors is satisfied by the components.
e e
Imagine for example three arrows in the plane, A, B, satisfying Â + B = according
to the laws for adding arrows. So far no basis has been chosen and we do not need
a basis to make the statement that the vectors from a closed triangle. Now we choose
a basis and write each vector in terms of the components. The components will
satisfy C, = A, + B,, i= 1, 2. If we choose a different basis, the components will change
in numerical value, but the relation between them expressing the equality of to e
the sum of the other two will still hold between the new set of components.
In the case of nonarrow vectors, adding them in terms of components proceeds 7
as in the elementary case thanks to the axioms. If MATHEMATICAL
INTRODUCTION
V>=> and (1.1.7)
I w> = E wiii> then (1.1.8)
v> + w> =E (vi+ (1.1.9)
where we have used the axioms to carry out the regrouping of terms. Here is the
conclusion:
To add two vectors, add their components.
There is no reference to taking the tail of one and putting it on the tip of the
other, etc., since in general the vectors have no head or tail. Of course, if we are
dealing with arrows, we can add them either using the tail and tip routine or by
simply adding their components in a basis.
In the same way, we have:
al V>=aEvili>=Eavili> (1.1.10)
In other words,
To multiply a vector by a scalar, multiply all its components by the scalar.
1.2. Inner Product Spaces
The matrix and function examples must have convinced you that we can have
a vector space with no preassigned definition of length or direction for the elements.
However, we can make up quantities that have the same properties that the lengths
and angles do in the case of arrows. The first step is to define a sensible analog of
the dot product, for in the case of arrows, from the dot product
;I• /3=IAIIBI cos 0 (1.2.1)
we can read off the length of say À as VI A I • I AI and the cosine of the angle between
two vectors as A • /3/1AIIBI. Now you might rightfully object: how can you use the dot
product to define the length and angles, if the dot product itself requires knowledge of
the lengths and angles? The answer is this. Recall that the dot product has a second
8
CHAPTER 1
Figure 1.2. Geometrical proof that the dot product obeys axiom (iii)
Pj for an inner product. The axiom requires that the projections obey
Pik Pk+ Pi  Pik •
equivalent expression in terms of the components:
;1• ,4,13,+ Ay13,+ Az Bz (1.2.2)
Our goal is to define a similar formula for the general case where we do have the
notion of components in a basis. To this end we recall the main features of the above
dot product:
1.A • h = 13 • ;I (symmetry)
2. ;I' • A > O 0 ¶A = 0 (positive semidefiniteness)
3. • (bh+ ce)=b:4 • h+ cÂ • C(linearity)
The linearity of the dot product is illustrated in Fig. 1.2.
We want to invent a generalization called the inner product or scalar product
between any two vectors I V> and I W>. We denote it by the symbol < VI W>. It is
once again a number (generally complex) dependent on the two vectors. We demand
that it obey the following axioms:
• < VI W> = <W V> * (skewsymmetry)
• <V V> iff I V> = 1 0 > (positive semidefiniteness)
• < VI (al W> + Z>)_ < VlaW+ bZ> = a<VIW> + b<VIZ> (linearity in ket)
Definition 7. A vector space with an inner product is called an inner product
space.
Notice that we have not yet given an explicit rule for actually evaluating the
scalar product, we are merely demanding that any rule we come up with must have
these properties. With a view to finding such a rule, let us familiarize ourselves with
the axioms. The first differs from the corresponding one for the dot product and
makes the inner product sensitive to the order of the two factors, with the two
choices leading to complex conjugates. In a real vector space this axioms states the
symmetry of the dot product under exchange of the two vectors. For the present,
let us note that this axiom ensures that <V V> is real.
The second axiom says that < VI V> is not just real but also positive semidefinite,
vanishing only if the vector itself does. If we are going to define the length of the
vector as the square root of its inner product with itself (as in the dot product) this
quantity had better be real and positive for all nonzero vectors.
The last axiom expresses the linearity of the inner product when a linear super 9
position al W> + bl Z> la W+ bZ> appears as the second vector in the scalar prod MATHEMATICAL
uct. We have discussed its validity for the arrows case (Fig. 1.2). INTRODUCTION
What if the first factor in the product is a linear superposition, i.e., what is
<aW+ bZIV>? This is determined by the first axiom:
<aW+ bZI V> = <VlaW+ bZ>* by BI
= (a<VIW> + b<VIZ>)*
= a* <VIW> * +b* <VIZ> *
=a* <WIV>+ b* <ZIV> (1.2.3)
which expresses the antilinearity of the inner product with respect to the first factor
in the inner product. In other words, the inner product of a linear superposition
with another vector is the corresponding superposition of inner products if the super
position occurs in the second factor, while it is the superposition with all coefficients
conjugated if the superposition occurs in the first factor. This asymmetry, unfamiliar
in real vector spaces, is here to stay and you will get used to it as you go along.
Let us continue with inner products. Even though we are trying to shed the
restricted notion of a vector as an arrow and seeking a corresponding generalization
of the dot product, we still use some of the same terminology.
Definition 8. We say that two vectors are orthogonal or perpendicular if their
inner product vanishes.
Definition 9. We will refer to ,/< VI V> I VI as the norm or length of the vector.
A normalized vector has unit norm.
Definition 10. A set of basis vectors all of unit norm, which are pairwise ortho
gonal will be called an orthonormal basis.
We will also frequently refer to the inner or scalar product as the dot product.
We are now ready to obtain a concrete formula for the inner product in terms
of the components. Given l V> and I W>
I v>=E i>
we follow the axioms obeyed by the inner product to obtain:
< VI W> E wjoli> (1.2.4)
To go any further we have to know <i I j>, the inner product between basis vectors.
That depends on the details of the basis vectors and all we know for sure is that
10 they are linearly independent. This situation exists for arrows as well. Consider a
CHAPTER 1
twodimensional problem where the basis vectors are two linearly independent but
nonperpendicular vectors. If we write all vectors in terms of this basis, the dot
product of any two of them will likewise be a double sum with four terms (determined
by the four possible dot products between the basis vectors) as well as the vector
components. However, if we use an orthonormal basis such as j, only diagonal
terms like <i l i> will survive and we will get the familiar result A • fi=i4,13,+A5 B5
depending only on the components.
For the more general nonarrow case, we invoke Theorem 3.
Theorem 3 (GramSchmidt). Given a linearly independent basis we can form
linear combinations of the basis vectors to obtain an orthonormal basis.
Postponing the proof for a moment, let us assume that the procedure has been
implemented and that the current basis is orthonormal:
<ili>= {1 for i =j
=
— Y
0 for i0j
where 8, is called the Kronecker delta symbol. Feeding this into Eq. (1.2.4) we find
the double sum collapses to a single one due to the Kronecker delta, to give
<v 1 w> (1.2.5)
This is the form of the inner product we will use from now on.
You can now appreciate the first axiom; but for the complex conjugation of
the components of the first vector, <V V> would not even be real, not to mention
positive. But now it is given by
<v1v>=E (1.2.6)
and vanishes only for the null vector. This makes it sensible to refer to < VI V> as
the length or norm squared of a vector.
Consider Eq. (1.2.5). Since the vector I V> is uniquely specified by its compo
nents in a given basis, we may, in this basis, write it as a column vector:

VI
V2
I V>—* in this basis (1.2.7)
vn_
Likewise 11

MATHEMATICAL
WI INTRODUCTION
W2
: in this basis (1.2.8)
Wn
The inner product < VI W> is given by the matrix product of the transpose conjugate
of the column vector representing I V> with the column vector representing 1 W>:
WI
W2
< VI W> = [v; , vl , . . . , 0] (1.2.9)
_Wn
1.3. Dual Spaces and the Dirac Notation
There is a technical point here. The inner product is a number we are trying to
generate from two kets I V> and I W>, which are both represented by column vectors
in some basis. Now there is no way to make a number out of two columns by direct
matrix multiplication, but there is a way to make a number by matrix multiplication
of a row times a column. Our trick for producing a number out of two columns has
been to associate a unique row vector with one column (its transpose conjugate)
and form its matrix product with the column representing the other. This has the
feature that the answer depends on which of the two vectors we are going to convert
to the row, the two choices (<V W> and <WI V>) leading to answers related by
complex conjugation as per axiom 1(h).
But one can also take the following alternate view. Column vectors are concrete
manifestations of an abstract vector I V> or ket in a basis. We can also work back
ward and go from the column vectors to the abstract kets. But then it is similarly
possible to work backward and associate with each row vector an abstract object
<WI, called bra W. Now we can name the bras as we want but let us do the following.
Associated with every ket 1 V> is a column vector. Let us take its adjoint, or transpose
conjugate, and form a row vector. The abstract bra associated with this will bear
the same label, i.e., it be called < VI. Thus there are two vector spaces, the space of
kets and a dual space of bras, with a ket for every bra and vice versa (the components
being related by the adjoint operation). Inner products are really defined only
between bras and kets and hence from elements of two distinct but related vector
spaces. There is a basis of vectors I i> for expanding kets and a similar basis <i l for
expanding bras. The basis ket 1i> is represented in the basis we are using by a column
vector with all zeros except for a 1 in the ith row, while the basis bra <il is a row
vector with all zeros except for a 1 in the ith column.
12 All this may be summarized as follows:
CHAPTER 1
VI
V2
(1.3.1)
Vn_
where 4* means "within a basis."
There is, however, nothing wrong with the first viewpoint of associating a scalar
product with a pair of columns or kets (making no reference to another dual space)
and living with the asymmetry between the first and second vector in the inner
product (which one to transpose conjugate?). If you found the above discussion
heavy going, you can temporarily ignore it. The only thing you must remember is
that in the case of a general nonarrow vector space:
• Vectors can still be assigned components in some orthonormal basis, just as with
arrows, but these may be complex.
• The inner product of any two vectors is given in terms of these components by
Eq. (1.2.5). This product obeys all the axioms.
1.3.1. Expansion of Vectors in an Orthonormal Basis
Suppose we wish to expand a vector I V> in an orthonormal basis. To find the
components that go into the expansion we proceed as follows. We take the dot
product of both sides of the assumed expansion with I j> : (or <A if you are a purist)
I v> =E vil (1.3.2)
01 V> = E (1.3.3)
= V, (1.3.4)
i.e., the find the jth component of a vector we take the dot product with the jth unit
vector, exactly as with arrows. Using this result we may write
I V>= 1001 v> (1.3.5)
Let us make sure the basis vectors look as they should. If we set I V> =Ij> in Eq.
(1.3.5), we find the correct answer: the ith component of the jth basis vector is 8„.
Thus for example the column representing basis vector number 4 will have a 1 in
the 4th row and zero everywhere else. The abstract relation
I v> =E vil i> (1.3.6)
becomes in this basis 13
MATHEMATICAL
V1 — 1  0— — 0— INTRODUCTION
V2 0 1 0
: =VI : + V2 0 + • • • vn : (1.3.7)
_Vn_. _0_ _0_ _1_
1.3.2. Adjoint Operation
We have seen that we may pass from the column representing a ket to the
row representing the corresponding bra by the adjoint operation, i.e., transpose
conjugation. Let us now ask: if < VI is the bra corresponding to the ket I V> what
bra corresponds to al V> where a is some scalar? By going to any basis it is readily
found that
— 
avi
av2
al V> —+ —> [a a*v*2 , . ,a*0]—> <V1a* (1.3.8)
_a vn _
It is customary to write al V> as laV> and the corresponding bra as <aVI. What
we have found is that
<a1/1= <Via* (1.3.9)
Since the relation between bras and kets is linear we can say that if we have an
equation among kets such as
al V>=bl W>+ clZ>+ • • (1.3.10)
this implies another one among the corresponding bras:
< VI a* =<W1b* + <ZIe* + • • • (1.3.11)
The two equations above are said to be adjoints of each other. Just as any equation
involving complex numbers implies another obtained by taking the complex conju
gates of both sides, an equation between (bras) kets implies another one between
(kets) bras. If you think in a basis, you will see that this follows simply from the
fact that if two columns are equal, so are their transpose conjugates.
Here is the rule for taking the adjoint:
14 To take the adjoint of a linear equation relating kets (bras), replace every ket
CHAPTER 1
(bra) by its bra (ket) and complex conjugate all coefficients.
We can extend this rule as follows. Suppose we have an expansion for a vector:
I v>= E (1.3.12)
1=1
in terms of basis vectors. The adjoint is
<v1= E <ilvr
i= 1
Recalling that vi = <i V> and v? = <V i>, it follows that the adjoint of
 I v>= E i><iV> (1.3.13)
is
<V1= E <vli>01 (1.3.14)
from which comes the rule:
To take the adjoint of an equation involving bras and kets and coefficients,
reverse the order of all factors, exchanging bras and kets and complex conjugating
all coefficients.
Gram—Schmidt Theorem
Let us now take up the Gram—Schmidt procedure for converting a linearly
independent basis into an orthonormal one. The basic idea can be seen by a simple
example. Imagine the twodimensional space of arrows in a plane. Let us take two
nonparallel vectors, which qualify as a basis. To get an orthonormal basis out of
these, we do the following:
• Rescale the first by its own length, so it becomes a unit vector. This will be the
first basis vector.
• Subtract from the second vector its projection along the first, leaving behind only
the part perpendicular to the first. (Such a part will remain since by assumption
the vectors are nonparallel.)
• Rescale the left over piece by its own length. We now have the second basis vector:
it is orthogonal to the first and of unit length.
This simple example tells the whole story behind this procedure, which will now
be discussed in general terms in the Dirac notation.
Let 1/>, 1H>, . . . be a linearly independent basis. The first vector of the 15
orthonormal basis will be MATHEMATICAL
INTRODUCTION
— where 1 1 1 =‘/<I1 I>
11> =1/>
Clearly
<110 
</V>2  1
11 1
As for the second vector in the basis, consider
12'>=1//>11><1111>
which is III> minus the part pointing along the first unit vector. (Think of the arrow
example as you read on.) Not surprisingly it is orthogonal to the latter:
<
<112'> = <1111>— 11 1><11H> =0
We now divide 12'> by its norm to get 12> which will be orthogonal to the first and
normalized to unity. Finally, consider
1 3'› = — I 1 ><11 HI> — 12><2IIII>
which is orthogonal to both 11> and 12>. Dividing by its norm we get 13>, the third
member of the orthogonal basis. There is nothing new with the generation of the
rest of the basis.
Where did we use the linear independence of the original basis? What if we had
started with a linearly dependent basis? Then at some point a vector like 12'> or 13'>
would have vanished, putting a stop to the whole procedure. On the other hand,
linear independence will assure us that such a thing will never happen since it amounts
to having a nontrivial linear combination of linearly independent vectors that adds
up the null vector. (Go back to the equations for 12'> or 13'> and satisfy yourself
that these are linear combinations of the old basis vectors.)
Exercise 1.3.1. Form an orthogonal basis in two dimensions starting with ;1= 3i+ 4j and
21— 6j. Can you generate another orthonormal basis starting with these two vectors? If
so, produce another.
16 Exercise 1.3.2. Show how to go from the basis
CHAPTER 1
3 0
II> =[()] 1H> =[11 IIH> =[2
0 2 5
to the orthonormal basis
1 o o
I 1> = [01
O
12>= [1/.13
2/N/3 l//5
When we first learn about dimensionality, we associate it with the number of
perpendicular directions. In this chapter we defined in terms of the maximum number
of linearly independent vectors. The following theorem connects the two definitions.
Theorem 4. The dimensionality of a space equals n 1 , the maximum number of
mutually orthogonal vectors in it.
To show this, first note that any mutually orthogonal set is also linearly indepen
dent. Suppose we had a linear combination of orthogonal vectors adding up to
zero. By taking the dot product of both sides with any one member and using the
orthogonality we can show that the coefficient multiplying that vector had to vanish.
This can clearly be done for all the coefficients, showing the linear combination is
trivial.
Now n 1 can only be equal to, greater than or lesser than n, the dimensionality
of the space. The Gram—Schmidt procedure eliminates the last case by explicit con
struction, while the linear independence of the perpendicular vectors rules out the
penultimate option.
Schwarz and Triangle Inequalities
Two powerful theorems apply to any inner product space obeying our axioms:
Theorem 5. The Schwarz Inequality
I<VI W>I I VII WI (1.3.15)
Theorem 6. The Triangle Inequality
I V+ WI I + WI (1.3.16)
The proof of the first will be provided so you can get used to working with bras
and kets. The second will be left as an exercise.
Before proving anything, note that the results are obviously true for arrows: 17
the Schwarz inequality says that the dot product of two vectors cannot exceed the MATHEMATICAL
product of their lengths and the triangle inequality says that the length of a sum INTRODUCTION
cannot exceed the sum of the lengths. This is an example which illustrates the merits
of thinking of abstract vectors as arrows and guessing what properties they might
share with arrows. The proof will of course have to rely on just the axioms.
To prove the Schwarz inequality, consider axiom 1(i) applied to
1 z> = 1 v> < w 1 ,v>1 w > (1.3.17)
1 wl
We get
<ZIZ> = < V
V>
<WI
2 W V
< WI V>2 W>
1W! I WI
= <VI V> <W V>< VI W> < V> *< V>
WI
1W1 2
+ <WI V> *< WI V>< WI W>
I WI 4
>0 (1.3.18)
where we have used the antilinearity of the inner product with respect to the bra.
Using
< v>* = < w>
we find
< VI V> > < WI V>< VI W> (1.3.19)
I WI 2
Crossmultiplying by 1 W1 2 and taking square roots, the result follows.
Exercise 1.3.3. When will this inequality be satisfied? Does this agree with you experience
with arrows?
Exercise 1.3.4. Prove the triangle inequality starting with 1 V+ W1 2. You must use
Re< VI W> 1< VI W>1 and the Schwarz inequality. Show that the final inequality becomes an
equality only if 1 V> = al W> where a is a real positive scalar.
1.4. Subspaces
Definition 11. Given a vector space V, a subset of its elements that form a
vector space among themselves t is called a subspace. We will denote a particular
subspace i of dimensionality ni by V`.
Vector addition and scalar multiplication are defined the same way in the subspace as in V.
18 Example 1.4.1. In the space V3(R), the following are some example of sub
CHAPTER 1
spaces: (a) all vectors along the x axis, the space V); (b) all vectors along the y
axis, the space V); (c) all vectors in the x —y plane, the space Vly . Notice that all
subspaces contain the null vector and that each vector is accompanied by its inverse
to fulfill axioms for a vector space. Thus the set of all vectors along the positive x
axis alone do not form a vector space. El
Definition 12. Given two subspaces 0/7' and VT), we define their sum
V7'0V7i= V",:k as the set containing (1) all elements of V", (2) all elements of
V7, (3) all possible linear combinations of the above. But for the elements (3),
closure would be lost.
Example 1.4.2. If, for example, V,I 0V) contained only vectors along the x and
y axes, we could, be adding two elements, one from each direction, generate one
along neither. On the other hand, if we also included all linear combinations, we
would get the correct answer, VI OV) = CI
Exercise 1.4.1.* In a space V", prove that the set of all vectors {I Vi>, I Vi>, • • • I ,
orthogonal to any I V> 00>, form a subspace V"  I .
Exercise 1.4.2. Suppose vp and vp are two subspaces such that any element of V I is
orthogonal to any element of V2. Show that the dimensionality of V, V2 is n 1 + n2 . (Hint:
Theorem 6.)
1.5. Linear Operators
An operator û is an instruction for transforming any given vector I V> into
another, I V'>. The action of the operator is represented as follows:
f/1 v>=1 (1.5.1)
One says that the operator f/ has transformed the ket I V> into the ket I V'>. We
will restrict our attention throughout to operators û that do not take us out of the
vector space, i.e., if I V> is an element of a space V, so is I V'>= s/I V>.
Operators can also act on bras:
< rin=< v" 1 (1.5.2)
We will only be concerned with linear operators, i.e., ones that obey the following
rules:
not' Vi> = anI Vi> (1.5.3a)
ntal vi>+fil Vi>1=aq vi>+finl vi> (1.5.3b)
(1.5.4a)
(<Vila F<Vilf3 ) 2 =a<viln+fi<v.iln (1.5.4b)
19
MATHEMATICAL
INTRODUCTION
Figure 1.3. Action of the operator R( ,ri ). Note that
R[12>+13>]= R12> +R13> as expected of a linear operator. (We
will often refer to R(Iiri) as R if no confusion is likely.)
Example 1.5.1. The simplest operator is the identity operator, I, which carries
the instruction:
I—>Leave the vector alone!
Thus,
/1 V> = 1 V> for all kets 1 V> (1.5.5)
and
< V1/= < VI for all bras <V (1.5.6)
We next pass on to a more interesting operator on V3 (R):
7ri)—>Rotate vector by r about the unit vector i
[More generally, R(0) stands for a rotation by an angle 0=101 about the axis parallel
to the unit vector 6= tve.] Let us consider the action of this operator on the three
unit vectors i, j, and k, which in our notation will be denoted by 11>, 12>, and 13>
(see Fig. 1.3). From the figure it is clear that
Rani/11>H» (1.5.7a)
R(iri)1 2>=1 3 > (1.5.7b)
Rani/13> = — 12> (1.5.7c)
Clearly R(ri) is linear. For instance, it is clear from the same figure that
R[12>+13>]=R12>+RI3>. LI
The nice feature of linear operators is that once their action on the basis vectors
is known, their action on any vector in the space is determined. If
nii>=10
for a basis II>, 12>, , In> in V's, then for any I V> =E vi I i>
v>=Env,ii>=E vs/10=E or> (1.5.8)
20 This is the case in the example SI= R(711). If
I V>=
CHAPTER 1
+ v2 I 2> + v3 I3>
is any vector, then
RI V> = vi Ri 1> + v2 RI2> + v3RI3>= vii 1> + v2I3> — v3 I2>
The product of two operators stands for the instruction that the instructions
corresponding to the two operators be carried out in sequence
V> = A(f/I V> )= Ain V> (1.5.9)
where I S2 V> is the ket obtained by the action of S2 on I V>. The order of the operators
in a product is very important: in general,
ûAA[û, A]
called the commutator of û and A isn't zero. For example R(ri) and R(1 nj) do
not commute, i.e., their commutator is nonzero.
Two useful identities involving commutators are
[SI, AO] = 0] + [S2, A] 0 (1.5.10)
[An, O] = 0] + [A, op (1.5.11)
Notice that apart from the emphasis on ordering, these rules resemble the chain rule
in calculus for the derivative of a product.
The inverse of 0, denoted by sr', satisfiest
ofri = fr'n =1 (1.5.12)
Not every operator has an inverse. The condition for the existence of the inverse is
given in Appendix A.1. The operator R(7ri) has an inverse: it is R(Iri). The
inverse of a product of operators is the product of the inverses in reverse:
mAyl (1.5.13)
for only then do we have
(SIA)(SIA) 1 = (SIA)(AI SI1 )= SIAA1 01 =sg/1 = I
1.6. Matrix Elements of Linear Operators
We are now accustomed to the idea of an abstract vector being represented in
a basis by an ntuple of numbers, called its components, in terms of which all vector
In V(C) with n finite, S21 S2= I .4.> S2S2 ' =I. Prove this using the ideas introduced toward the end of
Theorem A.1.1., Appendix A.1.
operations can be carried out. We shall now see that in the same manner a linear 21
operator can be represented in a basis by a set of n2 numbers, written as an n X n MATHEMATICAL
matrix, and called its matrix elements in that basis. Although the matrix elements, INTRODUCTION
just like the vector components, are basis dependent, they facilitate the computation
of all basisindependent quantities, by rendering the abstract operator more tangible.
Our starting point is the observation made earlier, that the action of a linear
operator is fully specified by its action on the basis vectors. If the basis vectors suffer
a change
(where I i'> is known), then any vector in this space undergoes a change that is readily
calculable:
ci v>=û E viii>=E vinli>=E vilr>
When we say I i'> is known, we mean that its components in the original basis
Ur> =</Inli>n,, (1.6.1)
are known. The n2 numbers, ny , are the matrix elements of û in this basis. If
then the components of the transformed ket I V'> are expressable in terms of the ni,
and the components of I V'> :
v; = <il v'>= <ilol v>= Oln(E Vi Li>)
=E
=ESlif t); (1.6.2)
Equation (1.6.2) can be cast in matrix form:
OPP> 01q2> • •• Ololn> vi
<2û1l> v2
(1.6.3)
v' [
<nli../1 1> ••• t;n
A mnemonic: the elements of the first column are simply the components of the first
transformed basis vector I l'> =op> in the given basis. Likewise, the elements of the
jth column represent the image of the jth basis vector after û acts on it.
22 Convince yourself that the same matrix SI, acting to the left on the row vector
CHAPTER 1 corresponding to any <v'l gives the row vector corresponding to <v"1= 0/1
Example 1.6.1. Combining our mnemonic with the fact that the operator R(ri)
has the following effect on the basis vectors:
R(zi)11>=11>
R(iri)12> =13>
R(ri)13>= —12>
we can write down the matrix that represents it in the 11>, 12>, 13> basis:
10 0]
R(1 ni) [0 0 —1 (1.6.4)
01 0
For instance, the —1 in the third column tells us that R rotates 13> into —12>. One
may also ignore the mnemonic altogether and simply use the definition R,.,=
to compute the matrix. 0
Exercise 1.6.1. An operator f2 is given by the matrix
001 1
100
010
What is its action?
Let us now consider certain specific operators and see how they appear in matrix
form.
(1) The Identity Operator I.
01'0= <ilj>=Su (1.6.5)
Thus I is represented by a diagonal matrix with l's along the diagonal. You should
verify that our mnemonic gives the same result.
(2) The Projection Operators. Let us first get acquainted with projection opera
tors. Consider the expansion of an arbitrary ket 1 V> in a basis:
v>= E iixii v>
i=,
In terms of the objects I 001, which are linear operators, and which, by definition, 23
act on I V> to give 1001 V>, we may write the above as MATHEMATICAL
INTRODUCTION
E li>01)IV> (1.6.6)
IV>=(i=1
Since Eq. (1.6.6) is true for all I V>, the object in the brackets must be identified
with the identity (operator)
i=
i=1
iixil= E
i=
Pi (1.6.7)
The object P, = 1001 is called the projection operator for the ket i>. Equation (1.6.7),
which is called the completeness relation, expresses the identity as a sum over projec
tion operators and will be invaluable to us. (If you think that any time spent on the
identity, which seems to do nothing, is a waste of time, just wait and see.)
Consider
Pil V>= 001 V>= (1.6.8)
Clearly P, is linear. Notice that whatever I V> is, P11 V> is a multiple of I i> with
a coefficient (v,) which is the component of I V> along I i>. Since P, projects out the
component of any ket I V> along the direction I i>, it is called a projection operator.
The completeness relation, Eq. (1.6.7), says that the sum of the projections of a
vector along all the n directions equals the vector itself. Projection operators can
also act on bras in the same way:
< Pi =< vl i Xil = vr<1 I (1.6.9)
Pojection operators corresponding to the basis vectors obey
PiPi = I i >< i lj><jI = 80 Pi (1.6.10)
This equation tells us that (1) once P, projects out the part of I V> along I i>, further
applications of P, make no difference; and (2) the subsequent application of P ( j i)
will result in zero, since a vector entirely along I i> cannot have a projection along a
perpendicular direction I j>.
24
CHAPTER 1 E( .
Figure 1.4. P. and Py are polarizers p aced in the way of a beam traveling along the z axis. The action
of the polarizers on the electric field E obeys the law of combination of projection operators:
P,Py =
The following example from optics may throw some light on the discussion.
Consider a beam of light traveling along the z axis and polarized in the x —y plane
at an angle 0 with respect to the y axis (see Fig. 1.4). If a polarizer Py , that only
admits light polarized along the y axis, is placed in the way, the projection E cos 0
along the y axis is transmitted. An additional polarizer Py placed in the way has no
further effect on the beam. We may equate the action of the polarizer to that of a
projection operator Py that acts on the electric field vector E. If Py is followed by a
polarizer Px the beam is completely blocked. Thus the polarizers obey the equation
P,P,= 8,, P, expected of projection operators.
Let us next turn to the matrix elements of P. There are two approaches. The
first one, somewhat indirect, gives us a feeling for what kind of an object li><i is.
We know
0
0
I i>
0
and
<i (0, 0, . . . , 1, 0, 0, . . . , 0)
so that 25
MATHEMATICAL
0 INTRODUCTION
0
0
li Xi l 1 (0, 0, . . . , 1, 0, .. . , 0) = 1 (1.6.11)
0 0
6 o o_
by the rules of matrix multiplication. Whereas < VI V'> = (1 X n matrix) x
(nx 1 matrix) = (1 x 1 matrix) is a scalar, I V>< V1 = (n x 1 matrix) x (1 x n matrix) =
(nx n matrix) is an operator. The inner product < VI V'> represents a bra and ket
which have found each other, while I V>< FI, sometimes called the outer product,
has the two factors looking the other way for a bra or a ket to dot with.
The more direct approach to the matrix elements gives
(PI )k/ = <kli><iIl> = ki8  8 (1.6.12)
which is of course identical to Eq. (1.6.11). The same result also follows from mne
monic. Each projection operator has only one nonvanishing matrix element, a 1 at
the ith element on the diagonal. The completeness relation, Eq. (1.6.7), says that
when all the P, are added, the diagonal fills out to give the identity. If we form the
sum over just some of the projection operators, we get the operator which projects
a given vector into the subspace spanned by just the corresponding basis vectors.
Matrices Corresponding to Products of Operators
Consider next the matrices representing a product of operators. These are related
to the matrices representing the individual operators by the application of Eq. (1.6.7) :
()A)=<iIQAlj> = <iIQIAIi>
=E <ilnIkXklAll>=E nikAki (1.6.13)
Thus the matrix representing the product of operators is the product of the matrices
representing the factors.
The Adjoint of an Operator
Recall that given a ket a l V> la V> the corresponding bra is
<a VI = <Via* (and not <Via)
26 In the same way, given a ket
CHAPTER 1
v>= I n v>
the corresponding bra is
<nvi =< Vf (1.6.14)
which defines the operator nt. One may state this equation in words: if SI turns a
ket I V> to I V'>, then f2t turns the bra <VI into <FI. Just as a and a*, IV> and
<VI are related but distinct objects, so are f2 and f2 t. The relation between f2, and
f2t, called the adjoint of f2 or "omega dagger," is best seen in a basis:
(nt)y=
=<ilni>*=<./Inli>*
SO
W,= skt (1.6.15)
In other words, the matrix representing fir is the transpose conjugate of the matrix
representing f2. (Recall that the row vector representing <VI is the transpose conju
gate of the column vector representing I V>. In a given basis, the adjoint operation is
the same as taking the transpose conjugate.)
The adjoint of a product is the product of the adjoints in reverse:
(1)A) t_ Atilt (1.6.16)
To prove this we consider <A VI. First we treat f2A as one operator and get
<OA VI = <(f)A) VI = < VI (f)A) t
Next we treat (A V) as just another vector, and write
<A VI = <f2(A V )1 = <A VI f2t
We next pull out A, pushing fir further out:
<A VI = < VI AtSlt
Comparing this result with the one obtained a few lines above, we get the desired
result.
Consider now an equation consisting of kets, scalars, and operators, such as
aiI Vi>= a2I V2> + a3IV3><V41 V5>+ a4QAIV6> (1.6.17a)
What is its adjoint? Our old rule tells us that it is 27
MATHEMATICAL
< a: = < V2I +<V51V4><V31a+<(211V6lat INTRODUCTION
In the last term we can replace <SIA V61 by
< V61(f2A) t = < KlAtnt
so that finally we have the adjoint of Eq. (1.6.17a):
< I at =<V2IctI+<V51V4><V3laT+<V61A tfirat (1.6.17b)
The final rule for taking the adjoint of the most general equation we will ever
encounter is this:
When a product of operators, bras, kets, and explicit numerical coefficients is
encountered, reverse the order of all factors and make the substitutions S2442t,
I>* <I, a. a*.
(Of course, there is no real need to reverse the location of the scalars a except in
the interest of uniformity.)
Hermitian, Anti Hermitian, and Unitary Operators

We now turn our attention to certain special classes of operators that will play
a major role in quantum mechanics.
Definition 13. An operator f2 is Hermitian if f2t =f2.
Definition 14. An operator SI is antiHermitian if f2t =
The adjoint is to an operator what the complex conjugate is to numbers. Hermitian
and antiHermitian operators are like pure real and pure imaginary numbers. Just
as every number may be decomposed into a sum of pure real and pure imaginary
parts,
a+a a—a
a— +
2 2
we can decompose every operator into its Hermitian and antiHermitian parts:
n
QEnt + usf
(1.6.18)
2 2
Exercise 1.6.2.* Given f2 and A are Hermitian what can you say about (1) KM; (2)
OA+ 2,11; (3) [f2, A]; and (4) i[S2, A]?
28 Definition 15. An operator U is unitary if
CHAPTER 1
uut =I (1.6.19)
This equation tells us that U and Ut are inverses of each other. Consequently,
from Eq. (1.5.12),
eu= I (1.6.20)
Following the analogy between operators and numbers, unitary operators are
like complex numbers of unit modulus, u = 9 . Just as u*u =1, so is Ut U= I.
Exercise 1.6.3. * Show that a product of unitary operators is unitary.
Theorem 7. Unitary operators preserve the inner product between the vectors
they act on.
Proof Let
Ivç>= ul vi>
and
I = ul v2>
Then
<VIV>= <UV2lUVi >
= < v2 i eV vi> = < 1/21 vi>
' (1.6.21)
(Q.E.D.)
Unitary operators are the generalizations of rotation operators from V3(R) to
✓ (C), for just like rotation operators in three dimensions, they preserve the lengths
of vectors and their dot products. In fact, on a real vector space, the unitarity
condition becomes U  ' = UT (T means transpose), which defines an orthogonal or
rotation matrix. [R ( ni) is an example.]
Theorem 8. If one treats the columns of an n X n unitary matrix as components
of n vectors, these vectors are orthonormal. In the same way, the rows may be
interpreted as components of n orthonormal vectors.
Proof ]. According to our mnemonic, the jth column of the matrix representing
U is the image of the jth basis vector after U acts on it. Since U preserves inner
products, the rotated set of vectors is also orthonormal. Consider next the rows. We
now use the fact that Ut is also a rotation. (How else can it neutralize U to give
Ut U= /?) Since the rows of U are the columns of Ut (but for an overall complex
conjugation which does not affect the question of orthonormality), the result we 29
already have for the columns of a unitary matrix tells us the rows of U are MATHEMATICAL
orthonormal. INTRODUCTION
Proof 2. Since Ut U= /,
3 u=<ilhlf>=<il Ut Uli>
=E <il oelkXkl uli>
=E UlkUkj = E UAUkj (1.6.22)
which proves the theorem for the columns. A similar result for the rows follows if
we start with the equation UUt =L Q.E.D.
Note that Cy — land UUt = I are not independent conditions.
Exercise 1.6.4.* It is assumed that you know (1) what a determinant is, (2) that det SZ T =
det (T denotes transpose), (3) that the determinant of a product of matrices is the product
of the determinants. [If you do not, verify these properties for a twodimensional case
n= [a p)
8)
with det (a — f3').] Prove that the determinant of a unitary matrix is a complex number
of unit modulus.
Exercise 1.6.5.* Verify that R ( A) is unitary (orthogonal) by examining its matrix.
Exercise 1.6.6. Verify that the following matrices are unitary:
1 [1 i 1[1
+ 1— i
2172 i 11 2 1—i 1+i
Verify that the determinant is of the form e'° in each case. Are any of the above matrices
Hermitian?
1.7. Active and Passive Transformations
Suppose we subject all the vectors I V> in a space to a unitary transformation
I v> (1.7.1)
Under this transformation, the matrix elements of any operator SI are modified as
follows:
V>—><UV'ISII UV>=OPIUtS2U1 V> (1.7.2)
30 It is clear that the same change would be effected if we left the vectors alone and
CHAPTER 1 subjected all operators to the change
(1.7.3)
The first case is called an active transformation and the second a passive transforma
tion. The present nomenclature is in reference to the vectors: they are affected in an
active transformation and left alone in the passive case. The situation is exactly the
opposite from the point of view of the operators.
Later we will see that the physics in quantum theory lies in the matrix elements
of operators, and that active and passive transformations provide us with two equiva
lent ways of describing the same physical transformation.
Exercise 1.7.1.* The trace of a matrix is defined to be the sum of its diagonal matrix
elements
Tr =En,
Show that
(1) Tr(SIA)=Tr(M2)
(2) Tr(f2A0)=Tr(A9S2 )=TR(OSIA) (The permutations are cyclic).
(3) The trace of an operator is unaffected by a unitary change of basis 100 Uli>. [Equiva
lently, show Tr f2=Tr(Uff2U).]
Exercise 1.7.2. Show that the determinant of a matrix is unaffected by a unitary change
of basis. [Equivalently show det n=det(UtS2U).]
1.8. The Eigenvalue Problem
Consider some linear operator SI acting on an arbitrary nonzero ket I V>:
(21 v >= 1 (1.8.1)
Unless the operator happens to be a trivial one, such as the identity or its multiple,
the ket will suffer a nontrivial change, i.e., I V'> will not be simply related to I V>.
So much for an arbitrary ket. Each operator, however, has certain kets of its own,
called its eigenkets, on which its action is simply that of rescaling:
(1.8.2)
Equation (1.8.2) is an eigenvalue equation: I V> is an eigenket of SI with eigenvalue
co. In this chapter we will see how, given an operator SI, one can systematically
determine all its eigenvalues and eigenvectors. How such an equation enters physics
will be illustrated by a few examples from mechanics at the end of this section, and
once we get to quantum mechanics proper, it will be eigen, eigen, eigen all the way.
Example 1.8.1. To illustrate how easy the eigenvalue problem really is, we will 31
begin with a case that will be completely solved: the case SI = I. Since MATHEMATICAL
INTRODUCTION
Il V> = I V>
for all l V>, we conclude that
(1) the only eigenvalue of / is 1;
(2) all vectors are its eigenvectors with this eigenvalue. 0
Example 1.8.2. After this unqualified success, we are encouraged to take on a
slightly more difficult case: SI= P y , the projection operator associated with a normal
ized ket l V>. Clearly
(1) any ket al V>, parallel to l V> is an eigenket with eigenvalue 1:
P v laV>=I V><VIaV>=alY>IVI 2 =1•lay>
(2) any ket l VI>, perpendicular to l V>, is an eigenket with eigenvalue 0:
P vl vi > = I v >< vl vi> = 0 =01 vi >
(3) kets that are neither, i.e., kets of the form al V> + fil VI>, are simply not
eigenkets:
Pv(al v > + )61 vi> )= la v> 0 r(al v> + fil v±>)
Since every ket in the space falls into one of the above classes, we have found
all the eigenvalues and eigenvectors. 0
Example 1.8.3. Consider now the operator Ra ri). We already know that it
has one eigenket, the basis vector 11> along the x axis:
R( . iri)ll>=11>
Are there others? Of course, any vector all> along the x axis is also unaffected by
the x rotation. This is a general feature of the eigenvalue equation and reflects the
linearity of the operator:
if
01 V > = co l v >
then
Slal V> = ail' Y>=acolV>=o)alV>
32 for any multiple a. Since the eigenvalue equation fixes the eigenvector only up to
CHAPTER 1
an overall scale factor, we will not treat the multiples of an eigenvector as distinct
eigenvectors. With this understanding in mind, let us ask if R ( in) has any eigenvec
tors besides l l>. Our intuition says no, for any vector not along the x axis necessarily
gets rotated by R(ici) and cannot possibly transform into a multiple of itself. Since
every vector is either parallel to 11> or isn't, we have fully solved the eigenvalue
problem.
The trouble with this conclusion is that it is wrong! RO xi) has two other
eigenvectors besides II>. But our intuition is not to be blamed, for these vectors are
in V3(C) and not V 3(R). It is clear from this example that we need a reliable and
systematic method for solving the eigenvalue problem in V(C). We now turn our
attention to this very question. El
The Characteristic Equation and the Solution to the Eigenvalue Problem
We begin by rewriting Eq. (1.8.2) as
(2— co/)1 V> = I 0> (1.8.3)
Operating both sides with (52— od) 1 , assuming it exists, we get
(1.8.4)
Now, any finite operator (an operator with finite matrix elements) acting on the null
vector can only give us a null vector. It therefore seems that in asking for a nonzero
eigenvector I V>, we are trying to get something for nothing out of Eq. (1.8.4). This
is impossible. It follows that our assumption that the operator (SI — o)/) 1 exists (as
a finite operator) is false. So we ask when this situation will obtain. Basic matrix
theory tells us (see Appendix A.1) that the inverse of any matrix M is given by
_cofactor MT
M' (1.8.5)
det M
Now the cofactor of M is finite if M is. Thus what we need is the vanishing of the
determinant. The condition for nonzero eigenvectors is therefore
det(52— co/)= 0 (1.8.6)
This equation will determine the eigenvalues co. To find them, we project Eq. (1.8.3)
onto a basis. Dotting both sides with a basis bra <i I, we get
<i l S2 — coil V> 0
and upon introducing the representation of the identity [Eq. (1.6.7)], to the left of 33
v>, we get the following image of Eq. (1.8.3): MATHEMATICAL
INTRODUCTION
E (ny CO 8)Vi = 0 (1.8.7)
Setting the determinant to zero will give us an expression of the form
E cmcom = o (1.8.8)
m=0
Equation (1.8.8) is called the characteristic equation and
Pn(co)= E cmcom (1.8.9)
m=0
is called the characteristic polynomial. Although the polynomial is being determined
in a particular basis, the eigenvalues, which are its roots, are basis independent, for
they are defined by the abstract Eq. (1.8.3), which makes no reference to any basis.
Now, a fundamental result in analysis is that every nthorder polynomial has n
roots, not necessarily distinct and not necessarily real. Thus every operator in V(C)
has n eigenvalues. Once the eigenvalues are known, the eigenvectors may be found,
at least for Hermitian and unitary operators, using a procedure illustrated by the
following example. [Operators on V(C) that are not of the above variety may not
have n eigenvectors—see Exercise 1.8.4. Theorems 10 and 12 establish that Hermitian
and unitary operators on V(C) will have n eigenvectors.]
Example 1.8.4. Let us use the general techniques developed above to find all
the eigenvectors and eigenvalues of R ( ri). Recall that the matrix representing it is
10 0]
R(iri)4> [0 0 —1
01 0
Therefore the characteristic equation is
1 — co 0 0
det(R — 0)1) = 0 —co —1 =0
0 1 —co
(1 —co)(co 2 +1)=0 (1.8.10)
34 with roots co = 1, ± i. We know that co = 1 corresponds to il>. Let us see this come
CHAPTER 1 out of the formalism. Feeding co = 1 into Eq. (1.8.7) we find that the components
, x2 , and x3 of the corresponding eigenvector must obey the equations
0 1 0
— [x 21= — — = 01
1
I]
X3
]—
[0
0
> x2
X2
x3
X3 =0
X3 =
Thus any vector of the form
xi I 1 >
xi
0
i
0
is acceptable, as expected. It is conventional to use the freedom in scale to normalize
the eigenvectors. Thus in this case a choice is
I co =1 > =1 1 > 40
I say a choice, and not the choice, since the vector may be multiplied by a number
of modulus unity without changing the norm. There is no universally accepted con
vention for eliminating this freedom, except perhaps to choose the vector with real
components when possible.
Note that of the three simultaneous equations above, the first is not a real
equation. In general, there will be only (n— 1) LI equations. This is the reason the
norm of the vector is not fixed and, as shown in Appendix A.1, the reason the
determinant vanishes.
Consider next the equations corresponding to co = i. The components of the
eigenvector obey the equations
(1— Ox i (i.e., x, = 0)
— — X3 = 0 (i.e., x2 iX3)
X2  iX3 = 0 (i.e., x2 = ix3)
Notice once again that we have only n— 1 useful equations. A properly normalized
solution to the above is
Ico=i, 4., 1 ro i
j
21
A similar procedure yields the third eigenvector: 35
la) = — i> [ —0 1 0
MATHEMATICAL
INTRODUCTION
In the above example we have introduced a popular convention: labeling the
eigenvectors by the eigenvalue. For instance, the ket corresponding to co = co; is
labeled I co = co,> or simply I co,>. This notation presumes that to each co, there is just
one vector labeled by it. Though this is not always the case, only a slight change in
this notation will be needed to cover the general case.
The phenomenon of a single eigenvalue representing more than one eigenvector
is called degeneracy and corresponds to repeated roots for the characteristic poly
nomial. In the face of degeneracy, we need to modify not just the labeling, but also
the procedure used in the example above for finding the eigenvectors. Imagine that
instead of R(ni) we were dealing with another operator S2 on V 3 (R) with roots co
and co 2 = w3 . It appears as if we can get two eigenvectors, by the method described
above, one for each distinct co. How do we get a third? Or is there no third? These
equations will be answered in all generality shortly when we examine the question
of degeneracy in detail. We now turn our attention to two central theorems on
Hermitian operators. These play a vital role in quantum mechanics.
Theorem 9. The eigenvalues of a Hermitian operator are real.
Proof Let
n10)>=(01(0>
Dot both sides with <col
<coin! co> = co<col co> (1.8.11)
Take the adjoint to get
<colot l co> = ce<colco>
Since S2 = S21., this becomes
<colol co> = (.0*<col co>
Subtracting from Eq. (1.8.11)
o= ( co ce)<wiv>
co = co* Q.E.D.
36 Theorem 10. To every Hermitian operator f2, there exists (at least) a basis
CHAPTER 1
consisting of its orthonormal eigenvectors. It is diagonal in this eigenbasis and
has its eigenvalues as its diagonal entries.
Proof Let us start with the characteristic equation. It must have at least one
root, call it co l . Corresponding to co l there must exist at least one nonzero eigenvector
lcDI>. [If not, Theorem (A.1.1) would imply that (f2 co l I) is invertible.] Consider
the subspace VI1 1 of all vectors orthogonal to 'col>. Let us choose as our basis the
vector I co l > (normalized to unity) and any n  1 orthonormal vectors
{V 1 , V 1 ,..., V11 1 } in VI7 1 . In this basis f2 has the following form:
0 0 0 0 • • 0
(1.8.12)
o
The first column is just the image of 10) 1 > after f2 has acted on it. Given the
first column, the first row follows from the Hermiticity of f2.
The characteristic equation now takes the form
(a)1  co) • (determinant of boxed submatrix) = 0
n1
(co 1  co) E cm0om=(co l —copn  l(0))=o
0
Now the polynomial P" 1 must also generate one root, oh, and a normalized
eigenvector i 0) 2 >. Define the subspace VI1,2 of vectors in 4/11 I orthogonal to 1(.02>
(and automatically to I w2> ) and repeat the same procedure as before. Finally, the
matrix f2 becomes, in the basis I col>, I (02>, • • • ,
(01 0 0 0
0 co2 0 o
Ç 4 0 0 CO3 0
0 0 0
Since every I co,> was chosen from a space that was orthogonal to the previous
ones, co IX I co2>, • • • , I coii> ; the basis of eigenvectors is orthonormal. (Notice that
nowhere did we have to assume that the eigenvalues were all distinct.) Q.E.D.
[The analogy between real numbers and Hermitian operators is further strength
ened by the fact that in a certain basis (of eigenvectors) the Hermitian operator can
be represented by a matrix with all real elements.]
In stating Theorem 10, it was indicated that there might exist more than one
basis of eigenvectors that diagonalized f2. This happens if there is any degeneracy.
Suppose col = (02= co. Then we have two orthonormal vectors obeying
f210)1> =coko i > 37
MATHEMATICAL
INTRODUCTION
K21(02> = 0)1 (02>
It follows that
n[al co 1> + 01(02>] = acol col> +0(01(02> = co[al col> +01 0)2>]
for any a and )6. Since the vectors 'co l > and 10) 2> are orthogonal (and hence LI),
we find that there is a whole twodimensional subspace spanned by I co l > and I 0o2>,
the elements of which are eigenvectors of I2 with eigenvalue co. One refers to this
space as an eigenspace of 52 with eigenvalue co. Besides the vectors I col> and I (02>,
there exists an infinity of orthonormal pairs Ico>,100, obtained by a rigid rotation
of 'col>, I 00, from which we may select any pair in forming the eigenbasis of S2.
In general, if an eigenvalue occurs m, times, that is, if the characteristic equation has
m, of its roots equal to some co i , there will be an eigenspace Vn.: from which we may
choose any m, orthonormal vectors to form the basis referred to in Theorem 10.
In the absence of degeneracy, we can prove Theorem 9 and 10 very easily. Let
us begin with two eigenvectors:
coi> = coi> (1.8.13a)
f2 1c0J> = Coj IWj > (1.8.13b)
Dotting the first with <coi l and the second with <coil, we get
<0);Inl (0i> = (0i<0)11 (0i> (1.8.14a)
<coil n1(0.> = Coj <Wj I Wj> (1.8.14b)
Taking the adjoint of the last equation and using the Hermitian nature of 52, we get
= 0)1<w; 1 (0s )
Subtracting this equation from Eq. (1.8.14a), we get
0(co i —coi)<coi lco i> (1.8.15)
If i=j, we get, since <co i I co i> 00,
(1.8.16)
38 If i Of, we get
CHAPTER 1 (1.8.17)
<coi I 0
since co ; — co,* = o), — co./ 0 0 by assumption. That the proof of orthogonality breaks
down for co, = co, is not surprising, for two vectors labeled by a degenerated eigenvalue
could be any two members of the degenerate space which need not necessarily be
orthogonal. The modification of this proof in this case of degeneracy calls for argu
ments that are essentially the ones used in proving Theorem 10. The advantage in
the way Theorem 10 was proved first is that it suffers no modification in the degener
ate case.
Degeneracy
We now address the question of degeneracy as promised earlier. Now, our
general analysis of Theorem 10 showed us that in the face of degeneracy, we have
not one, but an infinity of orthonormal eigenbases. Let us see through an example
how this variety manifests itself when we look for eigenvectors and how it is to be
handled.
Example 1.8.5. Consider an operator S2 with matrix elements
1 0 11
S.24— [0
2 0
01 1
in some basis. The characteristic equation is
(co — 2)2co = 0
co =0, 2, 2
The vector corresponding to co =0 is found by the usual means to be
1
!co [ 01]
 0> 21/2
—1
The case co = 2 leads to the following equations for the components of the
eigenvector :
± X3  0
0=0
xl —x2 =0
Now we have just one equation, instead of the two (n1) we have grown accustomed 39
to! This is a reflection of the degeneracy. For every extra appearance (besides the MATHEMATICAL
first) a root makes, it takes away one equation. Thus degeneracy permits us extra INTRODUCTION
degrees of freedom besides the usual one (of normalization). The conditions
XI = X3
X2 arbitrary
define an ensemble of vectors that are perpendicular to the first, I co = 0>, i.e., lie in
a plane perpendicular to I co = 0>. This is in agreement with our expectation that a
twofold "degeneracy should lead to a twodimensional eigenspace. The freedom in x2
(or more precisely, the ratio x2/x3) corresponds to the freedom of orientation in this
plane. Let us arbitrarily choose x2 = 1, to get a normalized eigenvector corresponding
to w =2:
1
I co = 2> 4> 31/2 [111
The third vector is now chosen to lie in this plane and to be orthogonal to the second
(being in this plane automatically makes it perpendicular to the first I co = 0> ) :
1
1
I co = 2, second one> —2
6h/2[ ]
1
Clearly each distinct choice of the ratio, x2/x3 , gives us a distinct doublet of orthonor
mal eigenvectors with eigenvalue 2. 0
Notice that in the face of degeneracy, I co i > no longer refers to a single ket but
to a generic element of the eigenspace V. To refer to a particular element, we must
use the symbol I coi , a>, where a labels the ket within the eigenspace. A natural
choice of the label a will be discussed shortly.
We now consider the analogs of Theorems 9 and 10 for unitary operators.
Theorem 11. The eigenvalues of a unitary operator are complex numbers of
unit modulus.
Theorem 12. The eigenvectors of a unitary operator are mutually orthogonal.
(We assume there is no degeneracy.)
40 Proof of Both Theorems (assuming no degeneracy). Let
CHAPTER 1
Ul ui> ui I ui> (1.8.18a)
and
U uf> = uf > (1.8.18b)
If we take the adjoint of the second equation and dot each side with the corresponding
side of the first equation, we get
<tti lUt Ului> =ui uI<ui lui>
so that
(1 —u i u,*)<uf lu,> = 0 (1.8.19)
If i=j, we get, since <ui I ui> 0 0,
(1.8.20a)
while if i 0j,
(1.8.20b)
since l ue> Otti ti*u,u,* 0 1. (Q.E.D.)
If U is degenerate, we can carry out an analysis parallel to that for the Hermitian
operator 52, with just one difference. Whereas in Eq. (1.8.12), the zeros of the first
row followed from the zeros of the first column and f = SI, here they follow from
the requirement that the sum of the modulus squared of the elements in each row
adds up to 1. Since lui! = 1, all the other elements in the first row must vanish.
Diagonalization of Hermitian Matrices
Consider a Hermitian operator 52 on V(C) represented as a matrix in some
orthonormal basis l 1>, . , I i>, , In>. If we trade this basis for the eigenbasis
w1>,. , I w>,. con>, the matrix representing S2 will become diagonal. Now the
operator U inducing the change of basis
oh> = i> (1.8.21)
is clearly unitary, for it "rotates" one orthonormal basis into another. (If you wish
you may apply our mnemonic to U and verify its unitary nature: its columns contain
the components of the eigenvectors I a),> that are orthonormal.) This result is often
summarized by the statement:
Every Hermitian matrix on V(C) may be diagonalized by a unitary change of
basis.
We may restate this result in terms of passive transformations as follows: 41
MATHEMATICAL
If f2 is a Hermitian matrix, there exists a unitary matrix U (built out of the INTRODUCTION
eigenvectors of S2) such that UtS2U is diagonal.
Thus the problem of finding a basis that diagonalizes 52 is equivalent to solving
its eigenvalue problem.
Exercise 1.8.1. (1) Find the eigenvalues and normalized eigenvectors of the matrix
1 3 1
0 2 0
S1=[ 0 1 4
(2) Is the matrix Hermitian? Are the eigenvectors orthogonal?
Exercise 1.8.2. * Consider the matrix
001
S2= [0 0 0
100
(1) Is it Hermitian?
(2) Find its eigenvalues and eigenvectors.
(3) Verify that UtS2U is diagonal, U being the matrix of eigenvectors of D.
Exercise 1.8.3.* Consider the Hermitian matrix
2 00
Û=1 [0
3 —1
2
0 —1 3
(1) Show that co, = w 2 = 1; w 3 = 2.
(2) Show that I co =2> is any vector of the form
0
1
[ al
(2a2)"
—a
(3) Show that the co = 1 eigenspace contains all vectors of the form
1
02+2c2),/2pi
c
either by feeding w = 1 into the equations or by requiring that the co = 1 eigenspace be ortho
gonal to I co =2>.
42 Exercise 1.8.4. An arbitrary nx n matrix need not have n eigenvectors. Consider as an
example
CHAPTER 1
=
[ 41 21 ]
(1) Show that co l = w2= 3.
(2) By feeding in this value show we get only one eigenvector of the form
1 Hai
(2a2)"2 La
We cannot find another one that is LI.
Exercise 1.8.5.* Consider the matrix
cos 0 sin 01
[ —sin 0 cos OJ
(1) Show that it is unitary.
(2) Show that its eigenvalues are e° and C`°.
(3) Find the corresponding eigenvectors; show that they are orthogonal.
(4) Verify that eflU= (diagonal matrix), where U is the matrix of eigenvectors of 11.
Exercise 1.8.6.* (1) We have seen that the determinant of a matrix is unchanged under
a unitary change of basis. Argue now that
det f2 = product of eigenvalues of û = 11 co,
for a Hermitian or unitary a.
(2) Using the invariance of the trace under the same transformation, show that
Tr f2 = E
Exercise 1.8.7. By using the results on the trace and determinant from the last problem,
show that the eigenvalues of the matrix
are 3 and —1. Verify this by explicit computation. Note that the Hermitian nature of the
matrix is an essential ingredient.
Exercise 1.8.8.* Consider Hermitian matrices M', M 2, M 3 , M4 that obey 43
MATHEMATICAL
j= 1, . . , 4
INTRODUCTION
(1) Show that the eigenvalues of M i are ± 1. (Hint: go to the eigenbasis of W, and use
the equation for i=j.)
(2) By considering the relation
M iMi= —M/M i for i Of
show that M / are traceless. [Hint: Tr(A CB)=Tr(CBA).]
(3) Show that they cannot be odddimensional matrices.
Exercise 1.8.9. A collection of masses m a , located at ra and rotating with angular velocity
co around a common axis has an angular momentum
= E rna (ra X va)
a
where va = w X ra is the velocity of ma . By using the identity
A x (B x C) = B(A • C) — C(A • B)
show that each Cartesian component 1i of 1 is given by
1i = E Iwo);
where
Mu = E mjr2a s,— (ra )i (ra ); ]
a
or in Dirac notation
Il> = MI w>
(1) Will the angular momentum and angular velocity always be parallel?
(2) Show that the moment of inertia matrix My is Hermitian.
(3) Argue now that there exist three directions for w such that I and co will be parallel.
How are these directions to be found?
(4) Consider the moment of inertia matrix of a sphere. Due to the complete symmetry
of the sphere, it is clear that every direction is its eigendirection for rotation. What does this
say about the three eigenvalues of the matrix M?
Simultaneous Diagonalization of Two Hermitian Operators
Let us consider next the question of simultaneously diagonalizing two Hermitian
operators.
Theorem 13. If 0 and A are two commuting Hermitian operators, there exists
(at least) a basis of common eigenvectors that diagonalizes them both.
44 Proof Consider first the case where at least one of the operators is nondegener
CHAPTER 1
ate, i.e., to a given eigenvalue, there is just one eigenvector, up to a scale. Let us
assume SI is nondegenerate. Consider any one of its eigenvectors :
nlffli>=coilcoi>
An' wi>= co i nico i>
Since [A, II] = 0,
nAlcoi>=NiAlcoe> (1.8.22)
i.e., A/ co,> is an eigenvector of SI with eigenvalue co,. Since this vector is unique up
to a scale,
AI oi> =A1I w> (1.8.23)
Thus loi> is also an eigenvector of A with eigenvalue A. Since every eigenvector of
is an eigenvector of A, it is evident that the basis I co,> will diagonalize both
operators. Since fl is nondegenerate, there is only one basis with this property.
What if both operators are degenerate? By ordering the basis vectors such that
the elements of each eigenspace are adjacent, we can get one of them, say fl, into
the form (Theorem 10)
COI
CO2
con ,
con
,
Now this basis is not unique: in every eigenspace V VT corresponding to the
eigenvalue co,, there exists an infinity of bases. Let us arbitrarily pick in VT: a set
co„ a> where the additional label a runs from 1 to m i
How does A appear in the basis? Although we made no special efforts to get A
into a simple form, it already has a simple form by virtue of the fact that it commutes
with a Let us start by mimicking the proof in the nondegenerate case:
nnicoi, a> = Anicoi, a > = coiAlcoi, a>
However, due to the degeneracy of SI, we can only conclude that 45
MATHEMATICAL
AI o ,, a> lies in V71' INTRODUCTION
Now, since vectors from different eigenspaces are orthogonal [Eq. (1.8.15)],
<(ob fliAlco„ a>=0
if I coi , a> and I co» 13> are basis vectors such that co, 0 co,. Consequently, in this basis,
A
which is called a block diagonal matrix for obvious reasons. The block diagonal form
of A reflects the fact that when A acts on some element I co„ a> of the eigenspace
VT', it turns it into another element of VT'. Within each subspace i, A is given by
a matrix Ai , which appears as a block in the equation above. Consider a matrix Ai
in VT, . It is Hermitian since A is. It can obviously be diagonalized by trading the
basis I o, 1>, I coi , 2>, , I co i , mi > in VT' that we started with, for the eigenbasis of
A. Let us make such a change of basis in each eigenspace, thereby rendering A
diagonal. Meanwhile what of II? It remains diagonal of course, since it is indifferent
to the choice of orthonormal basis in each degenerate eigenspace. If the eigenvalues
of A, are Al l) Al2) , . , en') then we end up with
A+ 2,f,n1)
x11)
O1
(
CO1
CO2
Q.E.D.
46 If A is not degenerate within any given subspace, A. k) e) for any k, 1, and i, the
CHAPTER 1
basis we end up with is unique: the freedom SI gave us in each eigenspace is fully
eliminated by A. The elements of this basis may be named uniquely by the pair of
indices w and Â. as I w,2.>, with playing the role of the extra label a. If A is
degenerate within an eigenspace of SI, if say 2.1 1) = 2. there is a twodimensional
eigenspace from which we can choose any two orthonormal vectors for the common
basis. It is then necessary to bring in a third operator F, that commutes with both
SI and A, and which will be nondegenerate in this subspace. In general, one can
always find, for finite n, a set of operators 42, A, F, . . . } that commute with each
other and that nail down a unique, common, eigenbasis, the elements of which may
be labeled unambiguously as 1w, X, y, . . . >. In our study of quantum mechanics it
will be assumed that such a complete set of commuting operators exists if n is infinite.
Exercise 1.8.10. * By considering the commutator, show that the following Hermitian
matrices may be simultaneously diagonalized. Find the eigenvectors common to both and
verify that under a unitary transformation to this basis, both matrices are diagonalized.
1 0l [2 1 11
= [CI 0 0] , A=l
[
1 0 —
1 0 1 1 — 1 2
Since û is degenerate and A is not, you must be prudent in deciding which matrix dictates
the choice of basis.
Example 1.8.6. We will now discuss, in some detail, the complete solution to a
problem in mechanics. It is important that you understand this example thoroughly,
for it not only illustrates the use of the mathematical techniques developed in this
chapter but also contains the main features of the central problem in quantum
mechanics.
The mechanical system in question is depicted in Fig. 1.5. The two masses m
are coupled to each other and the walls by springs of force constant k. If xl and x2
measure the displacements of the masses from their equilibrium points, these coordi
nates obey the following equations, derived through an elementary application of
Newton's laws:
2k
x1+ — x2 (1.8.24a)
2k
= XI  7;
1 X2 (1.8.24b)
^Y"VVV. "VVVVVVYVVVV .1 Figure 13. The coupled mass problem. All masses are
m, all spring constants are k, and the displacements of
" 12 the masses from equilibrium are x, and x2.
The problem is to find x i (t) and x2(t) given the initialvalue data, which in this 47
case consist of the initial positions and velocities. If we restrict ourselves to the case MATHEMATICAL
of zero initial velocities, our problem is to find x i (t) and x2(t), given x1(0) and x2(0). INTRODUCTION
In what follows, we will formulate the problem in the language of linear vector
spaces and solve it using the machinery developed in this chapter. As a first step, we
rewrite Eq. (1.8.24) in matrix form:
[51 [L1 11
(1.8.25a)
5e2 122, n22 x2
where the elements of the Hermitian matrix SI, are
=n22— —2k / m, I2= 21 = k/m (1.8.25b)
We now view xl and x2 as components of an abstract vector Ix>, and 110 as the matrix
elements of a Hermitian operator a Since the vector I x> has two real components, it
is an element of V2(R), and S2 is a Hermitian operator on V2(R). The abstract form
of Eq. (1.8.25a) is
iie(t)>=nlx(t)> (1.8.26)
Equation (1.8.25a) is obtained by projecting Eq. (1.8.26) on the basis vectors II>,
12), which have the following physical significance:
1 [1] [first mass displaced by unity]
1 ( .8.27a)
0 44 second mass undisplaced
[ first mass undisplaced
12> 40, (1.8.27b)
1 second mass displaced by unity
An arbitrary state, in which the masses are displaced by x l and x2 , is given in this
basis by
[x,1 = [11 x ± [01 (1.8.28)
x2 o 1
The abstract counterpart of the above equation is
lx> = 11>x, +12>x2 (1.8.29)
It is in this II>, 12> basis that fl is represented by the matrix appearing in Eq.
(1.8.25), with elements —2k / m, k/m, etc.
The basis II>, 12> is very desirable physically, for the components of I x> in this
basis (x i and x2) have the simple interpretation as displacements of the masses.
However, from the standpoint of finding a mathematical solution to the initialvalue
problem, it is not so desirable, for the components x l and x2 obey the coupled
48 differential equations (1.8.24a) and (1.8.24b). The coupling is mediated by the off
CHAPTER 1
diagonal matrix elements K112 = K1 21 = k/m.
Having identified the problem with the I1>, 12> basis, we can now see how to
get around it: we must switch to a basis in which 0 is diagonal. The components of
Ix> in this basis will then obey uncoupled differential equation which may be readily
solved. Having found the solution, we can return to the physically preferable ID,
12> basis. This, then, is our broad strategy and we now turn to the details.
From our study of Hermitian operators we know that the basis that diagonalizes
0 is the basis of its normalized eigenvectors. Let II> and III> be its eigenvectors
defined by
nli>=Nfli> (1.8.30a)
nIll>=coillii> (1.8.30b)
We are departing here from our usual notation: the eigenvalue of SI is written as
— o) 2 rather than as co in anticipation of the fact that 0 has eigenvalues of the form
2
 CO , with co real. We are also using the symbols II> and III> to denote what should
be called I— cob and 1— co?i > in our convention.
It is a simple exercise (which you should perform) to solve the eigenvalue prob
lem of 0 in the 11>, 12> basis (in which the matrix elements of 0 are known) and
to obtain
I /2
k II> *.. 1 [ii
m 2 1 /2 [1]
1R
(3k\ 2 [11
—
m ) III>4+ , (0114
2 1 / 2 —1]
If we now expand the vector I x(t)> in this new basis as
I x( t )> = l i >xi( t ) + 1I I >xii(t) (1.8.32)
[in analogy with Eq. (1.8.29)], the components x1 and xn will evolve as follows:
[ .Ri
.RII
I=[ CO?

0
CO?'
1[ XI]
XII 0
[ __, (0 21 xi
= ]
2 (1.8.33)
 0011 XII
We obtain this equation by rewriting Eq. (1.8.24) in the II>, III> basis in which 0
has its eigenvalues as the diagonal entries, and in which Ix> has components x1 and
x11 . Alternately we can apply the operator 49
MATHEMATICAL
d2 INTRODUCTION
 ç
dr2
to both sides of the expansion of Eq. (1.8.32), and get
10> = D(56 + (?xi) + I + x 11) (1.8.34)
Since II> and III> are orthogonal, each coefficient is zero.
The solution to the decoupled equations
+ ct) xi = 0, i = I, II (1.8.35)
subject to the condition of vanishing initial velocities, is
x i(t)=x,(0 cos w it,
) i= I, II (1.8.36)
As anticipated, the components of Ix> in the II>, III> basis obey decoupled equations
that can be readily solved. Feeding Eq. (1.8.36) into Eq. (1.8.32) we get
I x(t)> = I Dx1(0) cos oh t + I II>xii (0) cos oh ' t (1.8.37a)
= I><Ilx( 0)> cos co i t + I II><III x(0)> cos oh ' t (1.8.37b)
Equation (1.8.37) provides the explicit solution to the initialvalue problem. It corre
sponds to the following algorithm for finding I x(t)> given I x(0)>.
Step (1). Solve the eigenvalue problem of n.
Step (2). Find the coefficients xi(0) = <II x(0)> and xll(0) = <III x(0)> in the
expansion
lx (0 )> = I I>x i (0 ) + I II > xn (0)
Step (3). Append to each coefficient x, (0) (i = I, II) a time dependence cos co, t
to get the coefficients in the expansion of I x(t)>.
Let me now illustrate this algorithm by solving the following (general) initial
value problem: Find the future state of the system given that at t= 0 the masses are
displaced by x 1 (0) and x2(0).
Step (1). We can ignore this step since the eigenvalue problem has been solved
[Eq. (1.8.31)].
50 Step (2).
CHAPTER 1
1 x 1 ( 0 ) ] x1 (0) + x2(0)
[
xi (0)= <II x(0)>  (1, 1) 
2 1 /2 X2(0) 21 /2
1 x1 (0 )] x l( 0)— x 2 (0)
) = <III X( 3)> = ( 1 , 1) [ Xil(0 
21/ X2(0) 2 1/2
Step (3).
(0) x2 (0)
x1+ x1(0) x2(0)
lx (t)> = II> COS CO I t + III> COS Nu t
2" 2"
The explicit solution above can be made even more explicit by projecting Ix(t)> onto
the I1>, 12> basis to find x1 (t) and x2(t), the displacements of the masses. We get
(feeding in the explicit formulas for co, and colt
MO= <11x(t)>
= <1II> xi " 4 x2 " cos Rk)1/2 d+ <1III> x1(0)  x2(0) cos R3k) I/2 t
2 1 /2 2 1 /2
1/2
3k
cos h1/2t1+ [Xi (0)  x2(0)] cos R ) t (1.8.38a)
2 2 rn
using the fact that
<
<1II>= MI> = 1/2"
It can likewise be shown that
[(
1 k 1/2 1 3k)1/2 ]
X2(t) 
2 [x(0) + x2(0)] cos R) 2
[xl(0) x2(0)] cos  1 (1.8.38b)
We can rewrite Eq. (1.8.38) in matrix form as
 cos [(k/m) 1 /2t1+ cos [(3k/m) 1 12t] cos [(k/m) 1 /2t]  cos [(3k/m) 1 /2tj 
ki(t)1 2 2
[x2(t) cos[(k/m) 112 t]  cos [(3k/m) 1 /2/ cos [(k/m) 1121+ cos [(3k/m) 1 /21]
2 2
rx,(0)1
x (1.8.39)
[x2(0)]
This completes our determination of the future state of the system given the initial 51
state. MATHEMATICAL
INTRODUCTION
The Propagator
There are two remarkable features in Eq. (1.8.39):
(1) The finalstate vector is obtained from the initialstate vector upon multiplication
by a matrix.
(2) This matrix is independent of the initial state. We call this matrix the propagator.
Finding the propagator is tantamount to finding the complete solution to the
problem, for given any other initial state with displacements 2 1 (0) and 22(0), we
get 21 (t) and 22(0 by applying the same matrix to the initialstate vector.
We may view Eq. (1.8.39) as the image in the I1>, 12> basis of the abstract
relation
Ix( > = U(t)Ix( 0)> (1.8.40)
By comparing this equation with Eq. (1.8.37b), we find the abstract representation
of U:
U(t)= Pa cos oh t+ III><III cos con t (1.8.41a)
II
= E 1001 cos cot (1.8.41b)
i=I
You may easily convince yourself that if we take the matrix elements of this operator
in the I 1>, 12> basis, we regain the matrix appearing in Eq. (1.8.39). For example
U11=01U1 1>
k 1/2 3/12
<11{1IXII COSR)= ti+IIIXIII cos R /1111>
m m
/2
k 1/2
= <11»<I11>COS[H ti+<11,»<Inocos[(2_,
3)1 ti
m m
=1 { [(0 1/2 1 t
–2 cos ) i+COSR3kytil
m m
Notice that U(t) [Eq. (1.8.41)] is determined completely by the eigenvectors
and eigenvalues of a We may then restate our earlier algorithm as follows. To solve
the equation
= nix>
52 (1) Solve the eigenvalue problem of Q.
CHAPTER 1
(2) Construct the propagator U in terms of the eigenvalues and eigenvectors.
(3) lx(t)>= U(t)lx(0)>.
The Normal Modes
There are two initial states 1x(0)> for which the time evolution is particularly
simple. Not surprisingly, these are the eigenkets II> and III>. Suppose we have
Ix(0)>=II>. Then the state at time t is
WO> U(t)II>
= (1»<II cos w i t + lII><III cos con t)I I>
=1 1> cos w i t (1.8.42)
Thus the system starting off in II> is only modified by an overall factor cos w 1 t. A
similar remark holds with 1 41. These two modes of vibration, in which all (two)

components of a vector oscillate in step are called normal modes.
The physics of the normal modes is clear in the 11>, 12> basis. In this basis
1 [11
21/2 Ld
and corresponds to a state in which both masses are displaced by equal amounts.
The middle spring is then a mere spectator and each mass oscillates with a frequency
wi = (k/rn)" in response to the end spring nearest to it. Consequently
1 [ cos [(k/m)1/211
I I (t) =
2' /2 cos[(k/m)1/21]
On the other hand, if we start with
III> 1 [ 11
21/2 —1
the masses are displaced by equal and opposite amounts. In this case the middle
spring is distorted by twice the displacement of each mass. If the masses are adjusted
by A and —A, respectively, each mass feels a restoring force of 3kA (2kA from the
middle spring and kA from the end spring nearest to it). Since the effective force
constant is keff = 3kA/A = 3k, the vibrational frequency is (3k/m)" and
1 [ cos [(3k/m)"tfl
IMO» — 2 1/2 _cos [(3k/m)1/20
If the system starts off in a linear combination of II> and III> it evolves into
the corresponding linear combination of the normal modes II(t)> and III(t)>. This
is the content of the propagator equation 53
MATHEMATICAL
I x(0> = u(t)lx (0)> INTRODUCTION
=1»<I1x( 0)> co s co i t+ I I I><H I x (0 )> co s w il t
= II(t)> • <II x(0)> + III(0><H1x(0)>
Another way to see the simple evolution of the initial states II> and III> is to
determine the matrix representing U in the II>, III> basis:
COS COI t 0
U (1.8.43)
1,11[ 0 cos co il t
basis
You should verify this result by taking the appropriate matrix elements of U(t) in
Eq. (1.8.41b). Since each column above is the image of the corresponding basis
vectors (II> or III>) after the action of U(t), (which is to say, after time evolution),
we see that the initial states II> and III> evolve simply in time.
The central problem in quantum mechanics is very similar to the simple example
that we have just discussed. The state of the system is described in quantum theory
by a ket I v> which obeys the Schrbdinger equation
tk> =111w>
where h is a constant related to Planck's constant h by h= h/2r, and H is a Hermitian
operator called the Hamiltonian. The problem is to find I tif(t)> given I v(0)>. [Since
the equation is first order in t, no assumptions need be made about I yi(0)>, which
is determined by the Schrbdinger equation to be ( — i/h)Hiv (0)>.]
In most cases, H is a timeindependent operator and the algorithm one follows
in solving this initialvalue problem is completely analogous to the one we have just
seen:
Step (1). Solve the eigenvalue problem of H.
Step (2). Find the propagator U(t) in terms of the eigenvectors and eigenvalues
of H.
Step (3). I u(t)> = U(t)I y/(0)>.
You must of course wait till Chapter 4 to find out the physical interpretation
of I tit>, the actual form of the operator H, and the precise relation between U(t)
and the eigenvalues and eigenvectors of H. El
Exercise 1.8.11. Consider the coupled mass problem discussed above.
(1) Given that the initial state is 11>, in which the first mass is displaced by unity and
the second is left alone, calculate 11(0> by following the algorithm.
(2) Compare your result with that following from Eq. (1.8.39).
54 Exercise 1.8.12. Consider once again the problem discussed in the previous example. (1)
Assuming that
CHAPTER 1
IR> = fll x>
has a solution
I x(t)> = u(t)I x(0)>
find the differential equation satisfied by U(t). Use the fact that I x(0)> is arbitrary.
(2) Assuming (as is the case) that f2 and U can be simultaneously diagonalized, solve
for the elements of the matrix U in this common basis and regain Eq. (1.8.43). Assume
1.9. Functions of Operators and Related Concepts
We have encountered two types of objects that act on vectors: scalars, which
commute with each other and with all operators; and operators, which do not
generally commute with each other. It is customary to refer to the former as c
numbers and the latter as g numbers. Now, we are accustomed to functions of c
numbers such as sin(x), log(x), etc. We wish to examine the question whether
functions of g numbers can be given a sensible meaning. We will restrict ourselves
to those functions that can be written as a power series. Consider a series
f(x)= E ax n (1.9.1)
n=0
where x is a c number. We define the same function of an operator or g number to
be
f(fl ) = E an iln (1.9.2)
n=0
This definition makes sense only if the sum converges to a definite limit. To see what
this means, consider a common example:
00 iln
en= E — (1.9.3)
n=1 n!
Let us restrict ourselves to Hermitian n. By going to the eigenbasis of SI we can
readily perform the sum of Eq. (1.9.3). Since

(02
(1.9.4)
and 55
MATHEMATICAL
INTRODUCTION
cor
(1.9.5)
—
co,m
E m.
m=0
en = (1.9.6)
Since each sum converges to the familiar limit the operator en is indeed well
defined by the power series in this basis (and therefore in any other).
Exercise 1.9.1. * We know that the series
f(x)= E x"
—0
may be equated to the function f(x)= (1 — x) ' if 1x1 <1. By going to the eigenbasis, examine
when the g number power series
„=0
of a Hermitian operator Ll may be identified with (1 —
Exercise 1.9.2.* If H is a Hermitian operator, show that U=elli is unitary. (Notice the
analogy with c numbers: if 19 is real, u = e‘e is a number of unit modulus.)
Exercise 1.9.3. For the case above, show that det U=1 11.
Derivatives of Operators with Respect to Parameters
Consider next an operator O(X) that depends on a parameter X. Its derivative
with respect to L is defined to be
[ OdO(X)
—
ca AA —.0
If O .) is written as a matrix in some basis, then the matrix representing dO(X)/c1X
(
is obtained by differentiating the matrix elements of 0(2.). A special case of 00.) we
56 are interested in is
CHAPTER 1
where SI is Hermitian. We can show, by going to the eigenbasis of SI, that
dt9(;.)
— = (A) ) (1.9.7)
dA,
The same result may be obtained, even if SI is not Hermitian, by working with the
power series, provided it exists:
d cc, A, run n — 1 gy An— 1 gy — 1 cc xmgr
— E  f2e1Q
cti n = 0 n! n=1 n! n = 1 (n— 1)! m0 m!
Conversely, we can say that if we are confronted with the differential Eq. (1.9.7),
its solution is given by
A
0(A)= c exp(1 dX)= c exp(M)
(It is assumed here that the exponential exists.) In the above, c is a constant (opera
tor) of integration. The solution O = eQA corresponds to the choice c= I.
In all the above operations, we see that S2 behaves as if it were just a c number.
Now, the real difference between c numbrs and g numbers is that the latter do not
generally commute. However, if only one g number (or powers of it) enter the
picture, everything commutes and we can treat them as c numbers. If one remembers
this mnemonic, one can save a lot of time.
If, on the other hand, more than one g number is involved, the order of the
factors is all important. For example, it is true that
efin = e(a + fig)
as may be verified by a powerseries expansion, while it is not true that
eane" = ea ° ±130
or that
eane"e'n = es °
unless [SI, 0 ] = O. Likewise, in differentiating a product, the chain rule is
d
— em)e'l° = neeA° + evu'e'l° 0 (1.9.8)
dL
We are free to move SI through e'L° and write the first term as 57
MATHEMATICAL
INTRODUCTION
but not as
em)e"SI
unless [SI, 0]= 0.
1.10. Generalization to Infinite Dimensions
In all of the preceding discussions, the dimensionality (n) of the space was
unspecified but assumed to be some finite number. We now consider the generaliza
tion of the preceding concepts to infinite dimensions.
Let us begin by getting acquainted with an infinitedimensional vector. Consider
a function defined in some interval, say, a <x <b. A concrete example is provided
by the displacement f (x, t) of a string clamped at x = 0 and x = L (Fig. 1.6).
Suppose we want to communicate to a person on the moon the string's displace
ment f (x), at some time t. One simple way is to divide the interval 0— L into 20 equal
parts, measure the displacement f (x,) at the 19 points x = L/20, 2L/20, . . . , 19L/20,
and transmit the 19 values on the wireless. Given these f (x,), our friend on the moon
will be able to reconstruct the approximate picture of the string shown in Fig. 1.7.
If we wish to be more accurate, we can specify the values of f (x) at a larger
number of points. Let us denote by fn(x) the discrete approximation to f (x) that
coincides with it at n points and vanishes in between. Let us now interpret the ordered
ntuple { fn(x1), f,(x2), • • • , fn(xn)} as components of a ket I fn > in a vector space
(R):
I fn> (1.10.1)
flx
Figure 1.6. The string is clamped at x = 0
and x= L. It is free to oscillate in the plane
of the paper.
.. •
X.0, 1 III
X1X 2 I • cr.L
xig
Figure 1.7. The string as reconstructed by the
person on the moon.
58 The basis vectors in this space are
CHAPTER 1
0
0
1 ith place (1.10.2)
o
0_
corresponding to the discrete function which is unity at x = x, and zero elsewhere.
The basis vectors satisfy
<xi I xi >I = Su (orthogonality) (1.10.3)
E ixi><xi i = I (completeness) (1.10.4)
i=
Try to imagine a space containing n mutually perpendicular axes, one for each
point x,. Along each axis is a unit vector Ix,>. The function fn(x) is represented by
a vector whose projection along the ith direction is fn(x,):
Ifn>= E f(x)Ix> (1.10.5)
To every possible discrete approximation gn(x), h n (x), etc., there is a corresponding
ket Ign>, Ih>, etc., and vice versa. You should convince yourself that if we define
vector addition as the addition of the components, and scalar multiplication as the
multiplication of each component by the scalar, then the set of all kets representing
discrete functions that vanish at x = 0, L and that are specified at n points in between,
forms a vector space.
We next define the inner product in this space:
< fn ign> = E fn(x i )g n(x i ) (1.10.6)

Two functions fn(x) and g(x) will be said to be orthogonal if < fn ign > = 0.
Let us now forget the man on the moon and consider the maximal specification
of the string's displacement, by giving its value at every point in the interval 0— L.
In this case f(x) (x) is specified by an ordered infinity of numbers: an f (x) for
each point x. Each function is now represented by a ket I foo > in an infinitedimen
sional vector space and vice versa. Vector addition and scalar multiplication are
defined just as before. Consider, however, the inner product. For finite n it was
defined as 59
MATHEMATICAL
< fn ign> = E f(x)g(x) INTRODUCTION
i =1
in particular
<fnl fn> = E EfAxi)12
If we now let n go to infinity, so does the sum, for practically any function. What
we need is the redefinition of the inner product for finite n in such a way that as n
tends to infinity, a smooth limit obtains. The natural choice is of course
< fn ign> = E fn(x i )gn(x i )A, A = L (n + 1) (1.10.6')
i=
If we now let n go to infinity, we get, by the usual definition of the integral,
<f ig> = JI of (x)g(x) dx (1.10.7)
<fl f 2(x) dx (1.10.8)
Jo
If we wish to go beyond the instance of the string and consider complex functions
of x as well, in some interval a < x <b, the only modification we need is in the inner
product:
< f ig> = f*(x)g(x) dx (1.10.9)
a
What are the basis vectors in this space and how are they normalized? We know
that each point x gets a basis vector Ix>. The orthogonality of two different axes
requires that
<xi x'> = 0, x (1.10.10)
What if x = x'? Should we require, as in the finitedimensional case, <xl x> = 1? The
answer is no, and the best way to see it is to deduce the correct normalization. We
start with the natural generalization of the completeness relation Eq. (1.10.4) to the
case where the kets are labeled by a continuous index x' :
J a
*b
Ix'><x'l (1.10.11)
60 where, as always, the identity is required to leave each ket unchanged. Dotting both
CHAPTER 1
sides of Eq. (1.10.11) with some arbitrary ket i f> from the right and the basis bra
<xi from the left,
<xl x > I f> dx — <x1 1 1 f> = <xl f> (1.10.12)
a
Now, <xi f>, the projection of if> along the basis ket x>, is just f (x). Likewise
<x'l f> = f(x'). Let the inner product <xi x'> be some unknown function 8(x, x').
Since 8(x, x') vanishes if x x' we can restrict the integral to an infinitesimal region
near x' = x in Eq. (1.10.2):
x±.
8(x, x') f (x) dx' =f (x) (1.10.13)
E
In this infinitesimal region, f (x') (for any reasonably smooth f ) can be approximated
by its value at x' = x, and pulled out of the integral:
x+.
f (x)
J X E
S(x, x') dx' = f (x) (1.10.14)
so that
x+.
(1.10.15)
1.X E
Clearly 8(x, x') cannot be finite at x' = x, for then its integral over an infinitesimal
region would also be infinitesimal. In fact S(x, x') should be infinite in such a way
that its integral is unity. Since S(x, x') depends only on the difference x — x', let us
write it as (5(x — x'). The "function," 8(x — x'), with the properties
(5(x—x')=0, x0x'
(1.10.16)
fa S(x—x') dx' =1, a<x<b
is called the Dirac delta function and fixes the normalization of the basis vectors:
<xix'> = (5(x — x') (1.10.17)
It will be needed any time the basis kets are labeled by a continuous index such as
x. Note that it is defined only in the context of an integration : the integral of the
delta function 8(x — x') with any smooth function f (x') is f (x). One sometimes calls
(a) (b)
61
MATHEMATICAL
dg a(xxl INTRODUCTION
dx
x +e
Figure 1.8. (a) The Gaussian g6, approaches the delta function as A—>0. (b) Its derivative (dg/dx)(x — x')
approaches 8 '(x — x) as
the delta function the sampling function, since it samples the value of the function
f(x) at one points
f (5(x — x') f (x') dx= f (x) (1.10.18)
The delta function does not look like any function we have seen before, its
values being either infinite or zero. It is therefore useful to view it as the limit of a
more conventional function. Consider a Gaussian
1 (x — x') 21
gA (x — x') = 2 1 2 exp (1.10.19)
(rA ) A2
as shown in Fig. 1.8a. The Gaussian is centered at x'=x, has wdith A, maximum
height (rA2) 1 /2, and unit area, independent of A. As A approaches zero, g A becomes
a better and better approximation to the delta function.§
It is obvious from the Gaussian model that the delta function is even. This may
be verified as follows:
8(x— x') = <xi x'> = <x'lx>* = 8(x' — x)* = 8(x'—x)
since the delta function is real.
Consider next an object that is even more peculiar than the delta function: its
derivative with respect to the first argument x:
d
8(x — x') = — — (5(x — x') (1.10.20)
dx dx'
What is the action of this function under the integral? The clue comes from the
Gaussian model. Consider dgA (x— x')/ dx= —dg A (x — x')/ dx' as a function of x'. As
We will often omit the limits of integration if they are unimportant.
§ A fine point that will not concern you till Chapter 8: This formula for the delta function is valid even
if A2 is pure imaginary, say, equal to 0 2 . First we see from Eq. (A.2.5) that g has unit area. Consider
next the integral of g times f(x') over a region in x' that includes x. For the most part, we get zero
because f is smooth and g is wildly oscillating as )34 However, at x = x', the derivative of the phase
of g vanishes and the oscillations are suspended. Pulling f(x' = x) out of the integral, we get the desired
result.
62 SA shrinks, each bump at ± E will become, up to a scale factor, the 6 function. The
CHAPTER 1
first one will sample — f (x — E) and the second one +f (x + E), again up to a scale,
so that
ocf (x +
6 '(x — x') f e)(x') e)= 2
—f(x — dx'
J dx'
The constant of proportionality happens to be 1/2E so that
df(x)
'(x— f dx= —df (1.10.21)
dx' = „ dx
This result may be verified as follows:
, , d
6 '(x — x') f (x) dx' = f(x ) dx = 8(x— f dx'
dx dx
d
=— J(x)
dx
Note that 6 '(x — x') is an odd function. This should be clear from Fig. 1.8b or Eq.
(1.10.20). An equivalent way to describe the action of the 6' function is by the
equation
d
'(x — = 6(x — x') — (1.10.22)
where it is understood that both sides appear in an integral over x' and that the
differential operator acts on any function that accompanies the o' function in the
integrand. In this notation we can describe the action of higher derivatives of the
delta function:
d"6(x — x')
= 6(x — (1.10.23)
dx" dx'"
We will now develop an alternate representation of the delta function. We know
from basic Fourier analysis that, given a function f(x), we may define its transform
f(k)— 1 1/2r
(27r ) .1_ 00
e ikx f(x) dx (1.10.24)
and its inverse 63
MATHEMATICAL
INTRODUCTION
f(x') — 1 r e lkx' f(k) dk (1.10.25)
Feeding Eq. (1.10.24) into Eq. (1.10.25), we get
fl (
I
2 2r
dk e'k(x— x))f(x) dx
Comparing this result with Eq. (1.10.18), we see that
dk e ik(x'x) = 6(x' — x) (1.10.26)
212r
Exercise 1.10.1.* Show that ö(ax) = 8(x)lial. [Consider J 8(ax) d(ax). Remember that
8(x)= 8(—x).]
Exercise 1.10.2.* Show that
(x)) — E
ldfldxil
where x• the zeros of f(x). Hint: Where does 8(f (x)) blow up? Expand f(x) near such
points in a Taylor series, keeping the first nonzero term.
Exercise 1.10.3.* Consider the theta function 0(x— x') which vanishes if x — x' is negative
and equals 1 if x — x' is positive. Show that 8(x — x')= dl dx 0(x— x').
Operators in Infinite Dimensions
Having acquainted ourselves with the elements of this function space, namely,
the kets If> and the basis vectors lx>, let us turn to the (linear) operators that act
on them. Consider the equation
Qlf >=
Since the kets are in correspondence with the functions, SI takes the function f(x)
into another, 7(x). Now, one operator that does such a thing is the familiar differen
tial operator, which, acting on f(x), gives j(x)=df(x)/dx. In the function space we
can describe the action of this operator as
DI f>=Idf/ dx>
where Idf/dx> is the ket corresponding to the function df/dx. What are the matrix
elements of D in the l x> basis? To find out, we dot both sides of the above equation
64 with <xl ,
CHAPTER 1
df) df(x)
<xl DI f>=(x
dx dx
and insert the resolution of identity at the right place
<xl DI x'> <x' I f > dx' = —
df (1.10.27)
J dx
Comparing this to Eq. (1.10.21), we deduce that
d
<xi DIx'>= D=6'(x— x')= 6(x— x')— (1.10.28)
dx'
It is worth remembering that Dxx , = 6 '(x — x') is to be integrated over the second index
(x') and pulls out the derivative off at the first index (x). Some people prefer to
integrate 6 '(x — x') over the first index, in which case it pulls out —df/dx'. Our
convention is more natural if one views D x„, as a matrix acting to the right on the
components fx , ,f(x') of a vector I f>. Thus the familiar differential operator is an
infinitedimensional matrix with the elements given above. Normally one doesn't
think of D as a matrix for the following reason. Usually when a matrix acts on a
vector, there is a sum over a common index. In fact, Eq. (1.10.27) contains such a
sum over the index x'. If, however, we feed into this equation the value of Dxx , , the
delta function renders the integration trivial:
df
6(x — x') —
d fix') dx' = d—f
J dx' dx' x'= x dx
Thus the action of D is simply to apply d/dx to f(x) with no sum over a common
index in sight. Although we too will drop the integral over the common index
ultimately, we will continue to use it for a while to remind us that D, like all linear
operators, is a matrix.
Let us now ask if D is Hermitian and examine its eigenvalue problem. If D were
Hermitian, we would have
D xx , = D,!,x
But this is not the case:
Dxx ,= 8 '(x — x')
while
Mx = 6 '(x' — x)* = 6 '(x' — x)= —O '(x — x')
But we can easily convert D to a Hermitian matrix by multiplying it with a pure 65
imaginary number. Consider MATHEMATICAL
INTRODUCTION
K= —iD
which satisfies
10„=[—i6'(x' — x)]* = +i6 '(x' — x)= —i6 '(x— x')= Kxx ,
It turns out that despite the above, the operator K is not guaranteed to be Hermitian,
as the following analysis will indicate. Let If > and I g> be two kets in the function
space, whose images in the X basis are two functions f(x) and g(x) in the interval
a— b. If K is Hermitian, it must also satisfy
f>=<g1Kf>=<Kflg>*=<flICIg>* =<fliclg>*
So we ask
"'bib
<g I X> <XI C > <X' lf > dx dx'
a a
/ b /.1)
<fixXxIKIx '><xlg> dxdx')
(fa ia
b g* (x)
b
df(x)1 dx , d' j, *(x) F i dg(X)1 dx * _ dg*
f(x) dx
L dx u—a L dx a
dx
Integrating the lefthand side by parts gives
f b de (
—ig*(x)f(x) + 'xi f(x) dx
a a dx
So K is Hermitian only if the surface term vanishes:
—ig*(x)f(x) =0 (1.10.29)
a
In contrast to the finitedimensional case, Kxx , = IC5x is not a sufficient condition for
K to be Hermitian. One also needs to look at the behavior of the functions at the
end points a and b. Thus K is Hermitian in the space consists of functions that
obey Eq. (1.10.29). One set of functions that obey this condition are the possible
configurations f(x) of the string clamped at x = 0, L, since f(x) vanishes at the end
points. But condition (1.10.29) can also be fulfilled in another way. Consider
functions in our own threedimensional space, parametrized by r, 0, and Ø (0 is the
angle measured around the z axis). Let us require that these functions be single
66 valued. In particular, if we start at a certain point and go once around the z axis,
CHAPTER 1
returning to the original point, the function must take on its original value, i.e.,
f(0)=f(0+22r)
In the space of such periodic functions, K= d/ dO is a Hermitian operator. The
surface term vanishes because the contribution from one extremity cancels that from
the other:
2n.
—ig*(0)f (0) = —i[g* (22r) f (27r) — g* (0) f (0) 1 = 0
o
In the study of quantum mechanics, we will be interested in functions defined
over the full interval — <x< +oo. They fall into two classes, those that vanish as
1x1 cc, and those that do not, the latter behaving as e", k being a real parameter
that labels these functions. It is clear that K= d/dx is Hermitian when sandwiched
between two functions of the first class or a function from each, since in either case
the surface term vanishes. When sandwiched between two functions of the second
class, the Hermiticity hinges on whether
e ikx o
 CO
If k= k', the contribution from one end cancels that from the other. If k k', the
answer is unclear since ei(k"x oscillates, rather than approaching a limit as 1x1 cc.
Now, there exists a way of defining a limit for such functions that cannot make up
their minds: the limit as 1x1 oo is defined to be the average over a large interval.
According to this prescription, we have, say as x*(x),
lim e ik x C ik'x = liM e i(lck')x dx = 0 if k Ok'
L
and so K is Hermitian in this space.
We now turn to the eigenvalue problem of K. The task seems very formidable
indeed, for we have now to find the roots of an infiniteorder characteristic poly
nomial and get the corresponding eigenvectors. It turns out to be quite simple and
you might have done it a few times in the past without giving yourself due credit.
Let us begin with
Klk>=k1k> (1.10.30)
Following the standard procedure, 67
MATHEMATICAL
<x11(110 = k<xlk> INTRODUCTION
J <xi Kix'> <x' I k> (1.10.31)
d
—i — V k(x)= k IV k(x)
dx
where by definition tv k(x)= <x I k>. This equation could have been written directly
had we made the immediate substitution K= —i d/dx in the X basis. From now on
we shall resort to this shortcut unless there are good reasons for not doing so.
The solution to the above equation is simply
(if k (x)= A e ikx (1.10.32)
where A, the overall scale, is a free parameter, unspecified by the eigenvalue problem.
So the eigenvalue problem of K is fully solved: any real number k is an eigenvalue,
and the corresponding eigenfunction is given by A ed". As usual, the freedom in
scale will be used to normalize the solution. We choose A to be (1/270 1 /2 so that
1
lk> 4 e ikx
(270 112
and
.
i
<kl k'> = f <kl x> <xl k'> dx =____ r
27r j —oo
e i(kk')x dx — 6(k10 (1.10.33)
(Since <kl k> is infinite, no choice of A can normalize 1k> to unity. The delta function
normalization is the natural one when the eigenvalue spectrum is continuous.)
The attentive reader may have a question at this point.
"Why was it assumed that the eigenvalue k was real? It is clear that the function
A e`kx with k= k l + ik2 also satisfies Eq. (1.10.31)."
The answer is, yes, there are eigenfunctions of K with complex eigenvalues. If,
however, our space includes such functions, K must be classified a nonHermitian
operator. (The surface term no longer vanishes since eikx blows up exponentially as
x tends to either + co — co, depending on the sign of the imaginary part k2 .) In
restricting ourselves to real k we have restricted ourselves to what we will call the
physical Hilbert space, which is of interest in quantum mechanics. This space is
defined as the siiace of functions that can be either normalized to unity or to the
Dirac delta function and plays a central role in quantum mechanics. (We use the
qualifier "physical" to distinguish it from the Hilbert space as defined by mathemat
icians, which contain only proper vectors, i.e., vectors normalizable to unity. The
role of the improper vectors in quantum theory will be clear later.)
68 We will assume that the theorem proved for finite dimensions, namely, that the
CHAPTER 1 eigenfunctions of a Hermitian operator form a complete basis, holds in the Hilbertt
space. (The trouble with infinitedimensional spaces is that even if you have an
infinite number of orthonormal eigenvectors, you can never be sure you have them
all, since adding or subtracting a few still leaves you with an infinite number of
them.)
Since K is a Hermitian operator, functions that were expanded in the X basis
with components f(x)= <x I f > must also have an expansion in the K basis. To find
the components, we start with a ket 1 f >, and do the following:
e1 lkx f (x) dx (1.10.34)
—co
(27r) _ 00
The passage back to the X basis is done as follows:
f (x)= <xl f > = <klk> <kl f > dk 1 f c etkx f (k) dk (1.10.35)
—oo
(270 1 /2 _ 00
Thus the familiar Fourier transform is just the passage from one complete basis 1x>
to another, 1k>. Either basis may be used to expand functions that belong to the
Hilbert space.
The matrix elements of K are trivial in the K basis:
<klKik'>= k'<kl k'> = k' 6(k — k') (1.10.36)
Now, we know where the K basis came from: it was generated by the Hermitian
operator K. Which operator is responsible for the orthonormal X basis? Let us call
it the operator X. The kets 1x> are its eigenvectors with eigenvalue x:
Xlx> = xlx> (1.10.37)
Its matrix elements in the X basis are
<x' 1 X1 x> = x6(x' — x) (1.10.38)
To find its action on functions, let us begin with
xlf>=11>
and follow the routine:
<xlx1 f> = <xlx1 x'> <x' If> dx' = xf(x)= <x I .7> =7(x)
Ax) = xf(x)
Hereafter we will omit the qualifier "physical."
Thus the effect of X is to multiply f(x) by x. As in the case of the K operator, one 69
generally suppresses the integral over the common index since it is rendered trivial MATHEMATICAL
by the delta function. We can summarize the action of X in Hilbert space as INTRODUCTION
XI f(x)>=Ixf(x)> (1.10.39)
where as usual I xf( x)> is the ket corresponding to the function xf(x).
There is a nice reciprocity between the X and K operators which manifests itself
if we compute the matrix elements of X in the K basis:
1
<kIXIk'> = — c° e lkx X e ik'x dx
21r
= +i d (1 f e.,oeox dx)=
lb '(k — 101
dk 21r
Thus if Ig(k)> is a ket whose image in the k basis is g(k), then
i dg(k))
Xlg(k)> — (1.10.40)
dk
In summary then, in the X basis, X acts as x and K as —id/dx [on the functions
f(x)], while in the K basis, K acts like k and X like i d/dk [on f(k)]. Operators with
such an interrelationship are said to be conjugate to each other.
The conjugate operators X and K do not commute. Their commutator may be
calculated as follows. Let us operate X and K in both possible orders on some ket
I f> and follow the action in the X basis:
xl f> > xf(x)
df(x)
i
dx
So
df(x)
XK1 f> ix
dx
d
KXI f> — xf(x)
dx
Therefore
df df . . .
[X, K]l f> —> —ix —+ ix  F If= —> f>
dx dx
In the last step we have used the fact that 8(k' — k)= 5(k —
70 Since I f> is an arbitrary ket, we now have the desired result:
CHAPTER 1
[X, K]= il (1.10.41)
This brings us to the end of our discussion on Hilbert space, except for a final
example. Although there are many other operators one can study in this space, we
restricted ourselves to X and K since almost all the operators we will need for
quantum mechanics are functions of X and P= hK, where h is a constant to be
defined later.
Example 1.10.1: A Normal Mode Problem in Hilbert Space. Consider a string
of length L clamped at its two ends x = 0 and L. The displacement tg(x, t) obeys the
differential equation
a2 tv _ a2 v
(1.10.42)
at2 8X2
Given that at t= 0 the displacement is v(x, 0) and the velocity tli(x, 0) = 0, we wish
to determine the time evolution of the string.
But for the change in dimensionality, the problem is identical to that of the
two coupled masses encountered at the end of Section 1.8 [see Eq. (1.8.26)]. It is
recommended that you go over that example once to refresh your memory before
proceeding further.
We first identify v(x, t) as components of a vector I tg(t)> in a Hilbert space,
the elements of which are in correspondence with possible displacements 1,v, i.e.,
functions that are continuous in the interval 0 <x <L and vanish at the end points.
You may verify that these functions do form a vector space.
The analog of the operator f2 in Eq. (1.8.26) is the operator 02/ax2 . We recognize
this to be minus the square of the operator IC4i0/0x. Since K acts on a space in
which vi(0) = iv(L) 0, it is Hermitian, and so is K2. Equation (1.10.42) has the
abstract counterpart
I (P(0> = — K21 tP(t)> (1.10.43)
We solve the initialvalue problem by following the algorithm developed in Example
1.8.6:
Step (1). Solve the eigenvalue problem of —K2.
Step (2). Construct the propagator U(t) in terms of the eigenvectors and
eigenvalues.
Step (3).
I tP(t)> = U(t)I'( 0)> (1.10.44)
The equation to solve is 71
MATHEMATICAL
K2 1 Iv > = k2 I iv> (1.10.45) INTRODUCTION
In the X basis, this becomes
d2
— — tPk(x)= k 2 w k (x) (1.10.46)
dx2
the general solution to which is
yik (x)= A cos kx+ B sin kx (1.10.47)
where A and B are arbitrary. However, not all these solutions lie in the Hilbert space
we are considering. We want only those that vanish at x =0 and x = L. At x = 0 we
find
W k( 0) = 0 = A (1.10.48a)
while at x = L we find
0= B sin kL (1.10.48b)
If we do not want a trivial solution (A = B = 0) we must demand
sin kL =0, kL= MTC , 171= 1, 2, 3, ... (1.10.49)
We do not consider negative m since it doesn't lead to any further LI solutions
[sin(—x)= —sin x]. The allowed eigenvectors thus form a discrete set labeled by an
integer m:
1 1 /2
tv„,(x)= (i' ) sinkmgx ) (1.10.50)
L L
where we have chosen B= (2/L) 1 /2 so that
f yin,(x) tif,,,(x) dx= 6,,„,,
:
(1.10.51)
Let us associate with each solution labeled by the integer m an abstract ket 1m>:
(mrx)
1m> > (2/L)" 2 sin (1.10.52)
x basis L
72 If we project I f(t)> on the 1m> basis, in which K is diagonal with eigenvalues
CHAPTER 1 (mr/L) 2, the components <MI II '(t)> will obey the decoupled equations
d2 in 2 K 2
— <ml V (0> = ( )<ml V OD , m= 1, 2, ... (1.10.53)
dt2 L2
in analogy with Eq. (1.8.33). These equations may be readily solved (subject to the
condition of vanishing initial velocities) as
(mg t)
<ml V (0> = <MI VP> COS (1.10.54)
L
Consequently
CO
lin><ml vf(t)>
,n=1
.0
mg
= E Im><ini vi(c)> cos co n,t, (1.10.55)
m=I L
or
00
Mir
U(t)== E i mxm i cos tong, to m — — (1.10.56)
m= i L
The propagator equation
I v (t )>= u( t) I v/(0 )>
becomes in the l x> basis
<X I V(t)> = ig(x, t)
=fA
= <XI U(01 VP>
<XI U(t)IX' > <X' I V f (0)> dx' (1.10.57)
Jo
It follows from Eq. (1.10.56) that
<xi u(t)ix'>=E <xi in> <MI x'> cos co m t
E( ) sin mrx) sin (L
( m7rx'
) cos ow (1.10.58)
n, L2 L
Thus, given any f(x', 0), we can get y(x, t) by performing the integral in Eq. 73
(1.10.57), using <xl U(t)Ix'> from Eq. (1.10.58). If the propagator language seems MATHEMATICAL
too abstract, we can begin with Eq. (1.10.55). Dotting both sides with <xl , we get INTRODUCTION
oo
w(x, t) = E <x I m> <m l ty(0)> cos com t
m=1
1/2
(2) (mrx)
 sin cos OW <MI Iy ( 0)> (1.10.59)
Given It/J(0)X one must then compute
2
<ml vi(0)> = (
\ 1/2 f L
sin
(mirx)
vi(x, 0) dx
L ) 0 L
Usually we will find that the coefficients <m I tg(0)> fall rapidly with m so that a few
leading terms may suffice to get a good approximation. 0
Exercise 1.10.4. A string is displaced as follows at t = 0:
2xh L
2
2h L
=—(L—x), —<x<L
L 2
Show that
0. ( mirx) ( 8h ) . (ron)
ty(x, t)= E sin cos co„,t r2m2 SM
m=1 L
Review of Classical Mechanics
In this chapter we will develop the Lagrangian and Hamiltonian formulations of
mechanics starting from Newton's laws. These subsequent reformulations of mechan
ics bring with them a great deal of elegance and computational ease. But our principal
interest in them stems from the fact that they are the ideal springboards from which
to make the leap to quantum mechanics. The passage from the Lagrangian formula
tion to quantum mechanics was carried out by Feynman in this path integral formal
ism. A more common route to quantum mechanics, which we will follow for the
most part, has as its starting point the Hamiltonian formulation, and it was dis
covered mainly by Schrödinger, Heisenberg, Dirac, and Born.
It should be emphasized, and it will soon become apparent, that all three formu
lations of mechanics are essentially the same theory, in that their domains of validity
and predictions are identical. Nonetheless, in a given context, one or the other may
be more inviting for conceptual, computational, or simply aesthetic reasons.
2.1. The Principle of Least Action and Lagrangian Mechanics
Let us take as our prototype of the Newtonian scheme a point particle of mass
m moving along the x axis under a potential V(x). According to Newton's Second
Law,
d2x dV
m — (2.1.1)
dt2 dx
If we are given the initial state variables, the position x(t 1 ) and velocity .i(t,), we
can calculate the classical trajectory xei (t) as follows. Using the initial velocity and
acceleration [obtained from Eq. (2.1.1)] we compute the position and velocity at a
time t, + At. For example,
x 1 (t+ At)= x(ti )+ X(ti )At
Having updated the state variables to the time t, + At, we can repeat the process
again to inch forwar to t, + 2At and so on. 75
76
CHAPTER 2 (
Figure 2.1. The Lagrangian formalism asks what dis
tinguishes the actual path xc, (t) taken by the particle from
h ti
all possible paths connecting the end points (x„ t,) and
(xf , tf ).
The equation of motion being second order in time, two pieces of data, x(t)
and .i(t,), are needed to specify a unique xd (t). An equivalent way to do the same,
and one that we will have occasion to employ, is to specify two spacetime points
(x„ ti ) and (xi, on the trajectory.
The above scheme readily generalizes to more than one particle and more than
one dimension. If we use n Cartesian coordinates (x i , x2, . , x) to specify the
positions of the particles, the spatial configuration of the system may be visualized
as a point in an ndimensional configuration space. (The term "configuration space"
is used even if the n coordinates are not Cartesian.) The motion of the representative
point is given by
d2x =
tn• (2.1.2)
dt 2 aXj
where m, stands for the mass of the particle whose coordinate is xJ . These equations
can be integrated step by step, just as before, to determine the trajectory.
In the Lagrangian formalism, the problem of a single particle in a potential
V(x) is posed in a different way: given that the particle is at x, and xi at times t, and
tf , respectively, what is it that distinguishes the actual trajectory xd (t) from all other
trajectories or paths that connect these points? (See Fig. 2.1.)
The Lagrangian approach is thus global, in that it tries to determine at one
stroke the entire trajectory xci (t), in contrast to the local approach of the Newtonian
scheme, which concerns itself with what the particle is going to do in the next
infinitesimal time interval.
The answer to the question posed above comes in three parts:
(1) Define a function called the Lagrangian, given by 2' = T— V, T and V
being the kinetic and potential energies of the particle. Thus = Y(x, t). The
explicit t dependence may arise if the particle is in an external timedependent field.
We will, however, assume the absence of this t dependence.
(2) For each path x(t) connecting (x„ t,) and (x1, tf ), calculate the action
S[x(t)] defined by
C tf
S[X(t)]= ( X, dt (2.1.3)
t,
77
REVIEW OF
CLASSICAL
MECHANICS
Figure 2.2. If x 1 (t) minimizes S, then SS ( ' ) =0 if we
go to any nearby path xcl (t)+ ri(t).
We use square brackets to enclose the argument of S to remind us that the function
S depends on an entire path or function x(t), and not just the value of x at some
time t. One calls S a functional to signify that it is a function of a function.
(3) The classical path is one on which S is a minimum. (Actually we will only
require that it be an extremum. It is, however, customary to refer to this condition
as the principle of least action.)
We will now verify that this principle reproduces Newton's Second Law.
The first step is to realize that a functional S[x(t)] is just a function of n variables
as n—*co. In other words, the function x(t) simply specifies an infinite number of
values x(t,), . , x(t), . . . , x(tf ), one for each instant in time t in the interval
t,< t< tf , and S is a function of these variables. To find its minimum we simply
generalize the procedure for the finite n case. Let us recall that iff=f (x i , , x)=
f(x); the minimum x° is characterized by the fact that if we move away from it by
a small amount ri in any direction, the firstorder change iSf (1) in f vanishes. That
is, if we make a Taylor expansion:
f (x ° + = f (x0 ) + E af
n
— n,+ higherorder terms in ti (2.1.4)
i =, a Xi
then
8./ E f (2.1.5)
i =, aX i .0
From this condition we can deduce an equivalent and perhaps more familiar
expression of the minimum condition: every firstorder partial derivative vanishes at
x°. To prove this, for say, Of/ ex„ we simply choose 1.1 to be along the ith direction.
Thus
Of
=0, i=1,. ,n (2.1.6)
axi .0
Let us now mimic this procedure for the action S. Let xd (t) be the path of least
action and xd (t)+ ri(t) a "nearby" path (see Fig. 2.2). The requirement that all
paths coincide at t, and tf means
(2.1.7)
78 Now
CHAPTER 2 t if
S[X (t) 77(t)] = 2 (Xei(t) 71(t); d(t) ± 1)(0) dt
t,
tf
k (Xci(t) 1(t)) av) • ii(t)
ax(t) Xci
ay 7)(0+
• • • ]dt
ai(t)
= S[xci + S (I) + higherorder terms
We set SS (1) = 0 in analogy with the finite variable case:
0 ..29 ..29
0= 8S(I) = f [ 11(t)+ r)(t)]dt
t, Ox(t) xel Xci
If we integrate the second term by parts, it turns into
ay tf [d ..29
• ii(t) • ii(t) dt
ai(t) xel t dt i xe,
The first of these terms vanishes due to Eq. (2.1.7). So that
ff[OY d 0Y
0= 8S(1)= ri(t) dt (2.1.8)
t;
ex(t) dt ai(t)i xd
Note that the condition OS ( ' ) = 0 implies that S is extremized and not necessarily
minimized. We shall, however, continue the tradition of referring to this extremum
as the minimum. This equation is the analog of Eq. (2.1.5): the discrete variable
is replaced by 17(t); the sum over i is replaced by an integral over t, and Of/ ax, is
replaced by
ay d ay
ax(t) dt 0.i(t)
There are two terms here playing the role of Of/ ex, since 29 (or equivalently S) has
both explicit and implicit (through the •i terms) dependence on x(t). Since 17(t) is
arbitrary, we may extract the analog of Eq. (2.1.6):
{ ay d [
f =0 or tr <t_tf (2.1.9)
Ox(t) dt ai(t)l ci(t)
To deduce this result for some specific time t o , we simply choose an ti(t) that vanishes
everywhere except in an infinitesimal region around t0.
Equation (2.1.9) is the celebrated EulerLagrange equation. If we feed into it 79
= T V, T= V= V(x), we get REVIEW OF
CLASSICAL
OT MECHANICS
=  = mx
0.i
and
so that the EulerLagrange equation becomes just
d V
(m.i )=  —
dt Ox
which is just Newton's Second Law, Eq. (2.1.1).
If we consider a system described by n Cartesian coordinates, the same procedure
yields
d( \.2' (i=1,..., n) (2.1.10)
dt Xaxi
i
Now
and
V= V(x l ,... , x„)
so that Eq. (2.1.10) becomes
d .0V
dt ox i
which is identical to Eq. (2.1.2). Thus the minimum (action) principle indeed repro
duces Newtonian mechanics if we choose L= T V.
Notice that we have assumed that V is velocityindependent in the above proof.
An important force, that of a magnetic field B on a moving charge is excluded by
this restriction, since FL? = qv X B, q being the charge of the particle and v = t its
velocity. We will show shortly that this force too may be accommodated in the
Lagrangian formalism, in the sense that we can find an y that yields the correct
force law when Eq. (2.1.10) is employed. But this y no longer has the form T V.
One therefore frees oneself from the notion that 2, = T V; and views 2' as some
80 function Y(x„ which yields the correct Newtonian dynamics when fed into the
CHAPTER 2 EulerLagrange equations. To the reader who wonders why one bothers to even
deal with a Lagrangian when all it does is yield Newtonian force laws in the end, I
present a few of its main attractions besides its closeness to quantum mechanics.
These will then be illustrated by means of an example.
(1) In the Lagrangian scheme one has merely to construct a single scalar
and all the equations of motion follow by simple differentiation. This must be con
trasted with the Newtonian scheme, which deals with vectors and is thus more
complicated.
(2) The EulerLagrange equations (2.1.10) have the same form if we use, instead
of the n Cartesian coordinates xl , . , xn, any general set of n independent coordi
nates qi , q2, . , qn . To remind us of this fact we will rewrite Eq. (2.1.10) as
d (ay) ay
(2.1.11)
dt
One can either verify this by brute force, making a change of variables in Eq. (2.1.10)
and seeing that an identical equation with x, replaced by q, follows, or one can simply
go through our derivation of the minimum action condition and see that nowhere
were the coordinates assumed to be Cartesian. Of course, at the next stage, in showing
that the EulerLagrange equations were equivalent to Newton's, Cartesian coordi
nates were used, for in these coordinates the kinetic energy T and the Newtonian
equations have simple forms. But once the principle of least action is seen to generate
the correct dynamics, we can forget all about Newton's laws and use Eq. (2.1.11)
as the equations of motion. What is being emphasized is that these equations, which
express the condition for least action, are form invariant under an arbitrary change
of coordinates. This form invariance must be contrasted with the Newtonian equation
(2.1.2), which presumes that the x, are Cartesian. If one trades the x, for another
nonCartesian set of q„ Eq. (2.1.2) will have a different form (see Example 2.1.1 at
the end of this section).
Equation (2.1.11) can be made to resemble Newton's Second Law if one defines
a quantity
(2.1.12)
fqj
called the canonical momentum conjugate to q, and the quantity
(2.1.13)
called the generalized force conjugate to q,. Although the rate of change of the
canonical momentum equals the generalized force, one must remember that neither
is p, always a linear momentum (mass times velocity or "mv" momentum), nor is F,
always a force (with dimensions of mass times acceleration). For example, if q, is an
angle 0,p, will be an angular momentum and E a torque.
(3) Conservation laws are easily obtained in this formalism. Suppose the Lag 81
rangian depends on a certain velocity 4, but not on the corresponding coordinate qi . REVIEW OF
The latter is then called a cyclic coordinate. It follows that the corresponding p, is CLASSICAL
conserved: MECHANICS
d ( dp0uï 0
(2.1.14)
dt 04J dt qi
Although Newton's Second Law, Eq. (2.1.2), also tells us that if a Cartesian coordi
nate x, is cyclic, the corresponding momentum m r.ii is conserved, Eq. (2.1.14) is more
general. Consider, for example, a potential V(x, y) in two dimensions that depends
only upon p= (x2 + y) 1 /2 , and not on the polar angle 0, so that V(p, 0)= V(p). It
follows that 0 is a cyclic coordinate, as T depends only on 0 (see Example 2.1.1
below). Consequently ay/4 =p is conserved. In contrast, no obvious conservation
law arises from the Cartesian Eqs. (2.1.2) since neither x nor y is cyclic. If one
rewrites Newton's laws in polar coordinates to exploit OV/0(4= 0, the corresponding
equations get complicated due to centrifugal and Coriolis terms. It is the Lagrangian
formalism that allows us to choose coordinates that best reflect the symmetry of the
potential, without altering the simple form of the equations.
Example 2.1.1. We now illustrate the above points through an example. Con
sider a particle moving in a plane. The Lagrangian, in Cartesian coordinates, is
.r= fl1(.i2 +.)2 2)— V(x, y)
= mv • v — V(x, y) (2.1.15)
where y is the velocity of the particle, with v=t, r being its position vector. The
corresponding equations of motion are
OV
mx = — — (2.1.16)
Ox
. OV
m.f/ = — — (2.1.17)
Oy
which are identical to Newton's laws. If one wants to get the same Newton's laws
in terms of polar coordinates p and 0, some careful vector analysis is needed to
unearth the centrifugal and Coriolis terms:
+ inP((4)2 (2.1.18)
p
1 OV
n4= 2
(2.1.19)
p ad) P
82
CHAPTER 2
Figure 2.3. Points (1) and (2) are positions of the
particle at times differing by At.
Notice the difference in form between Eqs. (2.1.16) and (2.1.17) on the one hand
and Eqs. (2.1.18) and (2.1.19) on the other.
In the Lagrangian scheme one has only to recompute Y in polar coordinates.
From Fig. 2.3 it is clear that the distance traveled by the particle in time At is
dS=Rdp) 2 + (p d0)11 /2
so that the magnitude of velocity is
dS
y= — = [CV + P 2(4)) 21 1 /2
dt
and
y _ _ tn ( )62 ± p 2 (42) _ JA p, 0) (2.1.20)
(Notice that in these coordinates T involves not just the velocities 0 and (I, but also
the coordinate p. This does not happen in Cartesian coordinates.) The equations of
motion generated by this Y are
d eV .
— (m)6)=  + inP02 (2.1.21)
dt e )9
d 2• eV
— (mP 0)=  (2.1.22)
dt ao
which are the same as Eqs. (2.1.18) and (2.1.19). In Eq. (2.1.22) the canonical
momentum po =mp2(i) is the angular momentum and the generalized force —av/ao
is the torque, both along the z axis. Notice how easily the centrifugal and Coriolis
forces came out.
Finally, if V( p, (P)= V( p), the conservation of po is obvious in Eq. (2.1.22).
The conservation of po follows from Eq. (2.1.19) only after some manipulations and
is practically invisible in Eqs. (2.1.16) and (2.1.17). Both the conserved quantity and
its conservation law arise naturally in the Lagrangian scheme. El
Exercise 2.1.1. * Consider the following system, called a harmonic oscillator. The block 83
has a mass m and lies on a frictionless surface. The spring has a force constant k.
REVIEW OF
CLASSICAL
MECHANICS
Write the Lagrangian and get the equations of motion.
Exercise 2.1.2.* Do the same for the coupledmass problem discussed at the end of
Section 1.8. Compare the equations of motion with Eqs. (1.8.24) and (1.8.25).
Exercise 2.1.3.* A particle of mass m moves in three dimensions under a potential
V(r, B, V(r). Write its I' and find the equations of motion.
2.2. The Electromagnetic Lagrangiant
Recall that the force on a charge q due to an electric field E and magnetic field
B is given by
F=q(E+ vxB) (2.2.1)
where y = t is the velocity of the particle. Since the force is velocitydependent, we
must analyze the problem afresh, not relying on the preceding discussion, which was
restricted to velocityindependent forces.
Now it turns out that if we use
Y em = fimv•v — q0+ — v • A (2.2.2)
we get the correct electromagnetic force laws. In Eq. (2.2.2) c is the velocity of light,
while and A are the scalar and vector potentials related to E and B via
(2.2.3)
and
B=Vx A (2.2.4)
I See Section 18.4 for a review of classical electromagnetism.
84 The EulerLagrange equations corresponding to Y e are
CHAPTER 2
d ( a(v.A)
A,)= i= 1, 2, 3 (2.2.5)
ax, c
Combining the three equations above into a single vector equation we get
d qA
qV 0+ (v • A) (2.2.6)
dt(mv+
 )
The canonical momentum is
qA
p=mv+— (2.2.7)
Rewriting Eq. (2.2.6), we get
q [ dA
(mv)= +  + V(v • A)1 (2.2.8)
dt c dt
Now, the total derivative dA/dt has two parts: an explicit time dependence 0A/Ot,
plus an implicit one (v • V)A which represents the fact that a spatial variation in A
will appear as a temporal variation to the moving particle.Now Eq. (2.2.8) becomes
d OA q [V(v • A)  (v • V)A]
— (mv)= g — (2.2.9)
dt c at c
which is identical to Eq. (2.2.1) by virtue of the identity
v x (V x A) = V(v • A)  (v • V)A
Notice that Ye.,n is not of the form T V, for the quantity U= q0  (q/c)v • A
(sometimes called the generalized potential) cannot be interpreted as the potential
energy of the charged particle. First of all, the force due to a timedependent electro
magnetic field is not generally conservative and does not admit a pathindependent
work function to play the role of a potential. Even in the special cases when the
force is conservative, only q0 can be interpreted as the electrical potential energy.
The [q(v • A)/c] term is not a magnetic potential energy, since the magnetic force
F8 = q(v x B)/c never does any work, being always perpendicular to the velocity. To
accommodate forces such as the electromagnetic, we must, therefore, redefine Y to
be that function Y(q, t) which, when fed into the EulerLagrange equations,
reproduces the correct dynamics. The rule =T V becomes just a useful mnemonic
for the case of conservative forces.
85
REVIEW OF
CLASSICAL
MECHANICS
Figure 2.4. The relation between r 1 , r2 and rcm , r.
2.3. The TwoBody Problem
We discuss here a class of problems that plays a central role in classical physics:
that of two masses m l and m2 exerting equal and opposite forces on each other.
Since the particles are responding to each other and nothing external, it follows that
the potential between them depends only on the relative coordinate r = r 1 — r2 and
not the individual positions r 1 and r2 . But V(r i , r2) = V(r i — r2) means in turn that
there are three cyclic coordinates, for V depends on only three variables rather than
the possible six. (In Cartesian coordinate, since T is a function only of velocities, a
coordinate missing in V is also cyclic.) The corresponding conserved momenta will
of course by the three components of the total momentum, which are conserved in
the absence of external forces. To bring out these features, it is better to trade r 1
and r2 in favor of
r= rl — r2 (2.3.1)
and
m i l., +m2r2
rcm = (2.3.2)
m1 +m2
where rcm is called the centerofmass (CM) coordinate. One can invert Eqs. (2.3.1)
and (2.3.2) to get (see Fig. 2.4)
j_ m2r
r 1 = km + (2.3.3)
m1 +m2
r2 = rcm (2.3.4)
m1 ± m2
If one rewrites the Lagrangian
.29 = 21 milril 2 + rn21r21 2— IAri — r2) (2.3.5)
in terms of rcm and r, one gets
1 2 1 M1M2
(m i ± m2)1 tcm lY= + I f2— V(r) (2.3.6)
2 m 1 + m2
86 The main features of Eq. (2.3.6) are the following.
CHAPTER 2
(1) The problem of two mutually interacting particles has been transformed to
that of two fictitious particles that do not interact with each other. In other words,
the equations of motion for r do not involve rcm and vice versa, because 29 (r, t;
rcm tcm) = Y(r, t) + Y(rcm fcm )•
(2) The first fictitious particle is the CM, of mass M= m 1 + m2 . Since rcm is a
cyclic variable, the momentum pcm = Mtcm (which is just the total momentum) is
conserved as expected. Since the motion of the CM is uninteresting one usually
ignores it. One clear way to do this is to go to the CM frame in which km = 0, so
that the CM is completely eliminated in the Lagrangian.
(3) The second fictitious particle has mass p = m1m2/ (mi + m2 ) (called the
reduced mass), momentum p= pt and moves under a potential V(r). One has just to
solve this onebody problem. If one chooses, one may easily return to the coordinates
r 1 and r2 at the end, using Eqs. (2.3.1) and (2.3.2).
Exercise 2.3.1.* Derive Eq. (2.3.6) from (2.3.5) by changing variables.
2.4. How Smart Is a Particle?
The Lagrangian formalism seems to ascribe to a particle a tremendous amount
of foresight: a particle at (x„ t,) destined for (xf, tf ) manages to calculate ahead of
time the action for every possible path linking these points, and takes the one with
the least action. But this, of course, is an illusion. The particle need not know its
entire trajectory ahead of time, it needs only to obey the EulerLagrange equations
at each instant in time to minimize the action. This in turn means just following
Newton's law, which is to say, the particle has to sample the potential in its immediate
vicinity and accelerate in the direction of greatest change.
Our esteem for the particle will sink further when we learn quantum mechanics.
We will discover that far from following any kind of strategy, the particle, in a sense,
goes from (x„ t) to (xf, tf ) along all possible paths, giving equal weight to each!
How it is that despite this, classical particles do seem to follow xc, (t) is an interesting
question that will be answered when we come to the path integral formalism of
quantum mechanics.
2.5. The Hamiltonian Formalism
In the Lagrangian formalism, the independent variables are the coordinates q,
and velocities 41 . The momenta are derived quantities defined by
(2.5.1)
In the Hamiltonian formalism one exchanges the roles of 4 and p: one replaces the 87
Lagrangian Y(q, 4)1 by a Hamiltonian ,rt°(q,p) which generates the equations of REVIEW OF
motion, and 4 becomes a derived quantity, CLASSICAL
MECHANICS
0,Y(
(2.5.2)
up;
thereby completing the role reversal of the 4's and the p's.
There exists a standard procedure for effecting such a change, called a Legendre
transformation, which is illustrated by the following simple example. Suppose we
have a function f(x) with
df
u(x)= (2.5.3)
dx
Let it be possible to invert u(x) to get x(u). [For example if u(x)= x3, x(u)= u113 ,
etc.] If we define a function
g(u)= x(u)u— f(x(u)) (2.5.4)
then
dg dx df dx
—• u+ x(u) • —=x(u) (2.5.5)
du du dx du
That is to say, in going from f to g (or vice versa) we exchange the roles of x and
u. One calls Eq. (2.54) a Legendre transformation and f and g Legendre transforms
of each other.
More generally, if f=f(xi, x2, , x„), one can eliminate a subset {x„ i=1 to
j} in favor of the partial derivatives u,=aflex, by the transformation
(2.5.6)
It is understood in the righthand side of Eq. (2.5.6) that all the x,'s to be eliminated
have been rewritten as functions of the allowed variables in g. It can be easily verified
that
(2.5.7)
where in taking the above partial derivative, one keeps all the other variables in g
constant.
We will often refer to q i , , q, as q and 13 1 , , p, as p.
88 Table 2.1. Comparison of the Lagrangian and Hamiltonian Formalisms
CHAPTER 2 Lagrangian formalism Hamiltonian formalism
(1) The state of a system with n degrees of (1) The state of a system with n degrees of free
freedom is described by n coordinates dom is described by n coordinates and n
(q, , , q,) and n velocities (4 1 , , 4,), or momenta (q, .....q,; p, .....p,) or, more
in a more compact notation by (q,4). succinctly, by (q, p).
(2) The state of the system may be represented (2) The state of the system may be represented
by a point moving with a definite velocity in by a point in a 2ndimensional phase space,
an ndimensional configuration space. with coordinates (.7 1 , q„ ; pi .. • ,p0.
(3) The n coordinates evolve according to n ( 3) The 2n coordinates and momenta obey 2n
secondorder equations. firstorder equations.
(4) For a given .99, several trajectories may pass (4) For a given A' only one trajectory passes
through a given point in configuration space through a given point in phase space.
depending on 4.
Applying these methods to the problem in question, we define
Ye(q, p)= — Y(q, 4) (2.5.8)
,=
where the 4's are to be written as functions of q's and p's. This inversion is generally
easy since Y is a polynomial of rank 2 in 4, and p,=Ygq, is a polynomial of rank
1 in the 4's, e.g., Eq. (2.2.7). Consider now
aito
=
api Opi
(EMI — r) (2.5.9)
04.; ay aqi
=4i+Epi E
Pi a41 Pi
=i (since = (2.5.10)
[There are no (a 100(410A) terms since q is held constant in OY(gp i ; that is,
q and p are independent variables.] Similarly,
ayr a4; Y ay a4; _ _ Y
eqi
=P
E — E
qi aqi 7 04; ai aqi
(2.5.11)
7
We now feed in the dynamics by replacing (0Y / Oq i) by fi„ and obtain Hamilton's
canonical equations:
= 9i, = (2.5.12)
op; O qi
Note that we have altogether 2n firstorder equations (in time) for a system with n
degrees of freedom. Given the initialvalue data, (q (0) , p (0)) , i= 1, . . n, we can
integrate the equations to get (q,(t), p(t)).
Table 2.1 provides a comparison of the Lagrangian and Hamiltonian
formalisms.
Now, just as Y may be interpreted as T V if the force is conservative, so there
— 89
exists a simple interpretation for it' in this case. Consider the sum Ei p. Let us REVIEW OF
use Cartesian coordinates, in terms of which CLASSICAL
MECHANICS
T= E
ay aT
=— =mi.ki
xi
and
E E inifd= 2T (2.5.13)
so that
= T+ V (2.5.14)
the total energy. Notice that although we used Cartesian coordinates along the
way, the resulting equation (2.5.14) is a relation among scalars and thus coordinate
independent.
Exercise 2.5.1. Show that if T= Ei T1 (q)41 4i , where 4's are generalized velocities,
1 p1 ,= 2T.
The Hamiltonian method is illustrated by the simple example of a harmonic
oscillator, for which
=— 4x2
The canonical momentum is
0Y
p= =m.i
It is easy to invert this relation to obtain as a function of p:
•=p/m
90 and obtain
CHAPTER 2
e(x, p)=T+ V= iii[(p)]2 + .•kx2
2 1
P
= — + kx2 (2.5.15)
2m2
The equations of motion are
0.Y(
q• —>—=
P .
(2.5.16)
Op
ayr =13_, —kx= (2.5.17)
eq
These equations can be integrated in time, given the initial q and p. If, however, we
want the familiar secondorder equation, we differentiate Eq. (2.5.16) with respect
to time, and feed it into Eq. (2.5.17) to get
kx = 0
Exercise 2.5.2. Using the conservation of energy, show that the trajectories in phase
space for the oscillator are ellipses of the form (x/a) 2 +(p/b)2 = 1, where a 2 = 2E/k and b2 =
2mE.
Exercise 2.5.3. Solve Exercise 2.1.2 using the Hamiltonian formalism.
Exercise 2.5.4. * Show that ,rt corresponding to in Eq. (2.3.6) is ,S° =1Pcs41 2/2M+ 1P1 2/
2p + V(r), where M is the total mass, p is the reduced mass, pcm and p are the momenta
conjugate to rcm and r, respectively.
2.6. The Electromagnetic Force in the Hamiltonian Scheme
The passage from Ye m to its Legendre transform Ye e is not sensitive in any
way to the velocitydependent nature of the force. If 11' em generated the correct force
laws, so will Yt' e ,n , the dynamical content of the schemes being identical. In contrast,
the velocity independence of the force was assumed in showing that the numerical
value of le is T+ V, the total energy. Let us therefore repeat the analysis for the
electromagnetic case. As
= t'inv•y— q0+ — v•A
andI
Note that in this discussion, q is the charge and not the coordinate. The (Cartesian) coordinate r is
hidden in the functions A(r, t) and 0(r, t).
qA 91
p=mv +
REVIEW OF
CLASSICAL
we have MECHANICS
es m=p.v — Ye• m
v•A 1 qv•A
=mv•v+q — mv•v+q0—
c 2
= 2•mv•v+q0=T+qcp (2.6.1)
Now, there is something very disturbing about Eq. (2.6.1): the vector potential A
seems to have dropped out along the way. How is Ye e , to generate the correct
dynamics without knowing what A is? The answer is, of course, the le is more than
just T+0; it is T+ q4 written in terms of the correct variables, in particular, in
terms of p and not v. Making the change of variables, we get
10 — 9A/012 + qçb (2.6.2)
2m
with the vector potential very much in the picture.
2.7. Cyclic Coordinates, Poisson Brackets, and Canonical Transformations
Cyclic coordinates are defined here just as in the Lagrangian case and have the
same significance: if a coordinate q, is missing in Ye, then
—
aye = 0 (2.7.1)
aqi
Now, there will be other quantities, such as the energy, that may be conserved in
addition to the canonical momenta. § There exists a nice method of characterizing
these in the Hamiltonian formalism. Let co(p, q) be some function of the state vari
ables, with no explicit dependence on t. Its time variation is given by
dco (Ow , .)
— = E — qi Fpi
dt eqi api
aco aye Ow )
=E (aqi api api aqi
(2.7.2)
§ Another example is the conservation of I,= xpy yp, when V(x, y) = V(x 2 +y2 ). There are no cyclic
—
coordinates here. Of course, if we work in polar coordinates, V( p, 0)= V( p), and p = mp2ck =1, is
conserved because it is the momentum conjugate to the cyclic coordinate 0.
92 where we have defined the Poisson bracket (PB) between two variables co(p, q) and
CHAPTER 2 (p, q) to be
Oa) OA Oa) OA)
Ict), (2.7.3)
i Oq Op, Opi Oqi )
(
It follows from Eq. (2.7.2) that any variable whose PB with lc' vanishes is constant in
time, i.e., conserved. In particular ,h9 itself is a constant of motion (identified as the
total energy) if it has no explicit t dependence.
Exercise 2.7.1. * Show that
Ico, A.1= —IA, co}
{co, A+ o } = {co, Al+ {co, o  }
{co, Ao } = {co, Alo  + X{co, o }
Note the similarity between the above and Eqs. (1.5.10) and (1.5.11) for commutators.
Of fundamental importance are the PB between the q's and the p's. Observe
that
lqi , = {pi , A.} =0 (2.7.4a)
{q i , = bu (2.7.4b)
since (q„. , pu) are independent variables (0q,/0q,= by , aq,/p,= 0, etc.). Hamil
ton's equations may be written in terms of PB as
4i=1qi,..119 1 (2.7.5a)
(2.7.5b)
by setting co = qi or pi in Eq. (2.7.2).
Exercise 2.7.2. * (i) Verify Eqs. (2.7.4) and (2.7.5). (ii) Consider a problem in two dimen
sions given by Ye =p2,+p,2 + ax2 + by2 . Argue that if a = b, Ye} must vanish. Verify by
explicit computation.
Canonical Transformations
We have seen that the EulerLagrange equations are form invariant under an
arbitrary t change of coordinates in configuration space
q,+ 4, (q i , , qu ), i= 1, . . . , n (2.7.6a)
We assume the transformation is invertible, so we may write q in terms of q: q = q(4). The transformation
may also depend on time explicitly [4= q(q, 0], but we do not consider such cases.
or more succinctly 93
REVIEW OF
q —> q(q) (2.7.6b) CLASSICAL
MECHANICS
The response of the velocities to this transformation follows from Eq. (2.7.6a):
4i= qi= =E 91 (2.7.7)
dt 0q)
The response of the canonical momenta may be found by rewriting in terms of
(4, 4 ) and taking the derivative with respect to q:
_ 0..2° (', 4)
PI (2.7.8)
qi
The result is (Exercise 2.7.8):
(2.7.9)
aqi
Notice that although .29 enters Eq. (2.7.8), it drops out in Eq. (2.7.9), which connects
p to the old variables. This is as it should be, for we expect that the response of the
momenta to a coordinate transformation (say, a rotation) is a purely kinematical
question.
A word of explanation about .29(4, 4). By 2' (4, 4) we mean the Lagrangian (say
T V, for definiteness) written in terms of 4 and 4. Thus the numerical value of the
Lagrangian is unchanged under (q, 4) > (4,4); for (q, 4) and ( 4. ,4) refer to the same
physical state. The functional form of the Lagrangian, however, does change and so
we should really be using two different symbols (q, 4) and .9(4, 4). Nonetheless
we follow the convention of denoting a given dynamical variable, such as the Lag
rangian, by a fixed symbol in all coordinate systems.
The invariance of the EulerLagrange equations under (q, 4) + (4, ) implies
the invariance of Hamilton's equation under (q, p) > (4, p), i.e., (4,p) obey
4,= 0,.119/a15,, Pi= — (a f/a4,) (2.7.10)
where Jrp) is the Hamiltonian written in terms of 4 and p. The proof is
simple: we start with .29(4, 4), perform a Legendre transform, and use the fact that
4 obeys EulerLagrange equations.
The transformation
aq;
qi— 4i (qi • • • qn), (2.7.11)
94 is called a point transformation. If we view the Hamiltonian formalism as something
CHAPTER 2 derived from the Lagrangian scheme, which is formulated in ndimensional config
uration space, this is the most general (timeindependent) transformation which
preserves the form of Hamilton's equations (that we can think of). On the other
hand, if we view the Hamiltonian formalism in its own right, the backdrop is the
2ndimensional phase space. In this space, the point transformation is unnecessarily
restrictive. One can contemplate a more general transformation of phase space
coordinates:
4(q, p)
(2.7.12)
p + p(q, p)
Although all sets of 2n independent coordinates (4, p) are formally adequate for
describing the state of the system, not all of them will preserve the canonical form
of Hamilton's equations. (This is like saying that although Newton's laws may be
written in terms of any complete set of coordinates, the simple form in.11 = —OVIeq,
is valid only if the q, are Cartesian). If, however, ( .4, p) obey the canonical equations
(2.7.10), we say that they are canonical coordinates and that Eq. (2.7.12) defines a
canonical transformation. Any set of coordinates (q, , . • • , qn), and the corresponding
momenta generated in the Lagrangian formalism (p,=0..r /q,), are canonical coordi
nates. Given one set, (q, p), we can get another, ( 4, p), by the point transformation,
which is a special case of the canonical transformation. This does not, however,
exhaust the possibilities. Let us now ask the following question. Given a new set of
coordinates (q(q,p),p(q,p)), how can we tell if they are canonical [assuming (q, p)
are]? Now it is true for any co(x, p) that
Oa) OA° Ow 0,k)
cb= {co, Yt°}= E (2.7.13)
(0q, Op, Op, Oqi )
Applying this to (q, p) we find
• v (0q; 0,Ye' 0q; 0,k)
(2.7.14)
qj L7 ' Oqi pj pi 0q,)
If we view as a function of (445) and use the chain rule, we get
0,Y f(q, p) = 0,yt(,p) v (0,yf aq k+ 0,y1, Opk )
(2.7.15a)
op, api a4k api afik api)
and
0,h9(q, p) (q, p) (OA° a4k oaYf aPk)
0q, aqi
Ek a4k +
aqi al5k aqi
(2.7.15b)
Feeding all this into Eq. (2.7.14) we find, upon regrouping terms, 95
REVIEW OF
CLASSICAL
4=Ek ( a
:r 4kl+ a?r IC/50)

(2.7.16) MECHANICS
, qk ()Pk
It can similarly be established that
(ayf ayf
P=Ek .qk 40+ {fii,fik}) (2.7.17)
al3k
If Eqs. (2.7.16) and (2.7.17) are to reduce to the canonical equations (2.7.10) for
any < f(q, p), we must have
4k1 = 0 =
(2.7.18)
IC Pk} =8A
These then are the conditions to be satisfied by the new variables if they are to be
canonical. Notice that these constraints make no reference to the specific functional
form of Yf : the equations defining canonical variables are purely kinematical and
true for any Y f(q, p).
Exercise 2.7.3. Fill in the missing steps leading to Eq. (2.7.18) starting from Eq. (2.7.14).
Exercise 2.7.4. Verify that the change to a rotated frame
= x cos 0 — y sin 0
)7= x sin + y cos 0
Px = cos 9—p r sin 0
py = px sin 0 +py cos 0
is a canonical transformation.
Exercise 2.7.5. Show that the polar variables p= (x2 + y2)' / = tan I (y / x),
xPx — YPy
PP = P • P Po — xPy — YPx(= 1.)
(x2 +y2) 1 /2 '
are canonical. Cep is the unit vector in the radial direction.)
96 Exercise 2.7.6.* Verify that the change from the variables r 1 , r2, PI, P2 to rcm, Pcm, r,
and p is a canonical transformation. (See Exercise 2.5.4).
CHAPTER 2
Exercise 2.7.7. Verify that
rq = ln(q1 sin p)
p=q cot p
is a canonical transformation.
Exercise 2.7.8. We would like to derive here Eq. (2.7.9), which gives the transformation
of the momenta under a coordinate transformation in configuration space:
q„)
(1) Argue that if we invert the above equation to get q=q(q), we can derive the following
counterpart of Eq. (2.7.7):
aqi
qi= E
aq;
(2) Show from the above that
aq
( aj!)
4 i 4 at1i
(3) Now calculate
p[
0..r(q,
aq
_[0...r(q, 4)
34i 1
Use the chain rule and the fact that q=q(q) and not q (4,, 4) to derive Eq. (2.7.9).
(4) Verify, by calculating the PB in Eq. (2.7.18), that the point transformation is
canonical.
If (q, p) and (4, fi) are both canonical, we must give them both the same status,
for Hamilton's equations have the same appearance when expressed in terms of
either set. Now, we have defined the PB of two variables co and a in terms of (q, p)
as
{co, =E (w au ow ao)
api aqi
Should we not also define a PB, {a), al q,f, for every canonical pair (4, fi)? Fortunately
it turns out that the PB are invariant under canonical transformations:
Ico,a} q ,p {co, a } 4 ,p (2.7.19)
(It is understood that co and a are written as functions of 4 and fi on the righthand
side.)
Exercise 2.7.9. Verify Eq. (2.7.19) by direct computation. Use the chain rule to go from 97
q,p derivatives to 4,, p derivatives. Collect terms that represent PB of the latter.
REVIEW OF
CLASSICAL
Besides the proof by direct computation (as per Exercise 2.7.9 above) there is MECHANICS
an alternate way to establish Eq. (2.7.19).
Consider first a = Yi°. We know that since (q, p) obey canonical equations,
ci)= {co, )°},, p
But then ( 4. , j3) also obey canonical equations, so
th= {w,
Now co is some physical quantity such as the kinetic energy or the component
of angular momentum in some fixed direction, so its rate of change is independent
of the phase space coordinates used, i.e., c(*) is th, whether co = co(q, p) or w(, j). So
{ co,,Y(},, p = {co, o}
Having proved the result for what seems to be the special case a = Yi°, we now pull
the following trick. Note that nowhere in the derivation did we have to assume that
Jr was any particular function of q and p. In fact, Hamiltonian dynamics, as a
consistent mathematical scheme, places no restriction on Yf. It is the physical require
ment that the time evolution generated by Yt° coincide with what is actually observed,
that restricts Yi° to be T+ V. Thus Yi° could have been any function at all in the
preceding argument and in the result Eq. (2.7.20) (which is just a relation among
partial derivatives.) If we understand that Yt° is not T+ V in this argument but an
arbitrary function, call it a, we get the desired result.
Active Transformations
So far, we have viewed the transformation
P= P(9, P)
as passive: both (q, p) and (4, fi) refer to the same point in phase space described
in two different coordinate systems. Under the transformation (q, p)—> (4, p), the
numerical values of all dynamical variables are unchanged (for we are talking about
the same physical state), but their functional form is changed. For instance,
under a change from Cartesian to spherical coordinates, co (x, y, z)=
x2 z2
+y2 > co(r, 0, 4>)= r2 . As mentioned earlier, we use the same symbol for a
given variable even if its functional dependence on the coordinates changes when we
change coordinates.
Consider now a restricted class of transformations, called regular trans
formations, which preserve the range of the variables: (q, p) and ( 4. , fi) have the same
range. A change from one Cartesian coordinate to a translated or rotated one is
98 regular (each variable goes from —co to + co before and after), whereas a change to
CHAPTER 2
spherical coordinates (where some coordinates are nonnegative, some are bounded
by 27r, etc.) is not.
A regular transformation (q, p) —0 (4, p) permits an alternate interpretation:
instead of viewing ( 4, p) as the same phase space point in a new coordinate system,
we may view it as a new point in the same coordinate system. This corresponds to
an active transformation which changes the state of the system. Under this change,
the numerical value of any dynamical variable co(q, p) will generally change:
co (q, p) co( 4, p), though its functional dependence will not: u(, is the same
)
function co(q, p) evaluated at the new point (q= 4,  p=
We say that co is invariant under the regular transformation (q, p) —> p) if
co(q, p)— co(4, (2.7.21)
(This equation has content only if we are talking about the active transformations,
for it is true for any co under a passive transformation.)
Whether we view the transformation (q, p) —> (4, p) as active or passive, it is
called canonical if ( p) obey Eq. (2.7.18). As we shall see, only regular canonical
transformations are physically interesting.
2.8. Symmetries and Their Consequences
Let us begin our discussion by examining what the word "symmetry" means in
daily usage. We say that a sphere is a very symmetric object because it looks the
same when seen from many directions. Or, equivalently, a sphere looks the same
before and after it is subjected to a rotation around any axis passing through its
center. A cylinder has symmetry too, but not as much: the rotation must be per
formed around its axis. Generally then, the symmetry of an object implies its invari
ance under some transformations, which in our example are rotations.
A symmetry can be discrete or continuous, as illustrated by the example of a
hexagon and a circle. While the rotation angles that leave a hexagon unchanged
form a discrete set, namely, multiples of 60 0 , the corresponding set for a circle is a
continuum. We may characterize the continuous symmetry of the circle in another
way. Consider the identity transformation, which does nothing, i.e., rotates by 00 in
our example. This leaves both the circle and the hexagon invariant. Consider next
an infinitesimal transformation, which is infinitesimally "close" to the identity; in our
example this is a rotation by an infinitesimal angle c. The infinitesimal rotation
leaves the circle invariant but not the hexagon. The circle is thus characterized by
its invariance under infinitesimal rotations. Given this property, its invariance under
finite rotations follows, for any finite rotation may be viewed as a sequence of
infinitesimal rotations (each of which leaves it invariant).
It is also possible to think of functions of some variables as being symmetric in
the sense that if one changes the values of the variables in a certain way, the value
of the function is invariant. Consider for example
f(x, y)= x 2 +
If we make the following change 99
REVIEW OF
x+.k=xcos0—ysin0 CLASSICAL
(2.8.1) MECHANICS
y+ )=x sin +y cos El
in the arguments, we find that f is invariant. We say that f is symmetric under the
above transformation. In the terminology introduced earlier, the transformation in
question is continuous: its infinitesimal version is
x 0 = x cos c —y sin c=x—ye
(2.8.2)
y+ 5=x sin E+ycosE xE+y (to order E)
Consider now the function .°(q, p). There are two important dynamical conse
quences that follow from its invariance under regular canonical transformations.
I. If Yf is invariant under the following infinitesimal transformation (which you
may verify is canonical, Exercise 2.8.2),
ag
qi = qi + bqi
pi
(2.8.3)
ag
6pi
qj
where g(q, p) is any dynamical variable, then g is conserved, i.e., a constant of motion.
One calls g the generator of the transformation.
II. If Yf is invariant under the regular, canonical, but not necessarily infinitesi
mal, transformation (q, p) 0 (4, p), and if (q(t), p(t)) is a solution to the equations
of motion, so is the transformed (translated, rotated, etc.) trajectory, WO, p(t)).
Let us now analyze these two consequences.
Consequence I. Let us first verify that g is indeed conserved if Yi° is invariant
under the transformation it generates. Working to first order in c, if we equate the
change in under the change of its arguments to zero, we get
.5Y1' — aYf (e ag— )+aYf )— =0 (2.8.4)
OP, ap, aq,
But according to Eq. (2.7.2),
1g, = O —>g is conserved (2.8.5)
(More generally, the response of any variable co to the transformation is
bco = c{co, g} (2.8.6)
100 Note that Sp and Sq in Eq. (2.8.3) may also be written as PBs.) Consider as an
CHAPTER 2
example, a particle in one dimension and the case g=p. From Eq. (2.8.3),
Op
6x= E — = E
Op
(2.8.7)
Op
Sp= —E — =0
ax
which we recognize to be an infinitesimal translation. Thus the linear momentum p
is the generator of spatial translations and is conserved in a translationally invariant
problem. The physics behind this result is clear. Sincep is unchanged in a translation,
so is T=p2/2m. Consequently V(x + E)= V(x). But if the potential doesn't vary from
point to point, there is no force and p is conserved.
Next consider an example from two dimensions with g= lz = xpy — yp x . Here,
6x= —ye(=£ alz
apx)
6y= xE(= £
aPY
(2.8.8)
aL)
Opx = —Py g ( =—E
6Py =AcE HE —aizay )
which we recognize to be an infinitesimal rotation around the z axis, [Eq. (2.8.2)].
Thus the angular momentum around the z axis is the generator of rotations around
that axis, and is conserved if Yf is invariant under rotations of the state around that
axis. The relation between the symmetry and the conservation law may be understood
in the following familiar terms. Under the rotation of the coordinates and the
momenta, II doesn't change and so neither does T=Ip1 2/2m. Consequently, V is a
constant as we go along any circle centered at the origin. This in turn means that
there is no force in the tangential direction and so no torque around the z axis. The
conservation of 1z then follows.
Exercise 2.8.1. Show that p=p i +p2 , the total momentum, is the generator of infintesimal
translations for a twoparticle system.
Exercise 2.8.2. * Verify that the infinitesimal transformation generated by any dynamical
variable g is a canonical transformation. (Hint: Work, as usual, to first order in E.)
Exercise 2.8.3. Consider
p2x + p2Y +1 nuo 2( x2 + y2)
2m 2
whose invariance under the rotation of the coordinates and momenta leads to the conservation 101
of But 19 is also invariant under the rotation of just the coordinates. Verify that this is a
REVIEW OF
noncanonical transformation. Convince yourself that in this case it is not possible to write
CLASSICAL
OYI' as el g} for any g, i.e., that no conservation law follows. MECHANICS
Exercise 2.8.4.* Consider Ye= p,2 +Ix which is invariant under infinitesimal rotations
in phase space (the xp plane). Find the generator of this transformation (after verifying that
it is canonical). (You could have guessed the answer based on Exercise 2.5.2.).
The preceding analysis yields, as a byproduct, a way to generate infinitesimal
canonical transformations. We take any function g(q, p) and obtain the transforma
tion given by Eq. (2.8.6). (Recall that although we defined a canonical transformation
earlier, until now we had no means of generating one.) Given an infinitesimal canon
ical transformation, we can get a finite one by "integrating" it. The following
examples should convince you that this is possible. Consider the transformation
generated by g = °. We have
6qi = elqi ,
(2.8.9)
6111= E{Pi, Jr}
But we know from the equations of motion that 4,= Yr} etc. So
6q1= EçÎj
(2.8.10)
bpi = epi
Thus the new point in phase space (q,p)= (q+ Sq, p+ 6p) obtained by this canonical
transformation of (q, p) is just the point to which (q, p) would move in an infinitesi
mal time interval E. In other words, the motion of points in phase space under the
time evolution generated by Yi° is an active canonical transformation. Now, you
know that by integrating the equations of motion, we can find (445) at any future
time, i.e., get the finite canonical transformation. Consider now a general case of
g0Jr. We still have
6qi = Etqi ,
(2.8.11)
6pi =
Mathematically, these equations are identical to Eq. (2.8.9), with g playing the role
of the Hamiltonian. Clearly there should be no problem integrating these equations
for the evolution of the phase space points under the "fake" Hamiltonian g, and
fake "time" E. Let us consider for instance the case g =12 which has units erg sec
and the corresponding fake time e= 80, an angle. The transformation of the coordi
nates is
Ox= efx,1,1= — Ey (60)y
(2.8.12)
Sy = (80)x
102 The fake equations of motion are
CHAPTER 2
dx dy _
= (2.8.13)
dB dO x
Differentiating first with respect to 0, and using the second, we get
d2x
+x0
d02
and likewise,
d2y _ 0
2Y d0
So
x= A cos 0 + B sin 0
y Csin 0 + D cos 0
—
We find the constants from the "initial" (0 = 0) coordinates and "velocities": A=
x0 , D=yo , B = (0x/ 00) 0 = — yo, (ay I a0)0= xo. Reverting to the standard nota
tion in which (x, y), rather than (x o , y0), labels the initial point and y), rather
than (x, y), denotes the transformed one, we may write the finite canonical trans
formation (a finite rotation) as
= x cos 0 — y sin
(2.8.14)
y= x sin 0 + y cos 0
Similar equations may be derived for fix and py in terms of Px and pi,.
Although a wide class of canonical transformations is now open to us, there
are many that aren't. For instance, (q, p)—> (—q, —p) is a discrete canonical trans
formation that has no infinitesimal version. There are also the transformations that
are not regular, such as the change from Cartesian to spherical coordinates, which
have neither infintesimal forms, nor an active interpretation. We do not consider
ways of generating these.t
Consequence II. Let us understand the content of this result through an example
before turning to the proof. Consider a twoparticle system whose Hamiltonian is
invariant under the translation of the entire system, i.e., both particles. Let an
observer SA prepare, at t= 0, a state (4, x02 ; p p?) which evolves as (xi (t), x2(t);
,
Mt), p2(t)) for some time and ends up in the state (xT, ; pr) at time T. Let
For an excellent and lucid treatment of this question and many other topics in advanced classical
mechanics, see H. Goldstein, Classical Mechanics, AddisonWesley, Reading, Massachusetts (1950); E.
C. G. Sudharshan and N. Mukunda, Classical Dynamics: A Modern Perspective, Wiley, New York
(1974).
us call the final state the outcome of the experiment conducted by SA . We are told 103
that as a result of the translational invariance of ff, any other trajectory that is REVIEW OF
related to this by an arbitrary translation a is also a solution to the equations of CLASSICAL
motion. In this case, the initial state, for example, is (x?+ a, .x? + a; p?, A. The final MECHANICS
state and all intermediate states are likewise displaced by the same amount. To an
observer SB, displaced relative to SA by an amount a, the evolution of the second
system will appear to be identical to what SA saw in the first. Assuming for the sake
of this argument that SB had in fact prepared the second system, we may say that
a given experiment and its translated version will give the same result (as seen by
the observers who conducted them) if is translationally invariant.
The physical idea is the following. For the usual reasons, translational invariance
of Ye implies the invariance of V(x, , x2). This in turn means that V(xl , x2) —
V(x l — x2). Thus each particle cares only about where the other is relative to it, and
not about where the system as a whole is in space. Consequently the outcome of the
experiment is not affected by an overall translation.
Consequence II is just a generalization of this result to other canonical trans
formations that leave Ye° invariant. For instance, if is rotationally invariant, a
given experiment and its rotated version will give the same result (according to the
observers who conducted them).
Let us now turn to the proof of the general result.
Proof Imagine a trajectory (q(t), p(t)) in phase space that satisfies the equations
of motion. Let us associate with it an image trajectory, WO, p(t)), which is obtained
by transforming each point (q, p) to the image point (q, p) by means of a regular
canonical transformation. We ask if the image point moves according to Hamilton's
equation of motion, i.e., if
qJ _ P (2.8.15)
äp1 q1
if O is invariant under the transformation (q, p) —> (4, p). Now 4,(q, p), like any
dynamical variable w(q, p), obeys
4i = ((q, p)},, (2.8.16)
If (q, p) —> (4, p) were a passive canonical transformation, we could write, since the
PB are invariant under such a transformation,
aff(4, p)
41 = ((q, p)},,,= 14J , f f(4, 15)4,—
0p;
But it is an active transformation. However, because of the symmetry of Y °, i.e.,
p)— .Y04, p), we can go through the very same steps that led to Eq. (2.7.16)
from Eq. (2.7.14) and prove the result. If you do not believe this, you may verify it
104 by explicit computation using Y((q,p)= ,Ye(4,p). A similar argument shows that
CHAPTER 2
Y ((4, 15)
(2.8.17)
arq;
So the image point moves according to Hamilton's equations. Q.E.D.
Exercise 2.8.5. Why is it that a noncanonical transformation that leaves Yt" invariant
does not map a solution into another? Or, in view of the discussions on consequence II, why
is it that an experiment and its transformed version do not give the same result when the
transformation that leaves Ye' invariant is not canonical? It is best to consider an example.
Consider the potential given in Exercise 2.8.3. Suppose I release a particle at (x = a, y=0)
with (p„= b, py = 0) and you release one in the transformed state in which (x=0, y= a) and
(p,,= b, p=O), i.e., you rotate the coordinates but not the momenta. This is a noncanonical
transformation that leaves Yt' invariant. Convince yourself that at later times the states of the
two particles are not related by the same transformation. Try to understand what goes wrong
in the general case.
As you go on and learn quantum mechanics, you will see that the symmetries
of the Hamiltonian have similar consequences for the dynamics of the system.
A Useful Relation Between S and E
We now prove a result that will be invoked in Chapter 16:
S1 (x1, tf;
atf
where Sci tf ; x„ t,) is the action of the classical path from x i , t, to xi, t1 and ./f
is the Hamiltonian at the upper end point. Since we shall be working with problems
where energy is conserved we may write
asc, tf; x i , _ E (2.8.18)
Otf
where E is the conserved energy, constant on the whole trajectory.
At first sight you may think that since
Sci= dt
t,
105
REVIEW OF
CLASSICAL
MECHANICS
xi
Figure 2.5. The upper trajectory takes time t while the lower
takes t + At. t t + At
the right side must equal Y and not —E. The explanation requires Fig. 2.5 wherein
we have set xi = ti = 0 for convenience.
The derivative we are computing is governed by the change in action of the
classical path due to a change in travel by At holding the end points x, and xf fixed.
From the figure it is clear that now the particle takes a different classical trajectory
x(t)=x 1 (t)+q(t) with îi(0)=0.
so that the total change in action comes from the difference in paths between t = 0
and t= t as well as the entire action due to the extra travel between tt and ti+ Atf .
Only the latter is given Y At. The correct answer is then
tf [0„r OY
6Sci = f 77(0 + 7)(t)idt+ Y(tf ) At
0
f tf d
dt
1329 +all
0 .x Xci
17(0 dt+ fofl —
dc t r ri(t)idt+ Y(tf ) At
OY
=0+ 77(0 + Y(tf ) At.
tf
It is clear from the figure that 17(14 = —.ic(t1 ) At so that
OY
6S—[ Y1 At= — Yf(t f ) At
if
from which the result follows.
Exercise 2.8.6. Show that aSci laxf =p(tf ).
Exercise 2.8.7. Consider the harmonic oscillator, for which the general solution is
x(t)= A cos cot+ B sin on.
106 Express the energy in terms of A and B and note that it does not depend on time. Now choose
A and B such that x(0) =x, and x(T)= x 2 . Write down the energy in terms of x, , x2 , and T.
CHAPTER 2
Show that the action for the trajectory connecting x, and x2 is
mco
S1 (x1 , x 2 , T) — [(x; cos coT — 2xix2].
2 sin coT
Verify that aSci/ OT= —E.
All Is Not Well with
Classical Mechanics
It was mentioned in the Prelude that as we keep expanding our domain of observa
tions we must constantly check to see if the existing laws of physics continue to
explain the new phenomena, and that, if they do not, we must try to find new laws
that do. In this chapter you will get acquainted with experiments that betray the
inadequacy of the classical scheme. The experiments to be described were never
performed exactly as described here, but they contain the essential features of the
actual experiments that were performed (in the first quarter of this century) with
none of their inessential complications.
3.1. Particles and Waves in Classical Physics
There exist in classical physics two distinct entities: particles and waves. We
have studied the particles in some detail in the last chapter and may summarize their
essential features as follows. Particles are localized bundles of energy and momentum.
They are described at any instant by the state parameters q and 4 (or q and p). These
parameters evolve in time according to some equations of motion. Given the initial
values q(t ,) and 4(0 at time t, , the trajectory q(t) may be deduced for all future
times from the equations of motion. A wave, in contrast, is a disturbance spread over
space. It is described by a wave function yt(r, t) which characterizes the disturbance at
the point r at time t.
In the case of sound waves, y/ is the excess air pressure above the normal, while
in the case of electromagnetic waves, y/ can be any component of the electric field
vector E. The analogs of q and 4 for a wave are yf and if/ at each point r, assuming
tit obeys a secondorder wave equation in time, such as
V—
2 a2
C2 a t2 107
108 (a)
CHAPTER 3
Xmin
Figure 3.1. (a) When a wave ty=e '(kY
is incident on the screen with either slit SI
or S2 open, the intensity patterns I I and 12,
respectively, are measured by the row of
detectors on AB. (b) With both slits open,
1 1 .2 #1 1 +1 2 the pattern II + 2 is observed. Note that
/1+20/1+12. This is called interference.
which describes waves propagating at the speed of light, c. Given c(r, 0) and 0(r, 0)
one can get the wave function v(r, t) for all future times by solving the wave
equation.
Of special interest to us are waves that are periodic in space and time, called
plane waves. In one dimension, the plane wave may be written as
2r 2r
yc(x, 0= A exp [i( x t )]= A exp[i0] (3.1.1)
A T
At some given time t, the wave is periodic in space with a period A, called its
wavelength, and likewise at a given point x, it is periodic in time, repeating itself
every T seconds, T being called the time period. We will often use, instead of A and
T, the related quantities k =2g / A called the wave number and co =2g IT called the
(angular) frequency. In terms of the phase 0 in Eq. (3.1.1), k measures the phase
change per unit length at any fixed time t, while CO measures the phase change per
unit time at any fixed point x. This wave travels at a speed v= co/k. To check this
claim, note that if we start out at a point where 0= 0 and move along x at a rate
x= (co /k)t, 4 remains zero. The overall scale A up front is called the amplitude. For
any wave, the intensity is defined to be I=IVI2. For a plane wave this is a constant
equal to jAl 2. If yc describes an electromagnetic wave, the intensity is a measure of
the energy and momentum carried by the wave. [Since the electromagnetic field is
real, only the real part of tic describes it. However, time averages of the energy and
momentum flow are still proportional to the intensity (as defined above) in the case
of plane waves.]
Plane waves in three dimension are written as
tv(r, t)= A el(kg. w1) , = I k I v (3.1.2)
where each component k, gives the phase changes per unit length along the ith axis.
One calls k the wave vector.$
3.2. An Experiment with Waves and Particles (Classical)
Waves exhibit a phenomenon called interference, which is peculiar to them and
is not exhibited by particles described by classical mechanics. This phenomenon is
illustrated by the following experiment (Fig. 3.1a). Let a wave tv =A el(kY be
I Unfortunately we also use k to denote the unit vector along the z axis. It should be clear from the
context what it stands for.
A 109
(a)
ALL IS NOT WELL
WITH CLASSICAL
MECHANICS
Figure 3.2. (a) Intensity pattern
when SI or S2 1S open, due to a S2
beam of incident particles. (b) The
pattern with both slits open accord
ing to classical mechanics (II .2.=
11+12). 1 14.2' 1 I +1 2
incident normally on a screen with slits S I and S2 , which are a distance a apart. At
a distance d parallel to it is a row of detectors that measures the intensity as a
function of the position x measured along AB.
If we first keep only S I open, the incident wave will come out of SI and propagate
radially outward. One may think of SI as the virtual source of this wave VII, which
has the same frequency and wavelength as the incident wave. The intensity pattern
Ii = I 1// 1 1 2 is registered by the detectors. Similarly if S2 is open instead of S1 , the wave
1/12 produces the pattern 12 = I w21 2. In both cases the arrival of energy at the detectors
is a smooth function of x and t.
Now if both SI and S2 are opened, both waves and ii/2 are present and
produce an intensity pattern I +2 = IVI ± V21 2.
The interesting thing is that II +2 I + /2, but rather the interference pattern
shown in Fig. 3.1b. The ups and downs are due to the fact that the waves iv, and
W2 have to travel different distances d1 and d2 to arrive at some given x (see Fig.
3.1a) and thus are not always in step. In particular, the maxima correspond to the
case d2 —d 1 =nA (n is an integer), when the waves arrive exactly in step, and the
minima correspond to the case d2 = (2n + 1 ) ,/2, when the waves are exactly out
of step. In terms of the phases 01 and 02, 02(x) 0 1 (x)=2nr at a maximum and
02(x) — 0 1 (x) = (2n + 1)71 at a minimum. One can easily show that spacing Ax
between two adjacent maxima is ix= 2,d/ a.
The feature to take special note of is that if x„„n is an interference minimum,
there is more energy flowing into x„„n with just one slit open than with both. In
other words, the opening of an extra slit can actually reduce the energy flow into
xmin
Consider next the experiment with particles (Fig. 3.2a). The source of the inci
dent plane waves is replaced by a source of particles that shoots them toward the
screen with varying directions but fixed energy. Let the line AB be filled with an
array of particle detectors. Let us define the intensity /(x) to be the number of
particles arriving per second at any given x. The patterns with SI or S2 open are
shown in (Fig. 3.2a). These look very much like the corresponding patterns for the
wave. The only difference will be that the particles arrive not continuously, but in a
staccato fashion, each particle triggering a counter at some single point x at the time
of arrival. Although this fact may be obscured if the beam is dense, it can be easily
detected as the incident flux is reduced.
What if both SI and S2 are opened? Classical mechanics has an unambiguous
prediction : I + 2 = 11 ± 12. The reasoning is as follows: each particle travels along a
definite trajectory that passes via S i or S2 to the destination x. To a particle headed
110 for S1 , it is immaterial whether S2 is open or closed. Being localized in space it has
CHAPTER 3
no way of even knowing if S2 is open or closed, and thus cannot respond to it in
any way. Thus the number coming via SI to x is independent of whether S2 is open
or not and vice versa. It follows that 11 + 2 = + /2 (Fig. 3.2b).
The following objection may be raised: although particles heading for SI are
not aware that S2 is open, they certainly can be deflected by those coming out of
S2, if, for instance, the former are heading for xl and the latter for x2 (see Fig. 3.1a).
This objection can be silenced by sending in one particle at a time. A given
particle will of course not produce a pattern like II or /2 by itself, it will go to some
point x. If, however, we make a histogram, the envelope of this histogram, after
many counts, will define the smooth functions II , 12 , and II + 2 . Now the conclusion
+ 2 = +12 is inevitable.
This is what classical physics predicts particles and waves will do in the double
slit experiment.
3.3. The DoubleSlit Experiment with Light
Consider now what happens when we perform the following experiment to check
the classical physics notion that light is an electromagnetic wave phenomenon.
We set up the double slit as in Fig. 3.1a, with a row of lightsensitive meters
along AB and send a beam yc = A e'(kY" in a direction perpendicular to the screen.
(Strictly speaking, the electromagnetic wave must be characterized by giving the
orientation of the E and B vectors in addition to co and k. However, for a plane
wave, B is uniquely fixed by E. If we further assume E is polarized perpendicular to
the page, this polarization is unaffected by the double slit. We can therefore suppress
the explicit reference to this constant vector and represent the field as a scalar function
yc.) We find that with the slits open one at a time we get patterns II and /2 , and
with both slits open we get the interference pattern 11 +2 as in Figs. 3.1a and 3.1b.
(The interference pattern is of course what convinced classical physicists that light
was a wave phenomenon.) The energy arrives at the detectors smoothly and continu
ously as befitting a wave.
Say we repeat the experiment with a change that is expected (in classical physics)
to produce no qualitative effects. We start with S I open and cut down the intensity.
A very strange thing happens. We find that the energy is not arriving continuously,
but in sudden bursts, a burst here, a burst there, etc. We now cut down the intensity
further so that only one detector gets activated at a given time and there is enough
of a gap, say a millisecond, between counts. As each burst occurs at some x, we
record it and plot a histogram. With enough data, the envelope of the histogram
becomes, of course, the pattern 11 . We have made an important discovery: light
energy is not continuous—it comes in bundles. This discrete nature is obscured in
intense beams, for the bundles come in so fast and all over the line AB, that the
energy flow seems continuous in space and time.
We pursue our study of these bundles, called photons, in some detail and find
the following properties:
1. Each bundle carries the same energy E.
2. Each bundle carries the same momentum p.
3. E=pc. From the famous equation E 2 =p2c2 in2c4, we deduce that these bundles 111
are particles of zero mass. ALL IS NOT WELL
4. If we vary the frequency of the light source we discover that WITH CLASSICAL
MECHANICS
E= hco (3.3.1)
p=hk (3.3.2)
where h = h/27r is a constant. The constant h is called Planck's constant, and has the
dimensions of erg sec, which is the same as that of action and angular momentum.
Its value is
h
— =h102' erg sec (3.3.3)
2ir
For those interested in history, the actual experiment that revealed the granular
nature of light is called the photoelectric effect. The correct explanation of this experi
ment, in terms of photons, was given by Einstein in 1905.
That light is made of particles will, of course, surprise classical physicists but
will not imply the end of classical physics, for physicists are used to the idea that
phenomena that seem continuous at first sight may in reality be discrete. They will
cheerfully plunge into the study of the dynamics of the photons, trying to find the
equations of motion for its trajectory and so on. What really undermines classical
physics is the fact that if we now open both slits, still keeping the intensity so low
that only one photon is in the experimental region at a given time, and watch the
histogram take shape, we won't find that /1 ± 2 equals I + 12 as would be expected of
particles, but is instead an interference pattern characteristic of wave number k.
This result completely rules out the possibility that photons move in welldefined
trajectories like the particles of classical mechanics—for if this were true, a photon
going in via SI should be insensitive to whether S2 is open or not (and vice versa),
and the result I +2= II ± /2 is inescapable! To say this another way, consider a point
x„,,n which is an interference minimum. More photons arrive here with either S I or
S2 open than with both open. If photons followed definite trajectories, it is incompre
hensible how opening an extra pathway can reduce the number coming to xn„n . Since
we are doing the experiment with one photon at a time, one cannot even raise the
improbable hypothesis that photons coming out of SI collide with those coming out
of S2 to modify (miraculously) the smooth pattern I + /2 into the wiggly interference
pattern.
From these facts Born drew the following conclusion: with each photon is
associated a wave iv , called the probability amplitude or simply amplitude, whose
modulus squared I tif(x)1 2 gives the probability of finding the particle at x. [Strictly
speaking, we must not refer to I v(x)I 2 as the probability for a given x, but rather
as the probability density at x since x is a continuous variable. These subtleties can,
however, wait.] The entire experiment may be understood in terms of this hypothesis
as follows. Every incoming photon of energy E and momentum p has a wave function
iv associated with it, which is a plane wave with co=E/h and k=p/h. This wave
interferes with itself and forms the oscillating pattern tif(x)1 2 along AB, which gives
112 the probability that the given photon will arive at x. A given photon of course arrives
CHAPTER 3 at some definite x and does not reveal the probability distribution. If, however, we
wait till several photons, all described by the same tit, have arrived, the number at
any x will become proportional to the probability function I ty(x)1 2 . Likewise, if an
intense (macroscopic) monochromatic beam is incident, many photons, all described
by the same wave and hence the same probability distribution, arrive at the same
time and all along the line AB. The intensity distribution then assumes the shape of
the probability distribution right away and the energy flow seems continuous and in
agreement with the predictions of classical electromagnetic theory.
The main point to note, besides the probability interpretation, is that a wave
is associated not with a beam of photons, but with each photon. If the beam is
monochromatic, every photon is given by the same tif and the same probability
distribution. A large ensemble of such photons will reproduce the phenomena
expected of a classical electromagnetic wave tif and the probabilistic aspect will be
hidden.
3.4. Matter Waves (de Broglie Waves)
That light, which one thought was a pure wave phenomenon, should consist of
photons, prompted de Broglie to conjecture that entities like the electron, generally
believed to be particles, should exhibit wavelike behavior. More specifically, he con
jectured, in analogy with photons, that particles of momentum p will produce an
interference pattern corresponding to a wave number k= p/h in the doubleslit experi
ment. This prediction was verified for electrons by Davisson and Germer, shortly
thereafter. It is now widely accepted that all particles are described by probability
amplitudes v(x), and that the assumption that they move in definite trajectories is
ruled out by experiment.
But what about common sense, which says that billiard balls and baseballs
travel along definite trajectories? How did classical mechanics survive for three cen
turies? The answer is that the wave nature of matter is not apparent for macroscopic
phenomena since h is so small. The precise meaning of this explanation will become
clear only after we fully master quantum mechanics. Nonetheless, the following
example should be instructive. Suppose we do the doubleslit experiment with pellets
of mass 1 g, moving at 1 cm/sec. The wavelength associated with these particles is
271 h
'1026 cm
k p
which is 10  ' 3 times smaller than the radius of the proton! For any reasonable values
of the parameters a and d (see Fig. 3.1b), the interference pattern would be so dense
in x that our instruments will only measure the smooth average, which will obey
/1 + 2 = Ii + /2 as predicted classically.
3.5. Conclusions
The main objective of this chapter was to expose the inadequacy of classical
physics in explaining certain phenomena and, incidentally, to get a glimpse of what
the new (quantum) physics ought to look like. We found that entities such as the 113
electron are particles in the classical sense in that when detected they seem to carry ALL IS NOT WELL
all their energy, momentum, charge, etc. in localized form; and at the same time WITH CLASSICAL
they are not particlelike in that assuming they move along definite trajectories leads MECHANICS
to conflict with experiment. It appears that each particle has associated with it a
wave function vi(x, t), such that I v(x, 01 2 give the probability of finding it at a point
x at time t. This is called waveparticle duality.
The dynamics of the particle is then the dynamics of this function vi(x, t) or, if
we think of functions as vectors in an infinitedimensional space, of the ket I v(t)>.
In the next chapter the postulates of quantum theory will define the dynamics in
terms of I ty(t)>. The postulates, which specify what sort of information is contained
in I VI (t)> and how I tg(t)> evolves with time, summarize the results of the double
slit experiment and many others not mentioned here. The doubleslit experiment was
described here to expose the inadequacy of classical physics and not to summarize
the entire body of experimental results from which all the postulates could be inferred.
Fortunately, the doubleslit experiment contains most of the central features of the
theory, so that when the postulates are encountered in the next chapter, they will
appear highly plausible.
The Postulates a
General Discussion
Having acquired the necessary mathematical training and physical motivation, you
are now ready to get acquainted with the postulates of quantum mechanics. In this
chapter the postulates will be stated and discussed in broad terms to bring out
the essential features of quantum theory. The subsequent chapters will simply be
applications of these postulates to the solution of a variety of physically interesting
problems. Despite your preparation you may still find the postulates somewhat
abstract and mystifying on this first encounter. These feelings will, however, dis
appear after you have worked with the subject for some time.
4.1. The Postulates$
The following are the postulates of nonrelativistic quantum mechanics. We
consider first a system with one degree of freedom, namely, a single particle in one
space dimension. The straightforward generalization to more particles and higher
dimensions will be discussed towards the end of the chapter. In what follows, the
quantum postulates are accompanied by their classical counterparts (in the Hamil
tonian formalism) to provide some perspective.
Classical Mechanics Quantum Mechanics
I. The state of a particle at any given I. The state of the particle is represen
time is specified by the two variables ted by a vector 1 y/(t)> in a Hilbert
x(t) and p(t), i.e., as a point in a two space.
dimensional phase space.
II. Every dynamical variable co is a II. The independent variables x and p of
function of x and p: o) = o)(x, p). classical mechanics are represented
I Recall the discussion in the Preface regarding the sense in which the word is used here. 115
116 by Hermitian operators X and P
CHAPTER 4
with the following matrix elements
in the eigenbasis of Xt
<xIXIx'> = xS(x— x')
<xl PI x' > = '(x — x')
The operators corresponding to
dependent variables co (x, p) are
given Hermitian operators
Q(X, P) = co (x X , p 43 )
III. If the particle is in a state given by III. If the particle is in a state I tit>, meas
x and p, the measurement 11 of the urement of the variable (corre
variable co will yield a value co(x, p). sponding to) 5/ will yield one of the
The state will remain unaffected. eigenvalues co with probability
T e state of the
P( 0))G€1<o) I V>1 2.h
system will change from I ip> to co>
as a result of the measurement.
IV. The state variables change with time IV. The state vector I ip(t)> obeys the
according to Hamilton's equations: Schriidinger equation
d
ih —
dt
vi(t)> =HI tv(t)>
where H(X, P)= 9(x4X, pP) is
the quantum Hamiltonian operator
and If' is the Hamiltonian for the
corresponding classical problem.
4.2. Discussion of Postulates I II! —
The postulates (of classical and quantum mechanics) fall naturally into two
sets: the first three, which tell us how the system is depicted at a given time, and the
last, which specifies how this picture changes with time. We will confine our attention
to the first three postulates in this section, leaving the fourth for the next.
The first postulate states that a particle is described by a ket ip> in a Hilbert
space which, you will recall, contains proper vectors normalizable to unity as well as
Note that the X operator is the same one discussed at length in Section 1.10. Likewise where
K was also discussed therein. You may wish to go over that section now to refresh your memory.
§ By this we mean that s/ is the same function of X and P as co is of x and P.
Il That is, in an ideal experiment consistent with the theory. It is assumed you are familiar with the ideal
classical measurement which can determine the state of the system without disturbing it in any way. A
discussion of ideal quantum measurements follows.
improper vectors, normalizable only to the Dirac delta functions.t Now, a ket in 117
such a space has in general an infinite number of components in a given basis. One THE POSTULATES
wonders why a particle, which had only two independent degrees of freedom, x and —A GENERAL
p, in classical mechanics, now needs to be specified by an infinite number of variables. DISCUSSION
What do these variables tell us about the particle? To understand this we must go
on to the next two postulates, which answer exactly this question. For the present
let us note that the doubleslit experiment has already hinted to us that a particle
such as the electron needs to be described by a wave function v(x). We have seen
in Section 1.10 that a function f(x) may be viewed as a ket I f> in a Hilbert space.
The ket I tit> of quantum mechanics is none other than the vector representing the
probability amplitude yi(x) introduced in the doubleslit experiment.
When we say that l yi> is an element of a vector space we mean that if I yi> and
tit'> represent possible states of a particle so does al yi> + fil yi'>. This is called the
principle of superposition. The principle by itself is not so new: we know in classical
physics, for example, that if f(x) and g(x) [with f(0) =f(L)= g(0)=g(L) =0] are
two possible displacements of a string, so is the superposition af(x)+ 13g(x). What
is new is the interpretation of the superposed state al v'> +13 1 y/>. In the case of the
string, the state af+ 13g has very different attributes from the states f and g: it will
look different, have a different amount of stored elastic energy, and so on. In quantum
theory, on the other hand, the state al yf> +PI vii> will, loosely speaking, have attri
butes that sometimes resemble that of l yi> and at other times those of I ty'>. There
is, however, no need to speak loosely, since we have postulates II and III to tell us
exactly how the state vector l yi> is to be interpreted in quantum theory. Let us find
out.
In classical mechanics when a state (x, p) is given, one can say that any dynam
ical variable co has a value co(x, p), in the sense that if the variable is measured the
result co (x, p) will obtain. What is the analogous statement one can make in quantum
mechanics given that the particle is in a state l yir The answer is provided by
Postulates II and III, in terms of the following steps:
Step 1. Construct the corresponding quantum operator 5/ = o)(x4X,
where X and P are the operators defined in postulate II.
Step 2. Find the orthonormal eigenvectors co i > and eigenvalues co, of Q.
Step 3. Expand I yi> in this basis:
tv> =E 100
Step 4. The probability P(w) that the result co will obtain is proportional to
the modulus squared of the projection of I yi> along the eigenvector 1 w>, that is
P( 0))G€1<o) I V>I 2. In terms of the projection operator P„, = I co> <col ,
P(ct)ccl<o)i V>1 2 =<Vi co><coi W>=<VIP(01 1P>=<IPIP(oP0,1 1P>=<PcolVi PcoVf>.
There is a tremendous amount of information contained in these steps. Let us
note, for the present, the following salient points.
The status of the two classes will be clarified later in this chapter.
118 (1) The theory makes only probabilistic predictions for the result of a measure
CHAPTER 4
ment of Q. Further, it assigns (relative) probabilities only for obtaining some eigen
value co of Q. Thus the only possible values of Q are its eigen values. Since postulate
II demands that fl be Hermitian, these eigenvalues are all real.
(2) Since we are told that P( 0),) cc I <co, >1 2> the quantity '<co i l tv>1 2 is only
the relative probability. To get the absolute probability, we divide 1<o),1 tif >1 2 by the
sum of all relative probabilities:
P(coi) — <OM VIA 2 I <Wi I V>I 2 (4.2.1)
Ei l<o).» Vf>1 2 <VI V>
It is clear that if we had started with a normalized state
\
I/1i
'
I Iv>
1/2
<iv I tv>
we would have had
P(w)=l<wl V>1 2 (4.2.2)
If I ip> is a proper vector, such a rescaling is possible and will be assumed
hereafter. The probability interpretation breaks down if I tif> happens to be one of
the improper vectors in the space, for in this case <ii yi> l = 6(0) is the only sensible
normalization. The status of such vectors will be explained in Example 4.2.2 below.
Note that the condition < tit tv > = 1 is a matter of convenience and not a physical
restriction on the proper vectors. (In fact the set of all normalized vectors does not
even form a vector space. If vf> and ii' > are normalized, then an arbitrary linear
combination, alyt>+/3 1 "> is not.)
Note that the relative probability distributions corresponding to the states Iv>
and al '> when they are renormalized to unity, reduce to the same absolute probabil
ity distribution. Thus, corresponding to each physical state, there exists not one
vector, but a ray or "direction" in Hilbert space. When we speak of the state of the
particle, we usually mean the ket I ip> with unit norm. Even with the condition
Op I tv> = 1, we have the freedom to multiply the ket by a number of the form ed)
without changing the physical state. This freedom will be exploited at times to make
the components of I ip> in some basis come out real.
(3) If I ip> is an eigenstate Ico,>, the measurement of Q is guaranteed to yield
the result co,. A particle in such a state may be said to have a value co, for fl in the
classical sense.
(4) When two states 1 w 1 > and 1(1) 2 > are superposed to form a (normalized)
state, such as
\ alcoi>+)610)2>
IV/
(1 a1 2 + 1/31 2 ) 1 1 2
one gets the state, which upon measurement of 0, can yield either co l or co 2 with
probabilities la1 2 /(1a1 2 +1012) and I 131 2 /(l al 2 +1)6 1 2), respectively. This is the peculiar
a)
119
THE POSTULATES
1 ,03 > Ix3 >
—A GENERAL
DISCUSSION
I'4'>
I*>
1'0 2>
Iwi> IX!)
Figure 4.1. (a) The normalized ket in 0/ 3 (R) representing the state of the particle. (b) The SI basis, 1 0 1),
1 00, and 1c0 3 >. (c) The SI and the A bases. To get the statistical information on a variable, we find the
eigenvectors of the corresponding operator and project I 1//> on that basis.
consequence of the superposition principle in quantum theory, referred to earlier. It
has no analog in classical mechanics. For example, if a dynamical variable of the
string in the state af+ 13g is measured, one does not expect to get the value corre
sponding to f some of the time and that corresponding to g the rest of the time;
instead, one expects a unique value generally distinct from both. Likewise, the
functions l and af (a real) describe two distinct configurations of the string and are
not physically equivalent.
(5) When one wants information about another variable A, one repeats the
whole process, finding the eigenvectors A.,> and the eigenvalues Â. Then
P( 2 )=I<A,1 vi>1 2
The bases of Q and A will of course be different in general. In summary, we have a
single ket tic> representing the state of the particle in Hilbert space, and it contains
the statistical prediction for all observables. To extract this information for any
observable, we must determine the eigenbasis of the corresponding operator and find
the projection of I vf> along all its eigenkets.
(6) As our interest switches from one variable f2, to another, A, so does our
interest go from the kets I co>, to the kets A.>. There is, however, no need to change
the basis each time. Suppose for example we are working in the f2 basis in which
vi>=E100<wil iv>
and P(co 1 )=1<co1 l tic>I 2. If we want PR) we take the operator A (which is some
given matrix with elements A u = <coi l AI co,> ); find its eigenvectors I A.,> (which are
column vectors with components <N J ), and take the inner product <AI ilf> in
this basis:
vi>=E <Ail (0i> ty>
Example 4.2.1. Consider the following example from a fictitious Hilbert space
V(R) (Fig. 4.1). In Fig. 4.1a we have the normalized state I vf>, with no reference
120 to any basis. To get predictions on Q, we find its eigenbasis and express the state
CHAPTER 4
vector I ty> in terms of the orthonormal eigenvectors 1 co i>, 1 00, and 10)3> (Fig.
4.1b). Let us suppose
1 1 1
I 0)>=  1 0)i> ± 10)2> + 2 1 0)3>
2 2 2 1/
This means that the values co 0)2, and c0 3 are expected with probabilities
and , respectively, and other values of co are impossible. If instead 1 vt> were some
eigenvector, say 1co 1>, then the result co, would obtain with unit probability. Only a
particle in a state 1 ty> = 1co,> has a welldefined value of Q in the classical sense. If
we want P( 2L ) we construct the basis 1 2.i>, 1 22>, and 1 /13>, which can in general be
distinct from the fl basis. In our example (Fig. 4.1c) there is just one common
eigenvector I co3 > = I A.3>.
Returning to our main discussion, there are a few complications that could arise
as one tries to carry out the steps 14. We discuss below the major ones and how
they are to be surmounted.
Complication 1: The Recipe 0= co(x —>X, p 413) Is Ambiguous. If, for example,
co = xp, we don't know if f1 =XP or PX since xp=px classically. There is no universal
recipe for resolving such ambiguities. In the present case, the rule is to use the
symmetric sum: Q=(XP+PX)/2. Notice incidentally that symmetrization also
renders 0 Hermitian. Symmetrization is the answer as long as Q does not involve
products of two or more powers of X with two or more powers of P. If it does, only
experiment can decide the correct prescription. We will not encounter such cases in
this book.
Complication 2: The Operator 0 Is Degenerate. Let us say co l = w 2 = co. What
is P(co) in this case? We select some orthonormal basis 1w, 1> and 1co, 2> in the
eigenspace 'V„, with eigenvalue co. Then
P(w)=1<w, 1 1 tv>1 2 +1<o),211v>1 2
which is the modulus squared of the projection of I tif> in the degenerate eigenspace.
This is the result we will get if we assume that co l and w 2 are infintesimally distinct
and ask for P(co, or 0 2). In terms of the projection operator for the eigenspace,
P.=1(0,1><N,11+1(0,2><co,21 (4.2.3a)
we have
(4.2.3b)
In general, one can replace in Postulate III
No) Gc < Pco 1 IP>
where P. is the projection operator for the eigenspace with eigenvalue o). Then 121
postulate III as stated originally would become a special case in which there is no THE POSTULATES
degeneracy and each eigenspace is simply an eigenvector. A GENERAL
In our example from 0/ 3(R), if o), = co 2 = o) (Fig. 4.1b) then P(co) is the square DISCUSSION
of the component of v> in the "x —y" plane.
Complication 3: The Eigen value Spectrum of S2 Is Continuous. In this case one
expands tif> as
ill/>=Tico><co iv> dco
One expects that as co varies continuously, so will <o) vf>, that is to say, one expects
<co I tif> to be a smooth function vc(co). To visualize this function one introduces an
auxiliary onedimensional space, called the co space, the points in which are labeled
by the coordinate co. In this space vc(co) will be a smooth function of o) and is called
the wave function in the co space. We are merely doing the converse of what we did
in Section 1.10 wherein we started with a function f(x) and tried to interpret it as
the components of an infinitedimensional ket I yl> in the Ix> basis. As far as the
state vector I w> is concerned, there is just one space, the Hilbert space, in which it
resides. The co space, the A. space, etc. are auxiliary manifolds introduced for the
purpose of visualizing the components of the infinitedimensional vector I yi> in the
Q basis, the A basis, and so on. The wave function vi(w) is also called the probability
amplitude for finding the particle with Q = o).
Can we interpret 1<o) V>I 2 as the probability for finding the particle with a
value o) for Q? No. Since the number of possible values for o) is infinite and the
total probzability is unity, each single value of co can be assigned only an infinitesimal
probability. One interprets P(w) =1<co V>I 2 to be the probability density at co, by
which one means that P(co) do) is the probability of obtaining a result between o)
and co + do). This definition meets the requirement that the total probability be unity,
since
f p(0)) dco = 11 < o) V>I 2 do) =1<4/ (0 > < 0) I dco
=<IPITIV>=<VIV>= 1 (4.2.4)
If <vil IV> = 8 (0) is the only sensible normalization possible, the state cannot be
normalized to unity and P(co) must be interpreted as the relative probability density.
We will discuss such improper states later.
An important example of a continuous spectrum is that of X, the operator
corresponding to the position x. The wave function in the X basis (or the x space),
w(x), is usually referred to as just the wave function, since the X basis is almost
always what one uses. In our discussions in the last chapter, I ll'(x)I 2 was referred to
as the probability for finding the particle at a given x, rather than as the probability
density, in order to avoid getting into details. Now the time has come to become
precise!
122 Earlier on we were wondering why it was that a classical particle defined by
CHAPTER 4
just two numbers x and p now needs to be described by a ket which has an infinite
number of components. The answer is now clear. A classical particle has, at any
given time, a definite position. One simply has to give this value of x in specifying
the state. A quantum particle, on the other hand, can take on any value of x upon
measurement and one must give the relative probabilities for all possible outcomes.
This is part of the information contained in tv(x)= <x I tv>, the components of IV)
in the X basis. Of course, in the case of the classical particle, one needs also to specify
the momentum p as well. In quantum theory one again gives the odds for getting
different values of momenta, but one doesn't need a new vector for specifying this;
the same ket Itg> when expanded in terms of the eigenkets I p> of the momentum
operator P gives the odds through the wave function in p space, v(p) = <pI v>.
Complication 4: The Quantum Variable D Has No Classical Counterpart. Even
"point" particles such as the electron are now known to carry "spin," which is an
internal angular momentum, that is to say, angular momentum unrelated to their
motion through space. Since such a degree of freedom is absent in classical mechanics,
our postulates do not tell us which operator is to describe this variable in quantum
theory. As we will see in Chapter 14, the solution is provided by a combination of
intuition and semiclassical reasoning. It is worth bearing in mind that no matter
how diligently the postulates are constructed, they must often be supplemented by
intuition and classical ideas.
Having discussed the fourstep program for extracting statistical information
from the state vector, we continue with our study of what else the postulates of
quantum theory tell us.
Collapse of the State Vector
We now examine another aspect of postulate III, namely, that the measurement
of the variable 52 changes the state vector, which is in general some superposition
of the form
lw>=E10)><colvi>
into the eigenstate co> corresponding to the eigenvalue co obtained in the measure
ment. This phenomenon is called the collapse or reduction of the state vector.
Let us first note that any definitive statement about the impact of the measure
ment process presupposes that the measurement process is of a definite kind. For
example, the classical mechanics maxim that any dynamical variable can be measured
without changing the state of the particle, assumes that the measurement is an ideal
measurement (consistent with the classical scheme). But one can think up nonideal
measurements which do change the state; imagine trying to locate a chandelier in a
dark room by waving a broom till one makes contact. What makes Postulate III
profound is that the measurement process referred to there is an ideal quantum
measurement, which in a sense is the best one can do. We now illustrate the notion
of an ideal quantum measurement and the content of this postulate by an example.
Consider a particle in a momentum eigenstate I p>. The postulate tells us that 123
if the momentum in this state is measured we are assured a result p, and that the THE POSTULATES
state will be the same after the measurement (since 1 tg> = ip> is already an eigenstate A GENERAL
of the operator P in question). One way to measure the momentum of the particle DISCUSSION
is by Compton scattering, in which a photon of definite momentum bounces off the
particle.
Let us assume the particle is forced to move along the xaxis and that we send
in a rightmoving photon of energy hco that bounces off the particle and returns as
a leftmoving photon of energy hco'. (How do we know what the photon energies
are? We assume we have atoms that are known to emit and absorb photons of any
given energy.) Using momentum and energy conservation :
cp' = cp + h(co + cf)')
E' = E+ h(co — co')
it is now possible from this data to reconstruct the initial and final momenta of the
particle:
(hco + hco'
cp' =
2
Since the photon always loses energy to the particle (as is clear in the particle rest
frame) co' < co and by sending co —00, we can make the change in momentum p' —p
arbitrarily small. Hereafter, when we speak of a momentum measurement, this is
what we will mean. We will also assume that to each dynamical variable there exists
a corresponding ideal measurement. We will discuss, for example, the ideal position
measurement, which, when conducted on a particle in state I x>, will give the result
x with unit probability and leave the state vector unchanged.
Suppose now that we measure the position of a particle in a momentum eigenstate
I p>. Since I p> is a sum of position eigenkets I x>,
I P> =fix> <xl p> dx
the measurement will force the system into some state Ix>. Thus even the ideal
position measurement will change the state which is not a position eigenstate. Why
does a position measurement alter the state I p>, while momentum measurement does
not? The answer is that an ideal position measurement uses photons of infinitely
high momentum (as we will see) while an ideal momentum measurement uses photons
of infinitesimally low momentum (as we have seen).
This then is the big difference between classical and quantum mechanics: an
ideal measurement of any variable co in classical mechanics leaves any state invariant,
124 whereas the ideal measurement of S) in quantum mechanics leaves only the eigenstates
CHAPTER 4
of Q invariant.
The effect of measurement may be represented schematically as follows:
P.I vi>
I Iv> 52 measured, co obtained I
<PcotP P0,Vi> 1/2
where Pc is the projection operator associated with 1 w >, and the state after measure
,
ment has been normalized. If co is degenerate,
P.I Iv>
where Pc° is the projection operator for the eigenspace V » . Special note should be
taken of the following point: if the initial state 1 tv> were unknown, and the measure
ment yielded a degenerate eigenvalue co, we could not say what the state was after
the measurement, except that it was some state in the eigenspace with eigenvalue w.
On the other hand, if the initial state 1 tv> were known, and the measurement yielded
a degenerate value co, the state after measurement is known to be [1 3,„1 vi> (up to
normalization). Consider our example from V 3(R) (Fig. 4.1b). Say we had co l =
(02= 0). Let us use an orthonormal basis 1 co, 1 >, 10), 2 >, ) 3>5 where, as usual, the
extra labels 1 and 2 are needed to distinguish the basis vectors in the degenerate
eigenspace. If in this basis we know, for example, that
1 = I co, 1> + co, 2 > + ( V /2 10) 3>
and the measurement gives a value co, the normalized state after measurement is
known to us to be
1 Vt> = 2'(w, 1> + 1C1) , 2 >)
If, on the other hand, the initial state were unknown and a measurement gave a
result co, we could only say
alco, 1>+,610), 2>
where a and )6 are arbitrary real numbers.
Note that although we do not know what a and )6 are from the measurement,
they are not arbitrary. In other words, the system had a welldefined state vector
I V> before the measurement, though we did not know 1 iv>, and has a welldefined
state vector P c,,1 tif> after the measurement, although all we know is that it lies within
a subspace V0,.
How to Test Quantum Theory 125
One of the outstanding features of classical mechanics is that it makes fully THE POSTULATES
A GENERAL
deterministic predictions. It may predict for example that a particle leaving x= xi
DISCUSSION
p, in some potential V(x) will arrive 2 seconds later at x = xf with withmoenu
momentum p=pf . To test the prediction we release the particle at x = xi with p=p,
at t= 0 and wait at x = xf and see if the particle arrives there with p=pf at t =2
seconds.
Quantum theory, on the other hand, makes statistical predictions about a
particle in a state I tif> and claims that this state evolves in time according to
Schrbdinger's equation. To test these predictions we must be able to
(1) Create particles in a welldefined state >
(2) Check the probabilistic predictions at any time.
The collapse of the state vector provides us with a good way of preparing definite
states: we begin with a particle in an arbitrary state I vi> and meaure a variable S).
If we get a nondegenerate eigenvalue co, we have in our hands the state I co>. (If co
is degenerate, further measurement is needed. We are not ready to discuss this
problem.) Notice how in quantum theory, measurement, instead of telling us what
the system was doing before the measurement, tells us what it is doing just after the
measurement. (Of course it does tell us that the original state had some projection
on the state I co> obtained after measurement. But this information is nothing com
pared to the complete specifications of the state just after measurement.)
Anyway, assume we have prepared a state I co>. If we measure some variable A,
immediately thereafter, so that the state could not have changed from I co>, and if
say,
1 21/2
I co> 3 .," > ( 3 X2> + 0 (others)
the theory predicts that X I and X2 will obtain with probabilities 1/3 and 2/3, respec
tively. If our measurement gives a X„ i0 1, 2 (or worse still a )L any eigenvalue!)
that is the end of the theory. So let us assume we get one of the allowed values, say
A 1 . This is consistent with the theory but does not fully corroborate it, since the
odds for XI could have been 1/30 instead of 1/3 and we could still get X I . Therefore,
we must repeat the experiment many times. But we cannot repeat the experiment
with this particle, since after the measurement the state of the particle is I X i >. We
must start afresh with another particle in I co>. For this purpose we require a quantum
ensemble, which consists of a large number N of particles all in the same state Ico>.
If a measurement of A is made on every one of these particles, approximately N/3
will yield a value X I and end up in the state I X I > while approximately 2N/3 will yield
a value 2L2 and end up in a state 1.12 >. For sufficiently large N, the deviations from
the fractions 1/3 and 2/3 will be negligible. The chief difference between a classical
ensemble, of the type one encounters in, say, classical statistical mechanics, and the
quantum ensemble referred to above, is the following. If in a classical ensemble of
N particles NI3 gave a result X I and 2N/3 a result 2L 2, one can think of the ensemble
as having contained N/3 particles with X= X I and the others with X= 2L2 before the
126 measurement. In a quantum ensemble, on the other hand, every particle is assumed
CHAPTER 4
to be in the same state I w> prior to measurement (i.e., every particle is potentially
capable of yielding either result A or 2.2). Only after that measurement are a third
of them forced into the state 124 > and the rest into 1 2,2>.
Once we have an ensemble, we can measure any other variable and test the
expectations of quantum theory. We can also prepare an ensemble, let it evolve in
time, and study it at a future time to see if the final state is what the Schrödinger
equation tells us it should be.
Example 4.2.2. An example of an ensemble being used to test quantum theory
was encountered in the doubleslit experiment, say with photons. A given photon of
momentum p and energy E was expected to hit the detectors with a probability
density given by the oscillating function I ty(x)1 2 . One could repeat the experiment
N times, sending one such photon at a time to see if the final number distribution
indeed was given by I vi(x)I 2. One could equally well send in a macroscopic, mono
chromatic beam of light of frequency co =E/h and wave number k=p/h, which
consists of a large number of photons of energy E and momentum p. If one makes the
assumption (correct to a high degree) that the photons are noninteracting, sending in
the beam is equivalent to experimenting with the ensemble. In this case the intensity
pattern will take the shape of the probability density I tg(x)1 2, the instant the beam
is turned on. 0
Example 4.2.3. The following example is provided to illustrate the distinction
between the probabilistic descriptions of systems in classical mechanics and in quan
tum mechanics.
We choose as our classical sysetm a sixfaced die for which the probabilities
P(n) of obtaining a number n have been empirically determined. As our quantum
system we take a particle in a state
Suppose we close our eyes, toss the die, and cover it with a mug. Its statistical
description has many analogies with the quantum description of the state I tv> :
(1) The state of the die is described by a probability function P(n) before the mug
is lifted.
(2) The only possible values of n are 1, 2, 3, 4, 5, and 6.
(3) If the mug is lifted, and some value—say n= 3—is obtained, the function P(n)
collapses to (5 ,13.
(4) If an ensemble of N such dice are thrown, NP(n) of them will give the result n
(as N — oo).
The corresponding statements for the particle in the state I tif> are no doubt
known to you. Let us now examine some of the key differences between the statistical
descriptions in the two cases.
(1) It is possible, at least in principle, to predict exactly which face of the die 127
will be on top, given the mass of the die, its position, orientation, velocity, and THE POSTULATES
angular velocity at the time of release, the viscosity of air, the elasticity of the table —A GENERAL
top, and so on. The statistical description is, however, the only possibility in the DISCUSSION
quantum case, even in principle.
(2) If the result n=3 was obtained upon lifting the mug, it is consistent to
assume that the die was in such a state even prior to measurement. In the quantum
case, however, the state after measurement, say l co 3 >, is not the state before measure
ment, namely
(3) If N such dice are tossed and covered with N mugs, there will be NP(1)
dice with n=1, NP(2) dice with n=2, etc. in the ensemble before and after the
measurement. In contrast, the quantum ensemble corresponding to l tv> will contain
N particles all of which are in the same state l vi> (that is, each can yield any of the
values co 1 , . , co6) before the measurement, and NP(w) particles in co,> after the
measurement. Only the ensemble before the measurement represents the state l iv>.
The ensemble after measurement is a mixture of six ensembles representing the states
1 00, 1 001 I=1
Having seen the utility of the ensemble concept in quantum theory, we now
define and discuss the two statistical variables that characterize an ensemble.
Expectation Value
Given a large ensemble of N particles in a state l iv>, quantum theory allows us
to predict what fraction will yield a value co if the variable S) is measured. This
prediction, however, involves solving the eigenvalue problem of the operator S). If
one is not interested in such detailed information on the state (or the corresponding
ensemble) one can calculate instead an average over the ensemble, called the expecta
tion value, <K2>. The expectation value is just the mean value defined in statistics:
<O>=E P(0)i)coi=E I <coil tv>1 2coi
=E <Iv I coi> <coil vi>coi (4.2.5)
But for the factors coi multiplying each projection operator lcoi> <w1 , we could have
used Ei lw> <coi =./. To get around this, note that co il coi> = Rco i>. Feeding this in
and continuing, we get
<o>=E OVInIcoi><coil
Now we can use E i> <coil = to get
<Q>= <tvInl tv> (4.2.6)
This is an example of a mixed ensemble. These will be discussed in the digression on density matrices,
which follows in a while.
128 There are a few points to note in connection with this formula.
CHAPTER 4
(1) To calculate <Q>, one need only be given the state vector and the operator SI
(say as a column vector and a matrix, respectively, in some basis). There is no
need to find the eigenvectors or eigenvalues of Q.
(2) If the particle is in an eigenstate of n, that is Rip> = co iv>, then <Q> = co.
(3) By the average value of 52 we mean the average over the ensemble. A given
particle will of course yield only one of the eigenvalues upon measurement. The
mean value will generally be an inaccessible value for a single measurement unless
it accidentally equals an eigenvalue. [A familiar example of this phenomenon is
that of the mean number of children per couple, which may be 2.12, although
the number in a given family is restricted to be an integer.]
The Uncertainty
In any situation described probabilistically, another useful quantity to specify
besides the mean is the standard deviation, which measures the average fluctuation
around the mean. It is defined as
A _ (4.2.7)
<»)2>"2
and often called the rootmeansquared deviation. In quantum mechanics, it is
referred to as the uncertainty in Q. If Q has a discrete spectrum
(AO ) 2 = P(co,)(co i — <n> )2 (4.2.8)
and if it has a continuous spectrum,
(An )2 = fP(0))(co  <Q> ) 2 dco (4.2.9)
Notice that An, just like <Q>, is also calculable given just the state and the operator,
for Eq. (4.2.7) means just
AS2= [< (K)  <Q> ) 21 w>] 1/2 (4.2.10)
Usually the expectation value and the uncertainty provide us with a fairly good
description of the state. For example, if we are given that a particle has <X> = a and
AX= A, we know that the particle is likely to be spotted near x =a, with deviations
of order A.
So far, we have concentrated on the measurement of a single variable at a time.
We now turn our attention to the measurement of more than one variable at a time.
(Since no two independent measurements can really be performed at the same time,
we really mean the measurement of two or more dynamical variables in rapid
succession.)
Exercise 4.2.1 (Very Important). Consider the following operators on a Hilbert space 129
V2 (C):
THE POSTULATES
—A GENERAL
1 0 0 —i 0 1 0 0 DISCUSSION
1
1 0 1 ,L y = i 0 , 0 0 0
21/2[0
0 1 0 0 i 0
(1) What are the possible values one can obtain if L, is measured?
(2) Take the state in which L,=1. In this state what are <Lx >, <L2x >, and AL„?
(3) Find the normalized eigenstates and the eigenvalues of Lx in L, basis.
(4) If the particle is in the state with L,= —1, and Lx is measured, what are the possible
outcomes and their probabilities?
(5) Consider the state
1/2
V>=[ 1/2
1/2 1 /2
in the L, basis. If L,2 is measured in this state and a result +1 is obtained, what is the state
after the measurement? How probable was this result? If L z is measured, what are the outcomes
and respective probabilities?
(6) A particle is in a state for which the probabilities are P(L,= 1) = 1/4, P(L,= 0) =
1/2, and P(L,=1)= 1/4. Convince yourself that the most general, normalized state with
this property is
6.183
'6' eiS2
kV> =6 11,= 1 + 1,  0> +
2 /2 2
It was stated earlier on that if vf> is a normalized state then the state e'° Iv> is a physically
equivalent normalized state. Does this mean that the factors e'5 multiplying the L, eigenstates
are irrelevant? [Calculate for example P(Lx =0).1
Compatible and Incompatible Variables
A striking feature of quantum theory is that given a particle in a state I tg>, one
cannot say in general that the particle has a definite value for a given dynamical
variable S): a measurement can yield any eigenvalue co for which <co I vi> is not zero.
The exceptions are the states w>. A particle in one of these states can be said, as
in classical mechanics, to have a value co for 52, since a measurement is assured to
give this result. To produce such states we need only take an arbitrary state I Iv> and
measure a The measurement process acts as a filter that lets through just one
component of tv>, along some I w>. The probability that this will happen in P(co)=
i<o)1 V>I 2 .
We now wish to extend these ideas to more than one variable. We consider
first the question of two operators. The extension to more than two will be
130 straightforward. We ask:
CHAPTER 4
Question 1. Is there some multiple filtering process by which we can take an
ensemble of particles in some state 1 ty> and produce a state with welldefined values
co and for two variables n and A?
Question 2. What is the probability that the filtering will give such a state if we
start with the state IVY?
To answer these questions, let us try to devise a multiple filtering scheme. Let
us first measure Q on the ensemble described by Itv> and take the particles that yield
a result co. These are in a state that has a welldefined value for Q. We immediately
measure A and pick those particles that give a result X. Do we have now an ensemble
that is in a state with n = w and A = A? Not generally. The reason is clear. After the
first measurement, we had the system in the state lco>, which assured a result co for
0, but nothing definite for A (since 1w> need not be an eigenstate of A). Upon
performing the second measurement, the state was converted to
IV > = IA)
and we are now assured a result for A, but nothing definite for Q (since IA> need
not be an eigenstate of 0).
In other words, the second filtering generally alters the state produced by the
first. This change is just the collapse of the state vector 1w> =EI xxx I co> into the
eigenstate IA>.
An exception occurs when the state produced after the first measurement is
unaffected by the second. This in turn requires that 1w> also be an eigenstate of A.
The answer to the first question above is then in the affirmative only for the simulta
neous eigenstates 1w X>. The means for producing them are just as described above.
These kets satisfy the equations
nIcox> = col cox> (4.2.11)
coÀ) = wA.> (4.2.12)
The question that arises naturally is: When will two operators admit simulta
neous eigenkets? A necessary (but not sufficient) condition is obtained by operating
Eq. (4.2.12) with s), Eq. (4.2.11) with A, and taking the difference:
(QA —An )1cox> = o (4.2.13)
Thus [Q, A] must have eigenkets with zero eigenvalue if simultaneous eigenkets are
to exist. A pair of operators Q and A will fall into one of the three classes:
A. Compatible: [Q, A] = 0
B. Incompatible: [Q, A] = something that obviously has no zero eigenvalue
C. Others
Class A. If two operators commute, we know a complete basis of simultaneous 131
eigenkets can be found. Each element 1coX> of this basis has welldefined values for THE POSTULATES
and A. A GENERAL
Class B. The most famous example of this class is provided by the position and DISCUSSION
momentum operators X and P, which obey the canonical commutation rule
[X, P]= ih (4.2.14)
Evidently we cannot ever have Oil tv> = 01 vi> for any nontrivial Iv>. This means there
doesn't exist even a single ket for which both X and P are well defined. Any attempt
to filter X is ruined by a subsequent filtering for P and vice vesa. This is the origin
of the famous Heisenberg uncertainty principle, which will be developed as we go
along.
Class C. In this case there are some states that are simultaneous eigenkets. There
is nothing very interesting we can say about this case except to emphasize that even
if two operators don't commute, one can still find a few common eigenkets, though
not a full basis. (Why?)
Let us now turn to the second question of the probability of obtaining a state
1coX> upon measurement of Q and A in a state Iv>. We will consider just case A;
the question doesn't arise for case B, and case C is not very interesting. (You should
be able to tackle case C yourself after seeing the other two cases.)
Case A. Let us first assume there is no degeneracy. Thus, to a given eigenvalue
X, there is just one ket and this must be a simultaneous eigenket 10)X>. Suppose
we measured Q first. We get co with a probability P(co)= 1<w/11 lif >1 2 . After the
measurement, the particle is in a state 1o)X>. The measurement of A is certain to
yield the result X. The probability for obtaining co for s/ and X for A is just the
product of the two probabilities
P(0), )L )=1< c0 x1 w>1 2. 1 = 1<c0x1 w>1 2
Notice that if A were measured first and Q next, the probability is the same for
getting the results X and co. Thus if we expand i> in the complete common eigenbasis
as
tv>=E 10)A,><0)xl Iv> (4.2.15a)
then
P(co,=1<c0 AitP>1 2 = co) (4.2.15b)
The reason for calling Q and A compatible if [Q, A] = 0 is that the measurement
of one variable followed by the other doesn't alter the eigenvalue obtained in the
first measurement and we have in the end a state with a welldefined value for both
observables. Note the emphasis on the invariance of the eigenvalue under the second
measurement. In the nondegenerate case, this implies the invariance of the state
vector as well. In the degenerate case, the state vector can change due to the second
132 measurement, though the eigenvalue will not, as the following example will show.
CHAPTER 4
Consider two operators A and on 41 3(R). Let 1w 3X3 > be one common eigenvector.
Let XI = X2 = X. Let co l co2 be the eigenvalues of SI in this degenerate space. Let us
use as a basis I co X>, 1 0)2X>, and I w 3X3 >. Consider a normalized state
I V> = al (032,3> + 161(01X> + ylco2A) (4.2.16)
Let us say we measure n first and get w 3 . The state becomes I co 3X3 > and the subse
quent measurement of A is assured to give a value X3 and to leave the state alone.
Thus P(co3 , ), 3) — <(0 3X3IV>1 2 = a 2. Evidently P(co 3 , )L3 ) = P(X3 , co 3).
Suppose that the measurement of gave a value WI. The resulting state is I co IX)
and the probability for this outcome is 1<co i X1 ty>1 2. The subsequent measurement of
A will leave the state alone and yield the result X with unit probability. Thus P(w, , X)
is the product of the probabilities:
Pcco, , = ty>I 2 1 = Rao] v'>1 2 = 132 (4.2.17)
Let us now imagine the measurements carried out in reverse order. Let the result
of the measurement be X. The state I V> after measurement is the projection of Iv)
in the degenerate X eigenspace:
PAI w> coiX> + ri (02x> (4.2.18)
= 03),tv>11/2 (/3 2+ r2)1/2
where, in the expression above, the projected state has been normalized. The prob
ability for this outcome is P(2,)= )62 + y2, the square of the projection of Iv > in the
eigenspace. If is measured now, both results co l and c0 2 are possible. The probability
for obtaining co l is 1<co 1 X1 ty'>1 2 =fl 2/(02 + y2). Thus, the probability for the result
A = X, = w 1 , is the product of the probabilities:
16 2
P(X, 0)0= (02 + 3,2 ) • 162 + r 2  162 = P(wi ) ) (4.2.19)
Thus P(co , X) = P(X, I ) independent of the degeneracy. But this time the state
suffered a change due to the second measurement (unless by accident V> has no
component along I co2X>). Thus compatibility generally implies the invariance under
the second measurement of the eigen value measured in the first. Therefore, the state
can only be said to remain in the same eigenspace after the second measurement. If
the first eigenvalue is nondegenerate, the eigenspace is one dimensional and the state
vector itself remains invariant.
In our earlier discussion on how to produce welldefined states lvf> for testing
quantum theory, it was observed that the measurement process could itself be used
as a preparation mechanism: if the measurement of SI on an arbitrary, unknown
initial state given a result co, we are sure we have the state I w> =1co>. But this
presumes w is not a degenerate eigenvalue. If ti is degenerate, we cannot nail down
the state, except to within an eigenspace. It was therefore suggested that we stick to
variables with a nondegenerate spectrum. We can now lift that restriction. Let us
say a degenerate eigenvalue co for the variable Q was obtained. We have then some 133
vector in the co eigenspace. We now measure another compatible variable A. If we THE POSTULATES
get a result A, we have a definite statelcoA,>, unless the value (co, A) itself is degenerate. —A GENERAL
We must then measure a third variable F compatible with Q and A and so on. DISCUSSION
Ultimately we will get a state that is unique, given all the simultaneous eigenvalues:
ko, A, y,. .>. It is presumed that such a set of compatible observables, called a
complete set of commuting observables, exists. To prepare a state for studying quan
tum theory then, we take an arbitrary initial state and filter it by a sequence of
compatible measurements till it is down to a unique, known vector. Any nondegener
ate operator, all by itself, is a "complete set."
Incidentally, even if the operators Q and A are incompatible, we can specify
the probability P(co, A) that the measurement of Q followed by that of A on a state
1 ty) will give the results co and A, respectively. However, the following should be
noted:
(1) P(a P(A, co) in general.
(2) The probability P(co, A) is not the probability for producing a final state
that has welldefined values co and A, for Q and A. (Such a state doesn't exist by the
definition of incompatibility.) The state produced by the two measurements is just
the eigenstate of the second operator with the measured eigenvalue.
The Density Matrix — a Digressions
So far we have considered ensembles of N systems all in the same state 1 yf>.
They are hard to come by in practice. More common are ensembles of N systems,
n, (i= 1, 2, . . . , k) of which are in the state li>. (We restrict ourselves to the case
where 1i> is an element of an orthonormal basis.) Thus the ensemble is described by
k kets 11>, 12>, . . . ,1k>, and k occupancy numbers n l , . . . , nk . A convenient way to
assemble all this information is in the form of the density matrix (which is really an
operator that becomes a matrix in some basis):
p=Epili>01 (4.2.20)
where pt = ni/ N is the probability that a system picked randomly out of the ensemble
is in the state 1i>. The ensembles we have dealt with so far are said to be pure; they
correspond to all pi = 0 except one. A general ensemble is mixed.
Consider now the ensemble average of Q. It is
<Q> =E (4.2.21)
The bar on <S2> reminds us that two kinds of averaging have been carried out: a
quantum average 01Q1i> for each system in 1 i> and a classical average over the
This digression may be omitted or postponed without loss of continuity.
134 systems in different states i>. Observe that
CHAPTER 4
Tr(p) =E </I npli>
=E E <ini><iIi>pi= <ili> <A 01
ii
=E <ilnl i>Pi
= <C2> (4.2.22)
The density matrix contains all the statistical information about the ensemble. Sup
pose we want, not <C2>, but instead P(co), the probability of obtaining a particular
value co. We first note that, for a pure ensemble,
P(w)=1<o)IW>12 = <1Pico> <coi1V>= <w DI i , >= <P.>
which combined with Eq. (4.2.22) tells us that
P(co)=Tr(P p)
The following results may be easily established:
(1) Pt = P
(2) Tr p = 1
(3) P2 = P for a pure ensemble
(4) p=(1/k)! for an ensemble uniformly distributed over k states
(5) Tr p2 < 1 (equality holds for a pure ensemble) (4.2.23)
You are urged to convince yourself of these relations.
Example 4.2.4. To gain more familiarity with quantum theory let us con
sider an infinitedimensional ket I vf> expanded in the basis x> of the position
operator X:
= 1 00 Ix> dx= 1 °.° lx>vf(x)dx
We call tv(x) the wave function (in the X basis). Let us assume w(x) is a Gaussian,
that is, v(x)= A exp[—(x — a) 2/2A2] (Fig. 4.2a). We now try to extract information
about this state by using the postulates. Let us begin by normalizing the state:
1 = <IVIW>= f <1Ifix> <xlIP> dx= I ce itP(x)1 2 dx
A 2 C (xa 22 dx= A2(rA2) 1/2 (see Appendix A.2)
135
THE POSTULATES
—A GENERAL
DISCUSSION
O X 0 D
Figure 4.2. (a) The modulus of the wave function, 1<xi 4/ >I = (b) The modulus of the wave
function, I <PI 4/ >1 = V(P)i
So the normalized state is
1 e(x02/16,2
vf(x)—
(nA2 ) 1/4
The probability for finding the particle between x and x + dx is
12
P(x) dx=lv(x)12 dx— (71.6, )1 /2 e 22 dx
which looks very much like Fig. 4.2a. Thus the particle is most likely to be found
around x = a, and chances of finding it away from this point drop rapidly beyond a
distance A. We can quantify these statements by calculating the expectation value
and uncertainty for X. Let us do so.
Now, the operator X defined in postulate II is the same one we discussed at
length in Section 1.10. Its action in the X basis is simply to multiply by x, i.e., if
<x I iv> = ty(x)
then,
00 00
<xIXI iv> = <xIXIx'› <x'l st'> dx' = .x8(x— x')w(x') dx'
= xtv(x)
Using this result, the mean or expectation value of X is
<X> = <WIXitlf>= < 11/1x> <x1Xliv> dx
= te(x)xvf(x) dx
1 f
e 2 /A2 x dx
(it 42 ) 1 / 2
136 If we define y = x— a,
CHAPTER 4
<X> — (rA12)1 /2 f (y+ a) eY 2l°2 dy
=a
We should have anticipated this result of course, since the probability density is
symmetrically distributed around x = a.
Next, we calculate the fluctuations around <X> = a, i.e., the uncertainty
AX= [<0 (X — <X>) 2 1V>] 1/2
= WIX 2 — 2X<X>+ <X>21 >]1/2
=Rivlx2 a>2 1ty>1 1/2 (since < tglX1 W> = <X> )
= vx2 > <x>2 r2
= viv2 > a2r2
Now
1 .1"3
<X2 > = e 0, a)2/2A2 • X 2 •e —(x —a)2/2A2 a,x
(nA 2) 1 /2
oc A2
1
eY 22 (y 2 + 2ya +a2) dy=— + 0 + a2
(nA2) 1/2 f_ 00 2
So
A
AX=
21 /2
So much for the information on the variable X. Suppose we next want to know
the probability distribution for different values of another dynamical variable, say
the momentum P.
(1) First we must construct the operator P in this basis.
(2) Then we must find its eigenvalues p, and eigenvectors p>.
(3) Finally, we must take the inner product <PI V>.
(4) If p is discrete, I <Al IV >1 2 = P(P,), and if p is continuous, i< tV>12=P(P), the
probability density.
Now, the P operator is just the K operator discussed in Section 1.10 multiplied by
h and has the action of —A d/dx in the X basis, for if
<xl iv> = ti(x)
then 137
THE POSTULATES
—A GENERAL
>
<xl P I IV = cc' <xl PI x'> <4 IV> dx' DISCUSSION
=f [—ih8 '(x — x')]ty(x) dx' (Postulate II)
dty
=
dx
Thus, if we project the eigenvalue equation
P> = p>
onto the X basis, we get
<xiPi P> =P<xi P>
or
dyf p(x)
—ih — pt p(x)
dx
where yfp(x)= <xl p>. The solutions, normalized to the Dirac delta functiont are
(from Section 1.10)
1
e ipx/h
(2rh) 1 /2
Now we can compute
<PI W> = f <Pix> <xi V> dx= 14(x)V(x) dx
2 )1
/4
e ipx/h e__ 2 /2A2 (A _ipa/h _p2A2/2h2
(27.ch) 1 (n A2)i/ dx= rh 2 e e
—oo
The modulus of yf (p) is a Gaussian (Fig. 4.2b) of width h/2 1 /2 A. It follows that
<P> =0, and AP= h/2 1 /2A. Since AX= A/2 1/2 ; we get the relation
AX• AP= h/2
Here we want <PI p'> = — 11= 6 (k —k')//l, where p= 11k. This explains the (27r 11 ) 1/2 normalization
factor.
138 The Gaussian happens to saturate the lower bound of the uncertainty relation (to
CHAPTER 4
be formally derived in chapter 9) :
AX • AP> h/2
The uncertainty relation is a consequence of the general fact that anything
narrow in one space is wide in the transform space and vice versa. So if you are a
110lb weakling and are taunted by a 600lb bully, just ask him to step into momen
tum space! 1=1
This is a good place to point out that the plane waves e' all improper
vectors, i.e., vectors that can't be normalized to unity but only to the Dirac delta
function) are introduced into the formalism as purely mathematical entities. Our
inability to normalize them to unity translates into our inability to associate with
them a sensible absolute probability distribution, so essential to the physical interpre
tation of the wave function. In the present case we have a particle whose relative
probability density is uniform in all of space. Thus the absolute probability of finding
it in any finite volume, even as big as our solar system, is zero. Since any particle
that we are likely to be interested in will definitely be known to exist in some finite
volume of such large dimensions, it is clear that no physically interesting state will
be given by a plane wave. But, since the plane waves are eigenfunctions of P, does
it mean that states of welldefined momentum do not exist? Yes, in the strict sense.
However, there do exist states that are both normalizable to unity (i.e., correspond
to proper vectors) and come arbitrarily close to having a precise momentum. For
example, a wave function that behaves as e'° a large region of space and
tapers off to zero beyond, will be normalizable to unity and will have a Fourier
transform so sharply peaked at p= po that momentum measurements will only give
results practically indistinguishable from po . Thus there is no conflict between the
fact that plane waves are unphysical, while states of welldefined momentum exist,
for "well defined" never means "mathematically exact," but only "exact to any
measurable accuracy." Thus a particle coming out of some accelerator with some
advertised momentum, say 500 GeV/c, is in a proper normalizable state (since it is
known to be located in our laboratory) and not in a plane wave state corresponding
to I p= 500 GeV/c>.
But despite all this, we will continue to use the eigenkets I p> as basis vectors
and to speak of a particle being the state I p>, because these vectors are so much more
convenient to handle mathematically than the proper vectors. It should, however, be
borne in mind that when we say a particle is (coming out of the accelerator) in a
state [po>, it is really in a proper state with a momentum space wave function so
sharply peaked at p= po that it may be replaced by a delta function Op— po).
The other set of improper kets we will use in the same spirit are the position
eigenkets x>, which also form a convenient basis. Again, when we speak of a particle
being in a state I xo > we shall mean that its wave functionis so sharply peaked at x=
o that it may be treated as a delta function to a good accuracy.t x
Thus, by the physical Hilbert space, we mean the space of interest to physicists, not one whose elements
all correspond to physically realizable states.
Occasionally, the replacement of a proper wave function by its improper coun 139
terpart turns out to be a poor approximation. Here is an example from Chapter 19: THE POSTULATES
Consider the probability that a particle coming out of an accelerator with a nearly —A GENERAL
exact momentum scatters off a target and enters a detector placed far away, and not DISCUSSION
in the initial direction. Intuition says that the answer must be zero if the target is
absent. This reasonable condition is violated if we approximate the initial state of
the particle by a plane wave (which is nonzero everywhere). So we proceed as follows.
In the vicinity of the target, we use the plane wave to approximate the initial wave
function, for the two are indistinguishable over the (finite and small) range of influ
ence of the target. At the detector, however, we go back to the proper wave (which
has tapered off) to represent the initial state.
Exercise 4.2.2.* Show that for a real wave function v(x), the expectation value of
momentum <P>= O. (Hint: Show that the probabilities for the momenta ±p are equal.)
Generalize this result to the case 1/1=Ctlf,, where i, is real and c an arbitrary (real or complex)
constant. (Recall that yi) and al v> are physically equivalent.)
Exercise 4.2.3. * Show that if ty(x) has mean momentum <P), e'P'" v(x) has mean
momentum <P>+Po.
Example 4.2.5. The collapse of the state vector and the uncertainty principle
play a vital role in explaining the following extension of the double slit experiment.
Suppose I say, "I don't believe that a given particle (let us say an electron) doesn't
really go through one slit or the other. So I will set up a light source in between the
slits to the right of the screen. Each passing electron will be exposed by the beam
and I note which slit it comes out of. Then I note where it arrives on the screen. I
make a table of how many electrons arrive at each x and which slit they came from.
Now there is no escape from the conclusion that the number arriving at a given x
is the sum of the numbers arriving via Si and S2. So much for quantum theory and
its interference pattern!"
But the point of course is that quantum theory no longer predicts an interference
pattern! The theory says that if an electron of definite momentum p is involved, the
corresponding wave function is a wave with a welldefined wave number k=p1h,
which interferes with itself and produces a nice interference pattern. This prediction
is valid only as long as the state of the electron is what we say it is. But this state is
necessarily altered by the light source, which upon measuring the position of the
electron (as being next to SI , say) changes its wave function from something that
was extended in space to something localized near S I . Once the state is changed, the
old prediction of interference is no longer valid.
Now, once in a while some electrons will get to the detectors without being
detected by the light source. We note where these arrive, but cannot classify them
as coming via SI or S2. When the distribution of just these electrons is plotted; sure
enough we get the interference pattern. We had better, for quantum theory predicts
it, the state not having been tampered with in these cases.
The above experiment can also be used to demystify to some extent the collapse
of the wave function under measurement. Why is it that even the ideal measurement
produces unavoidable changes in the state? The answer, as we shall see, has to do
with the fact that h is not zero.
140
CHAPTER 4
1 " Figure 4.3. Light of frequency A bounces off the electron, enters
the objective 0 of the microscope, and enters the eye E of the
observer.
Consider the schematic set up in Fig. 4.3. Light of wavelength Â, illuminates an
electron (e ), enters the objective (0) of a microscope (M) and reaches our eye (E).
If SO is the opening angle of the cone of light entering the objective after interacting
with electron, classical optics limits the accuracy of the position measurement by an
uncertainty
Aisin SO
Both classically and quantum mechanically, we can reduce AX to 0 by reducing
to zero.$ In the latter description however, the improved accuracy in the position
measurement is at the expense of producing an increased uncertainty in the x compo
nent (Px) of the electron momentum. The reason is that light of wavelength Â, is not
a continuous wave whose impact on the electron momentum may be arbitrarily
reduced by a reduction of its amplitude, but rather a flux of photons of momentum
p=2n hl A. As decreases, the collisions between the electron and the photons
become increasingly violent. This in itself would not lead to an uncertainty in the
electron momentum, were it not for the fact that the x component of the photons
entering the objective can range from 0 to p sin SO =2n h sin SO / A. Since at least

one photon must reach our eyes after bouncing off the electron for us to see it, there
is a minimum uncertainty in the recoil momentum of the electron given by
2nh
APx '  sin SO
Consequently, we have at the end of our measurement an electron whose position
and momenta are uncertain by AX and APx such that
AX • AT' x ' '22t h h
[The symbols AX and AP x are not precisely the quantities defined in Eq. (4.2.7) but
are of the same order of magnitude.] This is the famous uncertainty principle. There
is no way around it. If we soften the blow of each photon by increasing Â, or narrow
the objective to better constrain the final photon momentum, we lose in resolution.
This would be the ideal position measurement.
More elaborate schemes, which determine the recoil of the microscope, are equally 141
futile. Note that if h were 0, we could have AX and APx simultaneously O. Physically, THE POSTULATES
it means that we can increase our position resolution without increasing the punch —A GENERAL
carried by the photons. Of course h is not zero and we can't make it zero in any DISCUSSION
experiment. But what we can do is to use bigger and bigger objects for our experiment
so that in the scale of these objects h appears to be negligible. We then regain
classical mechanics. The position of a billiard ball can be determined very well
by shining light on it, but this light hardly affects its momentum. This is why one
imagines in classical mechanics that momentum and position can be well defined
simultaneously. 111
Generalization to More Degrees of Freedom
Our discussion so far has been restricted to a system with one degree of free
dom—namely, a single particle in one dimension. We now extend our domain to a
system with N degrees of freedom. The only modification is in postulate II, which
now reads as follows.
Postulate II. Corresponding to the N Cartesian coordinates x 1 , . . . 5 x N describ
ing the classical system, there exist in quantum theory N mutually commuting
operators X 1 , . , XN. In the simultaneous eigenbasis lx,, x2 , . . , xN> of these
operators, called the coordinate basis and normalized as
<X 1 , x25 . . . , xNI x15 . . . , xN> = 8(xi – . . 3(xN – xN)
(the product of delta functions vanishes unless all the arguments vanish) we
have the following correspondence:
1 , . , x N1 Iv> = w( x i , • , xN)
, , xN lx i I = x ity (xi ..• , xN )
, , (V> = tv(xi,.. • , xN)
P, being the momentum operator corresponding to the classical momentum
pi . Dependent dynamical variables co(x„ pi ) are represented by operators Q =
The other postulates remain the same. For example
ItAxi , , xN)1 2 dx N is the probability that the particle coordinates lie
between x 1 , x2 , . . . , x N and xi + dx i , x2+ dx2, • • • , XN± dXN •
This postulate is stated in terms of Cartesian coordinates since only in terms
of these can one express the operator assignments in the simple form X ,
13,–>ih a/ x1 . Once the substitutions have been made and the desired equations
obtained in the coordinate basis, one can perform any desired change of variable
before solving them. Suppose, for example, that we want to find the eigenvalues and
142 eigenvectors of the operator f2, corresponding to the classical variable
CHAPTER 4
pl ' P2 P3 2 2 2
W +X + X2 +X3 (4.2.24)
2m
where x 1 , x2 , and x3 are the three Cartesian coordinates and p, the corresponding
momenta of a particle of mass m in three dimensions. Since the coordinates are
usually called x, y, and z, let us follow this popular notation and rewrite Eq. (4.2.24)
as
2 2
co — Px Pz
+x2+y2+z2 (4.2.25)
2m
To solve the equation
ol co> = col co>
with
,EP +X 2 + Y 2 2
f2 — +Z
2m
we make the substitution
10)> —>tp.(x, y, z)
etc. and get
[ 2 / 2 a2 a2
2 + X2 + y2 + z2 (x, y, z)= w (x, y, z) (4.2.26)
[2m x2
e ay2 az
Once we have obtained this differential equation, we can switch to any other set of
coordinates. In the present case the spherical coordinates r, 0, and recommend
themselves. Since
a2 02 a2
+ +
ax2 ay2 az2
2 l[a( 2 0) ± 1 a (sin 1 + 1 82 1
r2 L Or r Or ) sin 00 00 sin2 0 002]
Eq. (4.2.26) becomes 143
THE POSTULATES
—A GENERAL
[ (r2 a F a (sin a Vw )+ a2tPd DISCUSSION
2m 1_ r2 er ar r2 sin 00 \ 00 r2 sin2 002
+r2 tvco = cop °, (4.2.27)
What if we wanted to go directly from co in spherical coordinates
2 2
1 ( 2 Po ) 2
W= 27
m r r sin 2 0 +r
Pr + 2 + 2
to Eq. (4.2.27)? It is clear upon inspection that there exists no simple rule [such as
pr(ih 0/0r)] for replacing the classical momenta by differential operators in r, 0,
and 4, which generates Eq. (4.2.27) starting from the co above. There does exist a
complicated procedure for quantizing in nonCartesian coordinates, but we will not
discuss it, since the recipe eventually reproduces what the Cartesian recipe (which
seems to world) yields so readily.
There are further generalizations, namely, to relativistic quantum mechanics
and to quantum mechanics of systems in which particles are created and destroyed
(so that the number of degrees of freedom changes!). Except for a brief discussion
of these toward the end of the program, we will not address these matters.
4.3. The Schreidinger Equation (Dotting Your i's and Crossing Your h's)
Having discussed in some detail the state at a given time, we now turn our
attention to postulate IV, which specifies the change of this state with time. According
to this postulate, the state obeys the Schrödinger equation
d
ih — 1 VW> = Hi V(t)> (4.3.1)
dt
Our discussion of this equation is divided into three sections:
(1) Setting up the equation
(2) General approach to its solution
(3) Choosing a basis for solving the equation
Setting Up the Schrödinger Equation
To set up the Schrödinger equation one must simply make the substitution
lf(x ÷X, p 43),where ff is the classical Hamiltonian for the same problem. Thus,
$ In the sense that in cases where comparison with experiment is possible, as in say the hydrogen spectrum,
there is agreement.
144 if we are describing a harmonic oscillator, which is classically described by the
CHAPTER 4 Hamiltonian
2 I
P ' 2 2
A° =  ±  MO) X (4.3.2)
2m 2
the Hamiltonian operator in quantum mechanics is
P2 1
H=—+ mo) 2x2 (4.3.3)
2m2
In three dimensions, the Hamiltonian operator for the quantum oscillator is likewise
1 2
H— x Y + MCO (X 2 + Y2 +Z 2) (4.3.4)
2m 2
assuming the force constant is the same in all directions.
If the particle in one dimension is subject to a constant force f, then
2
le =P—fx
2m
and
p2
H= — fX (4.3.5)
2m
For a particle of charge q in an electromagnetic field in three dimensions,
f
IP — (9/c)A(r, 012 + q0(r, t) (4.3.6)
2m
In constructing the corresponding quantum Hamiltonian operator, we must use the
symmetrized form
q2
1 q q
H= (P•PP•A—A•P+ A•A)10 (4.3.7)
2m c c c
since P does not commute with A, which is a function of X, Y, and Z.
In this manner one can construct the Hamiltonian H for any problem with a
classical counterpart. Problems involving spin have no classical counterparts and
some improvisation is called for. We will discuss this question when we study spin
in some detail in Chapter 14.
General Approach to the Solution 145
Let us first assume that H has no explicit t dependence. In this case the equation THE POSTULATES
A GENERAL
DISCUSSION
if>=HIvI>
is analogous to equations discussed in Chapter 1
Li> =nix>
and
I 11)> = 1(21w>
describing the coupled masses and the vibrating string, respectively. Our approach
will once again be to find the eigenvectors and eigenvalues of H and to construct
the propagator U(t) in terms of these. Once we have U(t), we can write
I v(t)>= u(t)I v'(0)>
There is no need to make assumptions about I tk(0)> here, since it is determined by
Eq. (4.3.1):
I IP(0)> = —
h
HI V( 0)>
In other words, Schri5dinger's equation is first order in time, and the specification
of I v> at t= 0 is sufficient initialvalue datum.
Let us now construct an explicit expression for U(t) in terms of I E>, the normal
ized eigenkets of H with eigenvalues E which obey
HIE> = EIE> (4.3.8)
This is called the time independent Sehr6dinger equation. Assume that we have solved

it and found the kets I E>. If we expand I yt> as
tv(t)> = E E> <EIVI(t)>EaE(t)1E> (4.3.9)
the equation for aE(t) follows if we act on both sides with (ill a/ et — H):
O= (i/1 let — H)ltp(t)> =E(ihci,— EaE )IE > ihciE = EaE (4.3.10)
where we have used the linear independence of the kets I E>. The solution to Eq.
(4.3.10) is
aE(t)= aE(0) e E (4.3.11a)
146 or
CHAPTER 4
<E v(t)> = <El tv(0) > (4.3.11b)
so that
I v(t)>=E E> <E1111 (0)> ClEt/h (4.3.12)
We can now extract U(t):
U(t) =E E> <g CiEt/* (4.3.13)
We have been assuming that the energy spectrum is discrete and nondegenerate. If
E is degenerate, one must first introduce an extra label a (usually the eigenvalue of
a compatible observable) to specify the states. In this case
u(t)=E DE, a> <E, al E—IE°
a E
If E is continuous, the sum must be replaced by an integral. The normal modes
1E( t) > 1E> e— iEt/fi
are also called stationary states for the following reason: the probability distribution
P(co) for any variable S/ is timeindependent in such a state:
P(co, t)=1<wl IP(t)>I 2
2 =1<wIEW>
=1<wiE> 2
e'E"I
=1<colE>I2
= P(w, 0)
There exists another expression for U(t) besides the sum, Eq. (4.3.13), and
that is
U(t)= eiHt/fi (4.3.14)
It this exponential series converges (and it sometimes does not), this form of
U(t) can be very useful. (Convince yourself that I tv(t)> (3)> satisfies
Schrbdinger's equation.)
Since H (the energy operator) is Hermitian, it follows that U(t) is unitary. We
may therefore think of the time evolution of a ket v(t)> as a "rotation" in Hilbert
space. One immediate consequence is that the norm < tv(t)ltv(t)> is invariant: 147
THE POSTULATES
<w(t)lw(t)> = < v(0)100(401 w(0)>= <v(0)1 w(0)> (4.3.15) A GENERAL
DISCUSSION
so that a state, once normalized, stays normalized. There are other consequences of
the fact that the time evolution may be viewed as a rotation. For example, one can
abandon the fixed basis we have been using, and adopt one that also rotates at the
same rate as the state vectors. In such a basis the vectors would appear frozen, but
the operators, which were constant matrices in the fixed basis, would now appear to
be time dependent. Any physical entity, such as a matrix element, would, however,
come out the same as before since <oini V>, which is the dot product of <01 and
is invariant under rotations. This view of quantum mechanics is called the
Heisenberg picture, while the one we have been using is called the Schriidinger picture.
Infinitely many pictures are possible, each labeled by how the basis is rotating. So
if you think you were born too late to make a contribution to quantum theory fear
not, for you can invent your own picture. We will take up the study of various
pictures in Chapter 18.
Let us now consider the case H= H(t). We no longer look for normal modes,
since the operator in question is changing with time. There exists no fixed strategy
for solving such problems. In the course of our study we will encounter a time
dependent problem involving spin which can be solved exactly. We will also study
a systematic approximation scheme for solving problems with
H(t)= H° + 111 (t)
where H° is a large timeindependent piece and H I (t) is a small timedependent
piece.
What is the propagator U(t) in the timedependent case? In other words, how
is U(t) in I tv(t)>= U(t)Ity(0)> related to H(t)? To find out, we divide the interval
(0— t) into N pieces of width A = t/N, where N is very large and A is very small. By
integrating the Schrödinger equation over the first interval, we can write to first order
in A
dt o
=IV (0)> — — HMI V(0)>
h
iA
=[1— i 14 0)]100»
which, to this order
—i.6, 11
= exp[ (0 )]1V ( 0)>
h
148 [One may wonder whether in the interval 0 — A, one must use H(0) or H(A) or
CHAPTER 4
H(4/2) and so on. The difference between these possibilities is of order A and hence
irrelevant, since there is already one power of A in front of H.] Inching forth in steps
of A, we get
i t 1, 40> _ Nfi—1 e —iAll (nA)/h
n=0
We cannot simply add the exponents to get, in the N —>oo limit,
U(t)=exp[—(i/h ) J.' HO') (id
o
since
[ 11 (t 1 ), Mt2)100
in general. For example, if
H(t) =X 2 cos2 wt+ P 2 sin2 on
then
H(0) = X 2
and
Mr /2w)= P2
and
[H(0), H(Tc /2w)]00
It is common to use the symbol, called the time ordered integral

N— I
T{exp[—(i/ h) f HO dt'll= l im 11 exp[—(i/h)H(nA)A]
N—■ co n=0
0
in such problems. We will not make much use of this form of U(t). But notice that
being a product of unitary operators, U(t) is unitary, and time evolution continues
to be a "rotation" whether or not H is time independent.
Whether or not H depends on time, the propagator satisfies the following 149
conditions: THE POSTULATES
A GENERAL
U(t3 , t2 )U(t2 , t1)= U(t3, t1) DISCUSSION
(4.3.16)
Ut (t2, t1)= U 1 (t2, t1)= U(ti , t2 )
It is intuitively clear that these equations are correct. You can easily prove them by
applying the U's to some arbitrary state and using the fact that U is unitary and
U(t, t)= I.
Choosing a Basis for Solving Schriidinger's Equation
Barring a few exceptions, the Schr8dinger equation is always solved in a particu
lar basis. Although all bases are equal mathematically, some are more equal than
others. First of all, since H= H(X, P) the X and P basis recommend themselves, for
in going to one of them the corresponding operator is rendered diagonal. Thus one
can go to the X basis in which X >x and P >  ih d/dx or to the P basis in which
P >p and X ÷ih d/dp. The choice between the two depends on the Hamiltonian.
Assuming it is of the form (in one dimension)
P2
H= T+ V= —+ V(X) (4.3.17)
2m
the choice is dictated by V(X). Since V(X) is usually a more complicated function
of X than T is of P, one prefers the X basis. Thus if
P2
1
H + (4.3.18)
2m cosh 2 X
the equation
HIE> = El E>
becomes in the X basis the secondorder equation
h2 d2+ 1 ) VE(x) = EVE(x) (4.3.19)
( 2m dx2 cosh2 x
which can be solved. Had one gone to the P basis, one would have ended up with
the equation
[ n2 1
F ± it V E(P)= EV E(P) (4.3.20)
2 m cosh2 (ih d/dp)
which is quite frightening.
150 A problem where the P basis is preferred is that of a particle in a constant force
CHAPTER 4
field f, for which
P2
H = — — fX (4.3.21)
2m
In the P basis one gets a firstorder differential equation
d ) 1//E(P)= EVE(P)
ih.f (4.3.22)
2m
132
( dp
whereas in the X basis one gets the secondorder equation
(— —
2hm2 —
ddx22 — fx)IVE(x)= EV e(x) (4.3.23)
The harmonic oscillator can be solved with equal ease in either basis since H is
quadratic in X and P. It turns out to be preferable to solve it in a third basis in
which neither X nor P is diagonal! You must wait till Chapter 7 before you see how
this happens.
There exists a builtin bias in favor of the X basis. This has to do with the fact
that the x space is the space we live in. In other words, when we speak of the
probability of obtaining a value between x and x + dx if the variable X is measured,
we mean simply the probability of finding the particle between x and x + dx in our
space. One may thus visualize ti(x) as a function in our space, whose modulus
squared gives the probability density for finding a particle near x. Such a picture is
useful in thinking about the doubleslit experiment or the electronic states in a
hydrogen atom.
But like all pictures, it has its limits. First of all it must be borne in mind that
even though v(x) can be visualized as a wave in our space, it is not a real wave,
like the electromagnetic wave, which carries energy, momentum, etc. To understand
this point, consider a particle in three dimensions. The function p(x, y, z) can be
visualized as a wave in our space. But, if we consider next a twoparticle system,
Y1, zt, x2, Y2, z2) is a function in a sixdimensional configuration space and
cannot be visualized in our space.
Thus the case of the single particle is really an exception: there is only one
position operator and the space of its eigenvalues happens to coincide with the space
in which we live and in which the drama of physics takes place.
This brings us to the end of our general discussion of the postulates. We now turn
to the application of quantum theory to various physical problems. For pedagogical
reasons, we will restrict ourselves to problems of a single particle in one dimension
in the next few chapters.
Simple Problems in
One Dimension
Now that the postulates have been stated and explained, it is all over but for the
applications. We begin with the simplest class of problems—concerning a single
particle in one dimension. Although these onedimensional problems are somewhat
artificial, they contain most of the features of threedimensional quantum mechanics
but little of its complexity. One problem we will not discuss in this chapter is that
of the harmonic oscillator. This problem is so important that a separate chapter has
been devoted to its study.
5.1. The Free Particle
The simplest problem in this family is of course that of the free particle. The
Schr6dinger equation is
p2
ihl t> = /1 1 V/>= 7
1 1 1V> (5.1.1)
2
The normal modes or stationary states are solutions of the form
zEt/h (5.1.2)
I >=1E>e
'
Feeding this into Eq. (5.1.1), we get the timeindependent Schrödinger equation
for IE>:
p2
HIE>= 1E>=EIE> (5.1.3)
2m
This problem can be solved without going to any basis. First note that any eigenstate 151
152 of P is also an eigenstate of P 2. So we feed the trial solution 1p> into Eq. (5.1.3)
CHAPTER 5
and find
p2
IP> = EIP>
2m
or
(2 – E)1p>=0 (5.1.4)
2m
Since 1p> is not a null vector, we find that the allowed values of p are
p= ±(2mE)I12 (5.1.5)
In other words, there are two orthogonal eigenstates for each eigenvalue E:
1E, +> =Ip= (2mE)'12 > (5.1.6)
1E, – > =1p= –(2mE)`12 > (5.1.7)
Thus, we find that to the eigenvalue E there corresponds a degenerate twodimen
sional eigenspace, spanned by the above vectors. Physically this means that a particle
of energy E can be moving to the right or to the left with momentum IA = (2mE) 1 /2.
Now, you might say, "This is exactly what happens in classical mechanics. So what's
new?" What is new is the fact that the state
lE>=RP= (2nIE) 1/2 > + YIP= — (2n1E) 112 > (5.1.8)
is also an eigenstate of energy E and represents a single particle of energy E that can
be caught moving either to the right or to the left with momentum (2mE)'/ 2 !
To construct the complete orthonormal eigenbasis of H, we must pick from
each degenerate eigenspace any two orthonormal vectors. The obvious choice is
given by the kets 1E, +> and 1E, — > themselves. In terms of the ideas discussed in
the past, we are using the eigenvalue of a compatible variable P as an extra label
within the space degenerate with respect to energy. Since P is a nondegenerate
operator, the label p by itself is adequate. In other words, there is no need to call
the state lp, E= P2 /2m>, since the value of E=E(p) follows, given p. We shall
therefore drop this redundant label.
The propagator is then
U(t)= f iE(p)t/h dp
IP><PI
ip2t/2mh dp (5.1.9)
=f IP><PI e 
Exercise 5.1.1. Show that Eq. (5.1.9) may be rewritten as an integral over E and a sum 153
over the ± index as
SIMPLE
PROBLEMS IN
[ m ONE DIMENSION
U(t)= E 1E, a><E, al e 'Es" dE
.= L(2mE) I/2
Exercise 5.1.2. * By solving the eigenvalue equation (5.1.3) in the X basis, regain Eq.
(5.1.8), i.e., show that the general solution of energy E is
exp[i(2mE) 172x/h] + exp[  i(2mE) I 72x /h]
VIE(x) 13 (21th)v 2 (27rh) 172
[The factor (2n h) 1 /2 is arbitrary and may be absorbed into /3 and y.] Though v E (x)
will satisfy the equation even if E< 0, are these functions in the Hilbert space?
The propagator U(t) can be evaluated explicitly in the X basis. We start with
the matrix element
oo
U(x, t; x') <xi U(t)i x'>=
1
ji 00
<X1p><pix'› e2t/2mh dp
e t/7,X — X )/A
 r, /7 A dp
M
2,r/l .
)1 /2
m e im(xx)2/2ht (5.1.10)
2trhit
using the result from Appendix A.2 on Gaussian integrals. In terms of this propa
gator, any initialvalue problem can be solved, since
ty(x, t)= U(x, t; x')tit(x', 0) dx' (5.1.11)
Had we chosen the initial time to be t' rather than zero, we would have gotten
ty(x, t)= U'(x, t; x', t') (x', t') dx' (5.1.12)
where U(x, t; x', <xi U(t Oix'>, since U depends only on the time interval t t' —
and not the absolute values of t and t'. [Had there been a timedependent potential
such as V(t)= VO C al2 in H, we could have told what absolute time it was by looking
at V(t). In the absence of anything defining an absolute time in the problem, only
time differences have physical significance.] Whenever we set t' =0, we will resort to
our old convention and write U(x, t; x', 0) as simply U(x, t; x').
A nice physical interpretation may be given to U(x, t; x', t') by considering a
special case of Eq. (5.1.12). Suppose we started off with a particle localized at
154 =4, that is, with tic(x% t')= (5(x' — .4). Then
CHAPTER 5
tic(x, t)= U(x, t; x , t') (5.1.13)
In other words, the propagator (in the X basis) is the amplitude that a particle
starting out at the spacetime point (4, t') ends with at the spacetime point (x, t).
[It can obviously be given such an interpretation in any basis: <col U(t, t')1co'> is the
amplitude that a particle in the state lof> at t' ends up with in the state 1w> at t.]
Equation (5.1.12) then tells us that the total amplitude for the particle's arrival at
(x, t) is the sum of the contributions from all points x' with a weight proportional
to the initial amplitude y(x', t') that the particle was at x' at time t'. One also refers
to U(x, t; x , t') as the "fate" of the delta function c(x', t')= 8(x'
Time Evolution of the Gaussian Packet
There is an unwritten law which says that the derivation of the freeparticle
propagator be followed by its application to the Gaussian packet. Let us follow this
tradition.
Consider as the initial wave function the wave packet
Ilf(x',0)=eiPoxyh e x'2/2A2
(5.1.14)
1/4
(gA 2 )
This packet has mean position <X> =0, with an uncertainty AX= A/2 1 /2, and mean
momentum po with uncertainty h/2 I/2A. By combining Eqs. (5.1.10) and (5.1.12) we
get
IPOC, [TC " (A + mA)1
—1/2
• exp
[ — po t I m) 2
242(1+ itit/mA 2)
x exp — (5.1.15)
h 2m
The corresponding probability density is
1 —[x — (po/ m)t] 2 }
P(x, ir1 /2 02 + h 2 t2 /m 2A2 ) 1/2 exP{ A2+ h 2t2/m 2A2 (5.1.16)
The main features of this result are as follows:
(1) The mean position of the particles is
<X>
=_
Pot <P>t
m m
In other words, the classical relation x = (p/m)t now holds between average quanti 155
ties. This is just one of the consequences of the Ehrenfest theorem which states SIMPLE
that the classical equations obeyed by dynamical variables will have counterparts in PROBLEMS IN
quantum mechanics as relations among expectation values. The theorem will be ONE DIMENSION
proved in the next chapter.
(2) The width of the packet grows as follows:
A(t) A ( h 2 t2 /2
)1
AX(t)  21/2 = 21/2 1 ± m2A4 (5.1.17)
The increasing uncertainty in position is a reflection of the fact that any uncertainty
in the initial velocity (that is to say, the momentum) will be reflected with passing
time as a growing uncertainty in position. In the present case, since A V(0) = AP(0)/
m= h/2 1 /2mA, the uncertainty in X grows approximately as AX'' ht/2 1 /2mA which
agrees with Eq. (5.1.17) for large times. Although we are able to understand the
spreading of the wave packet in classical terms, the fact that the initial spread A V(0)
is unavoidable (given that we wish to specify the position to an accuracy A) is a
purely quantum mechanical feature.
If the particle in question were macroscopic, say of mass 1 g, and we wished to
fix its initial position to within a proton width, which is approximately 10 13 cm, the
uncertainty in velocity would be
A V(0) 1014 cm/sec
2 1 /2mA
It would be over 300,000 years before the uncertainty A(t) grew to 1 millimeter! We
may therefore treat a macroscopic particle classically for any reasonable length of
time. This and similar questions will be taken up in greater detail in the next chapter.
Exercise 5.1.3 (Another Way to Do the Gaussian Problem). We have seen that there exists
another formula for U(t), namely, U(t)=eilit/h. For a free particle this becomes
i (h2t d 2 )1
E 1 (iht)" d2"
U(t)=exp [ — — = (5.1.18)
h 2m dx 2 n =0 n! 2m) dx 2"
Consider the initial state in Eq. (5.1.14) with po =0, and set A=1, 1=0:
ex2/2
1 4
(g)
Find ty(x, t) using Eq. (5.1.18) above and compare with Eq. (5.1.15).
Hints: (1) Write tv(x, 0) as a power series:
(1).x 2n
—0 n! (2)"
156 (2) Find the action of a few terms
CHAPTER 5
iht) d2 1 (iht d2
2m dx 22!k
1'
( 2tn) —
 dx 2
etc., on this power series.
(3) Collect terms with the same power of x.
(4) Look for the following series expansion in the coefficient of x 2":
n1/2
( 1 + ith) = (iht) + (n+112)(n+ 3/2) (ithy
1 — (n + 1/2)
m) 2!
(5) Juggle around till you get the answer.
Exercise 5.1.4: A Famous Counterexample. Consider the wave function
71X
lif (x, 0) = sin (7), 1 xl L/2
=0, > L/ 2
It is clear that when this function is differentiated any number of times we get another function
confined to the interval 1x1 <L/2. Consequently the action of
U(t)=exp [ (h2t) d2
h 2m) dx 2
on this function is to give a function confined to 1x1 <LI2. What about the spreading of the
wave packet?
[Answer: Consider the derivatives at the boundary. We have here an example where the
(exponential) operator power series doesn't converge. Notice that the convergence of an
operator power series depends not just on the operator but also on the operand. So there is
no paradox: if the function dies abruptly as above, so that there seems to be a paradox, the
derivatives are singular at the boundary, while if it falls off continuously, the function will
definitely leak out given enough time, no matter how rapid the falloff.]
Some General Features of Energy Eigenfunctions
Consider now the energy eigenfunctions in some potential V(x). These obey
, 2m(E— V)
11/
h2 tif
where each prime denotes a spatial derivative. Let us ask what the continuity of
V(x) implies. Let us start at some point x0 where y/ and y/' have the values y/(x0)
and (x0). If we pretend that x is a time variable and that ty is a particle coordinate,
the problem of finding ky everywhere else is like finding the trajectory of a particle
(for all times past and future) given its position and velocity at some time and its
acceleration as a function of its position and time. It is clear that if we integrate
157
a)
E2 SIMPLE
PROBLEMS IN
ONE DIMENSION
III
X
L/2 0 L/2 L/2 0 L/2
Figure 5.1. (a) The box potential. (b) The first two levels and wave functions in the box.
these equations we will get continuous tif(x) and y(x). This is the typical situation.
There are, however, some problems where, for mathematical simplicity, we consider
potentials that change abruptly at some point. This means that ty" jumps abruptly
there. However, vi' will still be continuous, for the area under a function is continuous
even if the function jumps a bit. What if the change in V is infinitely large? It means
that vi" is also infinitely large. This in turn means that Iv' can change abruptly as
we cross this point, for the area under vi" can be finite over an infinitesimal region
that surrounds this point. But whether or not Iv' is continuous, Iv, which is the area
under it, will be continuous.$
Let us turn our attention to some specific cases.
5.2. The Particle in a Box
We now consider our first problem with a potential, albeit a rather artificial
one:
V(x) = 0, Ix' <L/2
("9, ixi L/2 (5.2.1)
This potential (Fig. 5.1a) is called the box since there is an infinite potential barrier
in the way of a particle that tries to leave the region 1x1 <L/2. The eigenvalue
equation in the X basis (which is the only viable choice) is
2m
d2tif + (E — V)tif=0 (5.2.2)
dx2 ti2
We begin by partitioning space into three regions I, II, and III (Fig. 5.1a). The
solution ty is called vii, viii, and viii in regions I, II, and III, respectively.
Consider first region III, in which V= oo. It is convenient to first consider the
case where V is not infinite but equal to some 1/0 which is greater than E. Now
We are assuming that the jump in is finite. This will be true even in the artificial potentials we will
encounter. But can you think of a potential for which this is not true? (Think delta.)
158 Eq. (5.2.2) becomes
CHAPTER 5
d2 vim 2m( Vo  E)
h2 Pm = 0
(
(5.2.3)
dx2
which is solved by
m= A e' +B e (5.2.4)
where lc = [2m( Vo , )E/h2] I /2 .
Although A and B are arbitrary coefficients from a mathematical standpoint,
we must set B=0 on physical grounds since B e" blows up exponentially as x  ci
and such functions are not members of our Hilbert space. If we now let V> co , we
see that
Vim
It can similarly be shown that vii O. In region II, since V=0, the solutions are
exactly those of a free particle:
ti= A exp[i(2mE/h2) 172x] + B exp[  i(2mE/ti 2)'/2x] (5.2.5)
=A eikx B e ikx k=(2mE/h 2) 1 /2 (5.2.6)
It therefore appears that the energy eigenvalues are once again continuous as in the
freeparticle case. This is not so, for tvii(x)= Iv only in region II and not in all of
space. We must require that yi n goes continuously into its counterparts ty, and yfm
as we cross over to regions I and II, respectively. In other words we require that
tyi(  L/2) = I/faL/2)=0 (5.2.7)
tifm( + L/2) = tifa +L/2)=0 (5.2.8)
(We make no such continuity demands on tif' at the walls of the box since V
jumps to infinity there.) These constraints applied to Eq. (5.2.6) take the form
A e ikL/2 B e ikL/2 = 0
(5.2.9a)
AeikL /2..•  ikL/2 = 0
11
, e (5.2.9b)
or in matrix form
[e2
e"2
[Aim
i 2  ik L / 2 (5.2.10)
e e B 0
Such an equation has nontrivial solutions only if the determinant vanishes: 159
SIMPLE
e ikL eikL = _
2i sin(kL) = 0 (5.2.11) PROBLEMS IN
ONE DIMENSION
that is, only if
k= n=0, ±1, ±2,... (5.2.12)
L
To find the corresponding eigenfunctions, we go to Eqs. (5.2.9a) and (5.2.9b). Since
only one of them is independent, we study just Eq. (5.2.9a), which says
A einn /2 B e in '/2 =0 (5.2.13)
Multiplying by ei"/2 , we get
A= turB (5.2.14)
Since el" =(1)n Eq. (5.2.6) generates two families of solutions (normalized to
,
unity):
1/2
Vin (X) = (2 sin( J, n even (5.2.15)
)
1/2 /
4 2) cos( J, n odd (5.2.16)
Notice that the case n= 0 is uninteresting since 11'0 =0. Further, since yft, =
for n odd and tyn = — ty, for n even, and since eigenfunctions differing by an overall
factor are not considered distinct, we may restrict ourselves to positive nonzero n.
In summary, we have
(2) 1/2 (ngx)
— cos n= 1, 3, 5, 7, . . . (5.2.17a)
\ 1/2 /
= ( 2 ) Sinr irl n=2, 4, 6, . . . (5.2.17b)
and from Eqs. (5.2.6) and (5.2.12),
ti
2
k
2
h 2 7C 2 n 2
En — (5.2.17c)
2m 2mL 2
[It is tacitly understood in Eqs. (5.2.17a) and (5.2.17b) that Ix < L/2.]
160 We have here our first encounter with the quantization of a dynamical variable.
CHAPTER 5
Both the variables considered so far, X and P, had a continuous spectrum of eigenval
ues from — oo to + co, which coincided with the allowed values in classical mechanics.
In fact, so did the spectrum of the Hamiltonian in the freeparticle case. The particle
in the box is the simplest example of a situation that will be encountered again
and again, wherein Schr6dinger's equation, combined with appropriate boundary
conditions, leads to the quantization of energy. These solutions are also examples
of bound states, namely, states in which a potential prevents a particle from escaping
to infinity. Bound states are thus characterized by
Bound states appear in quantum mechanics exactly where we expect them classically,
namely, in situations where V(± co) is greater than E.
The energy levels of bound states are always quantized. Let us gain some insight
into how this happens. In the problem of the particle in a box, quantization resulted
from the requirement that yi n completed an integral number of halfcycles within
the box so that it smoothly joined its counterparts tit and tviii which vanished
identically. Consider next a particle bound by a finite well, i.e., by a potential that
jumps from 0 to I/0 at 1x1 = LI2. We have already seen [Eq. (5.2.4)] that in the
classically forbidden region (E<V0 ,1x1> LI2) tit is a sum of rising and falling expo
nentials (as 1x1 —> co) and that we must choose the coefficient of the rising exponential
to be zero to get an admissible solution. In the classically allowed region (1x1 <
LI2) Iv is a sum of a sine and cosine. Since V is everywhere finite, we demand that
ty and ty' be continuous at x = ±L/2. Thus we impose four conditions on Iv, which
has only three free parameters. (It may seem that there are four—the coefficients of
the two falling exponentials, the sine, and the cosine. However, the overall scale of
ty is irrelevant both in the eigenvalue equation and the continuity conditions, these
being linear in tif and V. Thus if say, Iv' does not satisfy the continuity condition
at x=L12, an overall rescaling of tif and 1//' will not help.) Clearly, the continuity
conditions cannot be fulfilled except possibly at certain special energies. (See Exercise
5.2.6 for details). This is the origin of energy quantization here.
Consider now a general potential V(x) which tends to limits V, as x—> ± co and
which binds a particle of energy E (less than both Vi ). We argue once again that
we have one more constraint than we have parameters, as follows. Let us divide
space into tiny intervals such that in each interval V(x) is essentially constant. As
x—> ± oo, these intervals can be made longer and longer since V is stabilizing at its
asymptotic values V. The right and leftmost intervals can be made infinitely wide,
since by assumption V has a definite limit as x—> ± co. Now in all the finite intervals,
tif has two parameters: these will be the coefficients of the sine/cosine if E> V or
growing/falling exponential if E< V. (The rising exponential is not disallowed, since
it doesn't blow up within the finite intervals.) Only in the left and rightmost intervals
does tif have just one parameter, for in these infinite intervals, the growing exponential
can blow up. All these parameters are constrained by the continuity of ty and ty' at
each interface between adjacent regions. To see that we have one more constraint
than we have parameters, observe that every extra interval brings with it two free
parameters and one new interface, i.e., two new constraints. Thus as we go from
three intervals in the finite well to the infinite number of intervals in the arbitrary 161
potential, the constraints are always one more than the free parameters. Thus only SIMPLE
at special energies can we expect an allowed solution. PROBLEMS IN
[Later we will study the oscillator potential, V=21m0)2x2, which grows without ONE DIMENSION
limit as I xl —> co. How do we understand energy quantization here? Clearly, any
allowed tic will vanish even more rapidly than before as I —> cc, since V— E, instead
of being a constant, grows quadratically, so that the particle is "even more forbidden
than before" from escaping to infinity. If E is an allowed energy,t we expect y/ to
fall off rapidly as we cross the classical turning points x0 = (2Elmco 2 )". To a
particle in such a state, it shouldn't matter if we flatten out the potential to some
constant at distances much greater than I xo l , i.e., the allowed levels and eigen
functions must be the same in the two potentials which differ only in a region that
the particle is so strongly inhibited from going to. Since the flattenedout potential
has the asymptotic behavior we discussed earlier, we can understand energy quantiza
tion as we did before.]
Let us restate the origin of energy quantization in another way. Consider the
search for acceptable energy eigenfunctions, taking the finite well as an example. If
we start with some arbitrary values yf(xo) and 11/(4), at some point xo to the right
of the well, we can integrate Schr8dinger's equation numerically. (Recall the analogy
with the problem of finding the trajectory of a particle given its initial position and
velocity and the force on it.) As we integrate out to x—> oo, tic will surely blow up
since yfill contains a growing exponential. Since tv(x 0) merely fixes the overall scale,
we vary 111(x0) until the growing exponential is killed. [Since we can solve problem
analytically in region III, we can even say what the desired value of ty'(x0) is: it is
given by 111(x0)=—Icyc(x 0). Verify, starting with Eq. (5.2.4), that this implies B=
O.] We are now out of the fix as x+ cc, but we are committed to whatever comes
out as we integrate to the left of x0 . We will find that tic grows exponentially till we
reach the well, whereupon it will oscillate. When we cross the well, y/ will again start
to grow exponentially, for yfi also contains a growing exponential in general. Thus
there will be no acceptable solution at some randomly chosen energy. It can, however,
happen that for certain values of energy, tic will be exponentially damped in both
regions I and III. [At any point x in region I, there is a ratio ty'(4)/y/(4) for which
only the damped exponential survives. The tif we get integrating from region III will
not generally have this feature. At special energies, however, this can happen.] These
are the allowed energies and the corresponding functions are the allowed eigen
functions. Having found them, we can choose y/(x0) such that they are normalized
to unity. For a nice numerical analysis of this problem see the book by Eisberg and
Resnick. §
It is clear how these arguments generalize to a particle bound by some arbitrary
potential: if we try to keep tif exponentially damped as x—>—oo, it blows up as x—> co
(and vice versa), except at some special energies. It is also clear why there is no
quantization of energy for unbound states: since the particle is classically allowed
at infinity, ty oscillates there and so we have two more parameters, one from each
end (why?), and so two solutions (normalizable to 6(0)) at any energy.
We are not assuming E is quantized.
§ R. Eisberg and R. Resnick, Quantum Physics of Atoms, Molecules, Solids, Nuclei and Particles, Wiley,
New York (1974). See Section 5.7 and Appendix F.
162 Let us now return to the problem of the particle in a box and discuss the fact
CHAPTER 5
that the lowest energy is not zero (as it would be classically, corresponding to the
particle at rest inside the well) but h2 g 2/2mL 2 . The reason behind it is the uncertainty
principle, which prevents the particle, whose position (and hence AX) is bounded
by 1x1 <L/2, from having a welldefined momentum of zero. This in turn leads to a
lower bound on the energy, which we derive as follows. We begin witht
1,2
H= (5.2.18)
2m
so that
<p>
2
<H> — (5.2.19)
2m
Now <P> =0 in any bound state for the following reason. Since a bound state is a
stationary state, <P> is time independent. If this <P> 00, the particle must (in the
average sense) drift either to the right or to the left and eventually escape to infinity,
which cannot happen in a bound state.
Consequently we may rewrite Eq. (5.2.19) as
<H>=<(P— <P>)2> (AP)2
2m 2m
If we now use the uncertainty relation
AP • AX> h/2
we find
h2
<H>>
8m(AX) 2
Since the variable x is constrained by — L/2<x<L/2, its standard deviation AX
cannot exceed L/2. Consequently
<H>h2/2mL 2
In an energy eigenstate, <H>= E so that
E_Ii2/2mL 2 (5.2.20)
The actual groundstate energy El happens to be ir2 times as large as the lower
$ We are suppressing the infinite potential due to the walls of the box. Instead we will restrict x to the
range 1x1 5_L/2.
bound. The uncertainty principle is often used in this fashion to provide a quick 163
orderofmagnitude estimate for the groundstate energy. SIMPLE
If we denote by I n> the abstract ket corresponding to tp,(x), we can write the PROBLEMS IN
propagator as ONE DIMENSION
U(t)= E in><ni exp [ — (h21r2n2)ti (5.2.21)
n = h 2mL2 1
The matrix elements of U(t) in the X basis are then
(x I U( t ) I x'> = U(x, t; x')
( 2 7,2n2
= E ty,(x)14(x) exp [ 2— (5.2.2 )
n=1 h 2mL 2 ) th
Unlike in the freeparticle case, there exists no simple closed expression for this sum.
Exercise 5.2.1. * A particle is in the ground state of a box of length L. Suddenly the box
expands (symmetrically) to twice its size, leaving the wave function undisturbed. Show that
the probability of finding the particle in the ground state of the new box is (8/3/0 2.
Exercise 5.2.2. * (a) Show that for any normalized I yi>, <OH' ty>..,E0 , where E0 is the
lowestenergy eigenvalue. (Hint: Expand I iv> in the eigenbasis of H.)
(b) Prove the following theorem: Every attractive potential in one dimension has at least
one bound state. Hint: Since V is attractive, if we define V(cc) =0, it follows that V(x)=
V(x)I for all x. To show that there exists a bound state with E< 0, consider
ly“(X) = (—
and calculate
h2 d 2
ga)=<V.11111Pa>, H= IV(x)1
2m dx 2
Show that E(a) can be made negative by a suitable choice of a. The desired result follows
from the application of the theorem proved above.
Exercise 5.2.3. * Consider V(x)= — aV08(x). Show that it admits a bound state of energy
E= — ma2 V02/2h2. Are there any other bound states? Hint: Solve Schrâclinger's equation out
side the potential for E< 0, and keep only the solution that has the right behavior at infinity
and is continuous at x =0. Draw the wave function and see how there is a cusp, or a discontinu
ous change of slope at x =0. Calculate the change in slope and equate it to
c12 (7)dx
J _ ( dx 2
(where E is infinitesimal) determined from Schrâclinger's equation.
164 Exercise 5.2.4. Consider a particle of mass ni in the state In> of a box of length L. Find
the force F= 0E1OL encountered when the walls are slowly pushed in, assuming the particle

CHAPTER 5 remains in the nth state of the box as its size changes. Consider a classical particle of energy
E, in this box. Find its velocity, the frequency of collision on a given wall, the momentum
transfer per collision, and hence the average force. Compare it to — OEIOL computed above.
Exercise 5.2.5. * If the box extends from x=0 to L (instead of — L/2 to L/2) show that
tv„(x) = (2/L) 1 /2 sin(nrx/L), n=1, 2, ... , co and E,=h2 g2n212mL 2 .
Exercise 5.2.6. * Square Well Potential. Consider a particle in a square well potential:
v(x)= {0, lx1
Vo, ixi
Since when V0 —> co, we have a box, let us guess what the lowering of the walls does to the
states. First of all, all the bound states (which alone we are interested in), will have E< Vo .
Second, the wave functions of the lowlying levels will look like those of the particle in a box,
with the obvious difference that ty will not vanish at the walls but instead spill out with an
exponential tail. The eigenfunctions will still be even, odd, even, etc.
(1) Show that the even solutions have energies that satisfy the transcendental equation
k tan ka= (5.2.23)
while the odd ones will have energies that satisfy
k cot ka= (5.2.24)
where k and iic are the real and complex wave numbers inside and outside the well, respectively.
Note that k and lc are related by
k2 + ic2 = 2m Vo/h2 (5.2.25)
Verify that as Vo tends to co, we regain the levels in the box.
(2) Equations (5.2.23) and (5.2.24) must be solved graphically. In the (a =ka, )3= Ica)
plane, imagine a circle that obeys Eq. (5.2.25). The bound states are then given by the
intersection of the curve a tan a = 13 or a cot a= —fl with the circle. (Remember a and /3 are
positive.)
(3) Show that there is always one even solution and that there is no odd solution unless
Vo > h2 g2/8ma2 . What is E when Vc, just meets this requirement? Note that the general result
from Exercise 5.2.2b holds.
5.3. The Continuity Equation for Probability
We interrupt our discussion of onedimensional problems to get acquainted with
two concepts that will be used in the subsequent discussions, namely, those of the
probability current density and the continuity equation it satisfies. Since the probability
current concept will also be used in threedimensional problems, we discuss here a
particle in three dimensions.
As a prelude to our study of the continuity equation in quantum mechanics, let 165
us recall the analogous equation from electromagnetism. We know in this case that SIMPLE
the total charge in the universe is a constant, that is PROBLEMS IN
ONE DIMENSION
Q(t)=const, independent of time t (5.3.1)
This is an example of a global conservation law, for it refers to the total charge
in the universe. But charge is also conserved locally, a fact usually expressed in the
form of the continuity equation
e p(r,, t)
— V •j (5.3.2)
et
where p and j are the charge and current densities, respectively. By integrating this
equation over a volume V bounded by a surface S v we get, upon invoking Gauss's
law,
t) cl 3r = — f V I XI. = — f i.ds (5.3.3)
dt j v v sv
This equation states that any decrease in charge in the volume V is accounted for
by the flow of charge out of it, that is to say, charge is not created or destroyed in
any volume.
The continuity equation forbids certain processes that obey global conservation,
such as the sudden disappearance of charge from one region of space and its immedi
ate reappearance in another.
In quantum mechanics the quantity that is globally conserved is the total prob
ability for finding the particle anywhere in the universe. We get this result by
expressing the invariance of the norm in the coordinate basis: since
(v ( t) 1 1g(t)>= Ov(0)1C( ou( t)11v(0)>= < v(0)1v(0)>
then
const = <V(t)i IP(t)> = ill <IP(01 x, .Y, z><x> y, zl V(t)> dx dy dzt
= ill < V(01rXr1 VW> XI'
= ill vf*(r , t) tp(r, t) cPr
= f f f P(r , t) d 3r (5.3.4)
1 The range of integration will frequently be suppressed when obvious.
166 This global conservation law is the analog of Eq. (5.3.1). To get the analog of
CHAPTER 5
Eq. (5.3.2), we turn to the Schrödinger equation
h2
= V 2 vf + Vi (5.3.5)
et 2m
and its conjugate
avf * h2 _ * + V*
ih = V 2 ty tg (5.3.6)
et 2m
Note that V has to be real if H is to be Hermitian. Multiplying the first of these
equations by ty*, the second by and taking the difference, we get
a h2 2
—th (IP*0 = — (V *V — VV2 V * )
et 2m
OP h
2miV ov v vvtv*)
*
at
OP
(5.3.7)
where
h
i= (t" (5.3.8)
2mi
is the probability current density, that is to say, the probability flow per unit time
per unit area perpendicular to j. To regain the global conservation law, we integrate
Eq. (5.3.7) over all space:
d
P(r, t)crr= j•dS (5.3.9)
dt so,
where Soo is the sphere at infinity. For (typical) wave functions which are normaliz
able to unity, r 3/2 ty03 as r—>co in order that J tey1r2 dr an is bounded, and the
surface integral of j on So, vanishes. The case of momentum eigenfunctions that do
not vanish on S o, is considered in one of the following exercises.
Exercise 5.3.1. Consider the case where V= V, iV„ where the imaginary part V is a
—
constant. Is the Hamiltonian Hermitian? Go through the derivation of the continuity equation
and show that the total probability for finding the particle decreases exponentially as
e 2 V,t /h Such complex potentials are used to describe processes in which particles are absorbed
by a sink.
V 167
Figure 5.2. The singlestep potential. The dotted SIMPLE
line shows a more realistic potential idealized by PROBLEMS IN
the step, which is mathematically convenient. The Vo ONE DIMENSION
total energy E and potential energy V are
measured along the y axis. o
Exercise 5.3.2. Convince yourself that if v = c V, where c is constant (real or complex)
and (11 is real, the corresponding j vanishes.
Exercise 5.3.3. Consider
3/2
e(P/h
,
VP =
igh)
Find j and p and compare the relation between them to the electromagnetic equation j = pv,
v being the velocity. Since p and j are constant, note that the continuity Eq. (5.3.7) is trivially
satisfied.
Exercise 5.3.4.* Consider yl = A e'°"+B ew'm in one dimension. Show that j=
(1Al 2 El 2 )pi m. The absence of cross terms between the right and leftmoving pieces in v
allows us to associate the two parts of j with corresponding parts of v.
Ensemble Interpretation of j
Recall that j • dS is the rate at which probability flows past the area dS. If we
consider an ensemble of N particles all in some state v(r, t), then Nj • dS particles
will trigger a particle detector of area dS per second, assuming that N tends to
infinity and that j is the current associated with ty(r, t).
5.4. The SingleStep Potential: A Problem in Seatteringt
Consider the step potential (Fig. 5.2)
V(x)= O x<0 (region I)
=V0 x>0 (region II) (5.4.1)
Such an abrupt change in potential is rather unrealistic but mathematically
convenient. A more realistic transition is shown by dotted lines in the figure.
Imagine now that a classical particle of energy E is shot in from the left (region
I) toward the step. One expects that if E> Vo , the particle would climb the barrier
and travel on to region II, while if E< Vo , it would get reflected. We now compare
this classical situation with its quantum counterpart.
This rather difficult section may be postponed till the reader has gone through Chapter 7 and gained
more experience with the subject. It is for the reader or the instructor to decide which way to go.
vo
168
t =0
CHAPTER 5
a x X Figure 5.3. A schematic description of
the wave function long before and long
Vo *1* after it hits the step. The area under
t >» a p o / m)
/ 2 is unity. The areas under I VRI2
I W1
and I 11'r12,
12, respectively, are the prob
abilities for reflection and transmis
4/.
x sion.
First of all, we must consider an initial state that is compatible with quantum
principles. We replace the incident particle possessing a welldefined trajectory with
a wave packet. I Though the detailed wave function will be seen to be irrelevant
in the limit we will consider, we start with a Gaussian, which is easy to handle
analytically:
A 2) 1/4 eik0(x+a) e 2/2A2
Ipi(x, 0) = Vi(x)=(ir (5.4.2)
This packet has a mean momentum po = hk o , a mean position <X> = —a (which we
take to be far away from the step), with uncertainties
A h
X= AP — A
2 1 /2 2'/ 2A
We shall be interested in the case of large A, where the particle has essentially well
defined momentum hk o and energy E0 ' ‘h21(1,12m. We first consider the case Eo > Vo.
After a time t a[po/m] 1 , the packet will hit the step and in general break into
two packets: ty R , the reflected packet, and tit 7, the transmitted packet (Fig. 5.3).
The area under I tit RI 2 at large t is the probability of finding the particle in region I
in the distant future, that is to say, the probability of reflection. Likewise the area
under I v 71 2 at large t is the probability of transmission. Our problem is to calculate
the reflection coefficient
R= f I tyRI 2 t—> 00 (5.4.3)
and transmission coefficient
T= I VTI 2 dx, t—>oo (5.4.4)
Generally R and Twill depend on the detailed shape of the initial wave function.
If, however, we go to the limit in which the initial momentum is well defined (i.e.,
A wave packet is any wave function with reasonably welldefined position and momentum.
§ This is just the wave packet in Eq. (5.1.14), displaced by an amount —a.
when the Gaussian in x space has infinite width), we expect the answer to depend 169
only on the initial energy, it being the only characteristic of the state. In the following SIMPLE
analysis we will assume that X = A/2 1 /2 is large and that the wave function in k PROBLEMS IN
space is very sharply peaked near ko. ONE DIMENSION
We follow the standard procedure for finding the fate of the incident wave
packet, ty/:
Step 1: Solve for the normalized eigenfunction of the step potential Hamiltonian,
VE (x).
Step 2: Find the projection a(E)= <iv Ely' 1 > .
Step 3: Append to each coefficient a(E) a time dependence e'" and get y(x, t)
at any future time.
Step 4: Identify ty R and iv 7, in ly(x, t co) and determine R and T using Eqs. (5.4.3)
and (5.4.4).
Step 1. In region I, as V= 0, the (unnormalized) solution is the familiar one:
1/2
kl_ (2ME)
v/E(x)= A elk ' + B h2 (5.4.5)
In region II, we simply replace E by E — Vo [see Eq. (5.2.2)],
1/2
[2m(E— Vo)
yiE(x) = c eik2x + D eik2x , k2— h2 (5.4.6)
(We consider only E> Vo ; the eigenfunction with E< Vo will be orthogonal to tyi as
will be . shown on the next two pages.) Of interest to us are eigenfunctions with D=
0, since we want only a transmitted (rightgoing) wave in region II, and incident
plus reflected waves in region I. If we now impose the continuity of iv and its
derivative at x = 0; we get
A+ B= C (5.4.7)
ik i (A— B)= ik2 C (5.4.8)
In anticipation of future use, we solve these equations to express B and C in terms
of A:
k 2) A (E 112 (E— V0) 1 /2 ) A
B— (5.4.9)
k 1 +k2 E 1 /2 +(E—V0) 1/2 —
(2k 1 2E1/2 )
C— A— , A (5.4.10)
k 1 +k2) (E 1/2 +(E— Vo) 
170 Note that if Vo = 0, B= O and C= A as expected. The solution with energy E is then
CHAPTER 5
/ e(x)= ARez k lx + —
B e)0(—x)+ —C e'kv 0(x)1 (5.4.11)
A A
where
0(x)= 1 if x> 0
=0 if x<0
Since to each E there is a unique k 1= + (2mE /1i2) 1 /2 , we can label the eigenstates by
lc, instead of E. Eliminating k 2 in favor of k 1 , we get
B
//k, (x) = ARexp(ik, + — exp( — ik, x))0( —
1
A
C 2
+ — exp[i(ki — 2m Vo/h2) 1/2x] 0 (x)1 (5.4.12)
A
Although the overall scale factor A is generally arbitrary (and the physics depends
only on B/A and C/A), here we must choose A= (270 1 /2 because ty k has to be
properly normalized in the fourstep procedure outlined above. We shall verify
shortly that A= (270 1 /2 is the correct normalization factor.
Step 2. Consider next
a(ki)= 011 kilY 1
co
1 { f
+(—B) esk 'xi0(—x
)v i (x) dx
(27 ) 1/2 A
+ °3 (1 * e zk2x 0 (x)ty i (x) dx} (5.4.13)
A
The second integral vanishes (to an excellent approximation) since tyj (x) is nonvan
ishing far to the left of x=0, while 0(x) is nonvanishing only for x> O. Similarly
the second piece of the first integral also vanishes since vi in k space is peaked
around k= +1c0 and is orthogonal to (leftgoing) negative momentum states. [We
can ignore the 0( — x) factor in Eq. (5.4.13) since it equals 1 where vi (x) O.] So
a(k i ) =(—)
f &Ix (x) dx
27c —ao
2 ) 1 /4
e 1 —k0) 2&/2
e
ik ia (5.4.14)
TC
is just the Fourier transform of v' s. Notice that for large A, a(k i ) is very sharply 171
peaked at k l =k0 . This justifies our neglect of eigenfunctions with E< Vo , for these SIMPLE
correspond to lc, not near /co. PROBLEMS IN
ONE DIMENSION
Step 3. The wave function at any future time t is
(x, t)= fcc'
—co
d(lci) e IE (ki" tP ki(X) dki (5.4.15)
( f exp (—ihk;t) exp
6,2
47/3)
) 1/4
2m
[ —(kl—k0)26,2 1
2
exp(ikia)
x { eik ix 0 (_ x ) ± ( B ) e x 0( x)
A
+ exp[i(k; — 2m V0/ 2) I /2x] O (x)} dk, (5.4.16)
A
You can convince yourself that if we set t = 0 above we regain v i (x), which corrobor
ates our choice A= (2r) 1 /2.
Step 4. Consider the first of the three terms. If 0(—x) were absent, we would
be propagating the original Gaussian. After replacing x by x + a in Eq. (5.1.15), and
inserting the 0(—x) factor, the first term of p(x, t) is
(A + iht) —1/2 hkot/m)21
0( —.X)7C1 /4 exp
m 26,2(1+ iht/m6,2)
x exp [iko + a— 111(01)] = 0(—x)G(—a, k o , t) (5.4.17)
2m
Since the Gaussian G(—a, k 2 , t) is centered at x= —a+ hk ot/mhk ot/m as t—'x,
and 0(—x) vanishes for x >0, the product OG vanishes. Thus the initial packet has
disappeared and in its place are the reflected and transmitted packets given by the
next two terms. In the middle term if we replace B/A, which is a function of k 1 , by
its value (B/A) 0 at lc, =k0 (because a(k 1 ) is very sharply peaked at k l =k0) and pull
it out of the integral, changing the dummy variable from k l to —k 1 , it is easy to see
that apart from the factor (B/A) 00(—x) up front, the middle term represents the
free propagation of a normalized Gaussian packet that was originally peaked at x=
+a and began drifting to the left with mean momentum —hk o . Thus
R = 0( — x)G(a, —k o , t)(B/A)o (5.4.18)
172 As t —>cc, we can set 61 ( — x) equal to 1, since G is centered at x=a—hk ot/m
CHAPTER 5
—hk otlm. Since the Gaussian G has unit norm, we get from Eqs. (5.4.3) and (5.4.9),
2 c1/2 (E0 vol /2 2
R= IV RI' dx 
A 0 E'2 +(E0 — V0 ) "2
where
h 2k/)
Eo= (5.4.19)
2m
This formula is exact only when the incident packet has a welldefined energy E0,
that is to say, when the width of the incident Gaussian tends to infinity. But it is an
excellent approximation for any wave packet that is narrowly peaked in momentum
space.
To find T, we can try to evaluate the third piece. But there is no need to do so,
since we know that
R+T=1 (5.4.20)
which follows from the global conservation of probability. It then follows that
4Ed /2(Eo vo I /2 2
(E0 — V0 )"2
T=1— R= (5.4.21)
[4 /2 + (E0 V0)"2]2 A 4/2
By inspecting Eqs. (5.4.19) and (5.4.21) we see that both R and T are readily
expressed in terms of the ratios (B/A) 0 and (C/A) 0 and a kinematical factor,
(E0 — V0) 1 /2/E0172 . Is there some way by which we can directly get to Eqs. (5.4.19)
and (5.4.21), which describe the dynamic phenomenon of scattering, from Eqs. (5.49)
and (5.4.10), which describe the static solution to Schrbdinger's equation? Yes.
Consider the unnormalized eigenstate
W ko (x ) = [ 24 0 exp(iko +B0 exp( — iko x)] 0 ( —x)
2m v°)1/2
+ Co exp [i(k(2) h2 x0(x) (5.4.22)
The incoming plane wave A elk' has a probability current associated with it equal
to
hko
I Aol' (5.4.23)
while the currents associated with the reflected and transmitted pieces are 173
SIMPLE
2hko PROBLEMS IN
R =I BOI (5.4.24) ONE DIMENSION
ni
and
(14 —2mVo /h2) 1 /2 (5.4.25)
(Recall Exercise 5.3.4, which provides the justification for viewing the two parts of
the j in region I as being due to the incident and reflected wave functions.) In terms
of these currents
2
R Bo
R=—= (5.4.26)
Ao
and
2
• Co 2 (k j  2mVo/h) 172 Co (E0 — V0) 112
T =1 = (5.4.27)
il Ao ko Ao 4 /2
Let us now enquire as to why it is that R and T are calculable in these two
ways. Recall that R and T were exact only for the incident packet whose momentum
was well defined and equal to fik o . From Eq. (5.4.2) we see that this involves taking
the width of the Gaussian to infinity. As the incident Gaussian gets wider and wider
(we ignore now the AI /2 factor up front and the normalization) the following things
happen:
(1) It becomes impossible to say when it hits the step, for it has spread out to be a
rightgoing plane wave in region I.
(2) The reflected packet also gets infinitely wide and coexists with the incident one,
as a leftgoing plane wave.
(3) The transmitted packet becomes a plane wave with wave number
(14 — 2m Vo/ h2 ) 1/2 in region II.
In other words, the dynamic picture of an incident packet hitting the step and
disintegrating into two becomes the steadystate process described by the eigenfunc
tion Eq. (5.4.22). We cannot, however, find R and T by calculating areas under
2
I VT12 and I wRI since all the areas are infinite, the wave packets having been trans
formed into plane waves. We find instead that the ratios of the probability currents
associated with the incident, reflected, and transmitted waves give us R and T. The
equivalence between the wave packet and static descriptions that we were able to
demonstrate in this simple case happens to be valid for any potential. When we come
to scattering in three dimensions, we will assume that the equivalence of the two
approaches holds.
174 Exercise 5.4.1 (Quite Hard). Evaluate the third piece in Eq. (5.4.16) and compare the
resulting T with Eq. (5.4.21). [Hint: Expand the factor (k; 2mV 0/h2 ) I / 2 near k i =k0 , keeping
CHAPTER 5
just the first derivative in the Taylor series.]
Before we go on to examine some of the novel features of the reflection and
transmission coefficients, let us ask how they are used in practice. Consider a general
problem with some V(x), which tends to constants V+ and V_ as x> Go. For
simplicity we take V, = O. Imagine an accelerator located to the far left (x>  co)
which shoots out a beam of nearly monoenergetic particles with <P> = hko toward
the potential. The question one asks in practice is what fraction of the particles will
get transmitted and what fraction will get reflected to x = co, respectively. In gen
eral, the question cannot be answered because we know only the mean momenta of
the particles and not their individual wave functions. But the preceding analysis
shows that as long as the wave packets are localized sharply in momentum space, the
reflection and transmission probabilities (R and T) depend only on the mean momentum
and not the detailed shape of the wave functions. So the answer to the question raised
above is that a fraction R(k 0) will get reflected and a fraction T(k o)= 1  R(k 0 ) will
get transmitted. To find R and T we solve for the timeindependent eigenfunctions
of H = T+ V with energy eigenvalue Eo = h214/2m, and asymptotic behavior
>A ei4x + B eda'x
Wk0(x )
C eik'x
and obtain from it R IB/Al2 and T=IC/Al2. Solutions with this asymptotic
behavior (namely, freeparticle behavior) will always exist provided V vanishes rap
idly enough as 1x1 *co. [Later we will see that this means Ix V(x)1>0 as 1x1> co.]
The general solution will also contain a piece D exp(iko x) as x> co, but we set
D = 0 here, for if a exp(iko x) is to be identified with the incident wave, it must only
produce a rightmoving transmitted wave C e'k'x as x> co.
Let us turn to Eqs. (5.4.19) and (5.4.21) for R and T. These contain many
nonclassical features. First of all we find that an incident particle with E0 > Vo gets
reflected some of the time. It can also be shown that a particle with E0 > Vo incident
from the right will also get reflected some of the time, contrary to classical
expectations.
Consider next the case Eo < V0 . Classically one expects the particle to be reflected
at x= 0, and never to get to region II. This is not so quantum mechanically. In
region II, the solution to
d2 yin 2m
+ (E0  Vo) = 0
dx2 h2
with E0 < Vo is
(2mI(E0 
tv ii (x)= C , K h2 (5.4.28)
(The growing exponential ex does not belong to the physical Hilbert space.) Thus 175
there is a finite probability for finding the particle in the region where its kinetic SIMPLE
energy E0 — Vo is negative. There is, however, no steady flow of probability current PROBLEMS IN
into region II, since yin (x) = Op, where tp is real. This is also corroborated by the ONE DIMENSION
fact the reflection coefficient in this case is
2 2
(E0) 112 (E0 V0) 172 k0 — iK
R— — 1 (5.4.29)
(Eo) 172 ± (E0 V0) 172 k0 —ix
The fact that the particle can penetrate into the classically forbidden region leads
to an interesting quantum phenomenon called tunneling. Consider a modification of
Fig. 5.2, in which V= Vo only between x = 0 and L (region II) and is once again
zero beyond x = L (region III). If now a plane wave is incident on this barrier from
the left with E< V0 , there is an exponentially small probability for the particle to
get to region III. Once a particle gets to region III, it is free once more and described
by a plane wave. An example of tunneling is that of a particles trapped in the nuclei
by a barrier. Every once in a while an a particle manages to penetrate the barrier
and come out. The rate for this process can be calculated given Vo and L.
Exercise 5.4.2. (a) * Calculate R and T for scattering off a potential V(x)= V 0a3(x). (b)
Do the same for the case V=0 for 1x1> a and V= Vo for 1x1< a. Assume that the energy is
positive but less than vo.
Exercise 5.4.3. Consider a particle subject to a constant force f in one dimension. Solve
for the propagator in momentum space and get
U(p, t; 0) = 8(p— —ft) P"'hf (5.4.30)
Transform back to coordinate space and obtain
U(x, t; 0) = )1/2exp [m(x x')2 ± ft(x+ x')— f2t3 1} (5.4.31)
2ithit h 2t 2 24m
[Hint: Normalize VIE ( p) such that <EIE')= 3(E— E'). Note that E is not restricted to be
positive.]
5.5. The DoubleSlit Experiment
Having learned so much quantum mechanics, it now behooves us to go back
and understand the doubleslit experiment (Fig. 3.1). Let us label by I and lithe
regions to the left and right of the screen. The incident particle, which must really
be represented by a wave packet, we approximate by a plane wave of wave number
k=p/h. The impermeable screen we treat as a region with V= cc, and hence the
region of vanishing iv. Standard wave theory (which we can borrow from classical
electromagnetism) tells us what happens in region II: the two slits act as sources of
radially outgoing waves of the same wavelength. These two waves interfere on the
176 line AB and produce the interference pattern. We now return to quantum mechanics
CHAPTER 5
and interpret the intensity 10 2 as the probability density for finding the particle.
5.6. Some Theorems
Theorem 15. There is no degeneracy in onedimensional bound states.
Proof Let v i and v2 be two solutions with the same eigenvalue E:
—h2 d2 v,
+VWi= EFi (5.6.1)
2m dx 2
—h2 d2 12
± V11/2= Ell/2 (5.6.2)
2m dx 2
Multiply the first by v2 , the second by tv, and subtract, to get
2
d V2 d
dx2 112 dx2 —
1
or
d (
clty 2 n
fl dx (112 dx )
so that
dy/2 dtvi
=c (5.6.3)
dx dx
To find the constant c, go to Ix 1 —> co, where v i and lif2 vanish, since they describe
bound states by assumption.t It follows that c = O. So
1 1
— dtv, =
(02
log v i =log v2 + d (d is a constant)
vi=edv2 (5.6.4)
The theorem holds even if ty vanishes at either +oe or —oe. In a bound state it vanishes at both ends.
But one can think of situations where the potential confines the wave function at one end but not the
other.
Thus the two eigenfunctions differ only by a scale factor and represent the same 177
state. Q.E.D. SIMPLE
What about the freeparticle case, where to every energy there are two degenerate PROBLEMS IN
solutions with p= ±(2mE/h2) 1 /2? The theorem doesn't apply here since yip(x) does ONE DIMENSION
not vanish at spatial infinity. [Calculate C in Eq. (5.6.3).]
Theorem 16. The eigenfunctions of H can always be chosen pure real in the
coordinate basis.
Proof If
[— h2 d2
+ V(X)]In = En In
2m dx 2
then by conjugation
[—h2 d2
V(X)1 IV: = En tv*
2m dx 2
Thus tv, and tv: are eigenfunctions with the same eigenvalue. It follows that the real
and imaginary parts of ign ,
tvn+ tv:
Igr —
2
and
wn — tv:
(Pi=
2i
are also eigenfunctions with energy E. Q.E.D.
The theorem holds in higher dimensions as well for Hamiltonians of the above
form, which in addition to being Hermitian, are real. Note, however, that while
Hermiticity is preserved under a unitary change of basis, reality is not.
If the problem involves a magnetic field, the Hamiltonian is no longer real in
the coordinate basis, as is clear from Eq. (4.3.7). In this case the eigenfunctions
cannot be generally chosen real. This question will be explored further at the end of
Chapter 11.
Returning to one dimension, due to nondegeneracy of bound states, we must
have
41 i=cti/r, c, a constant
178 Consequently,
CHAPTER 5
Since the overall scale E is irrelevant, we can ignore it, i.e., work with real eigen
functions with no loss of generality.
This brings us to the end of our study of onedimensional problems, except for
the harmonic oscillator, which is the subject of Chapter 7.
The Classical Limit
It is intuitively clear that when quantum mechanics is applied to a macroscopic
system it should reproduce the results of classical mechanics, very much the way that
relativistic dynamics, when applied to slowly moving (v/c« 1) objects, reproduces
Newtonian dynamics. In this chapter we examine how classical mechanics is regained
from quantum mechanics in the appropriate domain. When we speak of regaining
classical mechanics, we refer to the numerical aspects. Qualitatively we know that
the deterministic world of classical mechanics does not exist. Once we have bitten
the quantum apple, our loss of innocence is permanent.
We commence by examining the time evolution of the expectation values. We
find
d d
dt <n> =
— <VIQI V>
dt
OkInly>+<vInlik>+<iglf.2 1tv>1 (6.1)
In what follows we will assume that S2 has no explicit time dependence. We will
therefore drop the third term <0 61 Iv>. From the Schrödinger equation, we get
i
I ik>= —
h
H1v>
and from its adjoint,
i
Oki = —h <tvl II
$ If you are uncomfortable differentiating bras and kets, work in a basis and convince yourself that this
step is correct. 179
180 Feeding these into Eq. (6.1) we get the relation
CHAPTER 6
— 111V>
dt
=H
h
<KI, H]> (6.2)
which is called Ehrenfest's theorem.
Notice the structural similarity between this equation and the corresponding
one from classical mechanics:
do)
{, (6.3)
Tit
We continue our investigation to see how exactly the two mechanics are related. Let
us, for simplicity, discuss a particle in one dimension. If we consider Q =X we get
111> (6.4)
If we assume
P2
H=—+ V(X)
2m
then
Now
[X, P 2]= P[X, P]+ [X , PIP [from Eq. (1.5.10)]
=2ihP
so that
(6.5)
The relation •i =p/m of classical mechanics now appears as a relation among the 181
mean values. We can convert Eq. (6.5) to a more suggestive form by writing THE CLASSICAL
LIMIT
P OH
—
m OP
where OH/aP is a formal derivative of H with respect to P, calculated by pretending
that H, P, and X are just c numbers. The rule for finding such derivatives is just as
in calculus, as long as the function being differentiated has a power series, as in this
case. We now get, in the place of Eq. (6.5),
<1;> (Oalip )
(6.6)
Consider next
. 1
<[P, H]>
1
= — <[P, V(X)]>
ih
To find [P, V(X)] we go to the X basis, in which
and V(X)—> V(x)
dx
and for any w(x),
dV
[— ih —d , V(x)itlf(x)= — ih Vi (x)
dx
We conclude that in the abstract,
dV
[P, V(X)]= — ih (6.7)
dX
where dV/dX is again a formal derivative. Since dV/dX=OH/OX, we get
(6.8)
The similarity between Eqs. (6.6) and (6.8) and Hamilton's equations is rather strik
ing. We would like to see how the quantum equations reduce to Hamilton's equations
when applied to a macroscopic particle (of mass 1 g, say).
182 First of all, it is clear that we must consider an initial state that resembles the
CHAPTER 6
states of classical mechanics, i.e., states with welldefined position and momentum.
Although simultaneous eigenstates of X and P do not exist, there do exist states
which we can think of as approximate eigenstates of both X and P. In these states,
labeled I xop0A>, <X>=x0 and <P>=po, with uncertainties AX= A and P h/i,
both of which are small in the macroscopic scale. A concrete example of such a state
is
)1 /4
1 ipox/h e (xx 0 )2/2e2
I x0p0A> A= e (6.9)
(rA2
If we choose A 1013 cm, say, which is the size of a proton, AP 10 ' 4 g cm/sec.
For a particle of mass 1g, this implies AVL 10 '4 cm/sec, an uncertainty far below
the experimentally detectable range. In the classical scale, such a state can be said
to have welldefined values for X and P, namely, x0 and po , since the uncertainties
(fluctuations) around these values are truly negligible. If we let such a state evolve
with time, the mean values xo(t) and po(t) will follow Hamilton's equations, once
again with negligible deviations. We establish this result as follows.
Consider Eqs. (6.6) and (6.8) which govern the evolution of <X> = x0 and <P>=
Po. These would reduce to Hamilton's equations if we could replace the mean values
of the functions on the righthand side by the functions of the mean values:
(011(X, P)) 0,Ye(xo,p0)
io= a> = (6.10)
OP OP (x=x0,p=p0) aPo
and
(OH) OH 01((xo,p0)
13o= <fi> (6.11)
OX (X = xo,P =") 0 x0
If we consider some function of X and P, we will find in the same approximation
<wx, P)> .0(xo, po) = co(xo, po)
 (6.12)
Thus we regain classical physics as a good approximation whenever it is a good
approximation to replace the mean of the functions OH/OP, —011/0X, and SI(X,P)
by the functions of the mean. This in turn requires that the fluctuations about the
mean have to be small. (The result is exact if there are no fluctuations.) Take as a
concrete example Eqs. (6.10) and (6.11). There is no approximation involved in the
first equation since <OH/OP> is just <P/m>=p0/m. In the second one, we need to
approximate <011/0X>=<dV/dX>=<V'(X)> by Vr(X= x0). To see when this is a
good approximation, let us expand V' in a Taylor series around x0 . Here it is
convenient to work in the coordinate basis where V(X)= V(x). The series is
r (x ) = r(x0 ) + — xo) v"(x0) +1(x — x0) 2 v( x0) + • • •
Let us now take the mean of both sides. The first term on the righthand side, which 183
alone we keep in our approximation, corresponds to the classical force at xo , and THE CLASSICAL
thus reproduces Newton's second law. The second vanishes in all cases, since the LIMIT
mean of x— x o does. The succeeding terms, which are corrections to the classical
approximation, represent the fact that unlike the classical particle, which responds
only to the force F= —V' at x o , the quantum particle responds to the force at
neighboring points as well. (Note, incidentally, that these terms are zero if the poten
tial is at the most quadratic in the variable x.) Each of these terms is a product of
two factors, one of which measures the size or nonlocality of the wave packet and
the other, the variation of the force with x. (See the third term for example.) At an
intuitive level, we may say that these terms are negligible if the force varies very little
over the "size" of the wave packet. (There is no unique definition of "size." The
uncertainty is one measure. We see above that the uncertainty squared has to be
much smaller than the inverse of the second derivative of the force.) In the present
case, where the size of the packet is of the order of 10 ' 3 cm, it is clear that the
classical approximation is good for any potential that varies appreciably only over
macroscopic scales.
There is one apparent problem: although we may start the system out in a state
with A ___' 10  ' 3 cm, which is certainly a very small uncertainty, we know that with
passing time the wave packet will spread. The uncertainty in the particle's position
will inevitably become macroscopic. True. But recall the arguments of Section 5.1.
We saw that the spreading of the wave packet can be attributed to the fact that any
initial uncertainty in velocity, however small, will eventually manifest itself as a giant
uncertainty in position. But in the present case (A V 10  '4 cm/sec) it would take
300,000 years before the packet is even a millimeter across! (It is here that we invoke
the fact that the particle is macroscopic: but for this, a small AP would not imply
a small A V.) The problem is thus of academic interest only; and besides, it exists in
classical mechanics as well, since the perfect measurement of velocity is merely an
idealization.
There remains yet another question. We saw that for a macroscopic particle pre
pared in a state I xopoA>, the time evolution of xo and po will be in accordance with
Hamilton's equations. Question: While it is true that a particle in such a conveniently
prepared state obeys classical mechanics, are these the only states one encounters in
classical mechanics? What if the initial position of the macroscopic particle is fixed
to an accuracy of 1027 cm? Doesn't its velocity now have uncertainties that are
classically detectable? Yes. But such states do not occur in practice. The classical
physicist talks about making exact position measurements, but never does so in
practice. This is clear from the fact that he uses light of a finite frequency to locate
the particle's positions, while only light of infinite frequency has perfect resolution.
For example light in the visible spectrum has a wavelength of Ar 10 5 cm and thus
the minimum AX is ' 10 5 cm. If one really went towards the classical ideal and
used photons of decreasing wavelength, one would soon find that the momentum of
the macroscopic particle is affected by the act of measuring its position. For example,
by the time one gets to a wavelength of 10 27 cm, each photon would carry a momen
tum of approximately 1 g cm/sec and one would see macroscopic objects recoiling
under their impact.
In summary then, a typical macroscopic particle, described classically as possess
ing a welldefined value of x and p, is in reality an approximate eigenstate I xopoz»,
184 where A is at least 105 cm if visible light is used to locate the particle. The quantum
CHAPTER 6 equations for the time evolution of these approximate eigenvalues xo and po reduce
to Hamilton's equations, up to truly negligible uncertainties. The same goes for any
other dynamical variable dependent on x and p.
We conclude this chapter by repeating an earlier observation to underscore its
importance. Ehrenfest's theorem does not tell us that, in general, the expectation
values of quantum operators evolve as do their classical counterparts. In particular,
<X> =xo and <P>=po do not obey Hamilton's equations in all problems. For them
to obey Hamilton's equations, we must be able to replace the mean values (expecta
tion values) of the functions OHIOP and OH/OX of X and P by the corresponding
functions of the mean values <X> = xo and <P> = po . For Hamiltonians that are at
the most quadratic in X and P, this replacement can be done with no error for all
wave functions. In the general case, such a replacement is a poor approximation
unless the fluctuations about the means xo and po are small. Even in those cases
where xo and po obey classical equations, the expectation value of some dependent
variable f/(X, P) need not, unless we can replace <SI(X, P)> by f/(<X>, <P>)=
co(xo,Po)•
Example 6.1. Consider <SI(X)>, where f/ =X 2, in a state given by tic(x)=
A exp[ — (x — a) 2/2A2]. Is <2(X)>= (<X>)?No, for the difference between the two
is <X 2 > — <X> 2 = (AX ) 2 0 O.
The Harmonic Oscillator
7.1. Why Study the Harmonic Oscillator?
In this section I will put the harmonic oscillator in its place—on a pedestal. Not
only is it a system that can be exactly solved (in classical and quantum theory) and
a superb pedagogical tool (which will be repeatedly exploited in this text), but it is
also a system of great physical relevance. As will be shown below, any system fluctu
ating by small amounts near a configuration of stable equilibrium may be described
either by an oscillator or by a collection of decoupled harmonic oscillators. Since
the dynamics of a collection of noninteracting oscillators is no more complicated
than that of a single oscillator (apart from the obvious Nfold increase in degrees
of freedom), in addressing the problem of the oscillator we are actually confronting
the general problem of small oscillations near equilibrium of an arbitrary system.
A concrete example of a single harmonic oscillator is a mass m coupled to a
spring of force constant k. For small deformations x, the spring will exert the force
given by Hooke's law, F= —kx, (k being its force constant) and produce a potential
V= 4x2. The Hamiltonian for this system is
2
= T+ V — P+ mco2x2 (7.1.1)
2m2
where co = (k/m)' 2 is the classical frequency of oscillation. Any Hamiltonian of the
above form, quadratic in the coordinate and momentum, will be called the harmonic
oscillator Hamiltonian. Now, the massspring system is just one among the following
family of systems described by the oscillator Hamiltonian. Consider a particle moving
in a potential V(x). If the particle is placed at one of its minima xo , it will remain
there in a state of stable, static equilibrium. (A maximum, which is a point of unstable
static equilibrium, will not interest us here.) Consider now the dynamics of this
particle as it fluctuates by small amounts near x = x0 . The potential it experiences
may be expanded in a Taylor series:
dV 1 d2 V 2
V(x)= V(xo) + — xo) + X0) +•••
(7.1.2)
dx xo 2! dx2 xo 185
186 Now, the constant piece V(x 0) is of no physical consequence and may be
CHAPTER 7
dropped. [In other words, we may choose V(x0) as the arbitrary reference point for
measuring the potential.] The second term in the series also vanishes since xo is a
minimum of V(x), or equivalently, since at a point of static equilibrium, the force,
—dV/dx, vanishes. If we now shift our origin of coordinates to xo Eq. (7.1.2) reads
1 d2 V 2 1 d 3 V 3
V(x) — X +  X +••• (7.1.3)
2! dx2 o 3! dx3 0
For small oscillations, we may neglect all but the leading term and arrive at the
potential (or Hamiltonian) in Eq. (7.1.1), d2 V/dx 2 being identified with k=mco 2.
(By definition, x is small if the neglected terms in the Taylor series are small compared
to the leading term, which alone is retained. In the case of the massspring system,
x is small as long as Hooke's law is a good approximation.)
As an example of a system described by a collection of independent oscillators,
consider the coupledmass system from Example 1.8.6. (It might help to refresh your
memory by going back and reviewing this problem.) The Hamiltonian for this system
is
2 2 ,
pi P2
+1 MCO 2 [X + Xi + (X i — X2) 2]
2m 2m 2
= Yei + fe2 + .12 MN 2 (X i — X2) 2 (7.1.4)
Now this ff is not of the promised form, since the oscillators corresponding to Ye,
and Y/9 2 (associated with the coordinates xl and x2 ) are coupled by the (x 1 — x2 )2
term. But we already know of an alternate description of this system in which it can
be viewed as two decoupled oscillators. The track is of course the introduction of
normal coordinates. We exchange xl and x2 for
XI + X2
Xi — (7.1.5a)
2 1 /2
and
XI — X2
XII — 2 1/2 (7.1.5b)
By differentiating these equations with respect to time, we get similar ones for the
velocities, and hence the momenta. In terms of the normal coordinates (and the
corresponding momenta),
2 1 „2 1
pl 2 2 PII J
1
+ m co xi + — + — mco 24 (7.1.6)
2m 2 2m 2
Thus the problem of the two coupled masses reduces to that of two uncoupled
oscillators of frequencies co l = co = (k/m) 1 /2 and co n = 3 112co = (3k/m)1/2.
Let us rewrite Eq. (7.1.4) as 187
THE HARMONIC
1 22 12 2 OSCILLATOR
r = E piou p1 + E E (7.1.7)
2m i = 1 .J = 1 2 i = 1 ;=,
where V, are elements of a real symmetric (Hermitian) matrix V with the following
values:
V11 = V22 = 2mco 2 , VI2 = V21 =  MCO 2 (7.1.8)
In switching to the normal coordinates x1 and xi' (and pi and PH), we are going
to a basis that diagonalizes V and reduces the potential energy to a sum of decoupled
terms, one for each normal mode. The kinetic energy piece remains decoupled in
both bases.
Now, just as the massspring system was just a representative element of a
family of systems described by the oscillator Hamiltonian, the coupledmass system
is also a special case of a family that can be described by a collection of coupled
harmonic oscillators. Consider a system with N Cartesian degrees of freedom
xl . . xN , with a potential energy function V(x i , . . . , x N ). Near an equilibrium point
(chosen as the origin), the expansion of V, in analogy with Eq. (7.1.3), is
1 N N 02 v
v(x, xN) = E E xix (7.1.9)
2 i= i = ex i ex, 0
For small oscillations, the Hamiltonian is
N N N N
ye= E EP 8yPj 1 E E x i v,,x; (7.1.10)
1= 1 j= 1 2m 2 i= 1 j=1
where
VO 
O2 y a2v (7.1.11)
()xi OXJ axi o
are the elements of a Hermitian matrix V. (We are assuming for simplicity that the
masses associated with all N degrees of freedom are equal.) From the mathematical
theory of Chapter 1, we know that there exists a new basis (i.e., a new set of
coordinates x 1 , xll , . ) which will diagonalize V and reduce A° to a sum of N
decoupled oscillator Hamiltonians, one for each normal mode. Thus the general
problem of small fluctuations near equilibrium of an arbitrary system reduces to the
study of a single harmonic oscillator.
This section concludes with a brief description of two important systems which
are described by a collection of independent oscillators. The first is a crystal (in three
dimensions), the atoms in which jiggle about their mean positions on the lattice. The
second is the electromagnetic field in free space. A crystal with No atoms (assumed
to be point particles) has 3N0 degrees of freedom, these being the displacements from
188 equilibrium points on the lattice. For small oscillations, the Hamiltonian will be
CHAPTER 7
quadratic in the coordinates (and of course the momenta). Hence there will exist
3N0 normal coordinates and their conjugate momenta, in terms of which ,Ye will be
a decoupled sum over oscillator Hamiltonians. What are the corresponding normal
modes? Recall that in the case of two coupled masses, the normal modes corre
sponded to collective motions of the entire system, with the two masses in step in
one case, and exactly out of step in the other. Likewise, in the present case, the
motion is collective in the normal modes, and corresponds to plane waves traveling
across the lattice. For a given wavevector k, the atoms can vibrate parallel to k
(longitudinal polarization) or in any one of the two independent directions perpendic
ular to k (transverse polarization). Most books on solid state physics will tell you
why there are only N0 possible values for k. (This must of course be so, for with
three polarizations at each k, we will have exactly 3N0 normal modes.) The modes,
labeled (k, )L), where A is the polarization index (A = 1, 2, 3), form a complete basis
for expanding any state of the system. The coefficients of the expansion, a(k, A.), are
the normal coordinates. The normal frequencies are labeled a)(k, A.).t
In the case of the electromagnetic field, the coordinate is the potential A(r, t)
at each point in space. [Â(r, t) is the "velocity" corresponding to the coordinate
A(r, t).] The normal modes are once again plane waves but with two differences:
there is no restriction on k, but the polarization has to be transverse. The quantum
theory of the field will be discussed at length in Chapter 18.
7.2. Review of the Classical Oscillator
The equations of motion for the oscillator are, from Eq. (7.1.1),
OA' p
x= — (7.2.1)
Op m
aye
fi= mco 2X (7.2.2)
öx
By eliminating j3 we arrive at the familiar equation
,
i+ (0 2x= o
with the solution
x(t)= A cos on + ,6 sin cot= x0 cos(cot + (7.2.3)
where x0 is the amplitude and 4) the phase of oscillator. The conserved energy
associated with the oscillator is
21
E= T+ V=  m2+
21 mco 2x2 = ffico 2x4
(7.2.4)
To draw a parallel with the twomass system, (k, )) is like I or II, a(k, ).) is like xl or x11 and w(k, A)
is like (k/m)" or (3k/m)2.
Since x0 is a continuous variable, so is the energy of the classical oscillator. The 189
lowest value for E is zero, and corresponds to the particle remaining at rest at the THE HARMONIC
origin. OSCILLATOR
By solving for ic in terms of E and x from Eq. (7.2.4) we obtain
•i = (2E / m  co 2x2) 1 /2 = CO (XP  x2 ) 1 /2 (7.2.5)
which says that the particle starts from rest at a turning point (x = ±x0 ), picks up
speed till it reaches the origin, and slows down to rest by the time it reaches the
other turning point.
You are reminded of these classical results, so that you may readily compare
and contrast them with their quantum counterparts.
7.3. Quantization of the Oscillator (Coordinate Basis)
We now consider the quantum oscillator, that is to say, a particle whose state
vector 1 yl> obeys the Schradinger equation
d
ill — IV>= Hi w>
dt
with
P2 1
Ye(xa ., pP)=—+H= mco 2X 2
2m 2
As observed repeatedly in the past, the complete dynamics is contained in the propa
gator U(t), which in turn may be expressed in terms of the eigenvectors and eigenval
ues of H. In this section and the next, we will solve the eigenvalue problem in the
X basis and the H basis, respectively. In Section 7.5 the passage from the H basis
to the X basis will be discussed. The solution in the P basis, trivially related to the
solution in the X basis in this case, will be discussed in an exercise.
With an eye on what is to follow, let us first establish that the eigenvalues of H
cannot be negative. For any I iv>,
1 1
<H>=27 < V1P 21 0+ rnco2<tY1X 2 1 0
m
1 1
= 01/1 14 t PI V> +  mc02< V I XtX. 1 V>
2m 2
1 1
= <PIYIPIY>+ iinco 2ayclXv> _O
2m
since the norms of the states I Pig> and I Xyl> cannot be negative. If we now set 1 tv>
equal to any eigenstate of H, we get the desired result.
190 Armed with the above result, we are now ready to attack the problem in the X
CHAPTER 7 basis.
We begin by projecting the eigenvalue equation,
(p2 1
 +  mco 2X 2)IE>= ElE> (7.3.1)
2m 2
onto the X basis, using the usual substitutions
X .x
d
P—>—ih
dx
1E>—>tgE(x)
and obtain
2 2 1 2 2
CL + mco X )yf = Eyf (7.3.2)
( 2m dx 2 2
(The argument of yf and the subscript E are implicit.)
We can rearrange this equation to the form
d2 yi 2m ( 2
+ E ! mco x2)yf = 0
— (7.3.3)
dx2 h2 2
We wish to find all solutions to this equation that lie in the physical Hilbert space
(of functions normalizable to unity or the Dirac delta function). Follow the approach
closely—it will be invoked often in the future.
The first step is to write Eq. (7.3.3) in terms of dimensionless variables. We
look for a new variable y which is dimensionless and related to x by
x= by (7.3.4)
where b is a scale factor with units of length. Although any length b (say the radius
of the solar system) will generate a dimensionless variable y, the idea is to choose
the natural length scale generated by the equation itself. By feeding Eq. (7.3.4) into
Eq. (7.3.3), we arrive at
d2 ty + 2mEb 2 m 2co 2b4 2
h2 Vi h2 Jr lif = 0 (7.3.5)
dy2
The last terms suggests that we choose 191
THE HARMONIC
1/2 OSCILLATOR
b= (7.3.6)
(:0)
Let us also define a dimensional variable e corresponding to E:
mEb 2 E
6— h2 = hco
— (7.3.7)
(We may equally well choose e=2mEb2/h2. Constants of order unity are not uniquely
suggested by the equation. In the present case, our choice of e is in anticipation of
the results.) In terms of the dimensionless variables, Eq. (7.3.5) becomes
(7.3.8)
where the prime denotes differentiation with respect to y.
Not only do dimensionless variables lead to a more compact equation, they also
provide the natural scales for the problem. By measuring x and E in units of
(h/mco) 1 /2 and hco, which are scales generated intrinsically by the parameters enter
ing the problem, we develop a feeling for what the words "small" and "large" mean:
for example the displacement of the oscillator is large if y is large. If we insist on
using the same units for all problems ranging from the atomic physics to cosmology,
we will not only be dealing with extremely large or extremely small numbers, we will
also have no feeling for the size of quantities in the relevant scale. (A distance of
1020 parsecs, small on the cosmic scale, is enormous if one is dealing with an atomic
system.)
The next step is to examine Eq. (7.3.8) at limiting values of y to learn about
the solution in these limits. In the limit y—*co, we may neglect the 2ety term and
obtain
iv"  y2 ty = 0 (7.3.9)
The solution to this equation in the same limit is
v= Aym e ±Y 2/2
for
±
V" = Aym + 2 ' e±Y2/2 [1 2My2+ 1 + 17(M 4— 1)
Y ]
y _„: Aym +2 e'Y 2/2 =y2 tif

192 where we have dropped all but the leading power in y as y4 co. Of the two possibilit
CHAPTER 7
ies yrn e±Y 2/2 , we pick yrn eY2/2 , for the other possibility is not a part of the physical
Hilbert space since it grows exponentially as y4 co.
Consider next the y40 limit. Equation (7.3.8) becomes, upon dropping the y2 ty
term,
tv"+2Ev= 0
which has the solution
Iv = A cos(20 1 /2y + B sin(20 1i2y
Since we have dropped the y2 term in the equation as being too small, consistency
demands that we expand the cosine and sine and drop terms of order y2 and beyond.
We then get
ty  A+ cy+ 0(y2)
y—.0
where c is a new constant [=B(2E) 1 /2 ] .
We therefore infer that Iv is of the form
V(Y)=u(Y) e Y2/2 (7.3.10)
where u approaches A+ cy (plus higher powers) as y40, and yrn (plus lower powers)
as y—> co. To determine u(y) completely, we feed the above ansatz into Eq. (7.3.8)
and obtain
u" —2yu'+ (2E — 1)u = 0 (7.3.11)
This equation has the desired features (to be discussed in Exercise 7.3.1) that indicate
that a powerseries solution is possible, i.e., if we assume
.0
u(y)= E Cyn (7.3.12)
n=0
the equation will determine the coefficients. [The series begins with n=0, and not
some negative n, since we know that as y40, u * A + cy+ 0(y2).] Feeding this series
—
into Eq. (7.3.11) we find
00
(7.3.13)
n0
Consider the first of three pieces in the above series:
oo
E Cnn(n — 1 ).Yn 2
n= o
Due to the n(n — 1) factor, this series also equals 193
CO
THE HARMONIC
E Cnn(n— 1)y'2 OSCILLATOR
n— 2
In terms of a new variable m= n  2 the series becomes
E cm±2(m+2)(m+1)ym—=E Cn+2(n+2 )( 1 + 1 ).Y"
=0 no
since m is a dummy variable. Feeding this equivalent series back into Eq. (7.3.13)
we get
CO
E Yn[G+2(n+2)0+ 0+ Cn (2E— 12n)]=0 (7.3.14)
n=0
Since the functions yn are linearly independent (you cannot express yn as a linear
combination of other powers of y) each coefficient in the linear relation above must
vanish. We thus find
(2n + 1 — 2c)
Cn+2 — Cn (7.3.15)
(n+2)(n+1)
Thus for any Co and C1 , the recursion relation above generates C2, C4, C6, . . and
C3, C5, C7 , . . The function u(y) is given by
(1 — 2 c)y2 (12e)20 4 (4+1 
Y ±•••
u(y)=c0 L1+ (0+2)(0+1) (0+2)(0+1) (2+2)(2+1)
(2+1 — 2c)y 3 (2+12c) (6+12c)
+CI[y+ y +•• (7.3.16)
(1+2)(1+1) (1+2)(1+1) (3+2)(3+1)
where CO and CI are arbitrary.
It appears as if the energy of the quantum oscillator is arbitrary, since c has
not been constrained in any way. But we know something is wrong, since we saw at
the outset that the oscillator eigenvalues are nonnegative. The first sign of sickness
in our solution, Eq. (7.3.16), is that u(y) does not behave like ym as co (as
deduced at the outset) since it contains arbitrarily high powers of y. There is only
one explanation. We have seen that as y.4 co, there are just two possibilities
Y) y
Ym e±Y2l2
If we write ty(y)= u(y) e Y 2/25 then the two possibilities for u(y) are
ym or ym eY 2
Y co
194 Clearly u(y) in Eq. (7.3.16), which is not bounded by any finite power of y as y—> co,
CHAPTER 7 corresponds to the latter case. We may explicitly verify this as follows.
Consider the power series for u(y) as y —> a o . Just as the series is controlled by
C0 (the coefficient of the lowest power of y) as y—>0, it is governed by its coefficients
Cn_ co as y co. The growth of the series is characterized by the ratio [see Eq. (7.3.15)]
Cn+2 2
(7.3.17)
C„ n—c. n
Compare this to the growth of yrn eY 2. Since
co 2k+ m
2= Y
yrn eY 1
k=0 k!
C= coefficient of yn= 1/k!; with n= 2k + m or k=(n—m)/2. Likewise
1
Cn + 2
[(n+ 2 —m)/21!
SO
Cn +2 [(n—m)/2]! _ 1 2
G n—
' c° [(n+2 —m)/2]!  (n— m+ 2)/2 — n
In other words, u(y) in Eq. (7.3.16) grows as yrn eY 2, so that iv(y)y`n eY 2 eY 2/2 =
itm e +Y2/2 , which is the rejected solution raising its ugly head! Our predicament is now
reversed: from finding that every E is allowed, we are now led to conclude that no
E is allowed. Fortunately there is a way out. If E is one of the special values
2n+1
En  n=0, 1, 2, .. . (7.3.18)
2'
the coefficient Cn +2 (and others dependent on it) vanish. If we choose CI = 0 when
n is even (or Co = 0 when n is odd) we have a finite polynomial of order n which
satisfies the differential equation and behaves as yn as y—> co :
C0+ C 2 + C4Y4 + • • • ± CnY n } —y2/2
( (7.3.19)
V(Y) = u( Y) eY2/2= i  e
c,y+ c3y3 + c5y5 + • • • + cnyn
Equation (7.3.18) tells us that energy is quantized: the only allowed values for
E= Ehco (i.e., values that yield solutions in the physical Hilbert space) are
En = (n+ )hco, n=0, 1, 2,... (7.3.20)
For each value of n, Eq. (7.3.15) determines the corresponding polynomials of nth 195
order, called Hermite polynomials, Hn(Y): THE HARMONIC
OSCILLATOR
Ho(y)=1
H1 (y)=2y
I12(y) = —2(1 — 2y2) (7.3.21)
I13(Y) = —12())  V)
H4(y)= 12(1 — 4y2 + ly4)
The arbitrary initial coefficients Co and CI in Hn are chosen according to a standard
convention. The normalized solutions are then
V E(X) = V (n + 1 /2)h,(X)  a 11/n(x)
1/ 4 1/2
mco ( mcox 2 )fl hco) x]
(n.h22n( 0 2) exp )H[('°) (7.3.22)
2h
The derivation of the normalization constant
[ mw i1 /4
An— (7.3.23)
Ir h2 2n(n!) 2
is rather tedious and will not be discussed here in view of a shortcut to be discussed
in the next section.
The following recursion relations among Hermite polynomials are very useful:
Irn(Y)=2n11,1 (7.3.24)
Hn+ i (y)=2yHn 2nHn _ i (7.3.25)
as is the integral
I. 11(y)11(y) e Y2 dy= 8 ,(71.1/22n!) (7.3.26)
which is just the orthonormality condition of the eigenfunctions ty n(x) and tic n,(x)
written in terms of y= (mo/h) 1 /2x.
We can now express the propagator as
mw
u(x, t; x', t')= E A n exp ( x2)Hn(x)A n exp( — mco 2)
n= o 2h 2h x,
x Hn(x) exp[ — i(n+ 1 /2)0)(t — e)] (7.3.27)
196 Evaluation of this sum is a highly formidable task. We will not attempt it here since
CHAPTER 7
we will find an extremely simple way for calculating U in Chapter 8, devoted to the
path integral formalism. The result happens to be
/
mo.) [ imo) (x2 + x'2) cos co T — 2xx'
U(x , t ; x', t')  exp (7.3.28)
27ri1 sin co T h 2 sin co T
where T= t — t'.
This concludes the solution of the eigenvalue problem. Before analyzing our
results let us recapitulate our strategy.
Step 1. Introduce dimensionless variables natural to the problem.
Step 2. Extract the asymptotic (yoc, y40) behavior of ty.
Step 3. Write iv as a product of the asymptotic form and an unknown function u.
The function u will usually be easier to find than iv .
Step 4. Try a power series to see if it will yield a recursion relation of the form Eq.
(7.3.15).
Exercise 7.3.1. * Consider the question why we tried a powerseries solution for Eq.
(7.3.11) but not Eq. (7.3.8). By feeding in a series into the latter, verify that a threeterm
recursion relation between Cn+ 2 G, and C„_ 2 obtains, from which the solution does not
follow so readily. The problem is that 0" has two powers of y less than 2E0, while the  y2
piece has two more powers of y. In Eq. (7.3.11) on the other hand, of the three pieces u",
2yu', and (2E 1)y, the last two have the same powers of y.
Exercise 7.3.2. Verify that H3(Y) and I/4(y) obey the recursion relation, Eq. (7.3.15).
Exercise 7.3.3. If 0(x) is even and 0(x) is odd under x >  x, show that
f ty(x)0(x) dx =0
Use this to show that W2(x) and 0 1 (x) are orthogonal. Using the values of Gaussian integrals
in Appendix A.2 verify that y/2 (x) and 00(x) are orthogonal.
Exercise 7.3.4. Using Eqs. (7.3.23)(7.3.25), show that
h )1/2
<(
111.110 [8,,(n+1)12+
2mco
t 1/2
<n'Illn> ( mw " ) 1)" 2 8,, 1n1/21
2
Exercise 7.3.5. * Using the symmetry arguments from Exercise 7.3.3 show that <nIXI n> =
<n1Pln>=0 and thus that <X2 > = (AX ) 2 and <P2 > = (4P) 2 in these states. Show that
<11X2 11>=3h/2mco and <11 P 2 I 1 > ; mcoh. Show that 00 (x) saturates the uncertainty bound
AX •
Exercise 7.3.6. * Consider a particle in a potential 197
THE HARMONIC
V(X) = ffi(0 2 X2 , x>0 OSCILLATOR
= CO, x <0
What are the boundary conditions on the wave functions now? Find the eigenvalues and
eigenfunctions.
We now discuss the eigenvalues and eigenfunction of the oscillator. The follow
ing are the main features:
(1) The energy is quantized. In contrast to the classical oscillator whose energy
is continuous, the quantum oscillator has a discrete set of levels given by Eq. (7.3.20).
Note that the quantization emerges only after we supplement Schriidinger's equation
with the requirement that ty be an element of the physical Hilbert space. In this case
it meant the imposition of the boundary condition ty(i xi *()o)0 [as opposed to
v(I xi co)*co, which is what obtained for all but the special values of E].
Why does the classical oscillator seem to have a continuum of energy values?
The answer has to do with the relative sizes of the energy gap and the total energy
of the classical oscillator. Consider, for example, a mass of 2 g, oscillating at a
frequency of 1 rad/sec, with an amplitude of 1 cm. Its energy is
E= 1 erg
Compare this to the gap between allowed energies:
AE= ho)10 27 erg
At the macroscopic level, it is practically impossible to distinguish between a system
whose energy is continuous and one whose allowed energy levels are spaced 10 27 erg
apart. Stated differently, the quantum number associated with this oscillator is
n= 1027
ha) 2
while the difference in n between adjacent levels is unity. We have here a special case
of the correspondence principle, which states that as the quantum number tends to
infinity, we regain the classical picture. (We know vaguely that when a system is big,
it may be described classically. The correspondence principle tells us that the quantum
number is a good measure of bigness.)
(2) The levels are spaced uniformly. The fact that the oscillator energy levels
go up in steps of ho) allows one to construct the following picture. We pretend that
associated with an oscillator of classical frequency co there exist fictitious particles
called quanta each endowed with energy ho). We view the nhco piece in the energy
formula Eq. (7.3.20) as the energy of n such quanta. In other words, we forget about
the mass and spring and think in terms of the quanta. When the quantum number
n goes up (or down) by An, we say that An quanta have been created (or destroyed).
198 Although it seems like a matter of semantics, thinking of the oscillator in terms of
CHAPTER 7 these quanta has proven very useful.
In the case of the crystal, there are 3N0 oscillators, labeled by the 3N0 values of
(k, X), with frequencies co(k, X). The quantum state of the crystal is specified by
giving the number of quanta, called phonons, at each (k, X). For a crystal whose
Hamiltonian is exactly given by a sum of oscillator pieces, the introduction of the
phonon concept is indeed a matter of semantics. If, however, we consider deviations
from this, say to take into account nonleading terms in the Taylor expansion of the
potential, or the interaction between the crystal and some external probe such as an
electron shot at it, the phonon concept proves very useful. (The two effects mentioned
above may be seen as phononphonon interactions and phononelectron inter
actions, respectively.)
Similarly, the interaction of the electromagnetic field with matter may be
reviewed as the interaction between light quanta or photons and matter, which is
discussed in Chapter 18.
(3) The lowest possible energy is 1 co/2 and not O. Unlike the classical oscillator,
which can be in a state of zero energy (with x= p= 0) the quantum oscillator has a
minimum energy of /w /2. This energy, called the zeropoint energy, is a reflection
of the fact that the simultaneous eigen state Ix = 0,p= 0> is precluded by the canonical
commutation relation [X, P]= ih. This result is common to all oscillators, whether
they describe a mechanical system or a normal mode of the electromagnetic field,
since all these problems are mathematically identical and differ only in what the
coordinate and its conjugate momentum represent. Thus, a crystal has an energy
hco(k, )L) in each mode (k, )L) even when phonons are absent, and the electromag
netic field has an energy hco(k, )) in each mode of frequency co even when photons
are absent. (The zeropoint fluctuation of the field has measurable consequences,
which will be discussed in Chapter 18.)
In the following discussion let us restrict ourselves to the mechanical oscillator
and examine more closely the zeropoint energy. We saw that it is the absence of
the state Ix = 0, p= 0> that is responsible for this energy. Such a state, with AX=
AP=O, is forbidden by the uncertainty principle. Let us therefore try to find a state
that is quantum mechanically allowed and comes as close as possible (in terms of
its energy) to the classical state x=p= O. If we choose a wave function v(x) that is
sharply peaked near x = 0 to minimize the mean potential energy 6m(02x2.) the
wave function in P space spreads out and the mean kinetic energy <P 2/2m> grows.
The converse happens if we pick a momentum space wave function sharply peaked
near p= O. What we need then is a compromise ty mir,(x) that minimizes the total
mean energy without violating the uncertainty principle. Let us now begin our quest
for iv,,,n(x). We start with a normalized trial state I ty> and consider
<p2> 2 2
<VIM =
2m 2
mo) > (7.3.29)
Now
(AP)2 <P2> <P>2 (7.3.30)
and
(Ax)2 =Qc2>—<x> 2 (7.3.31)
so that 199
THE HARMONIC
(AP)2 + <P>2 1 2 2 2 OSCILLATOR
<H> — + MCO [(AX) + <X> ] (7.3.32)
2m 2
The first obvious step in minimizing <H> is to restrict ourselves to states with <X>=
(P> = O. (Since <X> and <P> are independent of each other and of (AX ) 2 and (AP) 2,
such a choice is always possible.) For these states (from which we must pick the
winner)
(AP)2 1
+ a/0) (AX ) 2
2
<H> = (7.3.33)
2m 2
Now we use the uncertainty relation
AX • AP._ h/2 (7.3.34)
where the equality sign holds only for Gaussian, as will be shown in Section 9.3.
We get
h2 1
<H> 2 + /12/0 2 (AX) 2 (7.3.35)
8m(AX ) 2
We minimize <H> by choosing a Gaussian wave function, for which
h2 1 inco 2(Ax )2 (7.3.36)
<H>Gaussian=
8m(AX) 2 2
What we have found is that the mean energy associated with the trial wave function
is sensitive only to the corresponding AX and that, of all functions with the same
AX, the Gaussian has the lowest energy. Finally we choose, from the family of
Gaussians, the one with the AX that minimizes <H>Gaussian • By requiring
a<H>Gaussian _ 0 _ —h2 1 mco 2
+ (7.3.37)
X) 8m(AX) 4 2
we obtain
(AX) 2 = h/2mco (7.3.38)
and
<Minh, = 11(0 /2 (7.3.39)
200 Thus, by systematically hunting in Hilbert space, we have found that the following
CHAPTER 7
normalized function has the lowest mean energy:
(MN
)1/4 (
MCOX
2 hco
m in(x) exp <H>min= (7.3.40)
Ich 21 )' T
If we apply the above result
Vmin> V> (for all I iv>)
to I iv> = I ivo> = groundstate vector, we get
1/1 Vmin> <1//011111Po>=E0 (7.3.41)
Now compare this with the result of Exercise 5.2.2:
Eo = <V0111111/0 '<lIfilli 11/> for all I V>
If we set I iv> = I tymin > we get
E0 = <viol HI Vo> Wmin> (7.3.42)
It follows from Eq. (7.3.41) and (7.3.42) that
, ho)
E0 = ovoim wo>=0v.inim vimin2= T (7.3.43)
Also, since there was only one state, I tv.,„>, with energy hco/2, it follows that
I Vo>=I Vmin> (7.3.44)
We have thus managed to find the oscillator groundstate energy and state vector
without solving the Schrödinger equation.
It would be a serious pedagogical omission if it were not emphasized at this
juncture that the uncertainty relation has been unusually successful in the above
context. Our ability here to obtain all the information about the ground state using
the uncertainty relation is a consequence of the special form of the oscillator Hamil
tonian [which allowed us to write <H> in terms of (AX) 2 and (AP) 2] and the fact
that its groundstate wave function is a Gaussian (which has a privileged role with
respect to the uncertainty relation). In more typical instances, the use of the uncer
tainty relation will have to be accompanied by some handwaving [before <H> can
be approximated by a function of (AX) 2 and (AP) 2] and then too will yield only an
estimate for the groundstate energy. As for the wave function, we can only get an
estimate for AX, the spread associated with it.
201
THE HARMONIC
OSCILLATOR
Figure 7.1. Normalized eigenfunctions for n=
0, 1, 2, and 3. The small arrows at
IYI = (2n + 1) 1 /2 stand for the classical turning
points. Recall that y = (mco/h ) 1 /2x.
(4) The solutions (Fig. 7.1) ii(x) contain only even or odd powers of x, depend
ing on whether n is even or odd. Consequently the eigenfunctions are even or odd:
= n even
= n odd
In Chapter 11 on symmetries it will be shown that the eigenfunctions had to have
this property.
(5) The wave function does not vanish beyond the classical turning points, but
dies out exponentially as x co. [Verify that the classical turning points are given
by yo= ±(2n + 1) 1 /2.1 Notice, however, that when n is large (Fig. 7.2) the excursions
outside the turning points are small compared to the classical amplitude. This expo
nentially damped amplitude in the classically forbidden region was previously
encountered in Chapter 5 when we studied tunneling.
(6) The probability distribution P(x) is very different from the classical case.
The position of a given classical oscillator is of course exactly known. But we could
ask the following probabilistic question: if I suddenly walk into a room containing
the oscillator, where am I likely to catch it? If the velocity of the oscillator at a point
x is v(x), the time it spends near the x, and hence the probability of our catching it
there during a random spot check, varies inversely with v(x):
1 1
Pc( X) cc
( (7.3.45)
v(x) co(4 — x2)"
which is peaked near ±x 0 and has a minimum at the origin. In the quantum case,
for the ground state in particular, I ii(x) 1 2 seem to go just the other way (Fig. 7.1).
There is no contradiction here, for quantum mechanics is expected to differ from
classical mechanics. The correspondence principle, however, tells us that for large n
202 14/11101
CHAPTER 7
Figure 7.2. Probability density in the state n= 11.
J i? II VI I The broken curve gives the classical probability
6 4 2 2 4 6 distribution in a state with the same energy.
the two must become indistinguishable. From Fig. 7.2, which shows the situations
at n=11, we can see how the classical limit is reached: the quantum distribution
P(x)=Ity(x)1 2 wiggles so rapidly (in a scale set by the classical amplitude) that only
its mean can be detected at these scales, and this agrees with Pc , (x). We are reminded
here of the doubleslit experiment performed with macroscopic particles: there is a
dense interference pattern, whose mean is measured in practice and agrees with the
classical probability curve.
A remark that was made in more general terms in Chapter 6: the classical
oscillator that we often refer to, is a figment lodged in our imagination and doesn't
exist. In other words, all oscillators, including the 2g mass and spring system, are
ultimately governed by the laws of quantum mechanics, and thus have discrete
energies, can shoot past the "classical" turning points, and have a zeropoint energy
of tico even while they play dead. Note however that what I am calling nonexistent
is an oscillator that actually has the properties attributed to it in classical mechanics,
and not one that seems to have them when examined at the macroscopic level.
Exercise 7.3.7.* The Oscillator in Momentum Space. By setting up eigenvalue equation
for the oscillator in the P basis and comparing it to Eq. (7.3.2), show that the momentum
space eigenfunctions may be obtained from the ones in coordinate space through the substitu
tion x–>p, mw–* 1/mw. Thus, for example,
)" e _„2/ 2„,h.,
t v 0(P)— (
hco
There are several other pairs, such as AX and AP in the state In>, which are related by the
substitution mco–>l/mco. You may wish to watch out for them. (Refer back to Exercise 7.3.5.)
7.4. The Oscillator in the Energy Basis
Let us orient ourselves by recalling how the eigenvalue equation
( 1,2 1
— + – mco 2 X 2 )IE> = EIE> (7.4.1)
2m 2
was solved in the coordinate basis: (1) We made the assignments X >x, 203
P>ih d/dx. (2) We solved for the components <x i E> = ti/E(x) and the eigenvalues. THE HARMONIC
To solve the problem in the momentum basis, we first compute the X and P OSCILLATOR
operators in this basis, given their form in the coordinate basis. For instance,
<p'1Xip>= ff <illx> <xiXix'> <x'ip> dx dx'
e  `P" x5(x — JO e'P'" h
(27rh) 1/2 (given) (27rh) 1/2
=  did '(1)  p')
We then find P and H(X, P) in this basis. The eigenvalue equation, (7.4.1), will then
become a differential equation that we will proceed to solve.
Now suppose that we want to work in the energy basis. We must first find the
eigenfunctions of H, i.e., <xi E>, so that we can carry out the change of basis. But
finding <xiE> = tvE(x) amounts to solving the full eigenvalue problem in the coordi
nate basis. Once we have done this, there is not much point in setting up the problem
in the E basis.
But there is a clever way due to Dirac, which allows us to work in the energy
basis without having to know ahead of time the operators X and P in this basis. All
we will need is the commutation relation
[X, P]=ail= ih (7.4.2)
which follows from X >x, P>iti d/dx, but is basis independent. The next few steps
will seem rather mysterious and will not fit into any of the familiar schemes discussed
so far. You must be patient till they begin to pay off.
Let us first introduce the operator
(nuo )1/2 ( )1/2
1
a X+i P (7.4.3)
2h 2mcoh
and its adjoint
) 1/2
t (co
m ( 1 )1/2
a = X i P (7.4.4)
2h 2mcoh
(Note that mw 1 /mw as X 44 P.) They satisfy the commutation relation (which
you should verify)
[a, at]= 1 (7.4.5)
204 Note next that the Hermitian operator at a is simply related to H:
CHAPTER 7
1 i
at a —
In°
X2 + P2 + [X, P]—
2h 2moh 2h
Hi
=
ho 2
so that
H = (at a + 112)hco (7.4.6)
[This method is often called the "method of factorization" since we are
expressing H= p2+ X2 (ignoring constants) as a product of (X+ iP)=a and
(X— iP)=at . The extra &a/2 in Eq. (7.4.6) comes from the noncommutative nature
of X and P.]
Let us next define an operator fi,
H
I= = (at a+ 1/2) (7.4.7)
hw
whose eigenvalues E measure energy in units of hw. We wish to solve the eigenvalue
equation for !I:
111 s> = si s> (7.4.8)
where E is the energy measured in units of hco . Two relations we will use shortly are
[a, 11] = [a, at a+ 1/2] = [a, at a] = a (7.4.9)
and
[a t, b] = _a t (7.4.10)
The utility of a and at stems from the fact that given an eigenstate of il, they generate
others. Consider
Hai s> = (ari— [a, 1'1[)1E>
= (ail — a)l s>
= (s1)al E> (7.4.11)
We infer from Eq. (7.4.11) that al E> is an eigenstate with eigenvalue s— 1, i.e., 205
THE HARMONIC
als>=Csle1> (7.4.12) OSCILLATOR
where CE is a constant, and 1 E — 1> and 1 E> are normalized eigenkets.t
Similarly we see that
flat! E> = (at ii — [at, HMO
= (at II + at)! s>
(s+ Oat ' E>
 (7.4.13)
so that
at Ig>=C,±11s+1> (7.4.14)
One refers to a and at as lowering and raising operators for obvious reasons. They
are also called destruction and creation operators since they destroy or create quanta
of energy hco.
We are thus led to conclude that if E is an eigenvalue of ii, so are
g+ 1, E + 2, s+ 3, .. . , s+ co; and E — 1, . . . , s — GO . The latter conclusion is in con
flict with the result that the eigenvalues of H are nonnegative. So, it must be that
the downward chain breaks at some point: there must be a state I 4> that cannot
be lowered further:
al so> = 0 (7.4.15)
Operating with at , we get
at al co> =0
or
(f1— 1/2)1 So> = 0 [from Eq. (7.4.7)]
or
ill 4 > — 1 4 >
or
1
SO = 2 (7.4.16)
$ We are using the fact that there is no degeneracy in one dimension.
206 We may, however, raise the state I so> indefinitely by the repeated application of at.
CHAPTER 7 We thus find that the oscillator has a sequence of levels given by
En = (n + 1/2), n=0, 1, 2, ...
or
En = (n + 1 12)lico , n = 0, 1, 2, . (7.4.17)
Are these the only levels? If there were another family, it too would have to have a
ground state 14> such that
al s> =0
or
atals>=0
or
= (7.4.18)
But we know that there is no degeneracy in one dimension (Theorem 15). Conse
quently it follows from Eqs. (7.4.16) and (7.4.18) that Igo> and I 4>represent the
same state. The same goes for the families built from Igo> and I 4>
by the repeated
action of at .
We now calculate the constants C, and appearing in Eqs. (7.4.12) and
(7.4.14). Since E = n + 1/2, let us label the kets by the integer n. We want to determine
the constant G appearing in the equation
al n> = C„In — 1> (7.4.19a)
Consider the adjoint of this equation
<Mat = <n — 11 C: (7.4.19b)
By combining these equations we arrive at
<nlataln>= <n — lIn  1>C:Cn
<nl — In>= CC (since In — l> is normalized)
<nInIn> = CnI2 (since An> (n + 1/2)In>) (7.4.20)
I Cni 2 = n
= (n)' 2 ekk (q5 is arbitrary)
It is conventional to choose q5 as zero. So we have 207
THE HARMONIC
aln>=n1/2 In1> (7.4.21) OSCILLATOR
It can similarly be shown (by you) that
at In> = (n + 1)'/2 In + 1> (7.4.22)
[Note that in Eqs. (7.4.21) and (7.4.22) the larger of the n's labeling the two kets
appears under the square root.] By combining these two equations we find
drain> = at n 1121n — 1> =nu2n 21n>=n1n> (7.4.23)
In terms of
N= a (7.4.24)
called the number operator (since it counts the quanta)
(7.4.25)
Equations (7.4.21) and (7.4.22) are very important. They allow us to compute
the matrix elements of all operators in the In> basis. First consider a and at
themslv:
<nlain> = n 2 <n'In —1> =n1 /23 n',n — 1 (7.4.26)
<n'iatin>= (n + 1)1 /2<n'In + 1> (n +1 ) I/28 n',n I (7.4.27)
To find the matrix elements of X and P, we invert Eqs. (7.4.3) and (7.4.4) to obtain
h )1/2
X— (a+ at) (7.4.28)
2mco
)1/2
P— i(mcoh
( at a)
— (7.4.29)
2
and then use Eqs. (7.4.26) and (7.4.27). The details are left as an exercise. The two
basic matrices in this energy basis are
n=0 n=1 n=2 ...
n=0 0 0 0 ...
n=1 1" 0 0
at n=2 o 2 1 /2 o (7.4.30)
0 0 3"
208 and its adjoint
CHAPTER 7
o 1 1 /2 o o
o o 2" 0
a 4— (7.4.31)
0 0 0 3 1 /2

Both matrices can be constructed either from Eqs. (7.4.26) and (7.4.27) or Eqs.
(7.4.21) and (7.4.22) combined with our mnemonic involving images of the trans
formed vectors atmn> and al n>. We get the matrices representing X and P by turning
to Eqs. (7.4.28) and (7.4.29):
—
o 1" 0 0
( li )1/2 1 1/2 0 2 1/2 0
X 4— 0 2" 0 3 1/2 (7.4.32)
2mco
0 0 3 1/2 0
O —P/2 o o
)1/2 1 " 0 —2 1/2 0
(mcoh 0 _31/2
P 44 i 0 21/2 (7.4.33)
2 31/2
0 0 0

The Hamiltonian is of course diagonal in its own basis:
1/2 0 0 0 •••
0 3/2 0 0
H— lico (7.4.34)
0 0 5/2

Equation (7.4.22) also allows us to express all normalized eigenvectors In> in terms
of the ground state 0 >:
at at at i (at)n
In> — 1/2 In 1> — 1/2 2> — 1 2 10> (7.4.35)
n n (n — 1) 1/2 in (n!) /
The a and at operators greatly facilitate the calculation of the matrix of elements of
other operators between oscillator eigenstates. Consider, for example, <31X3 I2>. In
the X basis one would have to carry out the following integral: 209
THE HARMONIC
moi 1 1 ) 1/ 2 f { mco OSCILLATOR
<31x312>
( ) 1/2 (
— exp
233! 222!)  2h)x2
X 1/3[( nihi X1X3 eXP
1/2 2
MWX )1/2
2h
R MC°h
)
1/2
xi} dx
whereas in the in> basis
h ) 3/2
<31X3 12> — ( <31(a+ at)312>
2m co
( h )3/2
<31(a3 + a2at + aata + aatat
2mo)
+ at a a + at a at + at at a + at at at )12>
Since a lowers n by one unit and at raises it by one unit and we want to go up by
one unit from n =2 to n= 3, the only nonzero contribution comes from at at a, aat at ,
and at aat . Now
atata12>=2" 1/2 at at 1 i>=2,"1/2"1/2ati 2> = 21/221/231/213>
aatat 12> = 3 1 /2aat 13> .. 3 1/24 1/21114>=3 1/24 1/2413>
ataat 12>=3 1 /2ata13>= 3 1 /2/V13> = 3 1 /2313>
so that
h ) 3/2
<31X312> ( [2(3 1 /2)+4(3 1 /2)+ 3(3 1 /2)]
2mco
What if we want not some matrix element of X, but the probability of finding
the particle in in> at position x? We can of course fall back on Postulate III, which
tells us to find the eigenvectors 1x> of the matrix X [Eq. (7.4.32)] and evaluate the
inner product <xin>. A more practical way will be developed in the next section.
Consider a remarkable feature of the above solution to the eigenvalue problem
of H. Usually we work in the X basis and set up the eigenvalue problem (as a
differential equation) by invoking Postulate II, which gives the action of X and P in
the X basis (X ÷x, P 3 dldx). In some cases (the linear potential problem), the
P basis recommends itself, and then we use the Fouriertransformed version of
Postulate II, namely, X —>ifi dldp, P —y. In the present case we could not transform
this operator assignment to the energy eigenbasis, for to do so we first had to solve
for the energy eigenfunctions in the X basis, which was begging the question. Instead
we used just the commutation relation [X, P]=iti, which follows from Postulate II,
but is true in all bases, in particular the energy basis. Since we obtained the complete
210 solution given just this information, it would appear that the essence of Postulate II
CHAPTER 7 is just the commutator. This in fact is the case. In other words, we may trade our
present Postulate II for a more general version:
Postulate II. The independent variables x and p of classical mechanics now
become Hermitian operators X and P defined by the canonical commutator
[X, P]= ih. Dependent variables co(x, p) are given by operators SI =
co (x—a, p—>13).
To regain our old version, we go to the X basis. Clearly in its own basis X + x.
We must then pick P such that [X, P]= ill. If we make the conventional choice P=
—ih dl dx, we meet this requirement and arrive at Postulate II as stated earlier. But
the present version of Postulate II allows us some latitude in the choice of P, for
we can add to —ill dl dx any function of x without altering the commutator: the
assignment
X —> x (7.4.36a)
X basis
d
P (7.4.36b)
X basis dx
is equally satisfactory. Now, it is not at all obvious that in every problem (and not
just the harmonic oscillator) the same physics will obtain if we make this our starting
point. For example if we project the eigenvalue equation
(7.4.37a)
onto the X basis, we now get
[ — ih —d +f(x)116
( x)=1, 16(x) (7.4.37b)
dx
from which it follows that ty p(x) is no longer a plane wave cce 1P ". How can the
physics be the same as before? The answer is that the wave function is never measured
directly. What we do measure are probabilities I <col ty>I 2 for obtaining some result
co when is measured, squares of matrix elements I <V11 0 1 11/2 >2, or the eigenvalue
spectrum of operators such as the Hamiltonian. In one of the exercises that follows,
you will be guided toward the proof that these measurable quantities are in fact left
invariant under the change to the nontraditional operator assignment Eq. (7.4.36).
Dirac emphasized the close connection between the commutation rule
[X, P ] =ih
of the quantum operators and the Poisson brackets (PB) of their classical
counterparts
{x, pl = 1
which allows us to write the defining relation of the quantum operators as 211
THE HARMONIC
[X, P]= ih{x, I)} =ih (7.4.38) OSCILLATOR
The virtue of this viewpoint is that its generalization to the "quantization" of
a system of N degrees of freedom is apparent:
Postulate II (For N Degrees of Freedom). The Cartesian coordinates x 1 , . . . xN
and momenta p l , , pN of the classical description of a system with N degrees
of freedom now become Hermitian operators X 1 ,. • , XN; PI, • • • , PN obeying
the commutation rules
[Xi , 1;1= pi} = u
[Xi , XJ ]= ihlxi , x11=0 (7.4.39)
[Pi, =°
Similarly co (x, p)—>co(xX, pP)= Q.
[We restrict ourselves to Cartesian coordinates to avoid certain subtleties associated
with the quantization of nonCartesian but canonical coordinates; see Exercise
(7.4.10). Once the differential equations are obtained, we may abandon Cartesian
coordinates in looking for the solutions.]
It is evident that the generalization provided towards the end of Section 4.2,
namely,
X basis
Pi in
a
X basis aXi
is a choice but not the choice satisfying the canonical commutation rules, Eq. (7.4.39),
for the same reason as in the N= 1 case.
Given the commutation relations between X and P, the ones among dependent
operators follow from the repeated use of the relations
[Q, Ar ] = A[Q, F] + [Q, A]r
and
[QA, ] = Q[A, r] + [Q, I1A
Since PB obey similar rules (Exercise 2.7.1) except for the lack of emphasis on
ordering of the classical variables, it turns out that if
Ico(x,P), A(x,p)1= r(x,p)
212 then
CHAPTER 7
[S2(X, P), A(X, P)]= (7.4.40)
except for differences arising from ordering ambiguities; hence the formal similarity
between classical and quantum mechanics, first encountered in Chapter 6.
Although the new form of postulate II provides a general, basisindependent
specification of the quantum operators corresponding to classical variables, that is
to say for "quantizing," in practice one typically works in the X basis and also
ignores the latitude in the choice of P, and sticks to the traditional one, Pi =
a /x1 , which leads to the simplest differential equations. The solution to the
oscillator problem, given just the commutation relations (and a little help from
Dirac) is atypical.
Exercise 7.4.1.* Compute the matrix elements of X and P in the In> basis and compare
with the result from Exercise 7.3.4.
Exercise 7.4.2.* Find <X>, <P>, <X2>, <P2>, AX • AP in the state In>.
Exercise 7.4.3.* (Vinai Theorem). The vinai theorem in classical mechanics states that
for a particle bound by a potential V(r)= ark, the average (over the orbit) kinetic and potential
energies are related by
when c(k) depends only on k. Show that c(k)=102 by considering a circular orbit. Using the
results from the previous exercise show that for the oscillator (k= 2)
<T>=<V>
in the quantum state In>.
Exercise 7.4.4. Show that <nIX4In> = (h/2m0)) 2[3 + 6n(n + 1 ) ] .
Exercise 7.4 • 5 • * At t= 0 a particle starts out in I v/(0)> = 1/21/2(10> 1 l >). (1) Find
1 11/(()>; (2) find <X(0)> = <W( 0)1X1 W(0)>, <P(0)>, <X(t)>, <P(t)>;(3) find 4(0> and <15 (0>
using Ehrenfest's theorem and solve for <X(t)> and <P(t)> and compare with part (2).
Exercise 7.4.6.* Show that <a(t)>= "" <a(0)>and that <at(t)>=e'<at(0)>.
Exercise 7.4.7. Verify Eq. (7.4.40) for the case
(1) 51=X, A=X2 +P 2
(2)n =x2, A= P2
The second case illustrates the ordering ambiguity.
Exercise 7.4.8.* Consider the three angular momentum variables in classical mechanics: 213
THE HARMONIC
1,= yp,— zp y OSCILLATOR
y = zp,— xp, l
1,= xpy — yp x
(1) Construct L,, Ly , and Lz , the quantum counterparts, and note that there are no ordering
ambiguities.
(2) Verify that {/„, /y } = /z [see Eq. (2.7.3) for the definition of the PB].
(3) Verify that [Lx , Ly] = ihL,.
Exercise 7.4.9 (Important). Consider the unconventional (but fully acceptable) operator
choice
X
dx
in the X basis.
(1) Verify that the canonical commutation relation is satisfied.
(2) It is possible to interpret the change in the operator assignment as a result of a unitary
change of the X basis:
Ix> —dfc> = ele " Ix> = eig(x" Ix>
where
g(x)= f(x') dx'
First verify that
<5e1X1 50= xd(x — x')
i.e.,
X x
new X basis
Next verify that
d
“11)1g)=[— ili — ±f(x)1 8 (x
dx
214 i.e.,
CHAPTER 7
d
P ih +f(x)
new X basis dx
This exercise teaches us that the "X basis" is not unique; given a basis Ix>, we can get another
by multiplying by a phase factor which changes neither the norm nor the orthogonality.
The matrix elements of P change with f, the standard choice corresponding to f= O. Since the
presence of f is related to a change of basis, the invariance of the physics under a change in
f (from zero to nonzero) follows. What is novel here is that we are changing from one X basis
to another X basis rather than to some other SI basis. Another lesson to remember is that
two different differential operators co(x,—ih d/dx) and co(x, —ih d/dx+f) can have the same
eigenvalues and a onetoone correspondence between their eigenfunctions, since they both
represent the same abstract operator n(X, P).
Exercise 7.4.10.* Recall that we always quantize a system by promoting the Cartesian
coordinates x l , . . . , x N ; and momenta p i , . . . , pN to operators obeying the canonical commu
tation rules. If nonCartesian coordinates seem more natural in some cases, such as the
eigenvalue problem of a Hamiltonian with spherical symmetry, we first set up the differential
equation in Cartesian coordinates and then change to spherical coordinates (Section 4.2). In
Section 4.2 it was pointed out that if Ye is written in terms of nonCartesian but canonical
coordinates g, . . . ; pl ; p,—>—ih 10g,) does not generate the correct
Hamiltonian H, even though the operator assignment satisfies the canonical commutation
rules. In this section we revisit this problem in order to explain some of the subtleties arising
in the direct quantization of nonCartesian coordinates without the use of Cartesian coordi
nates in intermediate stages.
(1) Consider a particle in two dimensions with
„2 „2
Px PY
a(x 2 + y2 ) 1/2
2m
which leads to
H—, —h2 + )+ a(x2 + y2) 1 /2
2m ax2 ay2
in the coordinate basis. Since the problem has rotational symmetry we use polar coordinates
p = (x2 + y2)' , 0= tan 1 (y x)
in terms of which
_2(32
1 a 1 02
H 2m p2 +
p — +
p2—
302 )+ ap (7.4.41)
coordinate
basis
Since p and 0 are not mixed up as x and y are [in the (x2 ±y2)1/2 term ] the polar version can
be more readily solved.
The question we address is the following: why not start with expressed in terms of 215
polar coordinates and the conjugate momenta
THE HARMONIC
OSCILLATOR
xPx+YPy
Pp=ep• r — 2 , /2
(x +y2)
(where ef, is the unit vector in the radial direction), and
po = xpy – yp„ (the angular momentum, also called /z)
= + +ap (verify this)
2m 2mp2
and directly promote all classical variables p, p,,, (/), and po to quantum operators obeying
the canonical commutations rules? Let's do it and see what happens. If we choose operators
0
—
Op
4, 0
Po –, —
04,
that obey the commutation rules, we end up with
_h2 a2 1 a2
H
coLtate 2m a p2 + p 2 a0 2)+ap
(7.4.42)
which disagrees with Eq. (7.4.41). Now this in itself is not serious, for as seen in the last
exercise the same physics may be hidden in two different equations. In the present case this
isn't true: as we will see, the Hamiltonians in Eqs. (7.4.41) and (7.4.42) do not have the same
eigenvalues.t We know Eq. (7.4.41) is the correct one, since the quantization procedure in
terms of Cartesian coordinates has empirical support. What do we do now?
(2) A way out is suggested by the fact that although the choice P,, –, –ih 0/0p leads to
the correct commutation rule, it is not Hermitian! Verify that
<tillIPpitlf2>= f br tifp (—ih a tif 2 )p dp dc/,
0 0 Op
oo 2n.
0
JO JO
Op
Lill 41 )* V 2P dP 61(A
= <PpWliV2>
(You may assume pvt y/ 2 –, 0 as p–,0 or cc. The problem comes from the fact that p dp dc/)
and not dp dcfi is the measure for integration.)
What we will see is that 1),,= ih dldp, and hence the H constructed with it, are nonHermitian.
—
216 Show, however, that
CHAPTER 7
0 1)
(7.4.43)
is indeed Hermitian and also satisfies the canonical commutation rule. The angular momentum
P 4, — i a/ao is Hermitian, as it stands, on singlevalued functions: ty( p, 0)= w( p, 0+ 27r).
(3) In the Cartesian case we saw that adding an arbitrary f(x) to —ih 0/0x didn't have
any physical effect, whereas here the addition of a function of p to —ih 010p seems important.
Why? [Is f(x) completely arbitrary? Mustn't it be real? Why? Is the same true for the —ih 12p
piece?]
(4) Feed in the new momentum operator P,, and show that
H
—h2 ( a2 1 a 1 1 a )
+ap
...claw 2m Op2 p p 4p2 p2 a02
which still disagrees with Eq. (7.4.41). We have satisfied the commutation rules, chosen Hermi
tian operators, and yet do not get the right quantum Hamiltonian. The key to the mystery
lies in the fact that ./f doesn't determine H uniquely since terms of order h (or higher) may
be present in H but absent in Ye. While this ambiguity is present even in the Cartesian case,
it is resolved by symmetrization in all interesting cases. With nonCartesian coordinates the
ambiguity is more severe. There are ways of constructing H given Ye' (the path integral
formulation suggests one) such that the substitution P„—, — ih(0/0p+ 1/2p) leads to Eq.
(7.4.41). In the present case the quantum Hamiltonian corresponding to
= P2p + P2o 2 + ap
2m 2m p
is given by
a a h2
H rf(P—qi;Pp—'—ih[ + 1; 0—'0; Po—'—ih (7.4.44)
coordinate
basis
ap 2p 8mp2
Notice that the additional term is indeed of nonzero order in h.
We will not get into a discussion of these prescriptions for generating H
since they finally reproduce results more readily available in the approach we are
adopting. 0
7.5. Passage from the Energy Basis to the X Basis
It was remarked in the last section that although the In> basis was ideally suited
for evaluating the matrix elements of operators between oscillator eigenstates, the
amplitude for finding the particle in a state In> at the point x could not be readily
computed: it seemed as if one had to find the eigenkets Ix> of the operators X [Eq.
(7.4.32)] and then take the inner product <xln>. But there is a more direct way to
get ign(x) = <xi n>.
We start by projecting the equation defining the ground state of the oscillator 217
THE HARMONIC
alO>=0 (7.5.1) OSCILLATOR
on the X basis:
10> —> <xl0> = vco (x)
)1/2
(mco ( 1 )112
a= X+i P
2h 2mh
co
(mco )1/2 ( h 1/2 d
—> x+ (7.5.2)
2h 2mco ) dx
In terms of y= (mc o /h) 1 /2 x,
a= 1/ + d )
(7.5.3)
21 /2 Y/ dy
For later use we also note that (since dl dy is antiHermitian),
at = l 2 (y d )
(7.5.4)
2 1/ dy )
In the X basis Eq. (7.5.1) then becomes
( Y+ cl ) IV o( y) = CI (7.5.5)
dy
or
digo(Y)
— y dy
IV o(Y)
or
Vo(Y)= ilo e  y2/2
or
mcox
Wo(x) = A o exp(
2h 2 )
218 or (upon normalizing)
CHAPTER 7
) 1/4 (
(mco mx)
co 2
— exp (7.5.6)
izh 2h
By projecting the equation
(df r
In> — 1 210>
(n!) /
onto the X basis, we get the normalized eigenfunctions
t 1/2 ri 1/4
1 1 y d mco
[x_() + e —y2/2 (7.5.7)
,mco (n!) 1 /2 2 1 /2 dy rh
A comparison of the above result with Eq. (7.3.22) shows that
11,i(y) — e Y2/2 (y — A )n e—Y2/2 (7.5.8)
dy
We now conclude our rather lengthy discussion of the oscillator. If you understand
this chapter thoroughly, you should have a good grasp of how quantum mechanics
works.
Exercise 7.5.1. Project Eq. (7.5.1) on the P basis and obtain yro(p).
Exercise 7.5.2. Project the relation
aln>=111 /21n  1>
on the X basis and derive the recursion relation
H(y)=2nH_ 1 (y)
using Eq. (7.3.22).
Exercise 7.5.3. Starting with
a+at = 2'/2y
and
(a+ at)In>= n"In 1>+ (n+1)1 /21n+ 1>

and Eq. (7.3.22), derive the relation 219
THE HARMONIC
1 (y)=2yH„(y)  2n11„_ 1 (y) OSCILLATOR
Exercise 7.5.4.* Thermodynamics of Oscillators. The Boltzman formula
P(i)= ePE
where
Z=EQ13"
gives the probability of finding a system in a state i with energy E(i), when it is in thermal
equilibrium with a reservoir of absolute temperature T=11f3k, k= 1.4 x 10 16 ergsr K; being
Boltzman's constant. (The "probability" referred to above is in relation to a classical ensemble
of similar systems and has nothing to do with quantum mechanics.)
(1) Show that the thermal average of the system's energy is
a
E= E E(i)P(i) =  ln Z
fi
(2) Let the system be a classical oscillator. The index i is now continuous and corresponds
to the variables x and p describing the state of the oscillator, i.e.,
p
and
if dx dp
and
E(i)—>gx, p) = + mco 2x2
2m 2
Show that
( 27r ) 1/2 (27rm ) 1/2 _ 27r
zc, = finuo 2 fi cofi
and that
1
= — = kT
fi
Note that Ed is independent of m and co.
220 (3) For the quantum oscillator the quantum number n plays the role of the index i. Show
that
CHAPTER 7
Zqu = e 130/2 (1 — C flhw ) i
and
Equ = h (1) ( 1 + 1 )
2 el3h° — 1 )
(4) It is intuitively clear that as the temperature T increases (and 13 = 1 / IcT decreases) the
oscillator will get more and more excited and eventually (from the correspondence principle)
Equ ' Ed
T..
Verify that this is indeed true and show that "large T" means T»h(0/ k.
(5) Consider a crystal with N0 atoms, which, for small oscillations, is equivalent to 3N0
oscillators. The mean thermal energy of the crystal Ec„„al is Ed or Equ summed decoupl
over all the normal modes. Show that if the oscillators are treated classicaly, the specific heat
per atom is
1 a Ec,„a,
Cd ( T) — — 3k
N0 a T
which is independent of T and the parameters of the oscillators and hence the same for all
crystals.$ This agrees with experiment at high temperatures but not as T, 0. Empirically,
C(T) —>3k (T large)
—*0 (T —,0)
Following Einstein, treat the oscillators quantum mechanically, asuming for simplicity that
they all have the same frequency co. Show that
2
Cqu(T) = 3k( 9 E ) e‘9E/T
TI (e °E/ T — 1) 2
where 9 E = h0)/k is called the Einstein temperature and varies from crystal to crystal. Show
that
Cqu (T)  ■ 3k
T» OE
2
Cqu (T)  ■ 30) C e E/ T
T. UE T
Although Cqu(T) —,0 as T,0, the exponential falloff disagrees with the observed
C(T) —, 7 T 3 behavior. This discrepancy arises from assuming that the frequencies of all
I More precisely, for crystals whose atoms behave as point particles with no internal degrees of freedom.
3k
221
THE HARMONIC
2k OSCILLATOR
C(r)
1k
Figure 7.3. Comparison of experiment with Einstein's
theory for the specific heat in the case of diamond. (O E is
chosen to be 1320 K.) 0 0.2 0.4 0.6 0.8 1.0
T/8E
normal modes are equal, which is of course not generally true. [Recall that in the case of two
coupled masses we get oh =(k/m) I /2 and co n = (3k/m)".] This discrepancy was eliminated
by Debye.
But Einstein's simple picture by itself is remarkably successful (see Fig. 7.3).
The Path Integral Formulation
of Quantum Theory
We consider here an alternate formulation of quantum mechanics invented by
Feynman in the forties.$ In contrast to the Schrödinger formulation, which stems
from Hamiltonian mechanics, the Feynman formulation is tied to the Lagrangian
formulation of mechanics. Although we are committed to the former approach, we
discuss in this chapter Feynman's alternative, not only because of its aesthetic value,
but also because it can, in a class of problems, give the full propagator with tremend
ous ease and also give valuable insight into the relation between classical and
quantum mechanics.
8.1. The Path Integral Recipe
We have already seen that the quantum problem is fully solved once the propa
gator is known. Thus far our practice has been to first find the eigenvalues and
eigenfunctions of H, and then express the propagator U(t) in terms of these. In the
path integral approach one computes U(t) directly. For a single particle in one
dimension, the procedure is the following.
To find U(x, t; x', t'):
(1)Draw all paths in the xt plane connecting (x', t') and (x, t) (see Fig. 8.1).
(2)Find the action S[x(t)] for each path x(t).
(3) U(x, t; x', t')= A E e's[x(t)]/ (8.1.1)
all paths
where A is an overall normalization factor.
The nineteen forties that is, and in his twenties. An interesting account of how he was influenced by
Dirac's work in the same direction may be found in his Nobel lectures. See, Nobel Lectures Physics,
—
Vol. III, Elsevier Publication, New York (1972). 223
224
CHAPTER 8
Figure 8.1. Some of the paths that contribute to the propagator. The
contribution from the path x(t) is Z—expfiS[x(t)]/h).
8.2. Analysis of the Recipe
Let us analyze the above recipe, postponing for a while the proof that it repro
duces conventional quantum mechanics. The most surprising thing about it is the
fact that every path, including the classical path, xd (t), gets the same weight, that
is to say, a number of unit modulus. How are we going to regain classical mechanics
in the appropriate limit if the classical path does not seem favored in any way?
To understand this we must perform the sum in Eq. (8.1.1). Now, the correct
way to sum over all the paths, that is to say, path integration, is quite complicated
and we will discuss it later. For the present let us take the heuristic approach. Let
us first pretend that the continuum of paths linking the end points is actually a
discrete set. A few paths in the set are shown in Fig. 8.1.
We have to add the contributions Za = els t"(1)1 / from each path xa (t). This
summation is done schematically in Fig. 8.2. Since each path has a different action,
it contributes with a different phase, and the contributions from the paths essentially
cancel each other, until we come near the classical path. Since S is stationary here,
the Z's add constructively and produce a large sum. As we move away from xel (t),
destructive interference sets in once again. It is clear from the figure that U(t) is
dominated by the paths near xd (t). Thus the classical path is important, not because
it contributes a lot by itself, but because in its vicinity the paths contribute coherently.
How far must we deviate from xd before destructive interference sets in? One
may say crudely that coherence is lost once the phase differs from the stationary
value S[xd (t)]/h_E.Scilh by about tr. This in turn means that the action for the
coherence paths must be within h/r of S 1 . For a macroscopic particle this means a
very tight constraint on its path, since Sd is typically 1 erg sec 1027h, while for
in electron there is quite a bit of latitude. Consider the following example. A free
particle leaves the origin at t = 0 and arrives at x = 1 cm at t= 1 second. The classical
path is
x=t (8.2.1)
Figure 8.2. Schematic representation of the sum EZ„
Paths near xd (t) contribute coherently since S is station
ary there, while others cancel each other out and may
be ignored in the first approximation when we calculate
U(t).
225
THE PATH
INTEGRAL
Figure 8.3. Two possible paths connecting (0,0) and (1, 1). The FORMULATION
action on the classical path x= t is m/2, while on the other, it is OF QUANTUM
2m/3.
THEORY
Consider another path
x= t2 (8.2.2)
which also links the two spacetime points (Fig. 8.3.)
For a classical particle, of mass, say 1 g, the action changes by roughly
1.6 x 1026,ri , and the phase by roughly 1.6 x 1026 rad as we move from the classical
path x= t to the nonclassical path x= t2 . We may therefore completely ignore the
nonclassical path. On the other hand, for an electron whose mass is 1027 g, SS
I1I6 and the phase change is just around a sixth of a radian, which is well within the
coherence range 8S/h< 71% It is in such cases that assuming that the particle moves
along a welldefined trajectory, xe, (t), leads to conflict with experiment.
8.3. An Approximation to U(t) for a Free Particle
Our previous discussions have indicated that, to an excellent approximation, we
may ignore all but the classical path and its neighbors in calculating U(t). Assuming
that each of these paths contributes the same amount exp(iSel /h), since S is station
ary, we get
U(t)= A' el sc° (8.3.1)
where A' is some normalizing factor which "measures" the number of paths in the
coherent range. Let us find U(t) for a free particle in this approximation and compare
the result with the exact result, Eq. (5.1.10).
The classical path for a free particle is just a straight line in the xt plane:
x—x'
xel (t")=x'+ (t" t') (8.3.2)
t — t'
corresponding to motion with uniform velocity v = (x — x')/(t — t'). Since 2=
2/2 is a constant, mv
1 (x x') 2
So = dt" = m
2 t — t'
226 so that
CHAPTER 8
rim(xx)2]
U(x, t; x' , t') = A' exp (8.3.3)
L 2h(t — t')
To find A', we use the fact that as t — t' tends to 0, U must tend to 8(x— x').
Comparing Eq. (8.3.3) to the representation of the delta function encountered in
Section 1.10 (see footnote on page 61),
1 (x  x)21
(5(x — lim
A—o (nA 2 ) / 2 exP A2
(valid even if A is imaginary) we get
[ m ]1 /2
A' —
27thi(t— t')
so that
1/2
m )1/2 x[im( — x')21
U(x, t; x', 0) U(x, t; exp (8.3.4)
hit )
which is the exact answer! We have managed to get the exact answer by just comput
ing the classical action! However, we will see in Section 8.6 that only for potentials
of the form V = a+bx+ cx2 + dX + exi is it true that U(t)= A(t) ez sc'/h . Furthermore,
we can't generally find A(t) using U(x, 0; x') = 6(x — x') since A can contain an
arbitrary dimensionless function f such that f+ 1 as t40. Heref 1 because we can't
construct a nontrivial dimensionless f using just m, h, and t (check this).
8.4. Path Integral Evaluation of the FreeParticle Propagator
Although our heuristic analysis yielded the exact freeparticle propagator, we
will now repeat the calculation without any approximation to illustrate path
integration.
Consider U(xN, tN; x0 , to). The peculiar labeling of the end points will be just
ified later. Our problem is to perform the path integral
e is[xm 1 / h 53[x(r)] (8.4.1)
Xo
where
1. N
g [X (0]
X0
227
THE PATH
INTEGRAL
FORMULATION
Figure 8.4. The discrete approximation to a path OF QUANTUM
x(t). Each path is specified by N  1 numbers THEORY
x(11 ), , x(tN_ 1 ). To sum over paths we must
integrate each x• —oo to +oo. Once all inte
grations are done, we can take the limit N —.co.
is a symbolic way of saying  integrate over all paths connecting x o and x N (in the
interval to and tN )." Now, a path x(t) is fully specified by an infinity of numbers
x(to), . . , x(t), . x(t N ), namely, the values of the function x(t) at every point t
in the interval to to tN . To sum over all paths we must integrate over all possible
values of these infinite variables, except of course x(to) and x(t N ), which will be kept
fixed at xo and xN, respectively. To tackle this problem, we follow the idea that was
used in Section 1.10: we trade the function x(t) for a discrete approximation which
agrees with x(t) at the N+ 1 points 4, = to + nE, n= 09 . 9 N, where E=(tN — to)/N.
In this approximation each path is specified by N+ 1 numbers x(to), x(ti), • • • , x(tN)•
The gaps in the discrete function are interpolated by straight lines. One such path
is shown in Fig. 8.4. We hope that if we take the limit N * at the end we will get 
a result that is insensitive to these approximations.t Now that the paths have been
discretized, we must also do the same to the action integral. We replace the continu
ous path definition
IN
2
by
N1
—
(Xi+I — Xiy
S= (8.4.2)
it'o 2
where x, =x(4). We wish to calculate
XN
tAXN, t N; x09 t0) = explis[x(r)]/hIg[X(0]
X I)
i M N —1 (Xi+ I — Xi) 2
= lim A exp [

N— .oo
— oo
h 2 j=0
x dxi • • • dx N (8.4.3)
We expect that the abrupt changes in velocity at the points to + nE that arise due to our approximation
will not matter because y does not depend on the acceleration or higher derivatives.
228 It is implicit in the above that x0 and xN have the values we have chosen at the
CHAPTER 8
outset. The factor A in the front is to be chosen at the end such that we get the
correct scale for U when the limit N—> op is taken.
Let us first switch to the variables
\ 1/2
Yi=
Ghs)
m
We then want
00 Too N l ( 2
liM
N oo
A' exp [ — ii=o ±1—YI dy, • • • dy N _ I (8.4.4)
where
(N — 1)/2
A' = A (2hE)
m)
Although the multiple integral looks formidable, it is not. Let us begin by doing the
Yi integration. Considering just the part of the integrand that involves y i , we get
1/2
1
1
exp — [ (y 2 — .Y1) 2 +  y0) 2 ]} ig ) e (Y2Y0)2/21
dyi=(— (8.4.5)
00
2
Consider next the integration over y2 . Bringing in the part of the integrand involving
Y2 and combining it with the result above we compute next
• \1/2 09
ccy 3  y2) 2 /c. ccy2y0) 2i2c dy2
( 2  CO
1/2 .)1 /2
= (— e
—(2yi +A)/ 2 i 2/r1
— e
iy o+ 2Y3) 2/6 i
2) 3
0702 1 /2
(y3 YO) 2 /3 i
e (8.4.6)
[3
By comparing this result to the one from the Yi integration, we deduce the pattern:
if we carry out this process N— 1 times so as to evaluate the integral in Eq. (8.4.4),
it will become .'
(bo (N  1)/2
• p 0" — Y0 2 /Ni
N I/2 
or 229
THE PATH
(in. ) (N 1)/2
,—tn(xN — x0) 2/2fieNi INTEGRAL
N 1/2 ' FORMULATION
OF QUANTUM
THEORY
Bringing in the factor A(2hc/m) (N —1)/2 from up front, we get
U = A((2
7thci)\
m
mN/2(im(xN
1/2
exp
DrhiNs
) [
 xo) 2
2hNE
1
If we now let N >co, c >0, Ne >tN to , we get the right answer provided
   
IN/2
[271 hci
A= E
 B —N (8.4.7)
M
It is conventional to associate a factor 1 /B with each of the N 1 integrations and 
the remaining factor 1 /B with the overall process. In other words, we have just learnt
that the precise meaning of the statement "integrate over all paths" is
.{ „,,,x(t) , = ,im i ril . r dx, dx 2 dX N1
E0 Bjji
. ., j . BB B
N —■ oo
where
(27thci
)1/2
B (8.4.8)
m
8.5. Equivalence to the Schrtidinger Equation
The relation between the Schriidinger and Feynman formalisms is quite similar
to that between the Newtonian and the least action formalisms of mechanics, in that
the former approach is local in time and deals with time evolution over infinitesimal
periods while the latter is global and deals directly with propagation over finite times.
In the Schrödinger formalism, the change in the state vector I vi> over an infin
itesimal time c is
IV (0> — IV (0)> — —hi E HIV (0)> (8.5.1)
which becomes in the X basis
_ i E r _ h2 a2
tv(x, c) vi(x, 0) = + LAX, 0)11p(x, 0) (8.5.2)
h L2m ax2
230 to first order in E. To compare this result with the path integral prediction to the
CHAPTER 8 same order in E, we begin with
1V(X, E)= ftAX, E; X') vi(x', 0) dx' (8.5.3)
The calculation of U(s) is simplified by the fact that there is no need to do any
integrations over intermediate x's since there is just one slice of time E between the
start and finish. So
( m )1/2
(AX, E; x') — exp {i[m(x x)2 EV( X+ X' , 0)Vh} (8.5.4)
27rhis 2e 2
where the (m/2rhig) I /2 factor up front is just the 1/B factor from Eq. (8.4.8). We
take the time argument of V to be zero since there is already a factor of e before it
and any variation of V with time in the interval 0 to e will produce an effect of
second order in e. So
Mi (X — X) 2 iE v X F X' ,0
tif(x, 6) — ( ni ) 1/2 f eXp[ exp
2rchic 2ch j 2
X yr(x',0)dx' (8.5.5)
Consider the factor exp[im(x— x') 2/2Eh]. It oscillates very rapidly as (x — x') varies
since E is infinitesimal and h is so small. When such a rapidly oscillating function
multiplies a smooth function like f(x', 0), the integral vanishes for the most part
due to the random phase of the exponential. Just as in the case of the path integration,
the only substantial contribution comes from the region where the phase is stationary.
In this case the only stationary point is x=x', where the phase has the minimum
value of zero. In terms of ri =x' — x, the region of coherence is, as before,
2
MT)
,71.
2ch
or
) 1/2
( 2ehir
1/71 ,.., (8.5.6)
m
Consider now
1/2 cc
m
tg(x, E)—( m
) f exp(i n2 /2hE)• exp [ — (ï)e V(x + 11 , 0)1
irhie h 2 2
X tg(x + rho) dn (8.5.7)
We will work to first order in e and therefore to second order in n [see Eq. (8.5.6) 231
above]. We expand THE PATH
INTEGRAL
a vi 77 2 a2 v, FORMULATION
lif(X± n,o)= tv (x,o)+ + + OF QUANTUM
ex 2 ex2 THEORY
exp[4—i )E1/(x+ 5 ,0)1=1— V(x+ 5 ,0)+• 
h 2 h 2
i£
=1 — — V(X, 0) • • •
since terms of order nE are to be neglected. Equation (8.5.7) now becomes
£)=( m)1/2 exp (im n2)[ (x, 0)— iE TAX, 0) li/(X, 0)
2rchie 2he h
a2 v/1 dri
OX 2 ex2 ]
Consulting the list of Gaussian integrals in Appendix A.2, we get
vi(x, g)=
1/2 [
vi(x, 0)
(27thiE)
1/2
he (2rhie)
1/2
a2 tv
(27:11hig) 2im ax2
)1 /2
je (27ChiE
V(x,O)yr(x, 0)]
hm
Or
iE h 2 a2
11/(X, e) — yr(x, 0) = + V(x, O)] /(x 0) (8.5.8)
h 2m ex2
which agrees with the Schrödinger prediction, Eq. (8.5.1).
8.6. Potentials of the Form V= a + bx+ cx2 + d.i + ex.i$
We wish to compute
U(x, t; x')= eistx( r) iihg[x(t")] (8.6.1)
This section may be omitted without loss of continuity.
232 Let us write every path as
CHAPTER 8
x(t") = x1 (t")+ y(t") (8.6.2)
It follows that
= i, (t") +5'(t") (8.6.3)
Since all the paths agree at the end points, y(0) = y(t) = O. When we slice up the time
into N parts, we have for intermediate integration variables
x(t7)= xd (t7)+ y(t7)= xd (t7)+Yi
Since xel (t1) is just some constant at t' ,
dxi = dyi
and
0
g[x(t )]
= J 2[ y(t ")] (8.6.4)
so that Eq. (8.6.1) becomes
o Ç.
u(x, ; exp s[xel (t")+ Y(t")l}Y g[ (C)} (8.6.5)
JO h
The next step is to expand the functional S in a Taylor series about xd:
S[xcl+y] = Y(xd + y, dt"
0
I) + (a
O
ax Xe l Xc l
+1 (a 2 y y2 + 2 a2Y
a2y
572 dt" (8.6.6)
Y3.7 )]
2 xc, ax xci Xci
The series terminates here since y is a quadratic polynominal.
The first piece 2'(x1, .id) integrates to give Skel Sel. The second piece, linear
in y and .)), vanishes due to the classical equation of motion. In the last piece, if we
recall
(8.6.7)
we get 233
THE PATH
1 a2.r =  C (8.6.8)
INTEGRAL
2 ex2 FORMULATION
OF QUANTUM
a22, THEORY
(8.6.9)
Ox a* — —e
(8.6.10)
Consequently Eq. (8.6.5) becomes
U(x, t; x')= exp (L
Sci) exp [—i (1 — cy 2 — eyy)dt"]
h J0 0 h i: f 2
X g[y(t")] (8.6.11)
Since the path integral has no memory of xd , it can only depend on t. So
U(x, t; x')=etsc'/hA(t) (8.6.12)
where A(t) is some unknown function of t. Now if we were doing the freeparticle
problem, we would get Eq. (8.6.11) with c= e= O. In this case we know that [see
Eq. (8.3.4)]
( m \I /2
A(t) — (8.6.13)
27chit)
Since the coefficient b does not figure in Eq. (8.6.11), it follows that the same value
of A(t) corresponds to the linear potential V= a + bx as well. For the harmonic
oscillator, c=inco2 , and we have to do the integral
o f 1
A(t)= f exp [i/h i — m(fr2 — co 2y2 )]dt"g[y(t")] (8.6.14)
L° 0 2
The evaluation of this integral is discussed in the book by Feynman and Hibbs
referred to at the end of this section. Note that even if the factor A(t) in vi(x, t) is
not known, we can extract all the probabilistic information at time t.
Notice the ease with which the Feynman formalism yields the full propagator
in these cases. Consider in particular the horrendous alternative of finding the eigen
functions of the Hamiltonian and constructing from them the harmonic oscillator
propagator.
The path integral method may be extended to three dimensions without any
major qualitative differences. In particular, the form of U in Eq. (8.6.12) is valid
for potentials that are at most quadratic in the coordinates and the velocities. An
234 interesting problem in this class is that of a particle in a uniform magnetic field. For
CHAPTER 8 further details on the subject of path integral quantum mechanics, see R. P. Feynman
and A. R. Hibbs, Path Integrals and Quantum Mechanics, McGrawHill (1965), and
Chapter 21.
Exercise 8.6.1. * Verify that
\1/2
(AX, t; x', 0)= A(t)exp(iSd/h), A(t)—
(27:11hit)
agrees with the exact result, Eq. (5.4.31), for V(x)= —fx. Hint: Start with xi (t")=
xo + vot" + (f/tri)t" 2 and find the constants xo and vo from the requirement that xcl (0)=x'
and xcl (t)= x.
Exercise 8.6.2. Show that for the harmonic oscillator with
mw 2x2
imco
U(x, t; x')= A(t) exp{ [(x2 + x' 2) cos cot— 2xx')]}
2h sin on
where A(t) is an unknown function. (Recall Exercise 2.8.7.)
Exercise 8.6.3. We know that given the eigenfunctions and the eigenvalues we can con
struct the propagator:
u(x, t; x', t')=E tv„(x)*„(.0 (8.6.15)
Consider the reverse process (since the path integral approach gives U directly), for the case
of the oscillator.
(1) Set x = x'= t'= O. Assume that A(t)=(mco /27rih sin co 0 1 / 2 for the oscillator. By
expanding both sides of Eq. (8.6.15), you should find that E= hco 12, 5hco /2, 9hc o . , etc.
What happened to the levels in between?
(2) (Optional). Now consider the extraction of the eigenfunctions. Let x = x' and t' =O.
Find Eo, El, Vo(x)1 2, and I wi(x)J 2 by expanding in powers of a =exp(icot).
Exercise 8.6.4. * Recall the derivation of the Schrödinger equation (8.5.8) starting from
Eq. (8.5.4). Note that although we chose the argument of V to be the midpoint x + x'/2, it
did not matter very much: any choice x + a (where ti = x'— x) for 0 < a <1 would have
given the same result since the difference between the choices is of order 63/2. All this
was thanks to the factor c multiplying V in Eq. (8.5.4) and the fact that I ri I e l /2 as per
,
Eq. (8.6.5).
Consider now the case of a vector potential which will bring in a factor 235
THE PATH
[iqe x— x' iqE INTEGRAL
exp A(x+ a 71)1_exp[  A(x+ and
hc hc FORMULATION
OF QUANTUM
to the propagator for one time slice. (We should really be using vectors for position and the THEORY
vector potential, but the onedimensional version will suffice for making the point here.) Note
that E now gets canceled, in contrast to the scalar potential case. Thus, going to order E to
derive the Schrödinger equation means going to order re in expanding the exponential. This
will not only bring in an A 2 term, but will also make the answer sensitive to the argument of
A in the linear term. Choose a= 1/2 and verify that you get the onedimensional version of
Eq. (4.3.7). Along the way you will see that changing a makes an order E difference to y(x, E)
so that we have no choice but to use a= 1/2, i.e., use the midpoint prescription. This point
will come up in Chapter 21.
The Heisenberg
Uncertainty Relations
9.1. Introduction
In classical mechanics a particle in a state (x0 , po) has associated with it well
defined values for any dynamical variable co(x, p), namely, co(xo , po). In quantum
theory, given a state I tv>, one can only give the probabilities P(o)) for the possible
outcomes of a measurement of SI. The probability distribution will be characterized
by a mean or expectation value
<LI> =<W1 01 1P> (9.1.1)
and an uncertainty about this mean :
(An )= f<tvl(n  <n> )21 ,>]I/2 (9.1.2)
There are, however, states for which AO = 0, and these are the eigenstates I co> of K2.
If we consider two Hermitian operators S2 and A, they will generally have some
uncertainties AS/ and AA in an arbitrary state. In the next section we will derive the
Heisenberg uncertainty relations, which will provide a lower bound on the product
of uncertainties, AK2 • AA. Generally the lower bound will depend not only on the
operators but also on the state. Of interest to us are those cases in which the lower
bound is independent of the state. The derivation will make clear the conditions
under which such a relation will exist.
9.2. Derivation of the Uncertainty Relations
Let K2 and A be two Hermitian operators, with a commutator
[SI, A] = if (9.2.1) 237
238 You may readily verify that F is also Hermitian. Let us start with the uncertainty
CHAPTER 9 product in a normalized state
(AQ )2(AA) 2 = < VI (0 — <Q> )2 1 <I (A — <A> ) 2 1 V> (9.2.2)
where <n> = <0 0 1 tv> and <A> = <VIAI tv>. Let us next define the pair
(9.2.3)
"A= A— <A>
which has the same commutator as Q and A (verify this). In terms of C2 and A
(AQ)2 (AA) 2 = ovi n2 1 Iv >< Iv' A2 1 Iv >
=<nlvintv><AtviAtv> (9.2.4)
since
02 _ on = Ciro
and
A2 = AtA (9.2.5)
If we apply the Schwartz inequality
I v11 2 1 v v2>1 2 (9.2.6)
(where the equality sign holds only if I VI > = CI V2>, where c is a constant) to the
states Inv> and 1 'AV>, we get from Eq. (9.2.4),
(6,0) 2(AA) 2 I <CI VIA WA 2 (9.2.7)
Let us now use the fact that
<clIviAtv>= ovicitAl w>=<tviclAi kg> (9.2.8)
to rewrite the above inequality as
(AQ) 2 (6,A) 2 > 1 ov inAi v'>1 2 (9.2.9)
Now, we know that the commutator has to enter the picture somewhere. This we
arrange through the following identity:
_
QA
cm' +An+ cmAn
2 2
= qn, Ai++ qn, Al (9.2.10)
where [CI, A], is called the anticommutator. Feeding Eq. (9.2.10) into the inequality 239
(9.2.7), we get THE HEISENBERG
UNCERTAINTY
Ali + [c/, Ali tv>1 2 (9.2.11) RELATIONS
We next use the fact that
(1) since [C2, Â] = iF, where F is Hermitian, the expectation value of the com
mutator is pure imaginary;
(2) since [CI, A], is Hermitian, the expectation value of the anticommutator is
real.
Recalling that la+ ibl 2 = a2 + 62, we get
(AO )2(AA) 2 4 Kvi [n, 4,1 v>+i<tvIrlw>1 2
<tvI[n, ;k ]±I tv>2 + <tvir'l iv> 2 (9.2.12)
This is the general uncertainty relation between any two Hermitian operators and is
evidently state dependent. Consider now canonically conjugate operators, for which
I" = h. In this case
1 h2
(M2)2(A/1)2_ — < vil [C2, Â] +I yi > 2 + — (9.2.13)
4 4
Since the first term is positive definite, we may assert that for any Iv>
(L11)2(AA)2 ._ h2/4
or
6,0.6A h/2 (9.2.14)
which is the celebrated uncertainty relation. Let us note that the above inequality
becomes an equality only if
(1)ni tv>=Ai V>
and (9.2.15)
(2) ovi[n, "Am Iv> =0
9.3. The Minimum Uncertainty Packet
In this section we will find the wave function vi(x) which saturates the lower
bound of the uncertainty relation for X and P. According to Eq. (9.2.15) such a
state is characterized by
(P — <P>)I IV> = c(X — <X >)I IV> (9.3.1)
240 and
CHAPTER 9
<OW <P>XX + >) + — <P>)11P> =0 (9.3.2)
where <P> and <X> refer to the state I Iv >, implicitly defined by these equations. In
the X basis, Eq. (9.3.1) becomes
(
— ih — <P>)tif(x)=c(x— <X >)yi(x)
dx
or
di(x) i
— [<P> + c(x— <X > )] dx (9.3.3)
(x) h
Now, whatever <X>may be, it is always possible to shift our origin (to x= <X>)
so that in the new frame of reference <X>
=0. In this frame, Eq. (9.3.3) has the
solution
iv(x)= iv(0) et< P>x/h e 1 cx2/2h (9.3.4)
Let us next consider the constraint, Eq. (9.3.2), which in this frame reads
<IMP — <P>)X + X(P <P>)I Iv> =0 —
If we now exploit Eq. (9.3.1) and its adjoint, we find
<Iv I ex 2 + 2 1 > = o
+ c*) < Iv I x 2 > =
from which it follows that c is pure imaginary:
il cl (9.3.5)
Our solution, Eq. (9.3.4) now becomes
ty(x)= ty(0) el< P>x/h e—Iclx2/2h
In terms of
A2 =hlicl
<P>x/h e—x2/2A2
(9.3.6)
V/(X)= Ill(0)e 1
where A2, like Ici, is arbitrary. If the origin were not chosen to make <X> zero, we 241
would have instead THE HEISENBERG
UNCERTAINTY
111(x) = v(‹X > ) ei<P> (x >)/h e (x  >)2 /26,2 RELATIONS
(9.3.7)
Thus the minimum uncertainty wave function is a Gaussian of arbitrary width and
center. This result, for the special case <X >= <P>=0, was used in the quest for the
state that minimized the expectation value of the oscillator Hamiltonian.
9.4. Applications of the Uncertainty Principle
I now illustrate the use of the uncertainty principle by estimating the size of the
groundstate energy and the spread in the groundstate wave function. It should be
clear from this example that the success we had with the oscillator was rather atypical.
We choose as our system the hydrogen atom. The Hamiltonian for this system,
assuming the proton is a spectator whose only role is to provide a Coulomb potential
for the electron, may be written entirely in terms of the electron's variable as
H 13; +
(9.4.1)t
2m (X 2 + Y2 ±Z 2) 1/2
Let us begin by mimicking the analysis we employed for the oscillator. We evaluate
<H> in a normalized state I V, >:
<H>=
<P,2+1);+ 1
e 2( (x 2 + y2 + z 2) 1/2)
2m
<P.b+ <1';› e2 ( 1
(x 2 + y2 + z2) 1/2) (9.4.2)
2m
Since
<P,>=(AP,) + <Px>2 etc.
the first step in minimizing <H> is to work only with states for which <Pi > = O. For
such states
<H> =
(APx)2 + (AP)2 +(AP,)2 2( 1
(9.4.3)
2m e (X2+ Y2±Z2)112)
The operator (X 2 + Y2 + Z 2 ) 1/2 is just 1/r in the coordinate basis. We will occasionally denote it by
1/r even while referring to it in the abstract, to simplify the notation.
242 We cannot exploit the uncertainty relations
CHAPTER 9
AP„AX h/2, etc.
yet since <H> is not a function of AX and AP. The problem is that
< v 2 + y2 + z2)1/2> is not simply related to AX, AY, and AZ. Now the handwaving
,
(
begins. We argue that (see Exercise 9.4.2),
1
(V + y2 + z2)I/2) qx2 + y2 ± z2)1 /2> (9.4.4)
where the symbol means that the two sides of Eq. (9.4.4) are not strictly equal,
but of the same order of magnitude. So we write
op 0 2 ± z) 2
e2
<H> (AP )2 < 2 + )72 z2)1/2>
2m (
,v
Once again, we argue that
qx2 + y2 + z2)1 /2> ,...,(< iy2> < y2> ± <z2> )1/2
and gett
2
<H>,,(APx)2 + (AP )2 + (AP z)2 (<X 2 >+ <Y2 >+ <Z 2>) 1 /2
e
2m
From the relations
<X 2 > = (AX ) 2 + <X >2 etc.
it follows that we may confine ourselves to state for which = < Y> = <Z>=0 in <X>
looking for the state with the lowest mean energy. For such states
2
<H>„AP ,!+ 2m + RAX
e
) 2 + (A Y) 2 + (AZ ) 2r2
For a problem such as this, with spherical symmetry, it is intuitively clear that the
configuration of least energy will have
(AX ) 2 = (A Y) 2 = (AZ )2
We are basically arguing that the mean of the functions (of X, Y, and Z)
and the functions of the
mean (<X >, < Y>, and <Z>) are of the same order of magnitude. They are in fact equal if there are no
fluctuations around the mean and approximately equal if the fluctuations are small (recall the discussion
toward the end of Chapter 6).
and 243
THE HEISENBERG
(Apx) 2 = (Apy) 2 = (AP) 2 UNCERTAINTY
RELATIONS
so that
2
<H> 34/ 3 e
(9.4.5)
2m 3 1 /2 AX
Now we use
APAX ^ h/2
to get
3h2 2
e
<H>
8m(AX) 2 3 1 /2 AX
We now differentiate the righthand side with respect to AX to find its minimum:
—6h 2 e2
&vox),,+ 3,,/ ,.(Ax).„
or
3(3 1/2 )h2 h2
AX 1.3 2 (9.4.6)
4me2 me
Finally,
—2me4
<H> 9h2 (.9.4.7)
What prevents us from concluding (as we did in the case of the oscillator), that the
groundstate energy is —2me4/9h2 or that the groundstate wave function is a Gauss
ian [of width 3(3 112 )h 2/4me2] is the fact that Eq. (9.4.7) is an approximate inequality.
However, the exact groundstate energy
Eg = — me4/2h 2 (9.4.8)
differs from our estimate, Eq. (9.4.7), only by a factor 2. Likewise, the true ground
state wave function is not a Gaussian but an exponential Ili (x, y, z)=
c exp[—(x 2 +y 2 + z2,) 1/2 / ao ], where
ao = h2/me2
244 is called the Bohr radius. However, the AX associated with this wave function is
CHAPTER 9
Ax _ h 2 /rne2 (9.4.9)
which also is within a factor of 2 of the estimated AX in (9.4.6).
In conclusion, the uncertainty principle gives us a lot of information about the
ground state, but not always as much as in the case of the oscillator.
Exercise 9.4.1. * Consider the oscillator in the state In = 1> and verify that
( 1) 1 ma)
\X 2 h
Exercise 9.4.2. (1) By referring to the table of integrals in Appendix A.2, verify that
1
3 1/ 2
a— r/a0 r= (x .2 + y2 + z2)172
(7rao)
is a normalized wave function (of the ground state of hydrogen). Note that in three dimensions
the normalization condition is
<OW> = f V * (r, 0, 0) 111 (r, 0, 0)1.2 dr d(cos 0) dO
=47r f te(r)ty(r)r2 dr = 1
for a function of just r.
(2) Calculate (AX )2 in this state [argue that (AX ) 2 = <r2 >] and regain the result quoted
in Eq. (9.4.9).
(3) Show that <110 1/<r> me21h2 in this state.
Exercise 9.4.3. Ignore the fact that the hydrogen atom is a threedimensional system and
pretend that
P2 e2 (R2 = + py2 + R2 = x2 + y2 + z2)
H—
2m (R2 ) 172
corresponds to a onedimensional problem. Assuming
AP AR> / 2
estimate the groundstate energy.
Exercise 9.4.4. * Compute AT AX, where T= P2 /2m. Why is this relation not so famous?
245
THE HEISENBERG
UNCERTAINTY
RELATIONS
Figure 9.1. At the point x l , skater A throws the snowball
towards skater B, who catches it at the point x2.
9.5. The Energy—Time Uncertainty Relation
There exists an uncertainty relation
AE•At.. ti/2 (9.5.1)
which does not follow from Eq. (9.2.12), since time t is not a dynamical variable
but a parameter. The content of this equation is quite different from the others
involving just dynamical variables. The rough meaning of this inequality is that the
energy of a system that has been in existence only for a finite time At has a spread
(or uncertainty) of at least AE, where AE and At are related by (9.5.1). To see how
this comes about, recall that eigenstates of energy have a time dependence e—jEl/n ,
i.e., a definite energy is associated with a definite frequency, co =Elh. Now, only a
wave train that is infinitely long in time (that is to say, a system that has been in
existence for infinite time) has a welldefined frequency. Thus a system that has been
in existence only for a finite time, even if its time dependence goes as e—i" during
this period, is not associated with a pure frequency co =Elh or definite energy E.
Consider the following example. At time t = 0, we turn on light of frequency co
on an ensemble of hydrogen atoms all in their ground state. Since the light is supposed
to consist of photons of energy hco, we expect transitions to take place only to a
level (if it exists) hco above the ground state. It will however be seen that initially
the atoms make transitions to several levels not obeying this constraint. However,
as t increases, the deviation AE from the expected finalstate energy will decrease
according to hlt. Only as t—*co do we have a rigid law of conservation of
energy in the classical sense. We interpret this result by saying that the light source
is not associated with a definite frequency (i.e., does not emit photons of definite
energy) if it has been in operation only for a finite time, even if the dial is set at a
definite frequency co during this time. [The output of the source is not just e' but
rather 0(1) e't , whose transform is not a delta function peaked at ai.] Similarly
when the excited atoms get deexcited and drop to the ground state, they do not emit
photons of a definite energy E= Ee — Eg (the subscripts e and g stand for "excited"
and "ground") but rather with a spread AE h/ At, At being the duration for which
they were in the excited state. [The time dependence of the atomic wave function is
not e_1" but rather 0(t)0(T— t)e  IE't/fi assuming it abruptly got excited to this
state at t = 0 and abruptly got deexcited at t = T.] We shall return to this point when
we discuss the interaction of atoms with radiation in a later chapter.
Another way to describe this uncertainty relation is to say that violations in the
classical energy conservation law by AE are possible over times At 1/ AE. The
following example should clarify the meaning of this statement.
Example 9.5.1. (Range of the Nuclear Force.) Imagine two ice skaters each equipped
with several snowballs, and skating toward each other on trajectories that are parallel but
separated by some perpendicular distance (Fig. 9.1). When skater A reaches some point x l
246 let him throw a snowball toward B. He (A) will then recoil away from B and start moving
along a new straight line. Let B now catch the snowball. He too will recoil as a result, as
CHAPTER 9
shown in the figure. If this whole process were seen by someone who could not see the snow
balls, he would conclude that there is a repulsive force between A and B. If A (or B) can
throw the ball at most 10 ft, the observer would conclude that the range of the force is 10 ft,
meaning A and B will not affect each other if the perpendicular distance between them exceeds
10 ft.
This is roughly how elementary particles interact with each other: if they throw photons
at each other the force is called the electromagnetic force and the ability to throw and catch
photons is called "electric charge." If the projectiles are pions the force is called the nuclear
force. We would like to estimate the range of the nuclear force using the uncertainty principle.
Now, unlike the two skaters endowed with snowballs, the protons and neutrons (i.e., nucleons)
in the nucleus do not have a ready supply of pions, which have a mass p and energy pc2. A
nucleon can, however, produce a pion from nowhere (violating the classical law of energy
conservation by pc2) provided it is caught by the other nucleon within a time At such that
At h/AE=h/pc2. Even if the pion travels toward the receiver at the speed of light, it can
only cover a distance r=c At= h/ pc, which is called the Compton wavelength of the pion and
is a measure of the range of nuclear force. The value of r is approximately 1 Fermi = 10 13 cm.
The picture of nuclear force given here is rather simpleminded and should be taken with
a grain of salt. For example, neither is the pion the only particle that can be "exchanged"
between nucleons nor is the number of exchanges limited to one per encounter. (The pion is,
however, the lightest object that can be exchanged and hence responsible for the nuclear force
of the longest range.) Also our analogy with snowballs does not explain any attractive inter
action between particles.
10
Systems with N Degrees
of Freedom
10.1. N Particles in One Dimension
So far, we have restricted our attention (apart from minor digressions) to a
system with one degree of freedom, namely, a single particle in one dimension. We
now consider the quantum mechanics of systems with N degrees of freedom. The
increase in degrees of freedom may be due to an increase in the number of particles,
number of spatial dimensions, or both. In this section we consider N particles in one
dimension, and start with the case N= 2.
The Two Particle Hilbert Space

Consider two particles described classically by (x1, pi) and (x2, p2). The rule
for quantizing this system [Postulate II, Eq. (7.4.39)] is to promote these variables
to quantum operators (X1, PO and (X2 , P2) obeying the canonical commutation
relations:
[Xi , P.]= iti{xi , pi } = iti8 (i= 1,2) (10.1.1a)
[X i , Xi ]= ih{xi , x1 } =O (10.1.1b)
[Pi, 1;]= ail Pi 9 Pi = ° (10.1.1c)
It might be occasionally possible (as it was in the case of the oscillator) to extract
all the physics given just the canonical commutators. In practice one works in a
basis, usually the coordinate basis. This basis consists of the kets I x i x2 > which are 247
248 simultaneous eigenkets of the commuting operators X 1 and X2
CHAPTER 10
xiX2> =xixix2>
(10.1.2)
X2I X1X 2> = X2 1X1X2>
and are normalized ast
<x;.,(Z.Ixix 2 >= 8(x; — xl)(5(xZ— x2) (10.1.3)
In this basis
I tv > > < xi x2 1 tv> = tv (xi , x 2)
Xi —> Xi (10.1.4)
0
Ox i
We may interpret
P(x l , x2)=1<xix2iV>I 2 (10.1.5)
as the absolute probability density for catching particle 1 near x l and particle 2 near
x2 , provided we normalize I tit> to unity
1= = <X1X21 VA 2 dX1 dX2 = f P(X1, X2) dX1 d2 (10.1.6)
There are other bases possible besides I x i x2 >. There is, for example, the momentum
basis, consisting of the simultaneous eigenkets lp1p2> of 1) 1 and P2. More generally,
we can use the simultaneous eigenkets I co co 2> of two commuting operators§
n1(X1, PO and f22(X2 , P2) to define the f2 basis. We denote by V 1®2 the twoparticle
Hilbert space spanned by any of these bases.
V1®2 As a Direct Product Space
There is another way to arrive at the space V102, and that is to build it out of
two oneparticle spaces. Consider a system of two particles described classically by
(x2 , p i ) and (x2, p2). If we want the quantum theory of just particle 1, we define
operators X, and 1) 1 obeying
[X1 , Pi ] = iti/ (10.1.7)
The eigenvectors 'x i > of X 1 form a complete (coordinate) basis for the Hilbert space
Note that we denote the bra corresponding to 1.,6.x> as 4q.
§ Note that any function of X1 and PI commutes with any function of X2 and P2
V, of particle 1. Other bases, such as I Pi> of 1) 1 or in general, I NI> of ni(xi , P1) 249
are also possible. Since the operators X 1 , P1, n, , etc., act on V,, let us append a SYSTEMS WITH
superscript (1) to all of them. Thus Eq. (10.1.7) reads N DEGREES
OF FREEDOM
[x;' P;l1=iiii ( l )
), (10.1.8a)
where / (1) is the identity operator on V I . A similar picture holds for particle 2, and
in particular,
vS2 PP] = j//(2)
), (10.1.8b)
Let is now turn our attention to the twoparticle system. What will be the
coordinate basis for this system? Previously we assigned to every possible outcome
x, of a position measurement a vector 'x i > in V, and likewise for particle 2. Now a
position measurement will yield a pair of numbers (x1 , x2). Since after the measure
ment particle 1 will be in state I x i > and particle 2 in I x2>, let us denote the correspond
ing ket by Ix' >01 x2> :
particle 1 at xl
IXI>01X2> 4 (10.1.9)
particle 2 at x2
Note that I x i >0Ix2 > is a new object, quite unlike the inner product < ii V2>
or the outer product I y/ 1 >< v21 both of which involve two vectors from the same
space. The product I x i >01x2), called the direct product, is the product of vectors
from two different spaces. The direct product is a linear operation:
(ctlx,>+dlxi>)®(filx2>)=afilxi>01x2>+dfilx;>01x2> (10.1.10)
The set of all vectors of the form I x i >PI x2 > forms the basis for a space which we call
V I 0V 2 , and refer to as the direct product of the spaces V 1 and V2. The dimensionality
(number of possible basis vectors) of V 1 ®V2 is the product of the dimensionality
of V, and the dimensionality of V2. Although all the dimensionalities are infinite
here, the statement makes heuristic sense: to each basis vector I x i > of V, and lx2>
of V 2 , there is one and only one basis vector I xi >CI x2> of VI 0 V2. This should be
compared to the direct sum (Section 1.4):
V I C)2 = VI 0 V2
in which case the dimensionalities of V, and V2 add (assuming the vectors of VI
are linearly independent of those of V 2 ).
The coordinate basis, I xi >0 I x2>, is just one possibility; we can use the momen
tum basis I pl>01p2>, or, more generally, Ico l >01(0 2 >. Although these vectors span
V 1 CAl 2 , not every element of V I C)V 2 is a direct product. For instance
I tv>= lx ■ >01xD+ Ix DOIxD
250 cannot be written as
CHAPTER 10
I Iv> = I vi >01 ty2>
where VI> and 11//2> are elements of V I and V2, respectively.
The inner product of lxi >01x2 > and lxi>01xD is
(<40(0(lxi>01x2>)=<4x1XxZ.Ix2>
=8(x;— xl)8(xZ — x2) (10.1.11)
Since any vector in V I (:)V 2 can be expressed in terms of the I xl>C)Ix2> basis, this
defines the inner product between any two vectors in VI CW2.
It is intuitively clear that when two particles are amalgamated to form a single
system, the position and momentum operators of each particle, Xi 1), p XV) ,
/12), which acted on V, and V2, respectively, must have counterparts in VIOV2
and have the same interpretation. Let us denote by X i 1)®(2) the counterpart of
X i l) , and refer to it also as the "X operator of particle 1." Let us define its action
on V 1 ®V 2 . Since the vectors x 1 >C)I x2> span the space, it suffices to define its action
on these. Now the ketlx,>01x2> denotes a state in which particle 1 is at xl . Thus
it must be an eigenket of X i l)®(2) with eigenvalue xl :
)01' ) ® (2) 1xi>olx2>=x11x1>o1x2> (10.1.12)
Note that X1)®(2) does not really care about the second ket I x2>, i.e., it acts trivially
(as the identity) on lx2 > and acts on Ix, > just as Xi' ) did. In other words
x (1 1 )() ( 2). x 1 01 x 2> =1,17(1 1) x l > oli(2) x2> (10.1.13)
Let us define a direct product of two operators, IT ) and AV) (denoted by 11 1) 0
Ar), whose action on a direct product ket >010)2> is
(rP ) OA 2 )10001(02>=IF ■1) 0001A 2 (92> (10.1.14)
In this notation, we may write X1 1)®(2) , in view of Eq. (10.1.13), as
jy y)0(2) xp oi(2) (10.1.15)
We can similarly promote PP, say, from V2 to V, CDV2 by defining the momentum
operator for particle 2, PP )®(2) , as
pp )0 (2) i(1) 0 p12) (10.1.16)
The following properties of direct products of operators may be verified (say
by acting on the basis vectors lxi>01x2>):
Exercise 10.1.1.* Show the following: 251
SYSTEMS WITH
(1) [Q 1" ) 0/(2), /" ) 0A 2)]=0 for any Qp) and AP (10.1.1 7a) N DEGREES
OF FREEDOM
(operators of particle 1 commute with those of particle 2).
(2) (Qii)OrZ2))(0i1)0AP)=(ne)il)0 (FA) 2 ) (10.1.1 7b)
(3) If
Mol=r;')
then
[L21(1)03,(2) , Ay )®(2)] = (10.1.17c)
and similarly with 1—>2.
(4) (121(1)®(2) +011)0(2) ) 2= (0 ,2 ) (1) 0/ (2) +/ (0 0 (2)(2)±2w1)01212) (10.1.17d)
The notion of direct products of vectors and operators is no doubt a difficult
one, with no simple analogs in elementary vector analysis. The following exercise
should give you some valuable experience. It is recommended that you reread the
preceding discussion after working on the exercise.
Exercise 10.1.2.* Imagine a fictitious world in which the singleparticle Hilbert space is
twodimensional. Let us denote the basis vectors by 1+> and 1—>. Let
+— +—
(1)— + [a b] cr(2) = [e f
al  and
c d 2 — g h
be operators in V I and V 2 , respectively (the + signs label the basis vectors. Thus
b = <+I o;' ) I — > etc.) The space V I 0 V2 is spanned by four vectors 1+>01+>, 1+>01 — >,
1 — >01+>, 1 — >01 — >. Show (using the method of images or otherwise) that
++ +— —+ 
++— a 0 b 0
(1) o;" (2) =0;"0/ (2) =+— 0 a 0
—+ c 0 d 0
0 c 0 d
252 (Recall that <al 0 <pi is the bra corresponding to a >01 fi>.)
CHAPTER 10
e f 0 0
(2) 0 S1) 0 (2) = [,,2. h 0 0
0 0 e f
0 0 g h
rae af be bf
ag ah bg bh
(3) (0102) (1)®(2) =
ce cf de df
cg ch dg dh
Do part (3) in two ways, by taking the matrix product of o; 1)® (2) and ŒS1 ) ® ( 2) and by directly
computing the matrix elements of o; 1) 0oS2) .
From Eqs. (10.1.17a) and (10.1.17c) it follows that the commutation relations
between the position and momentum operators on V 1 ®V 2 are
[X ) ® (2) , Pi(1)(8) (2) 1 OP) = ith5I(1)® (2)
(10.1.18)
[X (1 ) ® (2) , X j i) ® (2) 1 = [P,(1)® (2) , Pf(1)® (2) 1 =0 1= 1, 2
Now we are ready to assert something that may have been apparent all along:
the space V1 ®V 2 is just V102, lxi>01x2> is just lx1x2>, and X; 1)®(2) is just XI , etc.
Notice first that both spaces have the same dimensionality: the vectors Ix i x2 > and
I xi >01x2> are both in onetoone correspondence with points in the x l — x2 plane.
Notice next that the two sets of operators X 1 , . , P2 and XY )®(2) , , ./1" (2) have
the same connotation and commutation rules [Eqs. (10.1.1) and (10.1.18)]. Since X
and P are defined by their commutators we can make the identification
x (1)0 (2) =
p( 1)0 (2)
(10.1.19a)
We can also identify the simultaneous eigenkets of the position operators (since they
are nondegenerate):
lxi>01x2>=Ixix2> (10.1.19b)
In future, we shall use the more compact symbols occurring on the righthand side
of Eqs. (10.1.19). We will, however, return to the concept of direct products of
vectors and operators on and off and occasionally use the symbols on the lefthand
side. Although the succinct notation suppresses the label (102) of the space on
which the operators act, it should be clear from the context. Consider, for example, 253
the CM kinetic energy operator of the twoparticle system: SYSTEMS WITH
N DEGREES
TCM
PCN4 PCm _ (P + P2)2 _.F1 .1q+ 2P1P2 OF FREEDOM
2(m i + m2) 2M — 2M 2M
which really means
2MTar (2) = (p)(1)C) (2) ± (p )(1)0 (2) ± zp f )0 (2) py )0 (2)
= ) o i(2 ))2 (/(1) 0 pF))2 2pf 1) o pp)
The Direct Product Revisited
Since the notion of a direct product space is so important, we revisit the forma
tion of V 1®2 as a direct product of V I and V2 , but this time in the coordinate basis
instead of in the abstract. Let EV ) be an operator on V I whose nondegenerate
eigenfunctions tit 1 ) co l (x l ) form a complete basis. Similarly let co 2 (X2) form a
basis for V2. Consider now a function '(x i , x2), which represents the abstract
ket tit> from V1 0 2. If we keep x l fixed at some value, say , then tit becomes a
function of x2 alone and may be expanded as
tv(ki, x2) = E c,(k1)(02(x2) (10.1.20)
(02
Notice that the coefficients of the expansion depend on the value of Îi. We now
expand the function C0)2(.Tc 1 ) in the basis co01):
c.2 (ki)=E CCO ,w2co (10.1.21)
Col
Feeding this back to the first expansion and dropping the bar on i we get
v(xi, x2) = E E c.,,0)2coi(x1)(02(x2) (10.1.22a)
WI W2
What does this expansion of an arbitrary tit(x l , x2) in terms of co (xi ) X W2(X2) imply?
Equation (10.1.22a) is the coordinate space version of the abstract result
Ity>=E E c0,21co1>010)2> (10.1.22b)
COI (0 2
which means Vi®2 = VI V2, for I tii> belongs to V1®2 and I 0) >01 0)2> span VI V2 •
If we choose S2= X, we get the familiar basis I x i >01x2 >. By dotting both sides of
Eq. (10.1.22b) with these basis vectors we regain Eq. (10.1.22a). (In the coordinate
basis, the direct product of the kets I co l > and Ico 2 > becomes just the ordinary product
of the corresponding wave functions.)
Consider next the operators. The momentum operator on V I , which used to be
d/dx ] now becomes — ih a I ax, , where the partial derivative symbol tells us it
254 operates on xl as before and leaves x2 alone. This is the coordinate space version of
P" )® (2) = ) / (2) . You are encouraged to pursue this analysis further.
CHAPTER 10
Evolution of the Two Particle State Vector

The state vector of the system is an element of V 102 . It evolves in time according
to the equation
+ + V(X, , X2)11V> = Ig> (10.1.23)
2m 1 2m2
There are two classes of problems.
Class A: H is separable, i.e.,
P?
H= + VI (X 1 )+ + V2(X 2)= Hi+ H2 (10.1.24)
2m, 2m2
Class B: H is not separable, i.e.,
, x2)o vi(xl) + x
v2( 2)
and
HOH, +H2 (10.1.25)
Class A corresponds to two particles interacting with external potentials V I and V2
but not with each other, while in class B there is no such restriction. We now examine
these two classes.
Class A: Separable Hamiltonians. Classically, the decomposition
if= Y17 ( x 1 , pi ) + Y6( x2,p2)
means that the two particles evolve independently of each other. In particular, their
energies are separately conserved and the total energy E is El + E2. Let us see these
results reappear in quantum theory. For a stationary state,
IV (0> = E> (10.1.26)
Eq. (10.1.23) becomes
[H ,(X , PO+H2(X2, P2)iiE> = El E> (10.1.27)
Since [HI, H2] =0 [Eq. (10.1.17a)] we can find their simultaneous eigenstates, which 255
are none other than lEi>01E2>=1E1E2>, where 1E1 > and 1E2 > are solutions to SYSTEMS WITH
N DEGREES
(10.1.28a) OF FREEDOM
Hi l) 1EI>=EilEi>
and
H22) 1 E2> = E2I E2> ( 10.1 .28b)
It should be clear that the state 1E001E2 > corresponds to particle 1 being in
the energy eigenstate 1E 1 > and particle 2 being in the energy eigenstate E2>. Clearly
HIE> = (HI +H2)1E001E2> = (E1+E2)1E1>01E2> = (Ei +E 2)1E>
so that
E=E1 +E2 (10.1.28c)
(The basis 1E1 >01E2 > is what we would get if in forming basis vectors of the direct
product V10V 2 , we took the energy eigenvalues from each space, instead of, say,
the position eigenvectors.) Finally, feeding 1E>=1E l >0 I E2 > , E= El + E2 into Eq.
(10.1.26) we get
1 ilf(t)>=1E1> e 'Elt/hOlE2>C'E2t/h (10.1.29)
It is worth rederiving Eqs. (10.1.28) and (10.1.29) in the coordinate basis to
illustrate a useful technique that you will find in other textbooks. By projecting the
eigenvalue Eq. (10.1.27) on this basis, and making the usual operator substitutions,
Eq. (10.1.4), we obtain
r  2
a2
2 ± VI (Xi)
h2 02
n + V2(X2) yfAx, , x2) = E v E(x,, x2)
[2m, ax, 2m2
where
VE(xl, x2)=<x,x2IE> (10.1.30)
We solve the equation by the method of separation of variables. We assume
VE(xi x2) = (xi )4/E2(x2) (10.1.31)
The subscripts El and E2 have no specific interpretation yet and merely serve as
labels. Feeding this ansatz into Eq. (10.1.30) and then dividing both sides by
256 VE1(xi)ti1 E2(x2) we get
CHAPTER 10
1 _h2 a2
[+ (xi )1V E, (xi)
VE,(xi) 2m1 aXj
V2(X2) 111/ E2 (X2) = E (10.1.32)
1
E2(X2) r:22
[2 aa2x2
This equation says that a function of x l alone, plus one of x2 alone, equals a constant
E. Since xl and x2 , and hence the two functions, may be varied independently, it
follows that each function separately equals a constant. We will call these constants
E, and E2. Thus Eq. (10.1.32) breaks down into three equations:
1 [_ h 2 02
Vi(x) tnicx.)= El
wax') 2m , ax
1 [_h 2 a 2
E2 (10.1.33)
wE2(x2) 2m 2 ax 3+ v2(x2) wE2(x2)
El + E2 = E
Consequently
) e iEt/h
, X2, t)= Ax i 9 x2
= vE,(x 1 ) e iEwh ivE2( x2 ) e iE2 0 (10.1.34)
where v E, and ti/ E2 are eigenfunctions of the oneparticle Schr6dinger equation with
eigenvalue El and E2, respectively. We recognize Eqs. (10.1.33) and (10.1.34) to be
the projections of Eqs. (10.1.28) and (10.1.29) on I xix2 > = 'xi >01 x2>•
Case B: Two Interacting Particles. Consider next the more general problem of
two interacting particles with
2 2
pi P2
,Y€2 — + V(x l , x 2) (10.1.35)
2m, 2m2
where
V( xl , x2)0 VI (xi) + V( x2)
Generally this cannot be reduced to two independent singleparticle problems. If,
however,
V(x l , x2 ) = V(xi —x2) (10.1.36)
which describes two particles responding to each other but nothing external, one can 257
always, by employing the CM coordinate SYSTEMS WITH
N DEGREES
mix, +m2x2 OF FREEDOM
xcm= (10.1.37a)
+n1 2
and the relative coordinate
= XI — X2 (10.1.37b)
reduce the problem to that of two independent fictitious particles: one, the CM,
which is free, has mass M=m, +m2 and momentum
PCM = = MI ) 1 +171 2).C2
and another, with the reduced mass p = min12/(mi + m2), momentum p= p ic, moving
under the influence of V(x):
Ye°(x,,p1; X2, P2) —'' Yf(XCM PCM ; X, /3)
n2 n2
..*CM ± relative CM ± V(X) (10.1.38)
2M 2p
which is just the result from Exercise 2.5.4 modified to one dimension. Since the new
variables are also canonical (Exercise 2.7.6) and Cartesian, the quantization condi
tion is just
[Xcm, Pcm1= ih (10.1.39a)
[X, P]= ih (10.1.39b)
and all other commutators zero. In the quantum theory,
PCm P2
V(X) (10.1.40)
2M 2p
and the eigenfunctions of H factorize:
e iPcm xcm/h
E(xcm, x) = (27ch) , /2 Vf Erel(X)
(10.1.41)
2
E= Pcm /Lrel
2M
The real dynamics is contained in tv E„,(x) which is the energy eigenfunction for a
particle of mass p in a potential V(x). Since the CM drifts along as a free particle,
one usually chooses to study the problem in the CM frame. In this case ECM=
258 ik m / 2M drops out of the energy, and the plane wave factor in y/ representing CM
CHAPTER 10
motion becomes a constant. In short, one can forget all about the CM in the quantum
theory just as in the classical theory.
N Particles in One Dimension
All the results but one generalize from N= 2 to arbitrary N. The only exception
is the result from the last subsection: for N> 2, one generally cannot, by using CM
and relative coordinates (or other sets of coordinates) reduce the problem to N
independent oneparticle problems. There are a few exceptions, the most familiar
ones being Hamiltonians quadratic in the coordinates and momenta which may be
reduced to a sum over oscillator Hamiltonians by the use of normal coordinates. In
such cases the oscillators become independent and their energies add both in the
classical and quantum cases. This result (with respect to the quantum oscillators)
was assumed in the discussion on specific heats in Chapter 7.
Exercise 10.1.3.* Consider the Hamiltonian of the coupled mass system:
„2
P2
= MCO 2 [Xj +x3+ (xi — x2) 2]
2m 2m 2
We know from Example 1.8.6 that Ye can be decoupled if we use normal coordinates
± x2
XI II
and the corresponding momenta
±p2
—
2 1/2
(1) Rewrite Ye in terms of normal coordinates. Verify that the normal coordinates are
also canonical, i.e., that
{x1 , p1 = 5 7 etc. ;
} i,j= I, II
Now quantize the system, promoting these variables to operators obeying
[X„.1)1 ]=i1i8u etc. ; j= I, II
Write the eigenvalue equation for H in the simultaneous eigenbasis of X, and X, 1 .
(2) Quantize the system directly, by promoting x 1 , x2, Pi, and p2 to quantum operators.
Write the eigenvalue equation for H in the simultaneous eigenbasis of X 1 and X2. Now change
from x 1 , x2 (and of course agx, , a/ax2) to xi , x 11 (and a/axi , /axii ) in the differential
equation. You should end up with the result from part (1).
In general, one can change coordinates and then quantize or first quantize and then
change variables in the differential equation, if the change of coordinates is canonical. (We
are assuming that all the variables are Cartesian. As mentioned earlier in the book, if one wants
to employ nonCartesian coordinates, it is best to first quantize the Cartesian coordinates and 259
then change variables in the differential equation.)
SYSTEMS WITH
N DEGREES
OF FREEDOM
10.2. More Particles in More Dimensions
Mathematically, the problem of a single particle in two dimensions (in terms of
Cartesian coordinates) is equivalent to that of two particles in one dimension. It is,
however, convenient to use a different notation in the two cases. We will denote the
two Cartesian coordinates of the single particle by x and y rather than x l and x2 .
Likewise the momenta will be denoted by Px and py . The quantum operators will be
called X and Y; and Px , and Py , their common eigenkets I xy>, Ipxpy >, respectively,
and so on. The generalization to three dimensions is obvious. We will also write a
position eigenket as lr> and the orthonormality relation <xyzIx'y'z'>=
8(x— x')6(y— y')8(z—z) as <rlr'> = 6 3 (r — r'). The same goes for the momentum
eigenkets p> also. When several particles labeled by numbers 1, . , N are involved,
this extra label will also be used. Thus 4) 1 1)2 > will represent a twoparticle state in
which particle 1 has momentum p i and particle 2 has momentum p2 and so on.
Exercise 10.2.1 * (Particle in a ThreeDimensional Box). Recall that a particle in a one
dimensional box extending from x = 0 to L is confined to the region 0 < x < L; its wave function
vanishes at the edges x = 0 and L and beyond (Exercise 5.2.5). Consider now a particle confined
in a threedimensional cubic box of volume L 3 . Choosing as the origin one of its corners, and
the x, y, and z axes along the three edges meeting there, show that the normalized energy
eigenfunctions are
n x TCX 2 1 /2 n Ir 2 1 /2 fir TCZ
E(x, y, z)= ( 2 ) 1 /2 sin
• ( sin Y Y sm
L L L L
where
h 2 K2
E— (4+ ny2 + n!)
2ML 2
and ni are positive integers.
Exercise 10.2.2.* Quantize the twodimensional oscillator for which
2, 2
ye = lis rPy I nuo 2xx2 1 mcoy2y2
2m 2 2
(1) Show that the allowed energies are
E= (nx + 1/2)hco, + (ny + 1/2)hcoy , nx ny =0,1, 2, . . .
(2) Write down the corresponding wave functions in terms of single oscillator wave
functions. Verify that they have definite parity (even/odd) number x — > — x, y—>y and that
the parity depends only on n=nx +ny .
260 3
CHAPTER 10
Figure 10.1. Two identical billiard balls start near holes 1 and 2 and
end up in holes 3 and 4, respectively, as predicted by P 1 . The pre
diction of P2, that they would end up in holes 4 and 3, respectively,
is wrong, even though the two final configurations would be indis
2 tinguishable to an observer who walks in at t = T.
(3) Consider next the isotropic oscillator (co,= coy ). Write explicit, normalized eigen
functions of the first three states (that is, for the cases n=0 and 1). Reexpress your results in
terms of polar coordinates p and q5 (for later use). Show that the degeneracy of a level with
E= (n + 1)hco is (n + 1).
Exercise 10.2.3.* Quantize the threedimensional isotropic oscillator for which
ye P P nuo 2( x2 + y2 ± z2)
2m 2
(1) Show that E=(n+312)hco; n=nx +ny +nr ; ny , nz =0, 1, 2, ....
(2) Write the corresponding eigenfunctions in terms of singleoscillator wave functions
and verify that the parity of the level with a given n is (1)". Reexpress the first four states
in terms of spherical coordinates. Show that the degeneracy of a level with energy E=
(n + 3 /2)hco is (n + 1)(n + 2)/2.
10.3. Identical Particles
The formalism developed above, when properly applied to a system containing
identical particles, leads to some very surprising results. We shall say two particles
are identical if they are exact replicas of each other in every respect—there should
be no experiment that detects any intrinsic t difference between them. Although the
definition of identical particles is the same classically and quantum mechanically,
the implications are different in the two cases.
The Classical Case
Let us first orient ourselves by recapitulating the situation in classical physics.
Imagine a billiard table with four holes, numbered 1 through 4 (Fig. 10.1). Near
holes 1 and 2 rest two identical billiard balls. Let us call these balls 1 and 2. The
difference between the labels reflects not any intrinsic difference in the balls (for they
are identical) but rather a difference in their environments, namely, the holes near
which they find themselves.
By intrinsic I mean properties inherent to the particle, such as its charge or mass and not its location
or momentum.
Now it follows from the definition of identity, that if these two balls are 261
exchanged, the resulting configuration would appear exactly the same. Nonetheless SYSTEMS WITH
these two configurations are treated as distinct in classical physics. In order for this N DEGREES
distinction to be meaningful, there must exist some experiments in which these two OF FREEDOM
configurations are inequivalent. We will now discuss one such experiment.
Imagine that at time t= 0, two players propel the balls toward the center of the
table. At once two physicists PI and P2 take the initialvalue data and make the
following predictions:
ball 1 goes to hole 31
PI : at t = T
ball 2 goes to hole 4
ball 1 goes to hole 4 }
P2 : at t= T
ball 2 goes to hole 3
Say at time T we find that ball 1 ends up in hole 3 and ball 2 in hole 4. We
declare that P1 is correct and P2 is wrong. Now, the configurations predicted by
them for t= T differ only by the exchange of two identical particles. If seen in
isolation they would appear identical: an observer who walks in just at t= T and is
given the predictions of PI and P2 will conclude that both are right. What do we
know about the balls (that allows us to make a distinction between them and hence
the two outcomes), that the newcomer does not? The answer of course is—their
histories. Although both balls appear identical to the newcomer, we are able to trace
the ball in hole 3 back to the vicinity of hole 1 and the one in hole 4 back to hole
2. Similarly at t = 0, the two balls which seemed identical to us would be distin
guishable to someone who had been following them from an earlier period. Now of
course it is not really necessary that either we or any other observer be actually
present in order for this distinction to exist. One imagines in classical physics the
fictitious observer who sees everything and disturbs nothing; if he can make the
distinction, the distinction exists.
To summarize, it is possible in classical mechanics to distinguish between ident
ical particles by following their nonidentical trajectories (without disturbing them in
any way). Consequently two configurations related by exchanging the identical parti
cles are physically nonequivalent.
An immediate consequence of the above reasoning, and one that will play a
dominant role in what follows, is that in quantum theory, which completely outlaws
the notion of continuous trajectories for the particles, there exists no physical basis
for distinguishing between identical particles. Consequently two configurations
related by the exchange of identical particles must be treated as one and the same
configuration and described by the same state vector. We now proceed to deduce
the consequences of this restriction.
Two Particle Systems
 —Symmetric and Antisymmetric States
Suppose we have a system of two distinguishable particles 1 and 2 and a position
measurement on the system shows particle 1 to be at x = a and particle 2 to be at
262 x = b. We write the state just after measurement as
CHAPTER 10
IV> = = x2= b>=lab> (10.3.1)
where we are adopting the convention that the state of particle 1 is described by the
first label (a) and that of particle 2 by the second label (b). Since the particles are
distinguishable, the state obtained by exchanging them is distinguishable from the
above. It is given by
I tv>=Iba>
and corresponds to having found particle 1 at b and particle 2 at a.
Suppose we repeat the experiment with two identical particles and catch one at
x = a and the other at x= b. Is the state vector just after measurement lab> or I ba>?
The answer is, neither. We have seen that in quantum theory two configurations
related by the exchange of identical particles must be viewed as one and the same
and be described by the same state. Since I tif> and a l iv> are physically equivalent,
we require that I v(a, b)>, the state vector just after the measurement, satisfy the
constraint
I tv(a, b)> = al v(b, a)> (10.3.2)
where a is any complex number. Since under the exchange
lab> 4—* lba>
and the two vectors are not multiples of each other t(i.e., are physically distinct)
neither is acceptable. The problem is that our position measurement yields not an
ordered pair of numbers (as in the distinguishable particle case) but just a pair of
numbers: to assign them to the particles in a definite way is to go beyond what is
physically meaningful in quantum theory. What our measurement does permit us to
conclude is that the state vector is an eigenstate of X1 +X2 with eigenvalue a+b, the
sum of the eigenvalue being insensitive to how the values a and b are assigned to
the particles. In other words, given an unordered pair of numbers a and b we can
still define a unique sum (but not difference). Now, there are just two product vectors,
lab> and I ba> with this eigenvalue, and the state vector lies somewhere in the two
dimensional degenerate (with respect to X1 +X2) eigenspace spanned by them. Let
I tv(a, b)> = f3lab> + ylba> be the allowed vector. If we impose the constraint
Eq. (10.3.2):
131ab> + ylba> = abglba> + ylab>]
we find, upon equating the coefficients of lab> and I ba> that
f3=ay, y=af3
We are assuming a *b. If a=b, the state is acceptable, but the choice we are agonizing over does not
arise.
so that 263
SYSTEMS WITH
a = ±1 (10.3.3) N DEGREES
OF FREEDOM
It is now easy to construct the allowed state vectors. They are
lab, S> =lab>+lba> (10.3.4)
called the symmetric state vector (a = 1) and
lab, A> =lab>—lba> (10.3.5)
called the antisymmetric state vector (a = — 1). (These are unnormalized vectors. Their
normalization will be taken up shortly.)
More generally, if some variable f2 is measured and the values w 1 and (0 2 are
obtained, the state vector immediately following the measurement is either I co 1 co 2 , S>
or 10) 1 (.0 2 , A >4 Although we have made a lot of progress in nailing down the state
vector corresponding to the measurement, we have still to find a way to choose
between these two alternatives.
Bosons and Fermions
Although both S and A states seem physically acceptable (in that they respect
the indistinguishability of the particles) we can go a step further and make the
following assertion:
A given species of particles must choose once and for all between S and A states.
Suppose the contrary were true, and the Hilbert space of two identical particles
contained both S and A vectors. Then the space also contains linear combinations
such as
(V>=alco1c0 2, S> + )610);(0Z, A>
which are neither symmetric nor antisymmetric. So we rule out this possibility.
Nature seems to respect the constraints we have deduced. Particles such as the
pion, photon, and graviton are always found in symmetric states and are called
bosons, and particles such as the electron, proton, and neutron are always found in
antisymmetric states and are called fermions.
Thus if we catch two identical bosons, one at x= a and the other at x = b, the
state vector immediately following the measurement is
IV> = !xi= a, x2= b> +xi =b, x 2 = a>
=lab>+lba>=Iab, S>
We are assuming S2 is nondegenerate. If not, let co represent the eigenvalues of a complete set of
commuting operators.
264 Had the particles been fermions, the state vector after the measurement would have
CHAPTER 10 been
Iv'> =Ixi =a, x2= b> — 'xi= b, x 2 = a> = lab> —lba>
=lab, A>
Note that although we still use the labels xl and x2 , we do not attach them to the
particles in any particular way. Thus having caught the bosons at x = a and x=b,
we need not agonize over whether x l — a and x2 = b or vice versa. Either choice leads
to the same I iv> for bosons, and to state vectors differing only by an overall sign
for fermions.
We are now in a position to deduce a fundamental property of fermions, which
results from the antisymmetry of their state vectors. Consider a twofermion state
Ica1(02, A>=1(0 100 — I(020)1>
Let us now set co 1 = (0 2 = co. We find
I cow, A> =1(qc.o> — 'cow> =0 (10.3.6)
This is the celebrated Pauli exclusion principle: Two identical fermions cannot be in
the same quantum state. This principle has profound consequences—in statistical
mechanics, in understanding chemical properties of atoms, in nuclear theory, astro
physics, etc. We will have occasion to return to it often.
With this important derivation out of our way, let us address a question that
may have plagued you: our analysis has only told us that a given type of particle,
say a pion, has to be either a boson or a fermion, but does not say which one. There
are two ways to the answer. The first is by further cerebration, to be specific, within
the framework of quantum field theory, which relates the spin of the particle to its
"statistics"—which is the term physicists use to refer to its bosonic or fermionic
nature. Since the relevant arguments are beyond the scope of this text I merely quote
the results here. Recall that the spin of the particle is its internal angular momentum.
The magnitude of spin happens to be an invariant for a particle (and thus serves as
a label, like its mass or charge) and can have only one of the following values: 0,
h/2, h, 3h/2, 2h, . . . . The spin statistics theorem, provable in quantum field theory,
asserts that particles with (magnitude of spin) equal to an even multiple of h/2 are
bosons, and those with spin equal to an odd multiple of h/2 are fermions. However,
this connection, proven in three dimensions, does not apply to one dimension, where
it is not possible to define spin or any form of angular momentum. (This should be
clear classically.) Thus the only way to find if a particle in one dimension is a boson
or fermion is to determine the symmetry of the wave function experimentally. This
is the second method, to be discussed in a moment.
Before going on to this second method, let us note that the requirement that
the state vector of two identical particles be symmetric or antisymmetric (under the
exchange of the quantum numbers labeling them) applies in three dimensions as
well, as will be clear by going through the arguments in one dimension. The only
difference will be the increase in the number of labels. For example, the position
eigenket of a spinzero boson will be labeled by three numbers x, y, and z. For 265
fermions, which have spin at least equal to h /2, the states will be labeled by the SYSTEMS WITH
orientation of the spin as well as the orbital labels that describe spinless bosons.t N DEGREES
We shall consider just spin particles, for which this label can take only two values, OF FREEDOM
call them + and — or spin up and down (the meaning of these terms will be clear
later). If we denote by co all the orbital labels and by s the spin label, the state vector
of the fermion that is antisymmetric under the exchange of the particles, i.e., under
the exchange of all the labels, will be of the form
1(0151, (0252, A> = Iwisi, (0252> — co2s2 , co isi> (10.3.7)
We see that the state vector vanishes if
(0 1 = 0) 2 and si = s2 (10.3.8)
Thus we find once again that two fermions cannot be in the same quantum state,
but we mean by a quantum state a state of definite co and s. Thus two electrons can
be in the same orbital state if their spin orientations are different.
We now turn to the second way of finding the statistics of a given species of
particles, the method that works in one or three dimensions, because it appeals to a
simple experiment which determines whether the twoparticle state vector is symmet
ric or antisymmetric for the given species. As a prelude to the discussion of such an
experiment, let us study in some detail the Hilbert space of bosons and fermions.
Bosonic and Fermionic Hilbert Spaces
We have seen that two identical bosons will always have symmetric state vectors
and two identical fermions will always have antisymmetric state vectors. Let us call
the Hilbert space of symmetric bosonic vectors Vs and the Hilbert space of the
antisymmetric fermionic vectors VA. We first examine the relation between these
two spaces on the one hand and the direct product space Vi®2 on the other.
The space V1 02 consists of all vectors of the form 1(01(02> = i>01w2>. To
each pair of vectors I co = a, w2= b> and I wi= b, w2= a> there is one (unnormalized)
bosonic vector I co = a, co 2 = b> +lcol= b, w2= a>and one fermionic vector I co =
a, co2 = b> —Icol = a, w2= b>. If a= b; the vector I co I =a, w2= a> is already symmetric
and we may take it to be the bosonic vector. There is no corresponding fermionic
vector (the Pauli principle). Thus V1®2 has just enough basis vectors to form one
bosonic Hilbert space and one fermionic Hilbert space. We express this relation as
V 102 = V sOVA (10.3.9)
Since spin has no classical counterpart, the operator representing it is not a function of the coordinate
and momentum operators and it commutes with any orbital operator SI. Thus spin may be specified
simultaneously with the orbital variables.
266 with Vs getting slightly more than half the dimensionality of V1 0 2.1 Our analysis
CHAPTER 10 has shown that at any given time, the state of two bosons is an element of Vs and
that of two fermions an element of VA . It can also be shown that a system that
starts out in Vs(VA ) remains in V s(VA ) (see Exercise 10.3.5). Thus in studying two
identical particles we need only consider V s or VA . It is however convenient, for
bookkeeping purposes, to view V s and VA as subspaces of V1®2 and the elements
of Vs or VA as elements also of V102.
Let us now consider the normalization of the vectors in V. Consider first the
eigenkets I colco2, S> corresponding to a variable f2 with discrete eigenvalues. The
unnormalized state vector is
10)10)2 , s > = 1 0)10)2> +10)200
Since I w i w 2 > and I w 2co 1 > are orthonormal states in V 10 2, the normalization factor
is just 2 1 /2 , i.e.,
1(0 1(02, S> = /2 [1(0 1 0)2> +1(0 200] (10.3.10a)
is a normalized eigenvector. You may readily check that <co1c02, SI wiw 2 , S> =1.
The preceding discussion assumes co l 0(02. If w 1 = (02= co the product ket I wco> is
itself both symmetric and normalized and we choose
'cow, S>=10)w> (I0.3.10b)
Any vector I Vis> in Vs may be expanded in terms of this S2 basis. As usual we
identify
Ps(0)1, (02) = < 0)1(0 2, SI tVs>I 2 (10.3.11)
as the absolute probability of finding the particles in state 1 co,w 2 , S> when an SI
measurement is made on a system in state I Vis>. The normalization condition of
I vs> and Ps(co l , w 2 ) may be written as
1= <tvs l (vs> = E I<0)10)2, s l ws>1 2
dist
= P(COI (0 2) (10.3.12a)
dist
where Edist denotes a sum over all physically distinct states. If co l and w 2 take values
between co„,,,, and corn., then
Wmax co
E=
(list w2=co ntin
(10.3.12b)
=cons',
In this manner we avoid counting both la) i co2 , S> and 1(02( oi, S>, which are phys
ically equivalent. Another way is to count them both and then divide by 2.
Since every element of Vs is perpendicular to every element of V A (you should check this) the dimension
ality of V, 02 equals the sum of the dimensionalities of V s and VA .
What if we want the absolute probability density for some continuous variable 267
such as X? In this case we must take the projection of Itif s> on the normalized SYSTEMS WITH
position eigenket: N DEGREES
OF FREEDOM
lxix2, S>=2 1/2 [Ixix2>+lx2xl>] (10.3.13)
to obtain
P AX 1 9 x2) = I <X I X2 SI vs>I 2 (10.3.14)
The normalization condition for Ps(x l , x 2 ) and Iv's> is
1= if Ps(xl , x2)
dx1 dx2
2
= if i<xix2, NV/ 01
2 dx, dx 2
2
(10.3.15)
where the factor 1/2 makes up for the double counting done by the dx, dx2 integra
tion.I In this case it is convenient to define the wave function as
Vs(xi , x2) = 2 1/2 <xix2, SiVs> (10.3.16)
so that the normalization of vi s is
1= if I x2)I 2 dx2 (10.3.17)
However, in this case
Ps(x, , x 2)= 21ips(x, , x2)1 2 (10.3.18)
due to the rescaling. Now, note that
1 1
Vs(xl <xix2, SI vs> =  Rxix21Vs> + <x2xiltVs>
x2)=2'12 2
= <x 1 x21 Vs> (10.3.19)
where we have exploited the fact that Itif s> is symmetrized between the particles and
has the same inner product with <x i x2 1 and <x2x11. Consequently, the normalization
The points xl = x2 = x pose some subtle questions both with respect to the factor 1/2 and the normaliza
tion of the kets I xx, S>. We do not get into these since the points on the line x, = x2 =x make only an
infinitesimal contribution to the integration in the xl — x2 plane (of any smooth function). In the follow
ing discussion you may assume that quantities such as Ps(x, x), s(x, x) are all given by the limits
—>x2 9x of Ps(xi , x2), 1//s(xl , x2), etc.
268 condition Eq. (10.3.17) becomes
CHAPTER 10
1= ‹V si vis>=Jj1v1512 dxi dx2 j'J <vislxix2><xix2lvis> dxi dx2
which makes sense, as IV/s> is an element of V 1®2 as well. Note, however, that the
kets lx i x2 > enter the definition of the wave function Eq. (10.3.19), and the normaliza
tion integral above, only as bookkeeping devices. They are not elements of Vs and
the inner product <x1x21 11/> would be of no interest to us, were it not for the fact
that the quantity that is of physical interest <x1x2, SI vs>, is related to it by just a
scale factor of 2 1 /2 . Let us now consider a concrete example. We measure the energy
of two noninteracting bosons in a box extending from x= 0 to x= L and find them
to be in the quantum states n= 3 and n= 4. The normalized state vector just after
measurement is then
13, 4> +14, 3>
Iws> — (10.3.20)
2 1/2
in obvious notation. The wave function is
vs(x l , x2) = 2 1 /2 <x i x2 , SI v s>
1 13, 4> +14, 3>)
<x2xi1)(
2 1 /2 )
1
2(2") Rx1x213, 4> + <xix2 14, 3> + <x2x 1 13, 4>+ <x2xi14, 3 >1
[v3(xi)v4 x2) + tv4(x1)11/3(x2) + V3(x2)4/4(xi)
(
2(21/2)
+ tV4(x2)V3(xi)1
= 2 1/2 [ tv3(xi) V/4(x2) + v/4(xl )41 3(x2)]
= <x1x211//s> (10.3.21a)
where in all of the above,
1/2 (
nrcx
Vi n(X) = s( i.) in (10.3.21b)
These considerations apply with obvious modifications to the fermionic space
. The basis vectors are of the form
10)0)2, A>= 21/2 [1(0 1 0)2>  1(0 2 001 (10.3.22)
(The case co l = co 2 does not arise here.) The wave function is once again 269
SYSTEMS WITH
A(xl , x2) = <xix2 , Al tvA> N DEGREES
OF FREEDOM
= <xix2IyiA> (10.3.23)
and as in the bosonic case
PA(xl , x2) = 2 1VA (xi , x2)1 2 (10.3.24)
The normalization condition is
dx, dx2 li
1= PA (x l , x2) = lt A (x l , X2)I 2 dx, dx 2 (10.3.25)
2
Returning to our example of two particles in a box, if we had obtained the
values n= 3 and n= 4, then the state just after measurement would have been
13, 4> —14, 3>
I vA> — (10.3.26)
2 1 /2
(We may equally well choose
1 4, 3> — 1 3, 4>
1 VA
2u2
which makes no physical difference). The corresponding wave function may be
written in the form of a determinant:
vA (xi , x2) = <xix2I L'A > = 2 1/2 [ v3(xi) V4(x2) V4(xl V3(x2)]
= 2 1 '2 V3(x1) V4(x1)
(10.3.27)
V3(x2) V4(x2)
Had we been considering the state 1w 1 w 2 , A> [Eq. (10.3.22) ] ,1
tv.,(xl) tv.2(xi)
(if A (x l , x 2 ) = 2 1 /2 (10.3.28)
tv . ,(x2) tv.2 ( x2 )
Determination of Particle Statistics
We are finally ready to answer the old question: how does one determine empir
ically the statistics of a given species, i.e., whether it is a boson or fermion, without
turning to the spin statistics theorem? For concreteness, let us say we have two
identical noninteracting pions and wish to find out if they are bosons or fermions.
The determinantal form of VIA makes it clear that y/A vanishes if xl =x2 or co,
270 We proceed as follows. We put them in a onedimensional box t and make an energy
CHAPTER 10
measurement. Say we find one in the state n= 3 and the other in the state n= 4. The
probability distribution in x space would be, depending on their statistics,
PS/A(Xl, x2) = 2 1 VS/A(Xl, x2)1 2
= 212 j/2 [ VIAX1) V4(X2) '4(xi)/3(x2)J1 2
= I 1//01)1 2 1 11/4(x2)1 2 + 1 11/4(x1)1 2 1tV3(x2)1 2
± (xl) 4/4(xl) t4(x2) V3(x2) + 1/4(x ) tP3(xi ) tPt (x2) tV4(x2)] (10.3.29)
Compare this situation with two particles carrying labels 1 and 2, but otherwise
identical, § with particle 1 in state 3 and described by a probability distribution
2
V3(X)1 , and particle 2 in state 4 and described by the probability distribution
(V4(x)1 2 . In this case, the first term represents the probability that particle 1 is at x 1
and particle 2 is at x2 , while the second gives the probability for the exchanged
event. The sum of these two terms then gives PD (x i , x2), the probability for finding
one at x l and the other at x2 , with no regard paid to their labels. (The subscript D
denotes distinguishable.) The next two terms, called interference terms, remind us
that there is more to identical particles in quantum theory than just their identical
characteristics: they have no separate identities. Had they separate identities (as in
the classical case) and we were just indifferent to which one arrives at x l and which
one at x2 , we would get just the first two terms. There is a parallel between this
situation and the doubleslit experiment, where the probabilities for finding a particle
at a given point x on the screen with both slits open was not the sum of the probabilit
ies with either slit open. In both cases, the interference terms arise, because in quan
tum theory, when an event can take place in two (or more) indistinguishable ways,
we add the corresponding amplitudes and not the corresponding probabilities.
Just as we were not allowed then to assign a definite trajectory to the particle
(through slits 1 or 2), we are not allowed now to assign definite labels to the two
particles.
The interference terms tell us if the pions are bosons or fermions. The difference
between the two cases is most dramatic as x l >x2 >x :
PA(xl >x, x2>x)>0 (Pauli principle applied to state Ix>) (10.3.30)
whereas
PAX1 + X, x2 + X) = 2 [1 tP3(01 2 1 y'4(X)1 2 + 1 V1 4(X)1 2 1 11/3 (421 (10.3.31)
which is twice as big as PD(x l >x, x 2 >x), the probability density for two distinct
label carrying (but otherwise identical) particles, whose labels are disregarded in the
position measurement.
One refers to the tendency of fermions to avoid each other (i.e., avoid the
state x l = x2 = x) as obeying "FermiDirac statistics" and the tendency of bosons to
We do this to simplify the argument. The basic idea works just as well in three dimensions.
§ The label can, for example, bethe electric charge.
conglomerate as "obeying BoseEinstein statistics," after the physicists who first 271
explored the consequences of the antisymmetrization and symmetrization require SYSTEMS WITH
ments on the statistical mechanics of an ensemble of fermions and bosons, respec N DEGREES
tively. (This is the reason for referring to the bosonic/fermionic nature of a particle OF FREEDOM
as its statistics.)
Given the striking difference in the two distributions, we can readily imagine
deciding (once and for all) whether pions are bosons or fermions by preparing an
ensemble of systems (with particles in n= 3 and 4) and measuring P(x l , x2).
Note that P(x, , x 2) helps us decide not only whether the particles are bosons
or fermions, but also whether they are identical in the first place. In other words, if
particles that we think are identical differ with respect to some label that we are not
aware of, the nature of the interference term will betray this fact. Imagine, for
example, two bosons, call them K and I?, which are identical with respect to mass
and charge, but different with respect to a quantum number called "hypercharge."
Let us assume we are ignorant of hypercharge. In preparing an ensemble that we
think contains N identical pairs, we will actually be including some (K, K) pairs,
some (k, TO pairs. If we now make measurements on the ensemble and extract
the distribution P(x, , x 2) (once again ignoring the hypercharge), we will find the
interference term has the + sign but is not as big as it should be. If the ensemble
contained only identical bosons, P(x, x) should be twice as big as PD(x, x), which
describes labelcarrying particles; if we get a ratio less than 2, we know the ensemble
is contaminated by labelcarrying particles which produce no interference terms.
From the above discussions, it is also clear that one cannot hastily conclude,
upon catching two electrons in the same orbital state in three dimensions that they
are not fermions. In this case, the label we are ignoring is the spin orientation s. As
mentioned earlier on, s can have only two values, call them + and —. If we assume
that s never changes (during the course of the experiment) it can serve as a particle
label that survives with time. If s = + for one electron and — for the other, they are
like two distinct particles and can be in the same orbital state. The safe thing to do
here is once again to work with an ensemble rather than an isolated measurement.
Since we are ignorant of spin, our ensemble will contain (+, +) pairs, (—, —) pairs,
and (+, —) pairs. The (+, +) and (—, —) pairs are identical fermions and will produce
a negative interference term, while the (+, —) pairs will not. Thus we will find P(r, r)
is smaller than PD(r, r) describing labeled particles, but not zero. This will tell us
that our ensemble has identical fermion pairs contaminated by pairs of distin
guishable particles. It will then be up to us to find the nature of the hidden degree
of freedom which provides the distinction.
Systems of N Identical Particles
The case N= 2 lacks one feature that is found at larger N. We illustrate it by
considering the case of three identical particles in a box. Let us say that an energy
measurement shows the quantum numbers of the particles to be n i , n2 , and n3 . Since
the particles are identical, all we can conclude from this observation is that the total
energy is
272 Now there are 3!= six product states with this energy: I nin2n3>, In 1n3n2>, 1n2n3ni>,
CHAPTER 10 1n2nin3>, 1 n3n2ni> , and 1 n3nin2> . The physical states are elements of the sixdimen
sional eigenspace spanned by these vectors and distinguished by the property that
under the exchange of any two particle labels, the state vector changes only by a
factor a. Since double exchange of the same two labels is equivalent to no exchange,
we conclude as before that a = ±1. There are only two states with this property:
1
1171112113, S> (3!) ,•, z.[Inin2n3>+ Inin3n2> + In2n3ni>
+In2nin3>+In3n2ni>+In3nin2A (10.3.32)
called the totally symmetric state,I for which a = +1 for all three possible exchanges
(1 2, 2 3, 1 3); and
1
1111172,13, A>= (30 ,,,,Elnin2n3>
.,_ — 1 n1n3n2> + In2n3ni>
— 1n2nin3>+In3nin2> — In3n2nIA (10.3.33)
called the totally antisymmetric state, for which a = —1 for all three possible
exchanges.
Bosons will always pick the S states and fermions, the A states. It follows that
no two fermions can be in the same state.
As in the N=2 case, the wave function in the X basis is
w s7A(xl , x2, x3)= / <xl x2x 3 ,
(3!) 12
 s/A1 v/s/ A> = <xix2x31 tys/A> (10.3.3 4)
and
I VS/A 12I dx, dx2 dx3 = 1
For instance, the wave function associated with I n i n2n3 , S1A>, Eqs. (10.3.33) and
(10.3.34), is
Wnin2n3(XI 9 X2 9 x3, S/A)
= ( 3 !)  I [ tYni (X I ) tt2(x2) Vn3(X3) (XI) tt'3(x2) Wn2(X3)
^ Wn2(X1) tYn3(X2) Vni (X3) ± Vn2(X1) Vni (X2) Vn3(X3)
^ Wn3(X1) Vni (X2) Vn2(X3) W3(x1) Wn2(X2) Wni (X3)] (10.3.35)
The normalization factor (3!) 1/2 is correct only if all three n's are different. If, for example, n 1 = n2 =
n3 = n, then the product state innn> is normalized and symmetric and can be used as the S state. A
similar question does not arise for the fermion state due to the Pauli principle.
The fermion wave function may once again be written as a determinant: 273
SYSTEMS WITH
N DEGREES
Wnt(X1) Wn2(XI) tYn3(X1
1 OF FREEDOM
111 nin2n3(xl, x2, x 3 , A) — 2 tVni (X2) Wn2(X2) Vn3(X2) (10.3.36)
(3!) /
Ifni (x3)
( Vfn2(X3) Wn3(X3)
Using the properties of the determinant, one easily sees that ty vanishes if two of
the x's or n's coincide. All these results generalize directly to any higher N.
Two questions may bother you at this point.
Question I. Consider the case N=3. There are three possible exchanges here:
(1 44 2), (1 44 3), and (2 3). The S states pick up a factor a= +1 for all three
exchanges, while the A states pick up a= —1 for all three exchanges. What about
states for which some of the a's are +1 and the others 1? Such states do not exist.
—
You may verify this by exhaustion: take the 3! product vectors and try to form such
a linear combination. Since a general proof for this case and all N involves group
theory, we will not discuss it here. Note that since we get only two acceptable
vectors for every N! product vectors, the direct product space for N> 3 is bigger (in
dimensionality) than Vs() VA •
Question II. We have tacitly assumed that if two identical particles of a given
species always pick the S (or A) state, so will three or more, i.e., we have extended
our definition of bosons and fermions from N=2 to all N. What if two pious always
pick the S state while three always pick the A state? While intuition revolts at such
a possibility, it still needs to be formally ruled out. We do so at the end of the next
subsection.
When Can We Ignore Symmetrization and Antisymmetrization?
A basic assumption physicists make before they can make any headway is that
they can single out some part of the universe (the system) and study it in isolation
from the rest. While no system is truly isolated, one can often get close to this ideal.
For instance, when we study the oscillations of a mass coupled to a spring, we ignore
the gravitational pull of Pluto.
Classically, the isolation of the system is expressed by the separability of the
Hamiltonian of the universe:
Yfuniverse = Xys JI6est (10.3.37)
where Xys is a function of the system coordinates and momenta alone. It follows
that the time evolution of the system's p's and q's are independent of what is going
on in the rest of the universe. In our example, this separability is ruined (to give just
one example) by the gravitational interaction between the mass and Pluto, which
depends on their relative separation. If we neglect this absurdly small effect (and
other such effects) we obtain separability to an excellent approximation.
274 Quantum mechanically, separability of H leads to the factorization of the wave
CHAPTER 10
function of the universe:
W universe = Wsys Wrest (10.3.38)
where tysys is a function only of system coordinates, collectively referred to as xs.
Thus if we want the probability that the system has a certain coordinate xs , and do
not care about the rest, we find (symbolically)
P(Xs ) f Wuniverse(Xs Xrest)I 2 dXrest
=
Wsys(Xs)I 2 I W(Xrest)i 2 dXrest
= I w sy s (xs )1 2 (10.3.39)
We could have obtained this result by simply ignoring Wrest from the outset.
Things get complicated when the system and the "rest" contain identical parti
cles. Even if there is no interaction between the system and the rest, i.e., the Hamil
tonian is separable, product states are not allowed and only S or A states must be
used. Once the state vector fails to factorize, we will no longer have
P(xs , xrest)= P(xr)P(xrest) (10.3.40)
(i.e., the systems will not be statistically independent), and we can not integrate out
Mrest) and regain P(xr).
Now it seems reasonable that at least in certain cases it should be possible to
get away with the product state and ignore the symmetrization or antisymmetrization
conditions.
Suppose, for example, that at t = 0, we find one pion in the ground state of an
oscillator potential centered around a point on earth and another pion in the same
state, but on the moon. It seems reasonable that we can give the particles the labels
"earth pion" and "moon pion," which will survive with time. Although we cannot
follow their trajectories, we can follow their wave functions: we know the first wave
function is a Gaussian GE(x E ) centered at a point in the lab on earth and that the
second is a Gaussian Gm (x m ) centered at a point on the moon. If we catch a pion
somewhere on earth at time t, the theory tells us that it is almost certainly the "earth
pion" and that the chances of its being the "moon pion" are absurdly small. Thus
the uncertainty in the position of each pion is compensated by a separation that is
much larger. (Even in classical mechanics, it is not necessary to know the trajectories
exactly to follow the particles; the band of uncertainty about each trajectory has
merely to be much thinner than the minimum separation between the particles during
their encounter.) We therefore believe that if we assumed
V(xE, xm) = GE(xE)Gm(xm) (10.3.41)
we should be making an error that is as negligible as is the chance of finding the 275
earth pion on the moon and vice versa. Given this product form, the person on earth SYSTEMS WITH
can compute the probability for finding the earth pion at some x by integrating out N DEGREES
the moon pion : OF FREEDOM
P(xE)=IGE(xE)1 2 fIGAI(xm)1 2 dx m
— 1GE(xE)1 2 (10.3.42)
Likewise the person on the moon, who does not care about (i.e., sums over) the
earth pion will obtain
P(XM) = I GM(XM )1 2 (10.3.43)
Let us now verify that if we took a properly symmetrized wave function it leads
to essentially the same predictions (with negligible differences).
Let us start with
ws(xl , x2) = 2 1/2 [GE(xl )Gm(x2) + Gm(x1)GE(x2)] (10.3.44)
We use the labels xl and x2 rather than xE and xm to emphasize that the pions are
indeed being treated as indistinguishable. Now, the probability (density) of finding
one particle near xl and one near x2 is
P(x l , x 2 ) = 2 1 11/1 2 = IGE(x1)1 2 1Gm(x2)1 2 +1Gm(x1)1 2 1GE(x2)1 2
+ G P(xl)G Af(xi)G itf(x2)GE(x2)
+ G(xi)GE(xi)G(x2)GM(x2) (10.3.45)
Let us ask for the probability of finding one particle near some point xE on the
earth, with no regard to the other. This is given by setting either one of the variables
(say x i ) equal to xE and integrating out the other [since P(xl , x2) = P(x2, xi)]. There
is no need to divide by 2 in doing this integration (why?). We get
P(XE) = I GAXE )1 2 f G m(x2)1 2 dx2+1Gm(xE)1 2 fIGE(x2)1 2 dx2
+ Gl(xE )Gm (x E ) f Gtf(x2)GE(x2) dx2
+ Gt(xE)GE(xE) JG(x2)GM(x 2 ) dx2 (10.3.46)
The first term is what we would get if we begin with a product wave function Eq.
(10.3.41) and integrate out xm . The other three terms are negligible since Gm is
peaked on the moon is utterly negligible at a point xE on the earth. Similarly if we
asked for P(xm ), where x m is a point on the moon, we will again get 1Gm(xm )1 2 .
276 The labels "earth pion" and "moon pion" were useful only because the two
CHAPTER 10 Gaussians remained well separated for all times (being stationary states). If the two
Gaussians had not been bound by the oscillating wells, and were wave packets
drifting toward each other, the labeling (and the factorized wave function) would
have become invalid when the Gaussians begin to have a significant overlap. The
point is that at the start of any experiment, one can always assign the particles some
labels. These labels acquire a physical significance only if they survive for some time.
Labels like "a particle of mass m and charge +1" survive forever, while the longevity
of a label like "earth pion" is controlled by whether or not some other pion is in
the vicinity.
A dramatic illustration of this point is provided by the following example. At
t= 0 we catch two pions, one at x = a and the other at x—b. We can give them the
labels a and b since the two delta functions do not overlap even if a and b are in the
same room. We may describe the initial state by a product wave function. But this
labeling is quite useless, since after the passage of an infinitesimal period of time,
the delta functions spread out completely: the probability distributions become con
stants. You may verify this by examining 1U(x, t; a, 0)1 2 (the "fate" of the delta
function)t or by noting that AP= co for a delta function (the particle has all possible
velocities from 0 to oo) and which, therefore, spreads out in no time.
All these considerations apply with no modification to two fermions: the two
cases differ in the sign of the interference term, which is irrelevant to these
considerations.
What if there are three pions, two on earth and one on the moon? Since the
two on the earth (assuming that their wave functions appreciably overlap) can be
confused with each other, we must symmetrize between them, and the total wave
function will be, in obvious notation,
Ilf (xEl , xE2 , xm) = Ws(xE, , x E2 ) • 11/(xm) (10.3.47)
The extension of this result to more particles and to fermions is obvious.
At this point the answer to Question II raised at the end of the last subsection
becomes apparent. Suppose threepion systems picked the A state while twopion
systems picked the S state. Let two of the three pions be on earth and the third one on
the moon. Then, by assumption, the following function should provide an excellent
approximation :
vt(x E,, xE 2 , XM ) = VA (XE1 9 X E2) tif ( XM ) (10.3.48)
If we integrate over the moon pion we get
P(XE, 9 XE2) =21 lif A(XE) 9 X E2)1 2 (10.3.49)
We are thus led to conclude that two pions on earth will have a probability distribu
tion corresponding to two fermions if there is a third pion on the moon and a
distribution expected to two bosons if there is not a third one on the moon. Such
$ It is being assumed that the particles are free.
absurd conclusions are averted only if the statistics depends on the species and not 277
the number of particles. SYSTEMS WITH
A word of caution before we conclude this long discussion. If two particles have N DEGREES
nonoverlapping wave functions in x space, then it is only in x space that a product OF FREEDOM
wave function provides a good approximation to the exact symmetrized wave func
tion, which in our example was
vs(xi , x2) = 2 1/2 [GE(xi)Gm(x2) + Gm(xi)GE(x2)] (10.3.50)
The formal reason is that for any choice of the arguments x l and x2 , only one or
the other of the two terms in the righthand side is important. (For example, if xi
on the earth and x2 is on the moon, only the first piece is important.) Physicallyis
it is because the chance of finding one pion in the territory of the other is negligible
and interference effects can be ignored.
If, however, we wish to switch to another basis, say the P basis, we must consider
the Fourier transform of the symmetric function ty s and not the product, so that we
end up with a symmetrized wave function in p space. The physical reason for this is
that the two pions have the same momentum distributions—with <P> =0 and ident
ical Gaussian fluctuations about this mean—since the momentum content of the
oscillator is independent of its location. Consequently, there are no grounds in P
space for distinguishing between them. Thus when a momentum measurement (which
says nothing about the positions) yields two numbers, we cannot assign them to the
pions in a unique way. Formally, symmetrization is important because the pspace
wave functions of the pions overlap strongly and there exist values for the two
momenta (both ()) for which both terms in the symmetric wave function are
significant.
By the same token, if there are two particles with nonoverlapping wave function
in p space, we may describe the system by a product wave function in this space
(using labels like "fast" and "slow" instead of "earth" and "moon" to distinguish
between them), but not in another space where the distinction between them is
absent. It should be clear that these arguments apply not just to X or P but to any
arbitrary variable SI.
Exercise 10.3.1. * Two identical bosons are found to be in states 10> and I yi>. Write
down the normalized state vector describing the system when <OI Iv> O.
Exercise 10.3.2.* When an energy measurement is made on a system of three bosons in
a box, the n values obtained were 3, 3, and 4. Write down a symmetrized, normalized state
vector.
Exercise 10.3.3. * Imagine a situation in which there are three particles and only three
states a, b, and c available to them. Show that the total number of allowed, distinct configura
tions for this system is
(1) 27 if they are labeled
(2) 10 if they are bosons
(3) 1 if they are fermions
278 Exercise 10.3.4.* Two identical particles of mass m are in a onedimensional box of
length L. Energy measurement of the system yields the value Esys = 2h 7.( 2/m 2 . Write down
CHAPTER 10
the state vector of the system. Repeat for Esys=5h271.2/2m. 2. (There are two possible vectors
in this case.) You are not told if they are bosons or fermions. You may assume that the only
degrees of freedom are orbital.
Exercise 10.3.5.* Consider the exchange operator P12 whose action on the X basis is
, x2 > =1 x2, xl>
(1) Show that P12 has eigenvalues +1. (It is Hermitian and unitary.)
(2) Show that its action on the basis ket I wi, (0 2 > is also to exchange the labels 1 and
2, and hence that 4/s/, are its eigenspaces with eigenvalues +1.
(3) Show that P12X1P12=X2, PI2X2P12—X, and similarly for PI and P2. Then show that
PI2n(X1 Pi; X2, P2)P12=s2(X2 P2; X1 P1). [Consider the action on x 1 , x2> or I pi ,p2>. As
for the functions of X and P, assume they are given by power series and consider any term
in the series. If you need help, peek into the discussion leading to Eq. (11.2.22).]
(4) Show that the Hamiltonian and propagator for two identical particles are left
unaffected under H PI2HP12 and U—+P 2 UP 12 . Given this, show that any eigenstate of Pi2
continues to remain an eigenstate with the same eigenvalue as time passes, i.e., elements of
V s/ A never leave the symmetric or antisymmetric subspaces they start in.
Exercise 10.3.6.* Consider a composite object such as the hydrogen atom. Will it behave
as a boson or fermion? Argue in general that objects containing an even/odd number of
fermions will behave as bosons/fermions.
11
Symmetries and
Their Consequences
11.1. Overview
In Chapter 2, we explored the consequences of the symmetries of the Hamil
tonian. We saw the following:
(1) If It9 is invariant under the infinitesimal canonical transformation generated
by a variable g(q, p), then g is conserved.
(2) Any canonical transformation that leaves ff invariant maps solutions to
the equations of motion into other solutions. Equivalently, an experiment and its
transformed version will give the same result if the transformation is canonical and
leaves It9 invariant.
Here we address the corresponding results in quantum mechanics.$
11.2. Translational Invariance in Quantum Theory
Consider a single particle in one dimension. How shall we define translational
invariance? Since a particle in an arbitrary state has neither a welldefined position
nor a welldefined energy, we cannot define translational invariance to be the invari
ance of the energy under an infinitesimal shift in the particle position. Our previous
experience, however, suggests that in the quantum formulation the expectation values
should play the role of the classical variables. We therefore make the correspondence
shown in Table 11.1.
Having agreed to formulate the problem in terms of expectation values, we still
have two equivalent ways to interpret the transformations:
(11.2.1a)
(11.2.1b)
I It may be worth refreshing your memory by going through Sections 2.7 and 2.8. 279
280 Table 11.1. Correspondence between Classical and Quantum Mechanical Concepts Related to
Translational Invariance
CHAPTER Il
Concept Classical mechanics Quantum mechanics
Translation X>X+ E <X> .<X> ± E
P —T <P> —> <P>
Translational invariance
Conservation law fi — 0 <P>=O (anticipated)
The first is to say that under the infinitesimal translation, each state I til> gets
modified into a translated state, I ty,> such that
< 111 EIXI LY E> = <LAX' W> + 6 (11.2.2a)
<WEIPI V.> = <VIPI V> (11.2.2b)
In terms of T(c), the translation operator, which translates the state (and which will
be constructed explicitly in a while)
RE)I V> = I V.> (11.2.3)
Eq. (11.2.2) becomes
<VI T(e)tXT(01 V>= <VIXI ty>+ e (11.2.4a)
<VI REORE)I V> = <WWI V> (11.2.4b)
This point of view is called the active transformation picture (in the terminology of
Section 1.7) and corresponds to physically displacing the particle to the right by s.
The second point of view is to say that nothing happens to the state vectors; it
is the operators X and P that get modified by T( e) as follows:
X—>T t (c)XT(s)
P+ Tt( e)PT( e )
such that
Tt (c)XT(s)= X + sI (11.2.5a)
Tt(s)PT(E)= P (11.2.5b)
This is called the passive transofrmation picture. Physically it corresponds to moving
the environment (the coordinate system, sources of external field if any, etc.) to the
left by c.
Physically, the equivalence of the active and passive pictures is due to the fact
that moving the particle one way is equivalent to moving the environment the other
way by an equal amount.
Mathematically, we show the equivalence as follows. If we sandwich the operator 281
equation (11.2.5) between <tvl and I Iv>, we get Eq. (11.2.4). To go the other way, SYMMETRIES
we first rewrite Eq. (11.2.4) as AND THEIR
CONSEQUENCES
<VIT t (E)XT(E) — X — Ellyt> = 0
<tvi Tt(e)PT(e) —Pi v> =0
We now reason as follows:
(1) The operators being sandwiched are Hermitian (verify).
(2) Since I ty> is arbitrary, we can choose it to be any of the eigenvectors of
these operators. It follows that all the eigenvalues vanish.
(3) The operators themselves vanish, implying Eq. (11.2.5).
In what follows, we will examine both pictures. We will find that it is possible
to construct T(g) given either of Eqs. (11.2.4) or (11.2.6), and of course that the
two yield the same result. The active transformation picture is nice in that we work
with the quantum state I tif>, which now plays the role of the classical state (x, p).
The passive transformation picture is nice because the response of the quantum
operators X and P to a translation is formally similar to that of their classical
counterparts. t
We begin by discussing translations in terms of active transformations. Let us
examine how the ket I ty,> is related to I ty> or, equivalently, the action of the Hilbert
space operator T(s). The answer appears obvious if we work with kets of definite
position, Ix>. In this case it is clear that
T(E)IX> =IX ± E> (11.2.6)
In other words, if the particle is originally at x, it must end up at x+ E. Notice that
T(e) is unitary: it acts on an orthonormal basis Ix>, —co _x< co, and gives another,
Ix + E>, +00 <X+ e< CO. Once the action of T(e) on a complete basis is known, its
action on any ket I tif> follows:
I V s> = 1101V> = T(e) I 'c' IxXxl V> dx= f a) Ix+ eXxiV> dx
=J
x I '><x' — El Iv> dx' (x' = x+ E) (11.2.7)
.0
In other words if
<xl Iv> = v(x)
$ As we shall see, it is this point of view that best exposes many formal relations between classical and
quantum mechanics.
282 then
CHAPTER 11
<xiT(01 > = 11, (11.2.8)
For example, if tv(x) eX2 is a Gaussian peaked at the origin,
ty(x  6) e (x e) 2 is an identical Gaussian peaked at x = E. Thus the wave function
•
tv,(x) is obtained by translating (without distortion) the wave function vi(x) by an
amount E to the right. You may verify that the action of T(E) defined by Eq. (11.2.8)
satisfies the condition Eq. (11.2.1a). How about the condition Eq. (11.2.1b)? It is
automatically satisfied:
d
<tV el PI V/ = (x)(  ih — ,(x) dx
dx
d
= Iv* (x  6)( ih — Iv (x  6) dx
dx
= Iv* (x')( ih (x') dx' (x' = x  6)
x'
d)
= PI IP> (11.2.9)
Now there is something odd here. Classically, translation is specified by two
independent relations
while in the quantum version we seem to find that in enforcing the former (on position
eigenkets), the latter automatically follows. The reason is that in our derivation we
have assumed more than what was explicitly stated. We reasoned earlier, on physical
grounds, that since a particle initially located at x must end up at x + E, it follows
that
REYX>=IX+ E>
While our intuition was correct, our implementation was not. As seen in chapter 7,
the X basis is not unique, and the general result consistent with our intuition is not
Eq. (11.2.6) but rather
REA X> e i g(x) h i X + E> (11.2.10)
(Note that as E>0, T(e)ix> > ix> as it should.) In ignoring g(x), we had essentially
assumed the quantum analog of p >p. Let us see how. If we start with Eq. (11.2.10)
instead of Eq. (11.2.6), we find that 283
SYMMETRIES
<X> —
T() ) <X> +g (11.2.11a) AND THEIR
CONSEQUENCES
<P>—><P>+ E<f(X)> (11.2.11b)
where f=g'. Demanding now that <P> ><P>, we eliminate
— f and reduce g to a
harmless constant (which can be chosen to be 0).
Exercise 11.2.1. Verify Eq. (1 1.2.1 lb)
Note that there was nothing wrong with our initial choice Tix> = +
was too restrictive given just the requirement <X>—><X> + E, but not so if we also
considered <P> ><P>. This situation reappears when we go to two or three dimen
—
sions and when we consider rotations. In all those cases we will make the analog of
the naive choice T(6)1 x> = ix + g> to shorten the derivations.
Having defined translations, let us now define translational invariance in the
same spirit. We define it by the requirement
<WIHIV>=<WeIHIV,› (11.2.12)
To derive the conservation law that goes with the above equation, we must first
construct the operator T(v) explicitly. Since E= 0 corresponds to no translation, we
may expand T(6) to order E as
lE
T(6)= I  G (11.2.13)
The operator G, called the generator of translations, is Hermitian (see Exercise 11.2.2
for the proof) and is to be determined. The constant ( — ilh) is introduced in anticipa
tion of what is to follow.
Exercise 11.2.2. * Using Tt (g)T(E)=/ to order c, deduce that Gt G.
We find G by turning to Eq. (11.2.8):
<xi T(6)1 1v> = ty(x— 6)
Expanding both sides to order E, we find
iE
<xi
h
<xiGIV>=Vf(x) 
dx
284 so that
CHAPTER 11
l
di —
<xiGIV>= —ctv
dx
Clearly G is the momentum operator,
G=P
and
iE
T(6)= I P (11.2.14)
h
We see that exactly as in classical mechanics, the momentum is the generator of
(infinitesimal) translations.
The momentum conservation law now follows from translational invariance,
Eq. (11.2.12), if we combine it with Eq. (11.2.14):
< tyl HI tv> = < tv,1111 1ye>
= < T(e) toi l T(e)'> = <VITt(E)HREY V>
=<(i+P)H(I— P)1V>
h h
i£
= <VIHIV> +  <IMP, IlliV> + 0(62)
ti
so that, we get, upon equating the coefficient of £ to zero,
<0[P, In V> = 0 (11.2.15)
It now follows from Ehrenfest's theorem that
<[P, II]> =0 —> <fi> =0 (11.2.16)
Translation in Terms of Passive Transformations
Let us rederive T(E), given that it acts as follows on X and P:
T(6) 1" XT(E)= X + El (11.2.17a)
T(6) t PT(6)= P (11.2.17b)
The operator T(E) tXT(e) is also a position operator, but it measures position
from a new origin, shifted to the left by E: This is the meaning of Eq. (11.2.17a).
Equation (11.2.17b) states that under the shift in the origin, the momentum is 285
unaffected. SYMMETRIES
Writing one again AND THEIR
CONSEQUENCES
ig
T(E)= I G
h
we find from Eq. (11.2.17a) (using the fact that Gt =G)
(I + G)X(I— igG) — X + EI
h h
or
— [X, G ] = EI (11.2.18a)
h
[X, G]= ihI (11.2.18b)
This allows us to conclude that
G= P+ f(X) (11.2.19)
If we now turn to Eq. (11.2.17b) we find
iE
 [P, G]=0 (11.2.20a)
h
or
[P, G] = 0 (11.2.20b)
which eliminates f (X ).$ So once again
iEP
T(E)= I —
h
Having derived the translation operator in the passive transformation picture, let us
reexamine the notion of translational invariance.
We define translational invariance by the requirement
Tt (e)HT(E)= H (11.2.21)
I For the purists, it reduces f to a c number which commutes with X and P, which we choose to be zero.
286 We can rewrite Eq. (11.2.21) in a form that is closer to the classical definition
CHAPTER 11 of translational invariance. But first we need the following result: for any n (X , P)
that can be expanded in a power series, and for any unitary operator U,
Utfl(X, P)U=n(Ut XU, U t PU)
For the proof, consider a typical term in the series such as PX2P. We have, using
UUt = I,
Ut PX2PU= Ut PUUt X UU t XUU t PU Q.E.D.
Applying this result to the case U= T(6) we get the response of any dynamical
variable to a translation:
52(X, P) — Ttn(x, P)T=f1(Tt XT, T t PT) = 51(X + El, P) (11.2.22)
Thus the transformed f1 is found by replacing X by X+ a and P by P. If we now
apply this to Eq. (11.2.21) we get the following definition of translation invariance:
H(X + Ei, P)= H(X, P) (11.2.23)
Not only does this condition have the same form as its classical counterpart
Ye(x + E, p)= AQ(x, p)
but it is also satisfied whenever the classical counterpart is. The reason is simply that
H is the same function of X and P as A'' is of x and p, and both sets of variables
undergo identical changes in a translation.
The conservation of momentum follows if we write T(E) in Eq. (11.2.21) in
terms of P and expand things out to first order in E:
0 = Tt (6)HT(6)— H= (I+ i6P/h)H(I— i6P/1)— H
_  iE [H, P] (11.2.24)
h
which implies that <P> = 0, because of the Ehrenfest's theorem.
A Digression on the Analogy with Classical Mechanics t
The passive transformation picture has the virtue that it bears a close formal
resemblance to classical mechanics, with operators fl in place of the classical variables
$ In a less advanced course, the reader may skip this digression.
o) [Eqs. (11.2.17), (11.2.22), (11.2.23)]. In fact, the infinitesimal unitary transforma 287
tion T(e) generated by P is the quantum image of the infinitesimal canonical trans SYMMETRIES
formation generated by p: if we define the changes SX and SP by AND THEIR
CONSEQUENCES
5X= Tt (g)XT(e)— X
5P= T. r (e)PT(e)— P
we get, on the one hand, from Eq. (11.2.17),
SX =X + el— X= el
SP=P—P=0
and on the other, from T= I— igPlh (working to first order in c),
5X= (I+ ieP1h)X(I— lei' /h) X= ig [X, P]
h
5 P = (1+ ieP/h)P(I— ieP1h)— P= ig [P, P]
h
combining which we obtain
h
P] = 0
h
More generally, upon combining, Eq. (11.2.22) and T= I— ieP/h, we obtain
551 = [SI, P] = 51(X + el, P)— SI(X, P)
h
These are the analogs of the canonical transformation generated by p:
Sx= e{x, p} = e
OP= Ell), PI = 0
p} = co(x
So)=e{, + £, p)— co(x, p)
If the problem is translationally invariant, we have
8H= [H, P]= 0 <13> = 0 by Ehrenfest's theorem
h
288 while classically
CHAPTER 11
8.7e= ellf, =0—>fo =0 byfi= {p,
The correspondence is achieved through the substitution rules already familiar to
us:
In general, the infinitesimal canonical transformation generated by g(x, p),
8(.0= Elco, gl
has as its image in quantum theory the infinitesimal unitary transformation UG(E)=
I iEG/h in response to which
—
551 — G]
Now, we have seen that the transformation generated by any g(x, p) is canonical,
i.e., it preserves the PB between the x's and the p's. In the quantum theory, the
quantities preserved are the commutation relations between the X's and the P's, for
if
[Xe , Pi ] = ih6u I
then upon premultiplying by the unitary operator UtG ( 6) and postmultiplying by
UG(e), we find that the transformed operators obeys
[UtX i U, Ut U] = ihSu I
This completes the proof of the correspondence
{infinitesimal canonical {infinitesimal unitary
transformation generated transformation generated
by g(x, p) by G(X, P)
The correspondence holds for finite transformations as well, for these may be viewed
as a sequence of infinitesimal transformations.
More generally if [f2, 0] = then a similar relation holds between the transformed operators U tf2U,
Ut OU, UTU. This is the quantum version of the result that PB are invariant under canonical
transformation.
The correspondence with unitary transformations also holds for regular canon 289
ical transformations which have no infinitesimal versions. For instance, in the SYMMETRIES
coupled oscillator problem, Exercise 10.1.3, we performed a canonical transformation AND THEIR
from X 1 , X2,PI ,P2 to x1, X11, pi and Pji, where, for example, x1 = (x1 + x2)/2. In the
, CONSEQUENCES
quantum theory there will exist a unitary operator such that, for example, CX, U=
(X i + X 2)I2 = XI and so on.$
We can see why we can either perform the canonical transformation at the
classical level and then quantize, or first quantize and then perform the unitary
transformation—since the quantum operators respond to the unitary transformation
as do their classical counterparts to the canonical transformation, the end effect will
be the same. §
Let us now return the problem of translational invariance. Notice that in a
problem with translational invariance, Eq. (11.2.24) tells us that we can find the
simultaneous eigenbasis of P and H. (This agrees with our result from Chapter 5,
that the energy eigenstates of a free particle could be chosen to be momentum
eigenstates as well. 11) If a system starts out in such an eigenstate, its momentum
eigenvalue remains constant. To prove this, first note that
[P, H]= —>[P, U(t)]= 0 (11.2.25)
since the propagator is a function of just H. *
Suppose at t= 0 we have a system in an eigenstate of P:
PIP> =PIP> (11.2.26)
After time t, the state is U(t)Ip> and we find
PU(t)IP> = U(t)Plp> = U(OPIP> =P(At) 1/3 > (11.2.27)
In other words, the state at time t is also an eigenstate of P with the same eigenvalue.
For such states with welldefined momentum, the conservation law </I> = 1 reduces
to the classical form, 1'3=0.
Finite Translations
What is the operator T(a) corresponding to a finite translation a? We find it by
the following trick. We divide the interval a into N parts of size alN. As N —>co,
If the transformation is not regular, we cannot find a unitary transformation in the quantum theory,
since unitary transformations preserve the eigenvalue spectrum.
§ End of digression.
I Note that a single particle whose H is translationally invariant is necessarily free.
* When H is time independent, we know U(t)— exp( — tilt/ti). If H= H(t), the result is true if P commutes
with H(t) for all t. (Why?)
290 alN becomes infinitesimal and we know
CHAPTER 11
T(a1N) — I iaP (11.2.28)
hN
Since a translation by a equals N translations by a/N,
T(a)= lim (T(a / N )] N =e—laPih (11.2.29)
N
by virtue of the formula
e ax = (1 a
N co Ni
N
We may apply this formula, true for c numbers, to the present problem, since P is
the only operator in the picture and commutes with everything in sight, i.e., behaves
like a c number. Since
eadmx
T(a) (11.2.30)
X basis
we find
, dyi d2 a2
<xiT(a)1 11/>= Vlx) — a+ +••• (11.2.31)
dx dx 2 2!
which is the full Taylor series for tv(x a) about the point x.
—
A Consistency Check. A translation by a followed by a translation by b equals
a translation by a+ b. This result has nothing to do with quantum mechanics and
is true whether you are talking about a quantum system or a sack of potatoes. It is
merely a statement about how translations combine in space. Now, we have just
built operators T, which are supposed to translate quantum states. For this interpre
tation to be consistent, it is necessary that the law of combination of the translation
operators coincide with the law of combination of the translations they represent.
Now, although we presumed this [see Eq. (11.2.29), and the line above it] in the very
act of deriving the formula for T(a), let us verify that our result T(a)=exp( — iaP /h)
satisfies
T(a)T(b)=T(a+b)? (11.2.32)
We find that this is indeed so:
e —ibP/h e —i(a+b)P/h =
T(a)T(b)=e '" • T(a+b) (11.2.33)
A Digression on Finite Canonical and Unitary Transformations $
Though it is clear that the correspondence between canonical and unitary trans
formations, established for the infinitesimal case in earlier discussions, must carry
Optional.
over to the finite case, let us nonetheless go through the details. Consider, for definite 291
ness, the case of translations. In the quantum theory we have SYMMETRIES
AND THEIR
fl> Tt (a)S2T(a)=e'aPlfifle  ia" CONSEQUENCES
Using the identity
CA B e±A = B+ [B, A ] + 1 [[B, A], Al+
31 1 • • •
we find
2
i 2(
f2,n + a (—) [n, 11+1 a [p, 11, 11+ • • • (11.2.34)
h 2! h
For example, if we set 51 = X 2 we get X 2 * (X + aI)2.
In the classical case, under an infinitesimal displacement 8a,
co = pl
or
dw
= lw,P1
da
Applying the above result to the variable dw /da, we get
d
— (dw da)= d2 w / da2 = ldw Ida, pl = { {w, pl, pl
da
and so on. The response of co to the finite translation is given by the Taylor series
about the point a = O:
a2
(4)C° +a{a),P} ± — {{ (0, PI , PI±• • (11.2.35)
2!
which we see is in correspondence with Eq. (11.2.34) if we make the usual
substitutions.
Exercise 11.2.3. * Recall that we found the finite rotation transformation from the infinite
simal one, by solving differential equations (Section 2.8). Verify that if, instead, you relate
the transformed coordinates •k and y to x and y by the infinite string of Poisson brackets, you
get the same result, =x cos 0 y sin 8 , etc. (Recall the series for sin 0, etc.)

292 System of Particles
CHAPTER 11 We will not belabor the extension of the preceding ideas to a system of N
particles. Starting with the analog of Eq. (11.2.8),
<XI, . . . , X NIT(E)1 1V> = E, . . . , XN — g) (11.2.36)
we find, on expanding both sides to order E, that
iE N a y/
<X1, . . , XNII PIV> = ,XI N —
X) E (11.2.37)
,= ex,
from which it follows that
ie N lE
T(E)= I E Pi = P (11.2.38)
i1 h
where P is the total momentum operator. You may verify that
T(E) t X,T(E)= X,+ EI
T(E) t P i T(E)= P „ i= 1, . . . ,N (11.2.39)
Translational invariance means in this case (suppressing indices),
H(X, P)= T(E) t H(X, P)T(E)= H(X + El, P) (11.2.40)
Whereas in the singleparticle cases this implied the particle was free, here it merely
requires that H (or rather V) be a function of the coordinate differences. Any system
whose parts interact with each other, but nothing external, will have this property.
There are some profound consequences of translational invariance besides
momentum conservation. We take these up next.
Implications of Translational Invariances
Consider a system with translational invariance. Premultiplying both sides of
Eq. (11.2.21) with T and using its unitarity, we get
[T(a), H]=0
293
SYMMETRIES
u(t)14,Ku> ocor (a)1*(0)>=T(c)U(011), (0)> AND THEIR
"4_ 0 CONSEQUENCES
Figure 11.1. A symbolic depiction of T(01 04, (0)>
translational invariance. The states
are represented schematically by
wave functions.
a
It follows that
[T(a), U(01= 0 or T(a)U(t)=U(t)T(a) (11.2.41)
The consequence of this relation is illustrated by the following example (Fig. 11.1).
At t= 0 two observers A and B prepare identical systems at x= 0 and x = a, respec
tively. If Iy/(0)> is the state vector of the system prepared by A, then T(a)l v/(0)> is
the state vector of the system prepared by B. The two systems look identical to the
observers who prepared them. After time t, the state vectors evolve into U(Oltv(0)>
and U(t)T(a)11g(0)>. Using Eq. (11.2.41) the latter may be rewritten as
T(a)U(t)lt g (0)> , which is just the translated version of A's system at time t. Therefore
the two systems, which differed only by a translation at t= 0, differ only by the same
translation at future times. In other words, the time evolution of each system appears
the same to the observer who prepared it. Translational invariance of H implies that
the same experiment repeated at two different places will give the same result (as
seen by the local observers). We have already seen this result in the classical frame
work. We pursue it further now.
Now it turns out that every known interaction—gravitational, weak, electromag
netic, and strong (e.g., nuclear)—is translationally invariant, in that every experi
ment, if repeated at a new site, will give the same result. Consider the following
illustrative example, which clarifies the meaning of this remark. A hydrogen atom
is placed between the plates of a charged condenser. The Hamiltonian is
H = 1P + 1P 2i2 + e1e2 + V(R 1 ) + e2V(R2) (11.2.42)
2m 1 2m2 IRI — R21
where the subscripts 1 and 2 refer to the electron and the proton and V(R)1 to the
potential due to the plates. Now this problem has no translation invariance, i.e.,
H(R I + E, P 1 ; R2 E, P2) H(RI , P 1 ; R2, P2)
which in turn means that if the atom alone is translated (away from the condenser)
it will behave differently. But this does not correspond to repeating the same experi
ment and getting a different result, since the condenser, which materially affects the
Remember that R is the operator corresponding to the classical variable r.
294 dynamics, is left behind. To incorporate it in what is translated, we redefine our
CHAPTER 11
system to include the (N) charges on the condenser and write
N+2 N+2 N+2 eie
H= E '+ E (11.2.43)
2mi 2 i=i ;#i
1=1
Now the charges on the condenser enter H, not via the external field which breaks
translational invariance, but through the Coulomb interaction, which does not. Now
it is true that (dropping indices),
H(R + E, P) = H(R, P)
which implies that if the atom and the condenser are moved to a new site, the
behavior of the composite system will be unaffected. This result should be viewed
not as obvious or selfevident, but rather as a profound statement about the Coulomb
interaction.
The content of the assertion made above is that every known interaction has
translational invariance at the fundamental level—if we expand our system to include
all degrees of freedom that affect the outcome of an experiment (so that there are
not external fields, only interactions between parts of the system) the total H is
translationally invariant. This is why we apply momentum conservation to every
problem whatever be the underlying interaction. The translational invariance of
natural laws reflects the uniformity or homogeneity of space. The fact that the
dynamics of an isolated t system (the condenser plus atom in our example) depends
only on where the parts of the system are relative to each other and not on where
the system is as a whole, represents the fact that one part of free space is as good
as another.
It is translational invariance that allows experimentalists in different parts of
the earth to claim they all did the "same" experiment, and to corroborate, correct,
and complement each other. It is the invariance of the natural laws under translations
that allows us to describe a hydrogen atom in some distant star as we do one on
earth and to apply to its dynamics the quantum mechanical laws deduced on earth.
We will examine further consequences of translational invariance toward the end of
the next section.
11.3. Time Translational Invariance
Just as the homogeneity of space ensures that the same experiment performed
at two different places gives the same result, homogeneity in time ensures that the
To be exact, no system is truly "isolated" except the whole universe (and only its momentum is exactly
conserved). But in practice one draws a line somewhere, between what constitutes the system and what
is irrelevant (for practical purposes) to its evolution. I use the term "isolated" in this practical sense.
The real utility of the concepts of translational invariance and momentum conservation lies in these
approximate situations. Who cares if the universe as a whole is translationally invariant and its momen
tum is conserved? What matters to me is that I can take my equipment to another town and get the
same results and that the momentum of my system is conserved (to a good accuracy).
same experiment repeated at two different times gives the same result. Let us see 295
what feature of the Hamiltonian ensures this and what conservation law follows. SYMMETRIES
Let us prepare at time t 1 a system in state I igo > and let it evolve for an infinitesi AND THEIR
mal time E.
The state at time ti + E, to first order in E, will be CONSEQUENCES
IV (tI + E)>=[I—LEh li(t1)11 ( if 0> (11.3.1)
If we repeat the experiment at time t2 , beginning with the same initial state, the state
at time t2+ E will be
I w(t2+ E )> = [I —.Le fi(t2)]1 1VO> (11.3.2)
h
The outcome will be the same in both cases if
o = 1v(t2+ E )> I W(ti + E)>
—
=(— i H
iE)[H(t2)— (ti)litlfo> (11.3.3)
Since I vo > is arbitrary, it follows that
H(t2)=H(ti ) (11.3.4)
Since 12 and t i are arbitrary, it follows that H is timeindependent:
(11.3.5)
Thus time translational invariance requires that H have no t dependence. Now
Ehrenfest's theorem for an operator S2 that has no time dependence t is
ih<f'1> =<K2,111>
Applying it to n = H in a problem with time translational invariance, we find
<ib = o (11.3.6)
which is the law of conservation of energy.
If c1521 dt 00 there will be an extra piece ih<d0/ dt> on the righthand side.
296 An important simplification that arises if dH/dt=0 is one we have repeatedly
CHAPTER 11
exploited in the past: Schriidinger's equation
ih al V> 11 1 111 > (11.3.7)
Ot
admits solutions of the form
4/(t)> = I E> e`" (11.3.8)
where the timeindependent ket I E> satisfies
HIE>=EIE> (11.3.9)
The entire dynamics, i.e., the determination of the propagator U(t), boils down to
the solution of the timeindependent Schrödinger equation (11.3.9).
The considerations that applied to space translation invariance apply here as
well. In particular, all known interactions—from gravitational to strong—are time
translational invariant. Consequently, if we suitably define the system (to include the
sources of external fields that affect the experiment) the total H will be independent of
t. Consider, for example, a hydrogen atom between the plates of a discharging con
denser. If the system includes just the electron and the proton, H will depend on
time—it will have the form of Eq. (11.2.42), with V= V(R, t). This simply means that
repeating the experiment without recharging the condenser, will lead to a different
result. If, however, we enlarge the system to include the N charges on the condenser,
we end up with the H in Eq. (11.2.43), which has no t dependence.
The spacetime invariance of natural laws has a profound impact on our quest
for understanding nature. The very cycle of physics—of deducing laws from some
phenomena studied at some time and place and then applying them to other phenom
ena at a different time and place—rests on the assumption that natural laws are
spacetime invariant. If nature were not to follow the same rules over spacetime,
there would be no rules to find, just a sequence of haphazard events with no rhyme
or reason. By repeating the natural laws over and over through all of spacetime,
nature gives tiny earthlings, who probe just a miniscule region of space for a fleeting
moment (in the cosmic scale), a chance of comprehending the universe at large.
Should we at times be despondent over the fact that we know so few of nature's
laws, let us find solace in these symmetry principles, which tell us that what little we
know is universal and eternal.$
The invariance of the laws of nature is not to be confused with our awareness of them, which does
not change with time. For example, Einstein showed that Newtonian mechanics and gravitation are
approximations to relativistic mechanics and gravitation. But this is not to say that the Newtonian
scheme worked till Einstein came along. In other words, the relation of Newton's scheme to Einstein's
(as a good approximation in a certain limit) has always been the same, before and after we learned of
it.
11.4. Parity Invariance 297
SYMMETRIES
Unlike spacetime translations, and rotations, (which we will study in the next AND THEIR
chapter), parity is a discrete transformation. Classically, the parity operation corre CONSEQUENCES
sponds to reflecting the state of the particle through the origin
X
parity
(11.4.1)
P parity P
In quantum theory, we define the action of the parity operator on the X basis
as follows
llix>= — x> (11.4.2)
in analogy with the classical case. Given this,
HIP> = —P> (11.4.3)
follows, as you will see in a moment.
Given the action of rir on a complete (X) basis, its action on an arbitrary ket
follows:
HIV> =11 f Ix><xlw>dx
= xXxi dx
= f Ix'><— dx' (where x'= —x) (11.4.4)
It follows that if
<xi iv> = v(x)
<x1111 vf> = tv( x) (11.4.5)
The function Iv( — x) is the mirror image of vi(x) about the origin. Applying Eq.
(11.4.5) to a momentum eigenstate, it will be readily found that HIP> = I — P>.
298 The eigenvalues of n are just ± L A moment's "reflection" will prove this.
CHAPTER 11
Since
111x> = I — x>
112 1x> = — ( —x)> = Ix>
Since this is true for an entire basis,
n2 =i (11.4.6)
It follows that
(1)n= H 1
(2) The eigenvalues of n are ± 1.
(3) So n is Hermitian and unitary.
(4) So n  ' =11t= 11•
The eigenvectors with eigenvalue ± 1 are said to have even/odd parity. In the X
basis, where
tv(x)7
1 Iv( — x)
evenparity vectors have even wave functions and oddparity vectors have odd wave
functions. The same goes for the P basis since
w(P)7 II'(
kg(  P)
In an arbitrary Q basis, vi(co) need not be even or odd even if I tif> is a parity
eigenstate (check this).
Rather than define n in terms of its action on the kets, we may also define it
through its action on the operators:
ntxn=x
(11.4.7)
ntpn= —P
We say H(X, P) is parity invariant if
11 111(X , P)II=H( — X, — P)= H(X, P) (11.4.8)
In this case
[II, H]= 0
and a common eigenbasis of n and H can be found. In particular, if we consider
just bound states in one dimension (which we saw are nondegenerate), every eigenvec
tor of H is necessarily an eigenvector of H. For example, the oscillator Hamiltonian
satisfies Eq. (11.4.8) and its eigenfunctions have definite parity equal to ( — 1)n, n 299
being the quantum number of the state. The particle in a box has a parityinvariant SYMMETRIES
Hamiltonian if the box extends from — L/2 to L/2. In this case the eigenfunctions AND THEIR
have parity ( — On+1 , n being the quantum number. If the box extends from 0 to L, CONSEQUENCES
V(x) is not parity invariant and the eigenfunctions
1/2
Ign (X) = sin (nirx)
have no definite parity. (When x—> —x they vanish, since ig n is given by the sine
function only between 0 and L, and vanishes outside.)
If H is parity invariant, then
IIU(t)= U(t)11 (11.4.9)
This means that if at t= 0 I start with a system in a state I vf(0)>, and someone
else starts with a system in the parity operated state fil kg(0)>, then at a later time
the state of his system will be related to mine by the parity transformation.
Whereas all natural laws are invariant under spacetime translations (and rota
tions) some are not invariant under parity. These are the laws of weak interactions,
which are responsible for nuclear )6 decay (among other things). This means formally
that the Hamiltonian cannot be made parity invariant by any redefinition of the
system if weak interactions are involved. Physically this means that if two observers
prepare initial states I V'(0)> and fil w(0)> which are mirror images of each other,
the final states U(t)I v'(0)> and U(t)1111g(0)> will not be mirror images of each other
(since II U0 uro.t Consider the following concrete example of a )6 decay:
60 co _+60Ni+ e +17
where e is an electron and t7 is an antineutrino. Now it turns out that the electron
likes to come flying out in a direction opposite to the spin of 60Co—and this implies
parity noninvariance. Let us see how. At t= 0 I prepare a system that consists of a
° Co nucleus with its spin up along the z axis (Fig. 11.2) (experiment A). Although
you are not yet familiar with spin, you may pretend here that 6°Co is spinning in
the sense shown. Let another observer set up another system which is just the mirror
image of mine (experiment B). Let M denote a fictitious experiment, which is what
I see in a mirror in front of me. Notice how the spin S gets reversed under a mirror
reflection. Let the )6 decay take place. My electron comes out down the z axis. Of
course the mirror also shows an electron coming down the z axis. In the other real
experiment (B), the dynamics forces the electron to come up the z axis, since the
initial S was down. Thus B starts out as the mirror image of A but ends up different.
Consequently, what I see in the mirror (experiment M) does not correspond to what
can happen in real life, i.e., is not a solution to the equations of motion.
I See Exercise 11.4.4 for a discussion of why the parity transformation is essentially a mirror reflection
in three dimensions.
300
CHAPTER 11
mirror
ci? cl? Initial
States
4
Final
States
Pe
mirror
Figure 11.2. An example of parity noninvariance. In experiment A, which I perform, the spin of the
nucleus points up the z axis. In its actual mirror image, it points down (experiment M). In experiment
B, which is a real experiment, the spin is chosen to be down, i.e., B starts out as the mirror image of A.
After the decay, the momentum of my electron, p„ is down the z axis. The mirror image of course also
shows the electron coming down. But in the actual experiment B, the dynamics forces the electron to
come up the z axis, i.e., antiparallel to the initial nuclear spin S.
This then is the big difference between parity and other transformations such
as spacetime translations and rotations. If a certain phenomenon can happen, its
translated or rotated version can also happen, but not its mirrorreflected version,
if the phenomenon involves weak interactions. In terms of conservation laws, if an
isolated system starts out in a state of definite parity, it need not end in a state of
same parity if weak interactions are at work. The possibility that weak interactions
could be parity noninvariant was discussed in detail by Lee and Yang in 1956 and
confirmed shortly thereafter by the experiments of C. S. Wu and collaborators.t
Exercise 11.4.1. * Prove that if [n, H]=0, a system that starts out in a state of even/odd
parity maintains its parity. (Note that since parity is a discrete operation, it has no associated
conservation law in classical mechanics.)
Exercise 11.4.2. * A particle is in a potential
V(x)= V0 sin(27Tx/a)
which is invariant under the translations xx+ma, where m is an integer. Is momentum
conserved? Why not?
Exercise 11.4.3. * You are told that in a certain reaction, the electron comes out with its
spin always parallel to its momentum. Argue that parity is violated.
Exercise 11.4.4. * We have treated parity as a mirror reflection. This is certainly true in
one dimension, where —x may be viewed as the effect of reflecting through a (point)
mirror at the origin. In higher dimensions when we use a plane mirror (say lying on the x — y
T. D. Lee and C. N. Yang, Phys. Rev., 104, 254 (1956); C. S. Wu, E. Ambler, R. W. Hayward, and
R. P. Hudson, Phys. Rev., 105, 1413 (1957).
plane), only one (z) coordinate gets reversed, whereas the parity transformation reverses all 301
three coordinates.
SYMMETRIES
Verify that reflection on a mirror in the x y plane is the same as parity followed by 180 0 AND THEIR
rotation about the x axis. Since rotational invariance holds for weak interactions, noninvari CONSEQUENCES
ance under mirror reflection implies noninvariance under parity.
11.5. TimeReversal Symmetry
This is a discrete symmetry like parity. Let us first understand what it means in
classical physics. Consider a planet that is on a circular orbit around the sun. At t=
0 it starts at 0=0 and has a velocity in the direction of increasing O. In other words,
the orbit is running counterclockwise. Let us call the initial position and momentum
x(0), p(0). (We should really be using vectors, but ignore this fact for this discussion.)
We now define the time reversed state as one in which the position is same but

the momentum is reversed:
MO= x(t) MO= — p(t).
In general, any quantity like position or kinetic energy, which involves an even power
of t in its definition is left invariant and any quantity like momentum or angular
momentum is reversed in sign under the timereversal operation.
Say that after time T the planet has come to a final state x(T), p(T) at 0=
7112 after doing a quarter of a revolution. Now Superman (for reasons best known
to him) stops it dead in its tracks, reverses its speed, and lets it go. What will it do?
We know it will retrace its path and at time 2T end up in the timereversed state of
the initial state:
x(2T) x(0) p(2T)= — p(0) (11.5.1)
The above equation defines timereversal invariance (TRI).
We can describe TRI more graphically as follows. Suppose we take a movie of
the planet from t = 0 to t= T. At t=T, we start playing the film backward. The
backward motion of the planet will bring it back to the timereversal initial state at
t=2T. What we see in the movie can really happen, indeed, it was shown how
Superman could make it happen even as you are watching the movie. More generally,
if you see a movie of some planetary motion you will have no way of knowing if
the projector is running forwards or backward. In some movies they get a big laugh
out of the audience by showing cars and people zooming in reverse. As a serious
physics student you should not laugh when you see this since these motions obey
Newton's laws. In other words, it is perfectly possible for a set of people and cars
to execute this motion. On the other hand, when a cartoon character falling under
gravity suddenly starts clawing his way upwards in thin air using sheer will power,
you may laugh since this is a gross violation of Newton's laws.
While the correctness of Eq.(11.5.1) is intuitively clear, we will now prove it
with the help of Newton's Second Law using the fact that it is invariant under t—> — t:
the acceleration is even in time and the potential or force has no reference to t. Here
are the details. Just for this discussion let us use a new clock that has its zero at the
302 point of timereversal, so that t = 0 defines the point when the motion is timereversed.
CHAPTER 11 When the movie is run backward we see the trajectory
xr(t) = x( — t)
In other words, 5 seconds after the reversal, the object is where it was 5 seconds
before the reversal. The reversal of velocities follows from this:
dx(— t) dx(— t)
— x(— t)
dt d(— t)
and does not have to be additionally encoded. The question is this: Does this orbit
xr (t) obey Newton's Second Law
d2x,(t)
m — F(xr)
dt2
given that x(t) does? We find it does:
d2x,(t) d2x(— t) 2x(— t)
m =m —m d = F(x(— t))= F(xr(t))
dt2 dt2 d(— t) 2
Not all problems are timereversal invariant. Consider a positively charged par
ticle in the xy plane moving under a magnetic field down the zaxis. Let us say it
is released at t= 0 just like the planet, with its velocity in the direction of increasing
O. due to the y X B force it will go in a counterclockwise orbit. Let us wait till it has
gone around by r/2 and at this time, t= T, timereverse its state. Will it return to
the timereserved initial state at t= 2T? No, it is readily seen that starting from t—
T it will once again go on a counterclockwise circular orbit tangential to the first at
the point of reversal. We blame the magnetic interaction for this failure of TRI: the
force now involves the velocity which is odd under timereversal.
We now ask how all this appears in quantum mechanics. The ideas will be
illustrated in the simplest context. Let us consider a particle in one dimension with
a timeindependent Hamiltonian H. In the xrepresentation the wave equation is
atg(x, t)
ih — H(x)111(x, t)
at
Let us first note that
performs timereversal. This is clear from the fact that the detailed probability distri
bution in x is unaffected by this change. On the other hand, it is clear from looking
at plane waves (or the momentum operator — ih(e/ex)) that p—> —p under complex
conjugation.
If the system has TRI, we must find the analog of Eq. (11.5.1). So let us prepare 303
a state iv(x, 0), let it evolve for time T, complex conjugate it, let that evolve for SYMMETRIES
another time T and see if we end up with the complex conjugate of the initial state. AND THEIR
We find the following happens at each stage: CONSEQUENCES
ty ( x, 0)_).e iH (x)T//
0) _,,eiH*(x)T/hty* (x, 0) _, eiii (x)T/h eiH * (x)T/hty* (x, 0)
tif (X,
It is clear that in order for the end result, which is y(x, 2T), to obey
iv(x, 2T) = te(x, 0)
we require that
H(x)= H* (x) (11.5.2)
i.e., that the Hamiltonian be real. For H= P2 /2m+ V(x) this is the case, even in
higher dimensions. On the other hand, if we have a magnetic field, P enters linearly
and H(x) 0 H* (x).
If H has TRI, i.e., is real, we have seen at the end of Chapter 6 that every
eigenfunction implies a degenerate one which is its complex conjugate.
Notice that the failure of TRI in the presence of a magnetic field does not
represent any fundamental asymmetry under timereversal in electrodynamics. The
laws of electrodynamics are invariant under t—>— t. The asymmetry in our example
arose due to our treating the magnetic field as external to the system and hence not
to be timereversed. If we had included in our system the currents producing the
magnetic field, and reversed them also, the entire system would have followed the
timereversed trajectory. Indeed, if you had taken a movie of the experiment and
played it back, and you could have seen the charges in the wire, you would have
found them running backward, the field would have been reversed at t= T, and the
charge we chose to focus on would have followed the timereversed trajectory.
On the other hand, certain experiments together with general arguments from
quantum field theory suggest that there exist interactions in this universe which do
not have this asymmetry at the fundamental level.
There are ways to formulate TRI in a basisindependent way but we will not
do so here. For most problems where the coordinate basis is the natural choice the
above discussion will do. There will be a minor twist when the problem involves
spin which has no classical counterpart. This can be handled by treating spin as we
would treat orbital angular momentum.
12
Rotational Invariance
and Angular Momentum
In the last chapter on symmetries, rotational invariance was not discussed, not
because it is unimportant, but because it is all too important and deserves a chapter
on its own. The reason is that most of the problems we discuss involve a single
particle (which may be the reduced mass) in an external potential, and whereas
translational invariance of H implies that the particle is free, rotational invariance
of H leaves enough room for interesting dynamics. We first consider two dimensions
and then move on to three.
12.1. Translations in Two Dimensions
Although we are concerned mainly with rotations, let us quickly review transla
tions in two dimensions. By a straightforward extension of the arguments that led to
Eq. (11.2.14) from Eq. (11.2.13), we may deduce that the generators of infinitesimal
translations along the x and y directions are, respectively,
4, a (12.1.1)
Px coordinate
in —
basis
P ih —
(12.1.2)
Y coordinate
basis
ey
In terms of the vector operator P, which represents momentum,
P=Px i+Py j (12.1.3)
Px and Py are the dot products of P with the unit vector (i or j) in the direction of
the translation. Since there is nothing special about these two directions, we conclude 305
306 that in general,
CHAPTER 12
(12.1.4)
is the generator of translations in the direction of the unit vector n. Finite translation
operators are found by exponentiation. Thus T(a), which translates by a, is given
by
T(a)=CiaPiln =e '4 " = ea " (12.1.5)
where ei= a/a.
The Consistency Test. Let us now ask if the translation operators we have
constructed have the right laws of combination, i.e., if
T(b)T(a)= T(a + b) (12.1.6)
or equivalently if
e h e 1' = e
+b)P/h (12.1.7)
This amounts to asking if P, and Py may be treated as c numbers in manipulating
the exponentials. The answer is yes, since in view of Eqs. (12.1.1) and (12.1.2), the
operators commute
[Px , Py ]=0 (12.1.8)
and their q number nature does not surface here. The commutativity of P. and Py
reflects the commutativity of translations in the x and y directions.
Exercise 12.1.1.* Verify that 'd • P is the generator of infinitesimal translations along a by
considering the relation
<x, L P1V> = V(x  y  8a)
h
12.2. Rotations in Two Dimensions
Classically, the effect of a rotation 00 k, i.e., by an angle 00 about the z axis
(counterclockwise in the xy plane) has the following effect on the state of a particle:
[x][5C1= [cos 00 sin 00 1[x
]
(12.2.1)
y y sin 00 cos 00 y
[Ad _ [cos 00  sin 001[p.,1
(12.2.2)
[
py ] Lpyj sin 00 cos 00 py
Let us denote the operator that rotates these twodimensional vectors by R(00 k). It 307
is represented by the 2 X 2 matrix in Eqs. (12.2.1) and (12.2.2). Just as T(a) is the ROTATION
operator in Hilbert space associated with the translation a, let U[R(Oo k)] be the INVARIANCE
operator associated with the rotation R(00 k). In the active transformation pictures AND ANGULAR
MOMENTUM
R> = U[R]I (12.2.3)
The rotated state I t/JR> must be such that
<X>R = <X> cos 00 — < Y> sin 0o (12.2.4a)
<Y>I2= <X> sin 00 + <Y> cos 00 (12.2.4b)
<PX >R = <Pr > cos 00 — <Py > sin 0o (12.2.5a)
<Py >R= <Pr> sin 00+ <Py > cos 00 (12.2.5b)
where
<X >R = < VRIX VR>
and
<X> = <V/IX I iv>, etc.
In analogy with the translation problem, we define the action of U[R] on position
eigenkets:
U[R]lx, y> = Ix cos 00—y sin 00, x sin 00+y cos 00> (12.2.6)
As in the case of translations, this equation is guided by more than just Eq. (12.2.4),
which specifies how <X>and <Y> transform: in omitting a possible phase factor
g(x, y), we are also ensuring that <Pr>
and <Py > transform as in Eq. (12.2.5).
One way to show this is to keep the phase factor and use Eqs. (12.2.5a) and
(12.2.5b) to eliminate it. We will take the simpler route of dropping it from the
outset and proving at the end that <Pr>
and <Py > transform according to Eq.
(12.2.5).
Explicit Construction of U[R]
Let us now construct U[R]. Consider first an infinitesimal rotation sic. In this
case we set
U[R(s,k)] — I (12.2.7)
We will
will suppress the rotation angle when it is either irrelevant or obvious.
308 where Lz , the generator of infinitesimal rotations, is to be determined. Starting with
CHAPTER 12
Eq. (12.2.6), which becomes to first order in ez
U[Ril XY> = Yez, Xez+ Y> (12.2.8)
it can be shown that
iezLz
<XYI I I tV> = tif +YEz, y  xE (12.2.9)
Exercise 12.2.1. * Provide the steps linking Eq. (12.2.8) to Eq. (12.2.9). [Hint: Recall the
derivation of Eq. (11.2.8) from Eq. (11.2.6)1
Expanding both sides to order Ez
iEz av a tv
<xyliitv>— Ti <xylLz1v>= tv(x, y)+—(yEz)+— (—x)
ax ay
Oy Ox
So
Lz ih—
a )— y( ih —a) (12.2.10)
coordinate
basis
ay —
ax
or in the abstract
Lz =XP,— YPx (12.2.11)
Let us verify that <Pr> and <Pr> transform according to Eq. (12.2.5). Since
p,— ih
a px
)
(12.2.12)
momentum
basis 'Px apy
it is clear that
+iez
<Px, Py i Lzi >a (Py Ez)± a ty (— 13,,Ez) (12.2.13)
Op),
Thus I— isz Lz /h rotates the momentum space wave function ty(p,,py ) by Ez in
momentum space, and as a result <Px > and <Py > transform just as <X> and <Y>
do, i.e., in accordance with Eq. (12.2.5).
We could have also derived Eq. (12.2.11) for L z by starting with the passive 309
transformation equations for an infinitesimal rotation: ROTATION
INVARIANCE
Ut [RJXU[R] =X— YEz (12.2.14a) AND ANGULAR
MOMENTUM
Ut[R]YU[R]= X E z + Y (12.2.14b)
Ut [R]Px U[R]=Px Py 8. (12.2.15a)
Ut [R]Py U[R] = Px sz + Py (12.2.15b)
By feeding Eq. (12.2.7) into the above we can deduce that
[X, Lz]= —ihY (12.2.16a)
[Y, L z ]= ihX (12.2.16b)
[Px L.]= —ihPy (12.2.17a)
[Py , L zi= ihPx (12.2.17b)
These commutation relations suffice to fix L z as XPy 
Exercise 12.2.2. Using these commutation relations (and your keen hindsight) derive
Lz =Xpr — YP,. At least show that Eqs. (12.2.16) and (12.2.17) are consistent with Lz =
Xp r — YP„.
The finite operator U[R(Oo k)] is
U [R(00 k)] = lim (/ —i — Lz) = exp( — i0oLz /h) (12.2.18)
N —■ co hN
Given
Lz x( ih—)— y— ih —
a)
coordinate
basis
ay ax
it is hard to see that e' rotates the state by the angle 00 . For one
thing, expanding the exponential is complicated by the fact that x(— ihagy) and
Y( — ihagx) do not commute. So let us consider an alternative form for L. It can
be shown, by changing to polar coordinates, that
Lz > 4.,
ill —
a (12.2.19)
coordinate
basis
ao
310 This result can also be derived more directly by starting with the requirement
CHAPTER 12 that under an infinitesimal rotation s2 k, tv(x, Y) = tlf(P, ) becomes ty(P, (/)— Ez)•
Exercise 12.2.3.* Derive Eq. (12.2.19) by doing a coordinate transformation on Eq.
(12.2.10), and also by the direct method mentioned above.
Now it is obvious that
exp —
a (12.2.20)
exp(—iooLz/h) —
coordinate
basis
ao)
rotates the state by an angle 00 about the z axis, for
exp( — 00•0 /a0) VG), 0)= V(13, 0 — 0o)
by Taylor's theorem. It is also obvious that U[R(Ojc)] U[R( k)] =
U[R(00 + 0 k]. Thus the rotation operators have the right law of combination.
)
Physical Interpretation of Lz . We identify Lz as the angular momentum opera
tor, since (i) it is obtained from /z = xpy —ypx by the usual substitution rule (Postulate
II), and (ii) it is the generator of infinitesimal rotations about the z axis. Lz is
conserved in a problem with rotational invariance: if
Ut[R]H(X, Px ; Y, Py )U[R]= H(X, Px; Y, Py ) (12.2.21)
it follows (by choosing an infinitesimal rotation) that
[Lz , H]= 0 (12.2.22)
Since X, Px , Y, and Py respond to the rotation as do their classical counterparts
(Eqs. (12.2.14) and (12.2.15)] and H is the same function of these operators as i(
is of the corresponding classical variables, H is rotationally invariant whenever
,Y(' is.
Besides the conservation of <L,>, Eq. (12.2.22) also implies the following:
(1) An experiment and its rotated version will give the same result if His rotationally
invariant.
(2) There exists a common basis for L. and H. (We will spend a lot of time discussing
this basis as we go along.)
The Consistency Check. Let us now verify that our rotation and translation
operators combine as they should. In contrast to pure translations or rotations,
which have a simple law of composition, the combined effect of translations and
rotations is nothing very simple. We seem to be facing the prospect of considering
every possible combination of rotations and translations, finding their net effect, and
then verifying that the product of the corresponding quantum operators equals the
operator corresponding to the result of all the transformations. Let us take one small 311
step in this direction, which will prove to be a giant step toward our goal. ROTATION
Consider the following product of four infinitesimal operations: INVARIANCE
AND ANGULAR
U[R(— s z k )] T( — E)U[R(Ez k)]T(E) MOMENTUM
where E= j + sy j. By subjecting a point in the xy plane to these four operations
we find
F x+Exi [(x+Ex)(y+Ey)Ezi
LY (
+ Ey R(e'k)L(x. &)&+ y+ Ey) E
[x—(y+ sy )Ez x— Ey Ez 1
(12.2.23)
. (x+ Ex)Ez+Y R( k) yd ExEz
i.e., that the net effect is a translation by — Ey Ez i+ ss2 j4 In the above, we have
ignored terms involving sx2 , sy2 , sz2 , and beyond. We do, however, retain the Ex Ey
and Ey Ez terms since they contain the first germ of noncommutativity. Note that
although these are secondorder terms, they are fully determined in our approxima
tion, i.e. unaffected by the secondorder terms that we have ignored. Equation
(12.2.23) imposes the following restriction on the quantum operators:
U[R( — Ez k)] T(— E)U[R(s z k)]T(E) = T(— EY Ez i+ ExEzi) (12.2.24)
or
(+g (Ex Px + Ey Py )](ii Ez L z k i (ExPx+ EyPyd
h h h h
= 1+ E Px ExEzPy (12.2.25)
h Y Ez
By matching coefficients (you should do this) we can deduce the following
constraints:
[Px , LA= — ihPy
[Pr , LA= ihPz
which are indeed satisfied by the generators [Eq. (12.2.17)].
So our operators have passed this test. But many other tests are possible. How
about the coefficients of terms such as Ex Ez2 , or more generally, how about finite
Note that if rotations and translations commuted, the fourfold product would equal /, as can be seen
by rearranging the factors so that the two opposite rotations and the two opposite translations cancel
each other. The deviation from this result of / is a measure of noncommutativity. Given two symmetry
operations that do not commute, the fourfold product provides a nice characterization of their noncom
mutavity. As we shall see, this characterization is complete.
312 rotations? How about tests other than the fourfold product, such as one involving
CHAPTER 12
14 translations and six rotations interlaced?
There is a single answer to all these equations: there is no need to conduct any
further tests. Although it is beyond the scope of this book to explain why this is so,
it is not hard to explain when it is time to stop testing. We can stop the tests when
all possible commutators between the generators have been considered. In the present
case, given the generators Ps , Pr , and Lz , the possible commutators are [Ps , Li],
[Pr , La], and [Ps , Pr]. We have just finished testing the first two. Although the third
was tested implicitly in the past, let us do it explicitly again. If we convert the law
of combination
X
[y]
X+ Ex
[ y
X+ Ex
S + Sy ] 
[y
X
—+ X
+ y]  sy; [y
1 (12.2.26)
into the operator constraint
T(— g rj)T(— c s i)T(Erj)T(c s i)= I (12.2.27)
we deduce that
[Ps , P]=0
which of course is satisfied by the generators Px and P. [Although earlier on, we
did not consider the fourfold product, Eq. (12.2.27), we did verify that the arguments
of the T operators combined according to the laws of vector analysis. Equation
(12.2.26) is just a special case which brings out the commutativity of Px and Pr .]
When I say that there are no further tests to be conducted, I mean the following:
(1) Every consistency test will reduce to just another relation between the com
mutators of the generators.
(2) This relation will be automatically satisfied if the generators pass the tests
we have finished conducting. The following exercise should illustrate this point.
Exercise 12.2.4. * Rederive the equivalent of Eq. (12.2.23) keeping terms of order Ex e.
(You may assume t= O.) Use this information to rewrite Eq. (12.2.24) to order 6,,e.
By
equating coefficients of this term deduce the constraint
— 2L,P,L z +P,L+ 1,Px =h2Px
This seems to conflict with statement (1) made above, but not really, in view of the identity
— 2ADA + S2A2 + A2S2 [A, [A, 5.2]]
Using the identify, verify that the new constraint coming from the cx 6,2 term is satisfied given
the commutation relations between Px , Py , and L.
Vector Operators 313
We call V= Vx i + Vy j a vector operator if V. and Vy transform as components ROTATION
INVARIANCE
of a vector under a passive transformation generated by U[R] :
AND ANGULAR
MOMENTUM
Ut [R]Vi U[R] =E Vi
where R y is the 2 x 2 rotation matrix appearing in Eq. (12.2.1). Examples of V are
P=P,i+Py j and R =Xi + Yj [see Eqs. (12.2.14) and (12.2.15)]. Note the twofold
character of a vector operator such as P: on the one hand, its components are
operators in Hilbert space, and on the other, it transforms as a vector in V2(R).
The same definition of a vector operator holds in three dimensions as well, with
the obvious difference that R y is a 3 x 3 matrix.
12.3. The Eigenvalue Problem of L,
We have seen that in a rotationally invariant problem, H and L. share a common
basis. In order to exploit this fact we must first find the eigenfunctions of L, . We
begin by writing
Lz i/z > = /JO (12.3.1)
in the coordinate basis:
jj iz(P '
1,(19, (12.3.2)
ao
The solution to this equation is
6(P, 0) =R(p) e`4" (12.3.3)
where R(p) is an arbitrary function normalizable with respect to foc° p dp.I We shall
have more to say about R(p) in a moment. But first note that 1, seems to be arbitrary:
it can even be complex since 4 goes only from 0 to 27r. (Compare this to the
eigenfunctions e'Pxl of linear momentum, where we could argue that p had to be
real to keep I bounded as I xl co.) The fact that complex eigenvalues enter the
answer, signals that we are overlooking the Hermiticity constraint. Let us impose it.
The condition
<tv1lLz1 w2> = <tv2IL.1 1i > (12.3.4)
This will ensure that iy is normalizable with respect to
dx dy= 1.02 ' p dp dol)
314 becomes in the coordinate basis
CHAPTER 12
2rr
wr(—ih
a
)ty 2p dp =[.f
co f2ir
a
ty1(— ih — ) y i p dp
Jo 0
ao 0 0
ao (12.3.5)
If this requirement is to be satisfied for all ty, and ty2, one can show (upon integrating
by parts) that it is enough if each ty obeys
W(P, 0 ) = V(P, 21r) (12.3.6)
If we impose this constraint on the L. eigenfunctions, Eq. (12.3.3), we find
1 = e21"4/ (12.3.7)
This forces lz not merely to be real, but also to be an integral multiple of h:
lz =mh, m=0, ± 1, ±2, . . . (12.3.8)
One calls m the magnetic quantum number. Notice that /, = mh implies that ty is a
singlevalued function of (1). (However, see Exercise 12.3.2.)
Exercise 12.3.1. Provide the steps linking Eq. (12.3.5) to Eq. (12.3.6).
Exercise 12.3.2. Let us try to deduce the restriction on 1, from another angle. Consider
a superposition of two allowed 1, eigenstates:
0(p, 0) = A(p) ei`k"+ B(p) eid'fJh
By demanding that upon a 2/1 rotation we get the same physical state (not necessarily the
same state vector), show that lz.—/'z =mh, where m is an integer. By arguing on the grounds
of symmetry that the allowed values of m must be symmetric about zero, show that these
values are either. . . , 3h/2, h/2, — h/2, — 3h/2, ... or ... , 2, h, 0, — h, — 2h, . . It is not
possible to restrict 1, any further this way.
Let us now return to the arbitrary function R(p) that accompanies the eigen
functions of L. Its presence implies that the eigenvalue lz = mh does not nail down
a unique state in Hilbert space but only a subspace V,. The dimensionality of this
space is clearly infinite, for the space of all normalizable functions R is infinite
dimensional. The natural thing to do at this point is to introduce some operator that
commutes with L. and whose simultaneous eigenfunctions with L. pick out a unique
basis in each V„,. We shall see in a moment that the Hamiltonian in a rotationally
invariant problem does just this. Physically this means that a state is not uniquely
specified by just its angular momentum (which only fixes the angular part of the
wave function), but it can be specified by its energy and angular momentum in a
rotationally invariant problem.
It proves convenient to introduce the functions 315
ROTATION
0(0) = (270 1 /2 elm (12.3.9) INVARIANCE
AND ANGULAR
MOMENTUM
which would have been nondegenerate eigenfunctions of L. if the p coordinate had
not existed. These obey the orthonormality condition
2,r
0 g = (12.3.10)
It will be seen that these functions play an important role in problems with rotational
invariance.
Exercise 12.3.3. * A particle is described by a wave function
ty(p, 0)=A eP2/2A2 COS2 ii
Show (by expressing cos' 0 in terms of (I)) that
P(1,=0)=2/3
P(1,=2h)= 1 /6
P(1,= —2h)= 1/6
(Hint: Argue that the radial part eP2/2A2 is irrelevant here.)
Exercise 12.3.4. * A particle is described by a wave function
v(p, )=A eP2/2A2 (E' COS + sin 0)
Show that
P(1,=h)= P(1,= —h)=
Solutions to Rotationally Invariant Problems
Consider a problem where V(p, Ø )= V(p). The eigenvalue equation for H is
[ _ h2 ( ±i a ± 1 02 )
211 Op 2 p Op p2 ao2 + V(p)1 1,v E(P, 0)= ElY E(P, 0) (12.3.11)
(We shall use pi to denote the mass, since m will denote the angular momentum
quantum number.) Since [H, L z] =0 in this problem, we seek simultaneous eigen
functions of H and Lz . We have seen that the most general eigenfunction of L. with
316 eigenvalue mh is of the form
CHAPTER 12
Vf 0)= R(P)(210 t/2e"d'=R(P)(1).(0)
where R(p) is undetermined. In the present case R is determined by the requirement
that
VEni(P, 46) = RE.(01).(0) (12.3.12)
be an eigenfunction of H as well, with eigenvalue E, i.e., that tif Em satisfy Eq.
(12.3.11). Feeding the above form into Eq. (12.3.11), we get the radial equation that
determines RE,n(p) and the allowed values for E:
[  h2 d2 1 d m 2) v(p) 1 REm , p, _
ERE,n(P) (12.3.13)
2p c/p 2 p dp p2 ) L
As we change the potential, only the radial part of the wave function, R, changes;
the angular part el„, is unchanged. Thus the functions 0(0), which were obtained
by pretending p does not exist, provide the angular part of the wave function in the
eigenvalue problem of any rotationally invariant Hamiltonian.
Exercise 12.3.5*. Note that the angular momentum seems to generate a repulsive poten
tial in Eq. (12.3.13). Calculate its gradient and identify it as the centrifugal force.
Exercise 12.3.6. Consider a particle of mass p constrained to move on a circle of radius
a. Show that H= Lz2 /2pa2. Solve the eigenvalue problem of H and interpret the degeneracy.
Exercise 12.3.7. * (The Isotropic Oscillator). Consider the Hamiltonian
p2 p2
H  " + pco 2(X 2 + Y 2)
2p 2
(1) Convince yourself [H, Lz.]= 0 and reduce the eigenvalue problem of H to the radial
differential equation for Rem(P).
(2) Examine the equation as p*0 and show that
REm(P)'
,9 _,,
(3) Show likewise that up to powers of p
RE„,(p) ,
p_e l")P2124
So assume that Rem(P)= e PwP2/24 UEm(P).
(4) Switch to dimensionless variables g= E/hco, y= (pcolh) 1 /2p. 317
(5) Convert the equation for R into an equation for U. (I suggest proceeding in two
ROTATION
stages: R=y 1 '1 o), co =eY 2/2 U.) You should end up with
INVARIANCE
AND ANGULAR
MOMENTUM
U"+ 2y1U' + (2E2102)U= 0
(6) Argue that a power series for U of the form
U(y) = E cry '
—0
will lead to a two  term recursion relation.
(7) Find the relation between Cr + 2 and C,. Argue that the series must terminate at some
finite r if the y co behavior of the solution is to be acceptable. Show e = r +1ml +1 leads to
termination after r terms. Now argue that r is necessarily even—i.e., r = 2k. (Show that if r is
odd, the behavior of R as p3 0 is not p.) So finally you must end up with
E= (2k + + 1)h(o, k=0, 1, 2, ...
Define n = 2k +Iml, so that
E„=(n+l)hto
(8) For a given n, what are the allowed values of 1ml? Given this information show that
for a given n, the degeneracy is n + 1. Compare this to what you found in Cartesian coordinates
(Exercise 10.2.2).
(9) Write down all the normalized eigenfunctions corresponding to n=0, 1.
(10) Argue that the n=0 function must equal the corresponding one found in Cartesian
coordinates. Show that the two n=2 solutions are linear combinations of their counterparts
in Cartesian coordinates. Verify that the parity of the states is (1 as you found in Cartesian
)
coordinates.
Exercise 12.3.8.* Consider a particle of charge q in a vector potential
B
A=— (—yi+ xj)
2
(1) Show that the magnetic field is B = Bk.
(2) Show that a classical particle in this potential will move in circles at an angular
frequency w o = qBIpc.
(3) Consider the Hamiltonian for the corresponding quantum problem:
[Px + qYB/2c]2 + [P — qXB/2c] 2
H— Y
2p 2p
Show that Q=(cP„+ qYB/2)IqB and P = (Py — qXBI2c) are canonical. Write H in terms
of P and Q and show that allowed levels are E= (n + 1 /2)hwo •
318 (4) Expand H out in terms of the original variables and show
CHAPTER 12
H= ±
6 ) , p)— (' L,
H( 2 2
where H(o4/2, p) is the Hamiltonian for an isotropic twodimensional harmonic oscillator
of mass p and frequency w 0/2. Argue that the same basis that diagonalized H(co 0/2, p) will
diagonalize H. By thinking in terms of this basis, show that the allowed levels for H are
E= (k + Iml— m+ )hco o , where k is any integer and m is the angular momentum. Con
vince yourself that you get the same levels from this formula as from the earlier one
[t= (n+ 1/2)ho)0]). We shall return to this problem in Chapter 21.
12.4. Angular Momentum in Three Dimensions
It is evident that as we pass from two to three dimensions, the operator L. picks
up two companions L., and Ly which generate infinitesimal rotations about the x
and y axes, respectively. So we have
Lx = YP,— ZP y (12.4.1a)
Ly = ZPx — XPz (12.4.1b)
L z =XPy — YP, (12.4.1c)
As usual, we subject these to the consistency test. It may be verified, (Exercise 12.4.2),
that if we take a point in threedimensional space and subject it to the following
rotations: R(i), R(gy n, R( — si) and lastly R(— c y j), it ends up rotated by
— sx sy k. In other words
R(— y DR(  Exi)R(E y DR(Exi)= R( — Ex Ey k) (12.4.2)
It follows that the quantum operators U[R] must satisfy
U[R(— Ey j)]U[R( — Exi)]U[R(gYn]U[R(Cxl)]= U[R( — ExEy k)] (12.4.3)
If we write each U to order E and match coefficients of Ex Ey , we will find
[Lx , Ly ]= ihL z (12.4.4a)
By considering two similar tests involving Ey Ez and Ez gx , we can deduce the
constraints
[Lx , L z ]= ihL ., (12.4.4b)
[Lz , Lx]= itiL y (12.4.4c)
You may verify that the operators in Eq. (12.4.1) satisfy these constraints. So they 319
are guaranteed to generate finite rotation operators that obey the right laws of ROTATION
combination. INVARIANCE
The three relations above may be expressed compactly as one vector equation AND ANGULAR
MOMENTUM
LxL=AL (12.4.5)
Yet another way to write the commutation relations is
3
[Li , L]= ih Eijk Lk (12.4.6)
k=1
In this equation, i and run from 1 to 3, L 1 , L2, and L3 stand for Lx , Ly , and Lz,
]
respectively,t and Eijk are the components of an antisymmetric tensor of rank 3, with
the following properties:
(1) They change sign when any two indices are exchanged. Consequently no two
indices can be equal.
(2) 6123 = 1.
This fixes all other components. For example,
6132 — —1, 6312 = ( 1 )( 1 ) = +1 (12.4.7)
and so on. In short, Eiji, is +1 for any cyclic permutation of the indices in 6123 and
—1 for the others. (The relation
c=axb (12.4.8)
between three vectors from V3 (R) may be written in component form as
3 3
Ci= EE Eijkajbk (12.4.9)
k= I
Of course a x a is zero if a is a vector whose components are numbers, but not zero
if it is an operator such as L.)
Exercise 12.4.1.* (1) Verify that Eqs. (12.4.9) and Eq. (12.4.8) are equivalent, given the
definition of Eyk •
(2) Let U1 , U2, and U3 be three energy eigenfunctions of a single particle in some
potential. Construct the wave function v A (x,, x2 , x3) of three fermions in this potential, one
of which is in Ul , one in U2 , and one in U3, using the Euk tensor.
Exercise 12.4.2.* (1) Verify Eq. (12.4.2) by first constructing the 3 X 3 matrices corre
sponding to R(E„i) and R(Ey j), to order c.
(2) Provide the steps connecting Eqs. (12.4.3) and (12.4.4a).
We will frequently let the indices run over 1, 2, and 3 insteady of x, y, and z.
320 (3) Verify that L., and Ly defined in Eq. (12.4.1) satisfy Eq. (12.4.4a). The proof for
other commutators follows by cyclic permutation.
CHAPTER 12
We next define the total angular momentum operator squared
L2 = 4,2 + (12.4.10)
It may be verified (by you) that
[L2 , L 1 ]= O, i=x,y, or z (12.4.11)
Finite Rotation Operators. Rotations about a given axis commute. So a finite
rotation may be viewed as a sequence of infinitesimal rotations abou the same axis.
What is the operator that rotates by angle 0, i.e., by an amount 0 about an axis
parallel to 6? If 0= 0,i, then clearly
U[R(0,i)]= — 10,Lxin
• The same goes for 0 along the unit vectors j and k. What if 0 has some arbitrary
direction? We conjecture that L O • L (where 0=0/0) is the generator of infin
itesimal rotations about that axis and that

i 0
U[R(0)]= lim — 0 •L = e 1/n
N— ■ co hN
= (12.4.12)
Our conjecture is verified in the following exercise.
Exercise 12.4.3. * We would like to show that 5. L generates rotations about the axis
parallel to O. Let 59 be an infinitesimal rotation parallel to O.
(1) Show that when a vector r is rotated by an angle 39, it changes to r+ 60 x r. (It
might help to start with rISCI and then generalize.)
(2) We therefore demand that (to first order, as usual)
v(r) ig(r  59 X r) = v(r)  (30 X r) V 1/
(4/z (5 o)]
Comparing to U[R(30)]= I (i 30 Ih)L 5, show that
Exercise 12.4.4.* Recall that V is a vector operator if its components V, transform as
Ut [R]V U [11 = ER4 Vj (12.4.13)
(1) For an infinitesimal rotation 60, show, on the basis of the previous exercise, that 321
ROTATION
E Rif V; = V+ (3O x V)= Vi + E guk(se); vk INVARIANCE
j k
AND ANGULAR
MOMENTUM
(2) Feed in U[R] = 1 — (i/h)59.L into the lefthand side of Eq. (12.4.13) and deduce
that
[V,, Id= ih E E, Vk (12.4.14)
This is as good a definition of a vector operator as Eq. (12.4.13). By setting V = L, we can
obtain the commutation rules among the L's.
If the Hamiltonian is invariant under arbitrary rotations,
Ut [R]HU[R]=H (12.4.15)
it follows (upon considering infinitesimal rotations around the x, y, and z axes) that
[H, Ld= 0 (12.4.16)
and from it
[H, L2]= 0 (12.4.17)
Thus L2 and all three components of L are conserved. It does not, however, follow
that there exists a basis common to H and all three L's. This is because the L's do
not commute with each other. So the best one can do is find a basis common to H,
L 2, and one of the L's, usually chosen to be L.
We now examine the eigenvalue problem of the commuting operators L2 and
L. When this is solved, we will turn to the eigenvalue problem of H, L2, and Lz
12.5. The Eigenvalue Problem of L2 and L,
There is a close parallel between our approach to this problem and that of the
harmonic oscillator. Recall that in that case we (1) solved the eigenvalue problem
of H in the coordinate basis; (2) solved the problem in the energy basis directly,
using the a and at operators, the commutation rules, and the positivity of H;
(3) obtained the coordinate wave function ti(y) given the results of part (2), by
the following trick. We wrote
alO>=0
in the coordinate basis as
( a\
V ± 0y tY o(Y) = 0
)
322 which immediately gave us w o(y) eY212, up to a normalization that could be easily

CHAPTER 12 determined.
Given the normalized eigenfunction w o(y), we got yi n(y) by the application of
the (differential) operator (ar)7(n!) 2 (y /0y)"/(2"n!) 1 /2 .
 
In the present case we omit part (1), which involves just one more bout with
differential equations and is not particularly enlightening.
Let us now consider part (2). It too has many similarities with part (2) of the
oscillator problem.$ We begin by assuming that there exists a basis l a, /3> common
to L2 and L z :
L 2la13>= ala13> (12.5.1)
Lz la 13> = fila /3> (12.5.2)
We now define raising and lower operators
= L,± iLy (12.5.3)
which satisfy
[L,, L,]= ±hL, (12.5.4)
and of course (since L 2 commutes with L., and Ly )
[L2, L ± ]=0 (12.5.5)
Equations (12.5.4) and (12.5.5) imply that L, raise/lower the eigenvalue of L z by
h, while leaving the eigenvalue of L2 alone. For example,
Lz(L+1 01 fl>)=(L+Lz+hL+)lafl>
=(L± /3 + hL,)la /3>
=(13+h)(L,Ia13>) (12.5.6)
and
L 2L + la fi>=L+.Cla fi>= aLFla 13 >
(12.5.7)
From Eqs. (12.5.6) and (12.5.7) it is clear that L± I afi> is proportional to the normal
ized eigenket la, /3 + h>:
LFlafi>= C+(a, PA a, + h> (12.5.8a)
If you have forgotten the latter, you are urged to refresh your memory at this point.
It can similarly be shown that 323
ROTATION
L_Ia13>= C_(a, )3)la, 13—h> (12.5.8b) INVARIANCE
AND ANGULAR
MOMENTUM
The existence of L, implies that given an eigenstate I afi> there also exist eigen
states l a, /3 + 1 >, l a, 13 + 2>, ; and l a, fl — 1>, la, 13— 2>, . This clearly signals
trouble, for classical intuition tells us that the z component of angular momentum
cannot take arbitrarily large positive or negative values for a given value of the
square of the total angular momentum; in fact classically lizi<(/ 2) 1 /2.
Quantum mechanically we have
<afilL 2 — L afi>= <afil + L41 afl> (12.5.9)
which implies
a — 13 2 u
(since Lx2 + 42 is positive definite) or
a> 132 (12.5.10)
Since /32 is bounded by a, it follows that there must exist a state I a fimax > such that
it cannot be raised:
(12.5.11)
Operating with L_ and using L_L + = L2 — .L—hL z , we get
(L2 — L — hLz)l a fimax > = 0
(a — PL. — hfi.)I afimax> =0
a = fir,a„(fir,a„+h) (12.5.12)
Starting with l afimax > let us operate k times with L_, till we reach a state l a fi.,,,>
that cannot be lowered further without violating the inequality (12.5.10):
LI afimin> =
L+ =0
(L 2 — + h Lz)1 a )3 min > = 0
a= fimin(Anin — h) (12.5.13)
A comparison of Eqs. (12.5.12) and (12.5.13) shows (as is to be expected)
fJ mjn flmax (12.5.14)
324 Table 12.2. Some LowAngularMomentum States
CHAPTER 12 (Angular momentum)
k/2 a 10>
0 0 0 10,0>
1/2 h/2 (1/2)(3/2)h 2 1(3/4)h 2, h/2>
(3/4)h 2, h/2>
1 h (1)(2)h 2 12h 2, h>
1 2 2, 0>
12h2 , —h>
3/2
Since we got to I afl inin > from I afi max > in k steps of h each, it follows that
f3 max fimin = 20. = hk
hk
f3max= — , k = 0, 1, 2, . . . (12.5.15a)
2
a = (fimax)(fimax+ h)= h2 (1 ) (I + 1 ) (12.5.15b)
22
We shall refer to (k/2) = (fl max/h) as the angular momentum of the state. Notice
that unlike in classical physics, fi 2max is less than a, the square of the magnitude of
angular momentum, except when a= fini ax = 0, i.e., in a state of zero angular
momentum.
Let us now take a look at a few of the lowangularmomentum states listed in
Table 12.1.
At this point the astute reader raises the following objection.
A.R. : I am disturbed by your results for odd k. You seem to find that L„ can
have halfintegral eigenvalues (in units of h). But you just convinced us in Section
12.3 that L, has only integral eigenvalues m (in units of h). Where did you go wrong?
R.S. : Nowhere, but your point is well taken. The extra (halfintegral) eigenval
ues arise because we have solved a more general problem than that of 1,, Ly ,
and L2 (although we didn't intend to). Notice that nowhere in the derivation did we
use the explicit expressions for the L's [Eq. (12.4.1)] and in particular L„—> — the I
00. (Had we done so, we would have gotten only integral eigenvalues as you expect.)
We relied instead on just the commutation relations, L X L = ihL. Now, these commu
tation relations reflect the law of combinations of infinitesimal rotations in three
dimensions and must be satisfied by the three generators of rotations whatever the
nature of the wave functions they rotate. We have so far considered just scalar wave
functions v(x, y, z), which assign a complex number (scalar) to each point. Now,
there can be particles in nature for which the wave function is more complicated,
say a vector field IP(x, y, z)= Iv x (x, y, z)i+ yf y (x, y, z)j+ Iv z (x, y, x)k. The response
of such a wave function to rotations is more involved. Whereas in the scalar case
the effect of rotation by 60 is to take the number assigned to each point (x, y, z)
325
ROTATION
INVARIANCE
AND ANGULAR
Figure 12.1. The effect of the infinitesimal rotations by ez MOMENTUM
on a vector yi in two dimensions is to (1) first reassign it
to the rotated point (x', y') (2) and then rotate the vector
itself by the infinitesimal angle. The differential operator Lz
does the first part while a 2 x 2 spin matrix S does the
second.
and reassign it to the rotated point (x', y', z'), in the vector case the vector at (x, y, z)
(i) must itself be rotated by (50 and (ii) then reassigned to (x', y', z'). (A simple
example from two dimensions is given in Fig. 12.1.) The differential operators Lx ,
Ly , and Lz will only do part (2) but not part (1), which has to be done by 3 x 3
matrices Sx , Sy , and Sz which shuffle the components tifx iv y tvz of 1P. In such
cases, the generators of infinitesimal rotations will be of the form
4= L i + Si
where L, does part (2) and S, does part (1) (see Exercise 12.5.1 for a concrete
example). One refers to L, as the orbital angular momentum, S, as the spin angular
momentum (or simply spin), and 4 as the total angular momentum. We do not yet
know what 4 or S, look like in these general cases, but we do know this: the ,/,'s
must obey the same commutation rules as the L‘'s, for the commutation rules reflect
the law of combination of rotations and must be obeyed by any triplet of generators
(the consistency condition), whatever be the nature of wave function they rotate. So
in general we have
J x J=ihJ (12.5.16)
with L as a special case when the wave function is a scalar. So our result, which
followed from just the commutation relations, applies to the problem of arbitrary J
and not just L. Thus the answer to the question raised earlier is that unlike L z , Jz
is not restricted to have integral eigenvalues. But our analysis tells us, who know
very little about spin, that Sz can have only integral or halfintegral eigenvalues if
the commutation relations are to be satisfied. Of course, our analysis doesn't imply
that there must exist particles with spin integral or half integral—but merely reveals
the possible variety in wave functions. But the old maxim—if something can happen,
it will—is true here and nature does provide us with particles that possess spin—i.e.,
particles whose wave functions are more complicated than scalars. We will study
them in Chapter 14 on spin.
Exercise 12.5.1.* Consider a vector field T(x, y) in two dimensions. From Fig. 12.1 it
follows that under an infinitesimal rotation sz
w x 0 V.„.(x, y) = v,(x +yEz , y— xc z )—w y (x+ yEz , y — xe z )E‘.
ivy —> Vy(x, y) = iy,(x + ys z , y— XE,)E,± y (X yE z y— xs z )
326 Show that (to order c,)
CHAPTER 12
[1/41_ ([1 01ic, [L, 0 i Ez 1)[tv,1
[0 
V'y 01 h L, h Vy
so that
J.=L I) 0i (2) + Iwo s (2)
= L.+ s.
where / (2) is a 2 x 2 identity matrix with respect to the vector components, P I) is the identity
operator with respect to the argument (x, y) of P(x, y). This example only illustrates the fact
that J,= L,+ S, if the wave function is not a scalar. An example of halfintegral eigenvalues
will be provided when we consider spin in a later chapter. (In the present example, S, has
eigenvalues ± h.)
Let us return to our main discussion. To emphasize the generality of the results
we have found, we will express them in terms of J's rather than L's and also switch
to a more common notation. Here is a summary of what we have found. The
eigenvectors of the operators J2 and J. are given by
J2 1.im>=../(j+ 1 )/i2 ijm>, j= 0, 1/2, 1, 3/2, ... (12.5.17a)
J.Lim>=Inhifin>, m =j,j— 1,j— 2, .. . , —j (12.5.17b)
We shall call j the angular momentum of the state. Note that in the above m can be
an integer or halfinteger depending on j.
The results for the restricted problem J = L that we originally set out to solve
are contained in Eq. (12.5.17): we simply ignore the states with halfintegral m and
j. To remind us in these cases that we are dealing with J = L, we will denote these
states by 1Im>. They obey
L211m>=1(/+1)h 2 1/m>, 1=0, 1, 2, ... (12.5.18a)
L.11m> = /IA lm>, m=1,1— 1, . . . , (12.5.18b)
Our problem has not been fully solved: we have only found the eigenvalues
the eigenvectors aren't fully determined yet. (As in the oscillator problem, finding
the eigenvectors means finding the matrices corresponding to the basic operators
whose commutation relations are given.) Let us continue our analysis in terms of
the J's. If we rewrite Eq. (12.5.8) in terms of J,, j, and m (instead of L,, a, and
fi), we get
J±Ijm> = C±U, m ± 1> (12.5.19)
where C±(j, m) are yet to be determined. We will determine them now.
If we take the adjoint of 327
ROTATION
= m+1> INVARIANCE
AND ANGULAR
we get MOMENTUM
<iml J= ct(i, 0 0, m + 11
Equating the inner product of the objects on the lefthand side to the product of the
objects on the righthand side, we obtain
11+1.im>= 1c+ Ci, 01 2 0, + 11j, m + I >
=1C+(i, 01 2
<finl hJzlim> =1C+(i, 01 2
or
I c±(f, m )12 _Ai + 1) h2 m2112 m h2
112 (j m)(i+m+ 1)
C±(j, m)= h[(j— m)(j+ m + 1)1 1 /2
It can likewise be shown that
C_(j, m)= h[(j+m)(j—m + 1)1 1 /2
so that finally
.1±Lim>=N(jT m)(l 1 m+ m± 1 > (12.5.20)
Notice that when L. act on 1j, ±1> they kill the state, so that each family with a
given angular momentum j has only 2j+ I states with eigenvalues jh,
(j— 1)h, . . . , — ( jh) for Jz
Equation (12.5.20) brings us to the end of our calculation, for we can write
down the matrix elements of J and J,, in this basis:
J+ +1_
'111'1 J.,1 1m> = Lim>
2
h
= { (5/, 6m%mFIR.i — mXj+m+ 1 1i 1/2 (511
X Rj+m)(j—m + 1)] 1 /2 1 (12.5.21a)
There can be an overall phase factor in front of C+ . We choose it to be unity according to standard
convention.
328 J_
Jyi fin> = Um] Lim>
CHAPTER 12 2i
= {G' m)( m+ 1 )1 l /2 (5,f 1
ii
x [(j + m)(j — m+ 1)] 1 /2 } (12.5.21b)
Using these (or our mnemonic based on images) we can write down the matrices
corresponding to .12, J, J, and J in the 11m> basisT:
(0,0) 6,D 6, —D (1,1) (1,0) (1, —1)
0 0 o o o o
O 3 h2 o o 0 o
o o 3 h2 o o o
0 0 0 2h2 0 o
0 0 o 0 2h 2 0
0 0 o o o 2h2
(12.5.22)
J., is also diagonal with elements mh.
00 0 o 0 o
0 0 h/2 o 0 o
o h/2 0 o 0 o
O o o o h/2" o (12.5.23)
oo o h/2" o h/2"
O o o o h/2" o
:
_o 0 0 o o o
0 0 —ih /2 0 o o
O ih /2 0 o o o
4+ 0 0 0 0 —ih/2 1 /2 0 (12.5.24)
O o o iih/2" o —ih/2"
O o o o i/1/2 2 o
Notice that although 4, and Jy are not diagonal in the 1j, m> basis, they are block
diagonal: they have no matrix elements between one value of f and another. This is
The quantum numbers j and in do not fully label a state; a state is labeled by lajm>, where a represents
the remaining labels. In what follows, we suppress a but assume it is the same throughout.
because .1, (out of which they are built) do not change j when they act on I jm>. 329
Since the J's are all block diagonal, the blocks do not mix when we multiply them. ROTATION
In particular when we consider a commutation relation such as [Jr, Jy]= ihJz, it will INVARIANCE
be satisfied within each block. If we denote the (2j+ 1) X (2j+ 1) block in J„ corre AND ANGULAR
sponding to a certain j, by .0) , then we have MOMENTUM
[J11) 4.i)] = j= 0, 1, 1, ... (12.5.25)
Exercise 12.5.2. (1) Verify that the 2 x 2 matrices 4") , 4 1 2) , and J» /2) obey the com
mutation rule [ 41 /2) , /21 ih I /2)
(2) Do the same for the 3 x 3 matrices JP.
(3) Construct the 4 x 4 matrices and verify that
[ 43/2) , 43/I =ih jz(3/2)
Exercise 12.5.3.* (1) Show that <J,>=<Jy >=0 in a state I jm>.
(2) Show that in these states
<4> = <4> = + 1) m21
(use symmetry arguments to relate <4> to <4> ) .
(3) Check that 4J, • AJ, from part (ii) satisfies the inequality imposed by the uncertainty
principle [Eq. (9.2.9)].
(4) Show that the uncertainty bound is saturated in the state Ij, ±j>.
Finite Rotationst
Now that we have explicit matrices for the generators of rotations, J,„ 4, and
4, we can construct the matrices representing U[R] by exponentiating ( — i0 J/
h). But this is easier said than done. The matrices J, are infinite dimensional and
exponentiating them is not practically possible. But the situation is not as bleak as
it sounds for the following reason. First note that since J are block diagonal, so is
the linear combination 0 • J, and so is its exponential. Consequently, all rotation
operators U[R] will be represented by block diagonal matrices. The (2j+ 1)dimen
sional block at a given j is denoted by DIM. The block diagonal form of the
rotation matrices implies (recall the mnemonic of images) that any vector Iv'> in
the subspace V, spanned by the (2j+ 1) vectors if>, , I j — j> goes into another
element I tif.',> of V,. Thus to rotate tiff >, we just need the matrix Du) . More generally,
if Iv> has components only in Vo , V 1 , V2 ,. , V,, we need just the first (j+ 1)
matrices D°). What makes the situation hopeful is that it is possible, in practice, to
evaluate these if j is small. Let us see why. Consider the series representing
D"[R(0)] = exp [
h
j(1
o
1'10' J")"—
n!
The material from here to the end of Exercise 12.5.7 may be skimmed over in a less advanced course.
330 It can be shown (Exercise 12.5.4) that (6. J () y, for n> 2f can be written as a linear
CHAPTER 12 combination of the first 2j powers of 0. J (1) . Consequently the series representing
D(J) may be reduced to
2,
D'= f,(0)(6 J ('))n
o
It is possible, in practice, to find closed expressions for fn(0) in terms of trigonometric
functions, for modest values of f (see Exercise 12.5.5). For example,
J (I/2) sin —0
h 22
Let us return to the subspaces V, Since they go into themselves under arbitrary
rotations, they are called invariant subspaces. The physics behind the invariance is
simple: each subspace contains states of a definite magnitude of angular momentum
squared j(j+ 1)112, and a rotation cannot change this. Formally it is because
[J2, U [R]] = 0 and so U[R] cannot change the eigenvalue of J 2.
The invariant subspaces have another feature: they are irreducible. This means
that V, itself does not contain invariant subspaces. We prove this by showing that
any invariant subspace V, of V, is as big as the latter. Let Itif> be an element of `11 .
Since we haven't chosen a basis yet, let us choose one such that I ,i'> is one of the
basis vectors, and furthermore, such that it is the basis vector jï>, up to a normaliza
tion factor, which is irrelevant in what follows. (What if we had already chosen a
basis jj>, . , 1j, —j> generated by the operators ,/,? Consider any unitary trans
formation U which converts 1jj> into Itif> and a different triplet of operators J;
defined by = (1,11 Ut. The primed operators have the same commutation rules and
hence eigenvalues as the J. The eigenvectors are just Ijm>'=Uljni>, with 1jj>'=
I iv>. In the following analysis we drop all primes.)
Let us apply an infinitesimal rotation 80 to lip>. This gives
= [R(60 )]1.0
= [I— (0)00'
=[I— (i/2h)(60±J__± 60_,LE+260,4)11.0
where
60,=(60.,,±i(50)
Since JAI.11> = 0, Jidii> =itil.g>, and J1J» = h(2j) 1/2 1j,j — 1>, we get
I = (1— W ) W> — 0(2j) 1/2 80+1j,j— 1>
Since is assumed to be invariant under any rotation, V> also belongs to
Subtracting (1 — ijSez)1.1.», which also belongs to ‘71
 , from I V>, we find that j,f— 1>
also belongs to V,. By considering more of such rotations, we can easily establish
that the (2j+ 1) orthonormal vectors, • • • >If, —j> all belong to 071.
Thus has the same dimensionality as Vp Thus V, has no invariant subspaces. (In 331
a technical sense, V, is its own subspace and is invariant. We are concerned here ROTATION
with subspaces of smaller dimensionality.) INVARIANCE
The irreducibility of V, means that we cannot, by a change of basis within V,, AND ANGULAR
further block diagonalize all the Du) . We show that if this were not true, then a MOMENTUM
contradiction would arise. Let it be possible block diagonalize all the Du) , say, as
follows:
D (i) [R] 25 + 1
jni> basis'
4 . d2
dl 0
new basis
d2 0
(The boxed regions are generally nonzero.) If follows that V, contains two invariant
subspaces of dimensionalities d1 and d2 , respectively. (For example, any vector with
just the first d1 components nonzero will get rotated into another such vector. Such
vectors form a dl dimensional subspace.) We have seen this is impossible.
The block diagonal matrices representing the rotation operators U[R] are said
to provide an irreducible (matrix) representation of these operators. For the set of
all rotation operators, the elements of which do not generally commute with each
other, this irreducible form is the closest one can come to simultaneous diagonaliza
tion. All this is summarized schematically in the sketch below, where the boxed
regions represent the blocks, D (°) , D(1) , . etc. The unboxed regions contain zeros.
Low)
0
1/2)
U[R] •
I; , tn) basis
0 D (1)
Consider next the matrix representing a rotationally invariant Hamiltonian in
this basis. Since [H, J]=O, H has the same form as J2, which also commutes with
332 all the generators, namely,
CHAPTER 12
(1) H is diagonal, since [H, J2]= 0, [H, = O.
(2) Within each block, H has the same eigenvalue E„, since [H, Jd=O.
It follows from (2) that V, is an eigenspace of H with eigenvalue Ej , i.e., all states
of a given j are degenerate in a rotationally invariant problem. Although the same
result is true classically, the relation between degeneracy and rotational invariance
is different in the two cases. Classically, if we are given two states with the same
magnitude of angular momentum but different orientation, we argue that they are
degenerate because
(1) One may be rotated into the other.
(2) This rotation does not change the energy.
Quantum mechanically, given two elements of V, it is not always true that they
may be rotated into each other (Exercise 12.5.6). However, we argue as follows:
(1) One may be reached from the other (in general) by the combined action of
and U[R].
(2) These operators commute with H.
In short, rotational invariance is the cause of degeneracy in both cases, but the
degenerate states are not always rotated versions of each other in the quantum case
(Exercises 12.5.6 and 12.5.7).
Exercise 12.5.4.* (1) Argue that the eigenvalues of J.,?) and ./.,?) are the same as those of
41) , namely, jh, (j — 1)h, . . . , ( — jh). Generalize the result to 6. J (J) .
(2) Show that
(J — jh)[J— (j — 1)h][J— (j — 2)h] • • (J +jh) =0
where J (J) . (Hint: In the case J=Jz what happens when both sides are applied to an
arbitrary eigenket 11m>? What about an arbitrary superpositions of such kets?)
(3) It follows from (2) that Pi' is a linear combination of J°, J 1 , , J2J. Argue that
the same goes for J2J ±k, k = 1, 2, ..
Exercise 12.5.5. ( Hard). Using results from the previous exercise and Eq. (12.5.23), show
that
(1) D(172) (R)= exp( — i0. J /2) /h) = cos (0/2)/ (1 /2) — (2i/h)sin(0/2)0• J (1/2)
Jo) 2 j(I)
(2) D(1) (R)=exp( — i0,4 1)1h)= (cos 0,  1)H _i sin 0,(=)+/ (1)
h h
Exercise 12.5.6. Consider the family of states I jj>, . . . , I jm>, , ij, —j>. One refers to
them as states of the same magnitude but different orientation of angular momentum. If ones
takes this remark literally, i.e., in the classical sense, one is led to believe that one may rotate
these into each other, as is the case for classical states with these properties. Consider, for
instance, the family II, 1>, Ii, 0>, 11, —1>. It may seem, for example, that the state with zero 333
angular momentum along the z axis, I 1, 0>, may be obtained by rotating Ii, 1> by some
ROTATION
suitable ( 70) angle about the x axis. Using D(1 1R(0,i)] from part (2) in the last exercise INVARIANCE
show that AND ANGULAR
MOMENTUM
11, 0> OD( "[R(0,0]11, 1> for any
The error stems from the fact that classical reasoning should be applied to <J>, which responds
to rotations like an ordinary vector, and not direcly to Ijm>, which is a vector in Hilbert
space. Verify that <J> responses to rotations like its classical counterpart, by showing that
<J> in the state D(1) [R(6x i)]11, 1> is h[ — sin 0,..j + cos 6k].
It is not too hard to see why we can't always satisfy
11/11 ) = D(') [R] Lim>
or more generally, for two normalized kets I ty.',> and I ty,>, satisfy
wi> = D [10 VI>
by any choice of R. These abstract equations imply (2j+ 1) linear, complex relations between
the components of I tiO and I tv,> that can't be satisfied by varying R, which depends on only
three parameters, 0,, 0,„ and O. (Of course one can find a unitary matrix in V, that takes
Ijm> into ijni> or I ty.,> into I tv;>, but it will not be a rotation matrix corresponding to U[R].)
Exercise 12.5.7: Euler Angles. Rather than parametrize an arbitrary rotation by the angle
8, which describes a single rotation by 0 about an axis parallel to 0, we may parametrize it
by three angles, y, /3, and a called Euler angles, which define three successive rotations:
urma, 13, 7)1=_ e  iceJz/h e iI3Jy /fi
(1) Construct Difi[R(a, y)] explicitly as a product of three 3 x 3 matrices. (Use the
result from Exercise 12.5.5 with ./.„4.)
(2) Let it act on II, 1> and show that <J> in the resulting state is
<J>=h(sin fi cos ai + sin /3 sin aj +cos /3k)
(3) Show that for no value of a, and y can one rotate Ii, 1> into just 11, 0>.
(4) Show that one can always rotate any 11, m> into a linear combination that involves
1 1 , in), i.e.,
<1, m'ID(1) [R(a, Y)il, m>0
for some a, fi, y and any m, m'.
(5) To see that one can occasionally rotate 11m> into Ijm'>, verify that a 180 0 rotation
about the y axis applied to II, 1> turns it into II, — l>.
Angular Momentum Eigenfunctions in the Coordinate Basis
We now turn to step (3) outlined at the beginning of this section, namely, the
construction of the eigenfunctions of L 2 and L 0 in the coordinate basis, given the
information on the kets I im>.
334 Consider the states corresponding to a given 1. The "topmost" state I11> satisfies
CHAPTER 12
L+ 111>=0 (12.5.26)
If we write the operator L,= Lx ±iL, in spherical coordinates we find
a )
L, ± heI ±icotO (12.5.27)
coordinate
basis
ae 00)
Exercise 12.5.8 ( Optional). Verify that
0
Lx ih(sin —+ cos 0 cot 0  11)
coordinate
basis
00 00)
L, ih(— cos ti) —+ sin 0 cot
' coordinate
basis
ae
If we denote by tgli(r, 0, 0) the eigenfunction corresponding to //>, we find that it
satisfies
+ cot o (12.5.28)
ao 00
Since y/i is an eigenfunction of L, with eigenvalue lh, we let
tgli (r, 0, 0)= Or, 0) e'l° (12.5.29)
and find that
a
—
ao — cot 0)0= o (12.5.30)
(
_ i d(sin 0)
tA sin e
or
0)= R(r)(sin 0) / (12.5.31)
where R(r) is an arbitrary (normalizable) function of r. When we address the eigen
value problem of rotationally invariant Hamiltonians, we will see that H will nail
down R if we seek simultaneous eigenfunctions of H, L 2, and L. But first let us
introduce, as we did in the study of L, in two dimensions, the function that would
have been the unique, nondegenerate solution in the absence of the radial coordinate: 335
1/2 ROTATION
INVARIANCE
Y(9, =( 1 ) 1 [(21+ 1)11 —1 (sin 0)1 ell° (12.5.32) AND ANGULAR
21/! MOMENTUM
Whereas the phase factor ( — 1) / reflects our convention, the others ensure that
1 f 2n.
11 .171/1 2 dn
—1 0
1 11 2 d(cos 0) d0 = 1 (12.5.33)
We may obtain Yr 1 by using the lowering operator. Since
L_111> = h[(1+ 1)(1)]1 /2 11, — 1> = h(21)1 /211, 1— 1>
(12.5.34)
Yr 1 (0 (21) ,.,, h 1) [(00
°
00
We can keep going in this manner until we reach Y[ 1 . The result is, for m > 0,
i 1/2 1/2
Yr(0, 0) = ( — 1)1 (2` + 1) 1 1 ![ (1+ m) eini°(sin 0) m
2'1! (2/)!(1— m)!
(sin 0) 21 (12.5.35)
d(cos 0) 1 r"
For m <0, see Eq. (12.5.40). These functions are called spherical harmonics and
satisfy the orthonormality condition
Yr* (0, 0) Yr'( 0, 0) AI= 8ff
Another route to the yr is the direct solution of the L2, L, eigenvalue problem
in the coordinate basis where
h2) ( 1 0 0 1 02
sin 0 + (12.5.36)
0 00 00 sin2 0 0 4)2)
and of course
— ih —
0
If we seek common eigenfunctions of the forint f (0) which are regular between
0= 0 and r we will find that L2 has eigenvalues of the form 1(1+ 1)h2, 1=0, 1, 2, .
,
We neglect the function R(r) that can tag along as a spectator.
336 where 1>1m1 . The yr functions are mutually orthogonal because they are nondegen
CHAPTER 12
erate eigenfunctions of L2 and Lz , which are Hermitian on singlevalued functions
of 0 and 4).
Exercise 12.5.9. Show that L2 above is Hermitian in the sense
tvr(L2 w2) df2= [ 14(L2 1// 1 )
The same goes for L,, which is insensitive to O and is Hermitian with respect to the 4
integration.
We may expand any v(r, 0, 0) in terms of Yr(e, 4)) using rdependent
coefficients [consult Eq. (10.1.20) for a similar expansion]:
lit(r, 0, 4))= E E crone, 0) (12.5.37a)
Io I
where
Cr(r)= r*(0, 4))tit(r, 0, 0) clf2 (12.5.37b)
If we compute <titi L2Itg> and interpret the result as a weighted average, we can
readily see (assuming tv is normalized to unity) that
p(L 2 = 1(1+ oh2, Lz =mh)= I Cr(r)1 2r2 dr (12.5.38)
It is clear from the above that Cr is the amplitude to find the particle at a radial
distance r with angular momentum (1, m),I The expansion Eq. (12.5.37a) tells us
how to rotate any ty(r, 0, 0) by an angle 0 (in principle):
(1) We construct the block diagonal matrices, exp( — i0 .0 )/h).
(2) Each block will rotate the Cr into linear combination of each other, i.e.,
under the action of U[R], the coefficients Mr), m=1,11, . . . , —1; will get mixed
with each other by
In practice, one can explicitly carry out these steps only if tit contains only
Yin's with small I. A concrete example will be provided in one of the exercises.
t Note that r is just the eigenvalue of the operator (X 2 + Y2 +Z 2) 1/2 which commutes with L2 and L.
Here are the first few rn functions: 337
ROTATION
Y2= (4z) —I/2 INVARIANCE
AND ANGULAR
= (3/8 70 1 /2 sin 0 e±uk MOMENTUM
n= (3/470 1 /2 cos 0
sin 2 0 e±2i0 (12.5.39)
Y2 2= (15/32r)12
31 1 = (15/870" sin 0 cos 0
VS' = (5/1670 1 /2(3 cos2 0 —1)
Note that
Yrrn = —1 r( Yin) * (12.5.40)
Closely related to the spherical harmonics are the associated Legendre polynomials
PT (with 0 defined by
1/2
[(21+ 1)(l— m) •1 ,m4,
Yr(0, 0)— (1) T e Pf(cos 0) (12.5.41)
4r(l+m)!
If m=0, PAcos 0)_=/)1 (cos 0) is called a Legendre polynomial.
The Shape of the YIT Functions. For large /, the functions I yri exhibit many
classical features. For example, I YllocIsin' 01, is almost entirely confined to the xy
plane, as one would expect of a classical particle with all its angular momentum
pointing along the z axis. Likewise, 1113 1 is, for large I, almost entirely confined to
the z axis. Polar plots of these functions may be found in many textbooks.
Exercise 12.5.10. Write the differential equation corresponding to
L210>=ala/3 >
in the coordinate basis, using the L2 operator given in Eq. (12.5.36). We already know fi =
mh from the analysis of —ih(0/00 ). So assume that the simultaneous eigenfunctions have
the form
45)= MO) e"" °
and show that n satisfies the equation
( i a . a a m2 )
n0 + n(0)=0
0 00 00 h2 sin2 0
338 We need to show that
CHAPTER 12
a
h2 =1(1+1),
(1) — 1=0,1, 2„
(2) iml
We will consider only part (1) and that too for the case m=0. By rewriting the equation in
terms of u =cos 0, show that Pr°, satisfies
d2F2, de, ( a
(1 — u2) 2u + )Pr? =0
du2 du h2
Convince yourself that a power series solution
n=0
will lead to a twoterm recursion relation. Show that (Cn+2 /C,) as n  oo. Thus the series
diverges when lui (490 or 7r). Show that if a /h2 = (1)(1+1); 1=0, 1, 2, ... , the series will
terminate and be either an even or odd function of u. The functions
e(u)= P 1± (u) are just the Legendre polynomials up to a scale factor.
Determine Po, PI, and P2 and compare (ignoring overall scales) with the Y;) functions.
Exercise 12.5.11. Derive Y; starting from Eq.(12.5.28) and normalize it yourself.
[Remember the (1)' factor from Eq. (12.5.32).] Lower it to get Y? and VT' and compare it
with Eq. (12.5.39).
Exercise 12.5.12.* Since L2 and L, commute with II, they should share a basis with it.
Verify that yr,,, (1) , yr. (First show that 0  0, (/)  0+ Ir under parity. Prove the
result for Y1. Verify that L_ does not alter the parity, thereby proving the result for all
Yr.)
Exercise 12.5.13.* Consider a particle in a state described by
ty=N(x+ y+2z)e a'
where N is a normalization factor.
(1) Show, by rewriting the nis, functions in terms of x, y, z, and r, that
1/2 x±iy
yr =
47r) 2 I /2r (12.5.42)
( 1/2 z
3\
41.7r) r
(2) Using this result, show that for a particle described by y/ above, P(lz =0)=2/3;
P(1z = +h)=1/6 = P(1, = —h).
Exercise 12.5.14. Consider a rotation 0,i. Under this 339
ROTATION
INVARIANCE
y)/ cos 0,— z sin O. AND ANGULAR
MOMENTUM
z—>z cos 0,+ y sin 0,
Therefore we must have
ty(x, y, z)
U[R(Ox i)]
I 'R = V(x, y cos 0„+z sin 0,, z cos 0,— y sin Ox)
Let us verify this prediction for a special case
= Az e '22
which must go into
WR= A(z cos 0, y sin 0„.)
—
(1) Expand y/ in terms of Y: ,
(2) Use the matrix e'0 ' find the fate of tv under this rotation.t Check your result
against that anticipated above. [Hint: (1) ty /1, which corresponds to
(2) Use Eq. (12.5.42).]
12.6. Solution of Rotationally Invariant Problems
We now consider a class of problems of great practical interest: problems where
V(r, 0, )= V(r). The Schrödinger equation in spherical coordinates becomes
[ h2 ( 1 a 1 a sin 0 a + a2
r2 + V(r)1
[ 2p /" 2 er er + r2 sin2 00 00 r2 sin2 0 ao2
E(r, 0 , 0)= EIV E(r, 0, 0) (12.6.1)
Since [H, = 0 for a spherically symmetric potential, we seek simultaneous eigen
functions of H, L2, and L z :
Elm(', 0 , (jo) = RE1,,,(r)Yr (0 , (12.6.2)
Feeding in this form, and bearing in mind that the angular part of V2 is just the L2
operator in the coordinate basis [up to a factor ( — h2r2) 1 , see Eq. (12.5.36)1, we get
See Exercise 12.5.8.
340 the radial equation
CHAPTER 12
h2F1 a 2 a 1(1+ 11
 — r 
i + V(r)}REI= EREI (12.6.3)
Lr r r r2
Notice that the subscript m has been dropped: neither the energy nor the radial
function depends on it. We find, as anticipated earlier, the (21+ 1)fold degeneracy
of H.
Exercise 12.6.1.* A particle is described by the wave function
(r, 0, 0)= A e "/" (ao = const)
(1) What is the angular momentum content of the state?
(2) Assuming W E is an eigenstate in a potential that vanishes as rco, find E. (Match
leading terms in Schredinger's equation.)
(3) Having found E. consider finite r and find V(r).
At this point it becomes fruitful to introduce an auxiliary function UEt defined
as follows:
RE1 = /r (12.6.4)
and which obeys the equation
{d 2 2p 1(1 + 1)h2
2 1} Uel=0
V(r) (12.6.5)
dr2+ h2 LE pr2
Exercise 12.6.2.* Provide the steps connecting Eq. (12.6.3) and Eq. (12.6.5).
The equation is the same as the onedimensional Schrödinger equation except
for the following differences:
(1) The independent variable (r) goes from 0 to oo and not from —cc to oo.
(2) In addition to the actual potential V(r), there is the repulsive centrifugal
barrier, 1(1+ 1)h2/2pr2, in all but the 1=0 states.
(3) The boundary conditions on U are different from the onedimensional case.
We find these by rewriting Eq. (12.6.5) as an eigenvalue equation
h2 d 2 iu+ oh2
2 UEID1(r)UEl = EUE1 (12.6.6)
Tp r
LP+V(r)± 2pr
and demanding that the functions UEl be such that DI is Hermitian with respect to 341
them. In other words, if U1 and U2 are two such functions, then we demand that ROTATION
INVARIANCE
. . .
AND ANGULAR
MOMENTUM
ut,„,u2)dr= f MA UI ) dr] f (I) 1 U1 )* U2 dr (12.6.7a)
J0 [
o o
This reduces to the requirement
CO
( ut dU2 u2 dUt)
=0 (12.6.7b)
dr dr o )
Exercise 12.6.3. Show that Eq. (12.6.7b) follows from Eq. (12.6.7a).
Now, a necessary condition for
0.
I Rar2 dr = f I UE11 2 dr
L
to be normalizable to unity or the Dirac delta function is that
U E1 > 0 (12.6.8a)
r —■ co
or
—> eikr
UEl (12.6.8b)
the first corresponding to bound states and the second to unbound states. In either
case, the expression in the brackets in Eq. (12.6.7b) vanishes at the upper limit and
the Hermiticity of DI hinges on whether or not
2 u2 dUtl 0
[Ut dU (12.6.9)
dr dr ]o
Now this condition is satisfied if
U—
r —.0
c, c=const (12.6.10)
$ For the oscillating case, we must use the limiting scheme described in Section 1.10.
342 If c is nonzero, then
CHAPTER 12
diverges at the origin. This in itself is not a disqualification, for R is still square
integrable. The problem with c 00 is that the corresponding total wave functions
does not satisfy Schr6dinger's equation at the origin. This is because of the relation
V2 (1/0= —47c8 3 (r) (12.6.11)
the proof of which is taken up in Exercise 12.6.4. Thus unless V(r) contains a delta
function at the origin (which we assume it does not) the choice c 0 is untenable.
Thus we deduce that
UE1  0 (12.6.12)
r
Exercise 12.6.4. * (1) Show that
1
83 (r — r') _= 5(x — x')8(y — y') (z — z') — 5(r — 000 — 0')5(0 — 0 ')
r2 sin 61
(consider a test function).
(2) Show that
V2(1/0= —47r5 3 (r)
(Hint: First show that V2(1/0=0 if r 00. To see what happens at r = 0, consider a small
sphere centered at the origin and use Gauss's law and the identity V2 0= V • VO).§
General Properties of UE1
We have already discussed some of the properties of UE1 as r —>0 or co. We shall
try to extract further information on UE1 by analyzing the equation governing it in
these limits, without making detailed assumptions about V(r). Consider first the limit
r +0. Assuming V(r) is less singular than r 2, the equation is dominated by the
As we will see in a moment, /00 is incompatible with the requirement that yi(r)*r I as r*O. Thus
the angular part of yi has to be yg = (470 1 /2.
§Or compare this equation to Poisson's equation in electrostatics V2 0= —47r p. Here p= 5 3 (r), which
represents a unit point charge at the origin. In this case we know from Coulomb's law that 4 = 1 1r.
centrifugal barrier: 343
ROTATION
1(1+ 1) INVARIANCE
(12.6.13)
r2 AND ANGULAR
MOMENTUM
We have dropped the subscript E, since E becomes inconsequential in this limit. If
we try a solution of the form
we find
a (a — 1)=1(1+1)
or
a =I+1 or (  1)
and
Jr'' F
(regular)
(12.6.14)
r (irregular)
We reject the irregular solution since it does not meet the boundary condition U(0) =
0. The behavior of the regular solutions near the origin is in accord with our expecta
tion that as the angular momentum increases the particle should avoid the origin
more and more.
The above arguments are clearly true only if /0 0. If! = 0, the centrifugal barrier
is absent, and the answer may be sensitive to the potential. In the problems we will
consider, U1 =0 will also behave as if+ with 1=0. Although Uo (r) —>0 as r40, note
that a particle in the 1=0 state has a nonzero amplitude to be at the origin, since
Ro(r)= Uo(r)/r 0 at r = O.
Consider now the behavior of UE1 as r oo. If V(r) does not vanish as r —> co,
it will dominate the result (as in the case of the isotropic oscillator, for which
V(r)ocr2) and we cannot say anything in general. So let us consider the case where
rV(r) —>0 as r oo . At large r the equation becomes
44 2 UE _ 2p E
h2 UE (12.6.15)
dr2
(We have dropped the subscript 1 since the answer doesn't depend on 1.) There are
now two cases:
1. E> 0: the particle is allowed to escape to infinity classically. We expect
UE to
oscillate as r oo .
2. E< 0: The particle is bound. The region r oo is classically forbidden and we
expect UE to fall exponentially there.
344 Consider the first case. The solutions to Eq. (12.6.15) are of the form
CHAPTER 12
UE= A ei kr + B k= (2p E / h2) i /2
that is to say, the particle behaves as a free particle far from the origin. $ Now, you
might wonder why we demanded that rV(r) —>0 and not simply V(r) —>0 as r —> oo .
To answer this question, let us write
UE=f (r) e±lkr
and see if f (r) tends to a constant as r —> co . Feeding in this form of UE into Eq.
(12.6.5) we find (ignoring the centrifugal barrier)
,211V(r)
f" ±(2ik) f h2 f— 0
Since we expect f (r) to be slowly varying as r co, we ignore f" and find
i=
d V(r) dr
f k h2
f (r)= f (ro) • exp [ 1P • fr V( r') dr'] (12.6.16)
Ich 2 j ro
where ro is some constant. If V(r) falls faster than r , i.e., rV(r) —>CI as r —>co, we
can take the limit as r oo in the integral and f (r) approaches a constant as r oo
If instead
2
e
V(r)= — —
r
as in the Coulomb problem, § then
f = f (ro) exp ro)1
[ikPhe22 ln (r
and
UE (r) — exp± [i
(kr + Pe2 ln r)1 (12.6.17)
kh 2
This means that no matter how far away the particle is from the origin, it is never
completely free of the Coulomb potential. If V(r) falls even slower than a Coulomb
potential, this problem only gets worse.
Although A and B are arbitrary in this asymptotic form, their ratio is determined by the requirement
that if UE is continued inward to r = 0, it must vanish. That there is just one free parameter in the
solution (the overall scale), and not two, is because DI is nondegenerate even for E> 0, which in turn
is due to the constraint UE1 (r = 0) = 0 ; see Exercise 12.6.5.
§ We are considering the case of equal and opposite charges with an eye on the next chapter.
Consider now the case E< O. All the results from the E> 0 case carry over with 345
the change ROTATION
INVARIANCE
K = (2111E1 h2 ) 112 AND ANGULAR
MOMENTUM
Thus
UE >A e'r + B e+Kr (12.6.18)
Again B/ A is not arbitrary if we demand that UE continued inward vanish at r =O.
Now, the growing exponential is disallowed. For arbitrary E< 0, both ekr and ekr
will be present in UE. Only for certain discrete values of E will the e'r piece be
absent; these will be the allowed bound state levels. (If AlB were arbitrary, we could
choose B=0 and get a normalizable bound state for every E< O.)
As before, Eq. (12.6.18) is true only if rV(r) —›0. In the Coulomb case we expect
[from Eq. (12.6.17) with k
p e2
UE'''exp( Kh 2
ln r)eT "
= (o±pe2 /Kh 2 eTtcr
(12.6.19)
When we solve the problem of the hydrogen atom, we will find that this is indeed
the case.
When E< 0, the energy eigenfunctions are normalizable to unity. As the operator
NO is nondegenerate (Exercise 12.6.5), we have
UE7 (r)UEI (r) dr = EE'
o
and
Ebn(r, 0, 0)= REI(Or 1(0, 0)
obeys
iff Vtim(r, 0, 0) 11/ ET.Ar, 0, (Mr 2 dr dn= 8 EE' 8 11' 8 trIm'
We will consider the case E> 0 in a moment.
Exercise 12.6.5. Show that Di is nondegenerate in the space of functions U that vanish
as r+O. (Recall the proof of Theorem 15, Section 5.6.) Note that UE1 is nondegenerate even
for E> O. This means that E, 1, and m, label a state fully in three dimensions.
346 The Free Particle in Spherical Coordinates t
CHAPTER 12 If we begin as usual with
Ebn(r, 0, = REI(r))7(0,
and switch to UE7, we end up with
2pE
—
d2 + k2 1(1+2 1)1UD = 0, k 2 = h2
[dr2
Dividing both sides by k2, and changing to p=kr, we obtain
d2 1(1+ 1)
= (12.6.20)
[ dp2+p2
The variable k, which has disappeared, will reappear when we rewrite the answer in
terms of r= p/k. This problem looks a lot like the harmonic oscillator except for
the fact that we have a potential 1/ p2 insteady of p2. So we define operators analogous
to the raising and lowering operators. These are
d 1+1
+ (12.6.21a)
dp p
and its adjoint
dir = d+1+1 (12.6.21b)
dp p
(Note that d/dp is antiHermitian.) In terms of these, Eq. (12.6.20) becomes
(d4)U1 =1.11 (12.6.22)
Now we premultiply both sides by M. to get
dirdi (di; U1)= (12.6.23)
You may verify that
di; di= dr+I(4+1 (12.6.24)
so that
di + I di+ i(cfrr U1)= di Ur (12.6.25)
I The present analysis is a simplified version of the work of L. Infeld, Phys. Rev., 59, 737 (1941).
It follows that 347
ROTATION
cl; Ur= crUr+1 (12.6.26) INVARIANCE
AND ANGULAR
MOMENTUM
where cr is a constant. We choose it to be unity, for it can always be absorbed in
the normalization. We see that X serves as a "raising operator" in the index Z. Given
U0 , we can find the others.$ From Eq. (12.6.20) it is clear that if 1=0 there are two
independent solutions:
Ug(p)= sin p, ug = —cos p (12.6.27)
The constants in front are chosen according to a popular convention. Now ug is
unacceptable at p= 0 since it violates Eq. (12.6.12). If, however, one is considering
the equation in a region that excludes the origin, ug must be included. Consider
now the tower of solutions built out of ug and U. Let us begin with the equation
U1+ 1 = X Ur (12.6.28)
Now, we are really interested in the functions 121 = / p.§ These obey (from the
above)
pR 1+1 = X (pi:0
d + I + 1) (pRo
dp p
dl

Ri+1 —( —+—)R 1
dp p
d).121
pi(
dp)
or
RI+, _( _1 d)R1
P1+1 P dP) P1
=( 1 dY R11
p dpi pi '
1 d 1+1 Ro
=(__
p dp) p°
In Chapter 15, we will gain some insight into the origin of such a ladder of solutions.
§Actually we want RI = U/r=kU/ p. But the factor k may be absorbed in the normalization factors of
U and R.
348 so that finally we have
CHAPTER 12
10 I
RI= (— , Ro (12.6.29)
p Op
Now there are two possibilities for Ro :
R6, _ sin p
—cos p
Rg
These generate the functions
(1 d )1 (sin 13)
R7 (— (12.6.30a)
Idpi
called the spherical Bessel functions of order 1, and
1 0 )(— cos p)
p)1( (12.6.30b)
P P
called spherical Neumann functions of order a Here are a few of these functions:
sin p —cos p
io(P)= no(P) —
P
sin p cos p —cos p sin p
nI(P) — 2 (12.6.31)
P P P
3 1) . 3 cos p 3 sin p
 —i  sin p—
i2(P)=( 2 , n2(p) — —( 33 1 ) cos p 2
p p P P
As p — , these functions behave as
il —I sin (p— r) (12.6.32)
p 2
1 (
cos p (12.6.32)
p 2
Despite the apparent singularities as p03, the MO functions are finite and in fact
(12.6.33)
P4) (21+1)!!
One also encounters spherical Hankel functions h1 =j1 + in, in some problems.
where (21+ 1) !! = (2/+ 1)(21— 1)(2/— 3) . . . (5)(3)(1). These are just the regular solu 349
tions listed in Eq. (12.6.14). The Neumann functions, on the other hand, are singular ROTATION
INVARIANCE
AND ANGULAR
(12.6.34) MOMENTUM
and correspond to the irregular solutions listed in Eq. (12.6.14).
Freeparticle solutions that are regular in all space are then
Ebn(r, O, 0)=.ii(kr) 170 9, 0), (12.6.35)
These satisfy
2
fff ETni , r2 dr dfI= 8(k – k')811 , S.' (12.6.36)
nk2
We are using here the fact that
2
fi (kr)ji (k'r)r2 dr = 8(k– k') (12.6.37)
o irk2
Exercise 12.6.6. * (I) Verify that Eqs. (12.6.21) and (12.6.22) are equivalent to Eq.
(12.6.20)
(2) Verify Eq. (12.6.24).
Exercise 12.6.7. Verify that Jo and j, have the limits given by Eq. (12.6.33).
Exercise 12.6.8. * Find the energy levels of a particle in a spherical box of radius ro in
the /=0 sector.
Exercise 12.6.9. * Show that the quantization condition for /=0 bound states in a spher
ical well of depth — vo and radius ro is
k'/K = tan k'ro
—
where k' is the wave number inside the well and irc is the complex wave number for the
exponential tail outside. Show that there are no bound states for Vo < ir2 h2/8/id . (Recall
Exercise 5.2.6.)
Connection with the Solution in Cartesian Coordinates
If we had attacked the freeparticle problem in Cartesian coordinates, we would
have readily obtained
p2 h2k2
1 ip•r/h
VE(x, Y, z)= e E– – (12.6.38)
(2nh)3/2 2p 2p
350 Consider now the case which corresponds to a particle moving along the z axis with
CHAPTER 12 momentum p. As
p•r/h = (pr cos 0)h = kr cos 0
we get
e ikr cos 0 h2k2
E (r 0= 3 2, E— (12.6.39)
(2n10 / 2,u
It should be possible to express this solution, describing a particle moving in the z
direction with energy E= 2k 2 /2p, as a linear combination of the functions von,
which have the same energy, or equivalently, the same k:
"s e = E E1 C7'ji(kr)r7'(0, (12.6.40)
1=0 m= —/
Now, only terms with m=0 are relevant since the lefthand side is independent of
0. Physically this means that a particle moving along the z axis has no angular
momentum in that direction. Since we have
1/2
Me) = (
2/ + 1
4n
) Pl (cos 0)
)1/2
ilcrcos 0
ŒJ
(21+ 1
e E ji (kr)Pi (cos 0), Ci = C ?
l
1= 0 47r
It can be show that
CI = (21+1)
so that
co
ikr cos 0 =
(21+ 1),)1(kr)P1(cos 0) (12.6.41)
1=0
This relation will come in handy when we study scattering. This concludes our study
of the free particle.
Exercise 12.6.10. (Optional). Verify Eq. (12.6.41) given that
(1) 1'1 Pi (cos 0).Pr(cos 0) d(cos 0)= [2/(21+ 1) ] 8n,
1 d1 (x2 —
(2) Pi (x)=
2'/! dx 1
(2m)!!
. (1 — x2)'" dx —
(3) Jo (2m+1)!!
Hint: Consider the limit kr+0 after projecting out CI .
We close this section on rotationally invariant problems with a brief study of 351
the isotropic oscillator. The most celebrated member of this class, the hydrogen ROTATION
atom, will be discussed in detail in the next chapter. INVARIANCE
AND ANGULAR
MOMENTUM
The Isotropic Oscillator
The isotropic oscillator is described by the Hamiltonian
H= P)2+ Py2 + Pz2 +
1 pco2(X
2
+ Y2 + Z 2 ) (12.6.42)
2p 2
If we write as usual
UE1(r)
4/E7m — Y7( 09,45) (12.6.43)
r
we obtain the radial equation
{d2 2p [ 1 2 2 /(/+ 1)1121}
+ E pa) r 2 U El = 0 (12.6.44)
dr2 112 2 2,ur
As r —> cc, we find
U— CY2/2 (12.6.45)
where
„1/2
Y=( Pw
h ) r (12.6.46)
is dimensionless. So we let
U(y)=eY 2/2v(y) (12.6.47)
and obtain the following equation for v(y):
1(1 +1) E
v" — 2yv' +[2X — 1 — 2 1V = 0, X = (12.6.48)
Y ho
It is clear upon inspection that a twoterm recursion relation will obtain if a power
series solution is plugged in. We set
CO
Cy n (12.6.49)
n=0
where we have incorporated the known behavior [Eq. (12.6.14)] near the origin.
352 By going through the usual steps (left as in exercise) we can arrive at the
CHAPTER 12
following quantization condition:
E= (2k+ I+3/2)hco, k= 0, 1, 2, . . . (12.6.50)
If we define the principal quantum number (which controls the energy)
n=2k +1 (12.6.51)
we get
E=(n+3/2)hco (12.6.52)
At each n, the allowed l values are
1=n2k=n,n— 2, . . . , 1 or 0 (12.6.53)
Here are the first few eigenstates :
n=0 1=0 m=0
n=1 1 =1 m=±1,
n=2 /= 0, 2 m=0; ±2, ±1,
n=3 1=1,3 m= ±1, 0; ±3, ±2, ±1,
Of particular interest to us is the fact that states of different 1 are degenerate. The
degeneracy in m at each 1 we understand in terms of rotational invariance. The
degeneracy of the different I states (which are not related by rotation operators or the
generators) appears mysterious. For this reason it is occasionally termed accidental
degeneracy. This is, however, a misnomer, for the degeneracy in I can be attributed
to additional invariance properties of H. Exactly what these extra invariances or
symmetries of H are, and how they explain the degeneracy in I, we will see in Chapter
15.
Exercise 12.6.11. * (1) By combining Eqs. (12.6.48) and (12.6.49) derive the twoterm
recursion relation. Argue that Co 0 0 if U is to have the right properties near y = 0. Derive the
quantizations condition, Eq. (12.6.50).
(2) Calculate the degeneracy and parity at each n and compare with Exercise 10.2.3,
where the problem was solved in Cartesian coordinates.
(3) Construct the normalized eigenfunction tv„ h„ for n=0 and 1. Write them as linear
combinations of the n=0 and n=1 eigenfunctions obtain in Cartesian coordinates.
13
The Hydrogen Atom
13.1. The Eigenvalue Problem
We have here a twobody problem, of an electron of charge —e and mass m,
and a proton of charge +e and mass M. By using CM and relative coordinates and
working in the CM frame, we can reduce the problem to the dynamics of a single
particle whose mass p=mM/(m+ M) is the reduced mass and whose coordinate r
is the relative coordinate of the two particles. However, since m/A/T   1/2000, as a
result of which the relative coordinate is essentially the electron's coordinate and the
reduced mass is essentially m, let us first solve the problem in the limit M—*co. In
this case we have just the electron moving in the field of the immobile proton. At a
later stage, when we compare the theory with experiment, we will see how we can
easily take into account the finiteness of the proton mass.
Since the potential energy of the electron in the Coulomb potential
(13.1.1)
due to the proton is V= —e 2/r, the Schr6dinger equation
{d 2 2m e[
+ E+
2 /(/+1)ti2 1} (13.1.2)
UEl — 0
dr2 ti 2 r 2mr2
determines the energy levels in the rest frame of the atom, as well as the wave
functions 1:
Uv(r)
tv Eim (r, 0, 40)= REM r7(0, 40)— Y(0, 0) (13.1.3)
r
It is clear upon inspection of Eq. (13.1.2) that a power series ansatz will lead
to a threeterm recursion relation. So we try to factor out the asymptotic behavior.
I It should be clear from the context whether m stands for the electron mass or the z component of
angular momentum. 353
354 We already know from Section 12.6 that up to (possibly fractional) powers of r
CHAPTER 13 [Eq. (12.6.19)],
Uo '' exp [—(2m W/h 2) 1 /2r] (13.1.4)
r—■ oo
where
W=—E
is the binding energy (which is the energy it would take to liberate the electron) and
that
l± 1
UE1 r^".'
—.0
r (13.1.5)
Equation (13.1.4) suggests the introduction of the dimensionless variable
p (2m pv/h 2 ) 1 /2 r (13.1.6)
and the auxiliary function Y E/ defined by
UE1 = eP vo (13.1.7)
The equation for y is then
d2 v dv [e2 A 1 (I+ 1)1
—2 + v=0 (13.1.8)
dp2 dp p p2
where
(13.1.9)
and the subscripts on y are suppressed. You may verify that if we feed in a series
into Eq. (13.1.8), a twoterm recursion relation will obtain. Taking into account the
behavior near p= 0 [Eq. (1315)] we try
vEi= pi+ i E ck p k (13.1.10)
k =0
and obtain the following recursion relation between successive coefficients :
Ck11 _ —e 22+2(k+1+1)
(13.1.11)
Ck (k+1+2)(k+1+1)1(1+1)
The Energy Levels
Since
Ck +1
—+ _2 (13.1.12)
Ck k' °° k
is the behavior of the series pm e2°, and would lead to U eP y  pm eP e2" pm eP ''' 355
as p>cc, we demand that the series terminate at some k. This will happen if THE
HYDROGEN
ATOM
e2A,=2(k+1+ 1) (13.1.13)
or [from Eq. (13.1.9)]
me4
E= —W  k = 0, 1, 2, .. . : /= 0, 1, 2, . .. (13.1.14)
2h2(k+ 1+1)2 '
In terms of the principal quantum number
n=k+1+1 (13.1.15)
the allowed energies are
me 4
En  n=1,2,3, ... (13.1.16)
2h2n2 '
and at each n the allowed values of 1 are, according to Eq. (13.1.15),
1=n k 1= n 1,n2,...,1,0 (13.1.7)
That states of different I should be degenerate indicates that H contains more symmet
ries besides rotational invariance. We discuss these later. For the present, let us note
that the degeneracy at each n is
E (2/+ 1) =n2 (13.1.18)
i=o
It is common to refer to the states with 1=0, 1, 2, 3, 4, . .. as s, p, d, f, g, h, . . . states.
In this spectroscopic notation, is denotes the state (n=1, 1=0) ; 2s and 2p the 1=0
and 1=1 states at n=2; 3s, 3p, and 3d the 1=0, 1, and 2 states at n=3, and so on.
No attempt is made to keep track of m.
It is convenient to employ a natural unit of energy, called a Rydberg (Ry), for
measuring the energy levels of hydrogen:
me4
R y= (13.1.19)
2 2
356 Et o  111111111— fi
1/16 — — — — n4
1/9 — — — n. 3
CHAPTER 13
1/4 _ _ _n .2
Figure 13.1. The first few eigenstates of hydrogen. The energy
is measured in Rydbergs and the states are labelled in the spec
I  troscopic notation.
in terms of which
—Ry
En — 2 (13.1.20)
n
Figure 13.1 shows some of the lowestenergy states of hydrogen.
The Wave Functions
Given the recursion relations, it is a straightforward matter to determine the
wave functions and to normalize them. Consider a given n and 1. Since the series in
Eq. (13.1.10) terminates at
k=n1— 1 (13.1.21)
the corresponding function vi is piEl times a polynomial of degree n—l— 1. This
polynomial is called the associated Laguerre polynomial, L,21.1 1 _ 1 (2p).1 The corre
sponding radial function is
R„,(p)—e  P pi L2„1_±. 1_ 1 (2p) (13.1.22)
Recall that
(2m w)1/2 [2m ( me4 12
)1