INTRODUCTION.
A problem with Maxwell's Theory of the Electromagnetic Fields led
to Einstein's Special Theory of Relativity. The following is a
discussion of that problem and its solution by
Einstein's Special Theory of Relativity.
THE PROBLEM.
In 1864 James Clerk Maxwell (1831-1879)
presented to the Royal Society his famous paper "On a Dynamical
Theory of the Electromagnetic Field," in which he set forth
his four, now famous, equations describing all electromagnetic
radiation. These equations described exactly how electromagnetic
forces worked and along with Newton's Laws of Motion and of Universal
Gravitation seemed to provide a complete understanding of the
physical world. Maxwell's equations not only described electromagnetic
phenomena but it also predicted that there should be electromagnetic
waves that travel at the speed of light. The prediction of the
existence of electromagnetic waves was confirmed on Friday, the
thirteenth of November, 1888, by Heinrich Hertz (1857-1894), who
detected electromagnetic waves emitted by an electric spark. These
waves were called Hertzian waves, which we now call radio waves.
He also showed that electromagnetic waves of short wavelength
are refracted more than those of long wavelength in passing through
matter, just as Newton had shown in his prism experiment for light.
He showed that these electromagnetic waves could be defracted
and they also showed interference. There was no difference, except
wavelength and frequency, between these invisible long electromagnetic
waves made by an electric spark and the light seen by our eyes.
In 1894, a young Italian, Gugliehmo Marconi, who was only twenty
years old, read of Hertz' work and got the idea of using electromagnetic
waves for communications. In 1898 he accomplished the transmission
and reception of the Hertzian waves over a distance of a few miles
and in 1901 he sent and received them successfully across the
Atlantic.
Maxwell was the first to understand that light, electricity, and
magnetism are intimately related and that light itself was electromagnetic
waves of extremely high frequency and short wavelength. Noting
that the ratio of the two constants ke/km
connecting electricity and magnetism, which had already been measured in the
laboratory, was exactly the square of speed of light, he concluded that light
is an electromagnetic waves. That is,
ke/km =
(9 × 109 newton·meter2/coul2) /
(10-7 newton·sec2/coul2).
He calculated the value of that ratio and found it to be
9 × 1016 meters2/sec2,
the square of the speed of light. But in this fact lay the seed of a great
problem. Consider sound waves that travel through the air at about 1000
feet per second. Now if you, as an observer, are traveling in
the opposite direction to the direction of the sound at 100 feet
per second, the sound would be moving relative to you at
1100 feet per second. But Maxwell's equations indicated that
light as electromagnetic waves would be measured to have the same
speed for any observer, however he was moving. This inconsistency
was the result of considering that light waves are like sound
waves in that the waves require a medium for their propagation.
This medium for light waves was called "aether". According
to nineteenth century physics, aether had to have unusual properties.
It had to be immensely rigid to give light a speed of
3 × 108 m/sec and at the same time to have a zero density so
that bodies like the planets could travel through it. Also it had to be
perfectly transparent to account for its undectability. Later this attempt
to attribute to the aether properties analogous to the properties
of matter was abandoned. The sole property of aether was to support
electromagnetic waves. To nineteenth century physicists it was
difficult to conceive of waves as traveling without a medium.
Many physicists concluded that Maxwell's theory was valid only
if the observer was at rest relative to the aether. But then,
what observers are at rest relative to the aether? Maxwell's
theory did not provided this knowledge and without this knowledge
the theory seemed incomplete. Was the aether at rest relative
to the earth? No one wanted to accept this view that the earth
was the center of aether universe. The common accepted view was
that the aether was at rest with respect to the fixed stars.
This meant that the earth was moving relative to the aether, in
different directions at different times as it moved along its
orbit around the sun. That is, the speed of light should vary
when measured at different points along its orbit. Measuring
this effect became the crucial experiment during the late nineteenth
century.
The Michelson-Morely Experiment.
The most accurate attempt to measure the speed of light at different
points along the earth's orbit was made by two American physicists, Albert
A. Michelson (1852-1931) and Edward W. Morley (1838-1923) in 1887.
Michelson first performed the experiment in 1881, using a instrument
that he had invented, called an optical interferometer, and then,
in 1887, with the collaboration of his colleague Morley, carried
out a more precise version of what is known as the Michelson-Morley
experiment. An optical interferometer is a ingenious device
that splits a light beam into two parts so that they would travel
over two different paths and then after they were reflected back
along their paths the device recombines them to form an interference
pattern. The Michelson interferometer being fixed with respect
to the earth is moving with the earth through the aether that
was consider to be at rest to the fixed stars and the sun. Hence
the earth and the interferometer moves through the aether at a
speed of about 30 km/sec, in different directions during different
seasons of the year. The Michelson interferometer attempted to
measure the speed of light relative to the aether. It was assumed
that aether filled all of space and that it was the medium with
respect to which the speed of light was to be measured. It followed
then that an observer moving through the aether with velocity
v would measure a velocity c′ of a light beam since
c′ = c + v. It was this result that the
Michelson-Morley experiment was designed to detect.
The Michelson interferometer had a monochromatic light source S whose light beam was aimed at a partially silvered mirror M inclined at 45° to the direction of the beam. The glass of this mirror was lightly coated with silver so that approximately half the light would be reflected and half would pass through the glass of the mirror. The beam of light is thus split by the partially silvered mirror M into two coherent beams, one passing through the mirror in the direction of the original beam, and the other reflected off the partially silvered mirror M at right angles to the direction of the original beam. Beam 1 is transmitted through M and is reflected back to M by a plane mirror M1. Beam 2 is reflected at right angle off M and reflected back to M by a second plane mirror M2. Thus both beams are reflected off mirrors at the end of the beams' path back toward the partially silvered mirror where the beams are partially reflected and partially transmitted through the mirror M. There they interfere with each other and the interference pattern is observed by a telescope at right angle to the original beam.
Suppose that one of the arms of interferometer is aligned along the direction of the motion of the earth through space. The earth moving through the aether would be equivalent to the aether flowing past the earth in the opposite direction. This "aether wind" blowing in the direction opposite the earth's motion should cause the speed of light measured in the earth's frame of reference to be c - v as the light approaches the mirror M1 at the end of arm in the direction of its motion and c + v after reflection, where c is the speed of light in the aether frame of reference and v is the speed of the earth through space and hence the speed of the aether wind. The incident and reflected beams of light recombine, and an interference pattern consisting of alternate dark and bright bands would be formed.
The interference is constructive or destructive depending on the
phase difference of the two beams. The phase difference can arise
from two causes: the difference path lengths L1 and
L2,
and the different speeds of travel along the two paths with respect
to the aether wind. To compensate for the first cause the path
lengths L1 and L2 are made as nearly
equal as is possible. The second cause is crucial one. This can explained
by means of commonly used analogy. The different speeds along
the two arms of the interferometer are much like the different
cross-stream speed and up-and-down-stream speeds with respect
to the shore of a row boat in a moving stream. The time t1
for beam 1 to travel from M to M1 and back is
t1 = L1/(c - v) +
L1/(c + v) = 2L1/c
[1/(1 - v2/c2)],
where the light, whose speed is c in the aether, has an
up-stream speed of c - v with respect to the device
and a down-stream speed of c + v.
The path of the reflected beam 2, traveling from M to
M2 and back, is the cross-stream path through the aether.
Meanwhile the device has moved at right angles to path L2
through a distance d = vt2. Therefore, the
reflected light beam 2 travels diagonally from M to
M2 as the device moves a distance vt2/2
through the aether. By the Pythagorean Theorem the beam 2 has traveled a
diagonal distance from M to M2 equal to
½√(4L22 +
vt22).
The light beam 2 is reflected off the plane mirror M2 and
travels diagonally from M2 back to M, which has
traveled an additional distance vt2/2 through the aether.
By the Pythagorean Theorem the reflected beam 2 has traveled
an additional diagonal distance from M2 to M is
also equal to
½√(4L22 +
vt22),
so that the beam 2 is able to return to the (advancing) mirror
M. Thus the reflected light beam 2 has traveled a total
distance in time t2 equal to
√(4L22 +
vt22).
Since the total distance that light beam 2 travels from M
to M2 and back again is
ct2 = √(4L22 +
vt22).
Therefore, the time t2 that light beam 2 travels from
M to M2 and back again can be found by solving the
previous equation for t2. Squaring that equation, we get
c2t22 =
(4L22 + vt22),
or, solving for t22,
c2t22 -
vt22 = 4L22,
or,
t22 = 4L22/
(c2 - v2).
Thus the difference in time
Δt = t1 - t2
taken by the two beams of light is therefore
Δt = t1 - t2 =
2/c[
L1/(1 - v2/c2) -
L2/√(1 - v2/c2)],
or, if L1 = L2 = L, then
Δt = t1 - t2 =
2L/c[
1/(1 - v2/c2) -
1/√(1 - v2/c2)].
Thus according to classical physics we see that there is difference
in time between the two beams of light. This formula can be simplified
by using the binomial expansion and dropping the terms higher
than the second-order. We get the following formula for the
difference in time between the two beams.
Δt = t1 - t2 =
(L/c)(v2/c2).
The whole interferometer device was mounted on a stone table which
was floated on a pool of mercury in a large tank, so that the
whole device could be rotated through an angle of 90°. The interference
pattern could be observed while the interferometer was rotated.
The idea was that this rotation would change the speed of the
aether wind along the arms of the interferometer, and consequently
the fringe pattern would shift slightly but measurably. After
rotating the device by 90°, so that the L1 is the
cross-stream length and L2 is the up-down-stream length,
the transit time difference now designated by primes will be
Δt′ =
t′1 - t′2 =
2/c[
L1/√(1 - v2/c2) -
L2/(1 - v2/c2)].
Now the rotation changes the time differences by
Δt - Δt′ =
2(L1 + L2)/c[
1/(1 - v2/c2) -
1/√(1 - v2/c2)].
Using the binomial expansion and dropping terms higher than the
second-order, we get
Δt - Δt′ =
[(L1 + L2)/c]
(v2/c2).
Therefore, the rotation should cause a shift in the fringe pattern,
since it changes the phase relationship between beams 1 and 2.
If the optical path difference between the beams changes by one
wavelength, for example, there will be a shift of one fringe across
the cross-hairs of the viewing telescope. Let N represent
the number of fringes moving past the cross-hairs of the telescope
as the pattern shifts. Therefore, if light of wavelength λ and
of frequency f, so that the period of one vibration is
T = 1/f = λ/c,
ΔN = (Δt - Δt′)/T =
[(L1 + L2)/cT]
(v2/c2) =
[(L1 + L2)/λ]
(v2/c2).
Michelson and Morley were able to obtain an optical path length,
L1 + L2, of about 22 meters by multiple
reflections. In their experiment the arms of their interferometer were of
nearly equal length, that is,
L1 = L2 = L, so that,
ΔN = [2L/λ]
(v2/c2) =
[22 m/5.5 &time; 10-7m](10-8) = 0.4,
where λ = 5.5 × 10-7 meters and
v/c = 10-4, since
the speed of earth in its orbit v = 3 × 104 meters/sec
and the speed of light c = 3 × 108 meters/sec.
Thus ΔN is a shift of four-tenths of a fringe.
The Failure of the Experiment.
But measurements failed to show any change in the interference pattern!
On the chance that the earth was at rest relative to the aether at one
point in its orbit (in which the time difference would not occur),
Michelson and Morley repeated their experiment over a period of
six months, constantly rotating their device. Not once did they
observe the expected shift in the interference pattern. To Michelson
this was a grave disappointment. He considered the experiment
a failure. He repeated the experiment under various conditions,
at different times and locations, but the results was always the
same: no fringe shift of the magnitude required was ever observed.
The immediate conclusion to be drawn from this failure of the
Michelson-Morley experiment to detect any effect of the aether
on the motion of light waves is that the speed of light in free
space is a constant everywhere, completely independent of any
relative motion of the aether. And a corollary of this conclusion
is that the aether did not exist. Here was a huge puzzle. On
the one hand, how could one explain the failure of the Michelson-Morley
experiment and still believe in the existence of aether, and on
the other hand, if aether did not exist, how could one explain
how light was propagated without a medium, since the sole property
of aether was its ability to support electromagnetic waves? The
nineteenth century physicist were not willing to abandon the concept
of the aether.
The negative result of the Michelson-Morley experiment were variously interpreted. Sir Oliver Lodge suggested that a layer of aether was dragged along by the earth as it rotated, so that the aether around the earth was relative at rest. He tried various experiments to verify his hypothesis but no experiment showed the existence of the aether drag. The Dutch theoretical physicist, Hendrik Antoon Lorentz (1853-1928), and an Irish mathematician, G. F. FitzGerald (1851-1901), independently suggested an ad hoc explanation of the negative result of the Michelson-Morley experiment. FitzGerald in 1892 proposed the hypothesis, which was elaborated by Lorentz, that all bodies contracted in the direction of motion relative to the stationary aether by a factor of √(1 - v2/c2). The motion of the earth through the aether caused a shortening of one arm of the Michelson interferometer in the direction of the earth's motion by exactly the amount required to eliminate the fringe shift. Since a meter stick placed alongside any object would suffer the same fractional contraction, there can be no operational procedure for verifying or disproving this assertion of FitzGerald. Lorentz, who independently had thought of this possibility, justified the hypothesis in terms of a possible change of electromagnetic force between constituent atoms due to its motion. This length contraction, now known as the Lorentz-FitzGerald contraction, is also a consequent of Einstein's special theory of relativity, but not, in his theory, an arbitrary axiom contrived to explain an otherwise incomprehensible observation.
The Principle of Relativity.
All these interpretations of the failure of the Michelson-Morley
experiment assumed that the aether existed, since it was necessary to explain
how electromagnetic waves are propagated. But the failure of so many attempts
to measure the velocity of the earth relative to the aether suggested
to the brilliant mind of the French physicist Jules Henri Poincare
(1854-1912) a new possibility. In his lectures at the Sorbonne
in 1899, after reviewing the experiments so far, he proposed that
the absolute motion of a body is not detectable in principle.
In the following year, at an International Congress of Physics
held in Paris, he asserted the same view. He said "Our aether,
does it really exist? I do not believe that more precise observations
could ever reveal anything more than relative displacements."
A new principle must be introduced into Physics, which would resemble
the Second Law of Thermodynamics in so far as it asserts the impossibility
of doing something; that is, the impossibility of determining
the velocity of the earth relative to the aether. He published
these views in April of 1904 and in a lecture to a Congress of
Arts and Science at St. Louis, U.S.A., on 24 Sept., 1904, Poincare
gave to a generalized form of this principle the name, The
Principle of Relativity. He said, "According to the
Principle of Relativity the laws of physical phenomena must be
the same for a 'fixed' observer as for an observer who has a uniform
motion of translation relative to him; so that we have not, and
cannot possibly have, any means of discerning whether we are,
or are not, carried along in such a motion." And after reviewing
the records of observations in the light of this principle, he
says, "From all these results there must arise an entirely
new kind of dynamics, which will be characterized above all
by the rule, that no velocity can exceed the velocity of light."
Coordinate Transformations.
In 1895 H. A. Lorentz developed a set of equations for a moving electric
system by applying a transformation to the fundamental equations of the aether.
But in the original form of this transformation, quantities of
order high than the first in (v/c) were neglected.
In 1900 Sir J. Larmor extended the analysis so as to include
quantities of the second order. Lorentz in 1903 went even further
and obtain the transformation from one coordinate system to another
moving with constant velocity v with respect to the first
one so that the invariance of Maxwell's equations is retained:
x′ = (x - vt)/
√(1 - v2/c2),
y′ = y,
z′ = z,
t′ = (t - vx/c2)/
√(1 - v2/c2),
In June, 1905, Poincare gave to these set of coordinate transformation
the name of Lorentz transformations. These equations assume
that the second frame of reference is moving in the same direction
(x direction) as the first one and at a constant velocity
v relative to the first frame of reference. Note that
in these Lorentz equations distance and time are both involved
together. The equations relating x and x′ and relating
t and t′ are not the simple ones in the Galilean
transformations.
x′ = x - vt,
y′ = y,
z′ = z,
t′ = t.
These equations are the equations of coordinate transformation
between two frames of reference moving relative to one another
with a uniform translational motion, and supposes that an object
is moving respect to both frames. With respect to the first frame,
the object describes a particular path with some definite motion
along that path. With respect to the second frame, the path and
motion will be different. For mathematical purposes, each frame
uses a coordinate system to specify the desired space frame of
reference. Let us call the fixed frame K and the frame
moving to its right at constant velocity K′. Both observers
in the frames are supposed to have identical clocks. Now let
P be a point in space. Its coordinates with respect to
K′ are x′, y′ and z′,
and with respect of K they are x, y, and z.
Since the frame K′ is moving with velocity v, parallel
to the positive x-axis of the coordinate system of the
frame K, the equation of transformation between x
and x′, is x′ = x + vt.
The equation for the other coordinates are
y′ = y, and
z′ = z.
These two frames are called Galilean or inertial frames.
One moves relative to the other at a constant speed in a straight line.
No relative acceleration or rotation between these frames are taken place.
In Newton's terms, Galilean frames are at rest or moving with uniform
translational speed through absolute space and without acceleration or
rotation. And it cannot be determined which one is at rest in absolute space,
but this does not matter because we know the laws of transformation.
Since we can speak only of the relative velocity of one frame
with respect to another, and not of an absolute velocity of a
frame, this principle is sometimes called Newtonian relativity.
Transformation equations, in general, will change many quantities
but will leave others unchanged. These unchanged quantities are
called invariants of the transformation. In the Galilean
transformation laws for the relation between observations made
in different inertial frames of reference, acceleration, for example,
is an invariant and - more important - so are Newton's laws of
motion. A statement of what the invariant quantities are is called
a relativity principle; it says that for such quantities
the reference frames are equivalent to each other, no one of them
having an absolute or privileged status relative to the others.
Newton expressed his relativity principle in the following way:
"The motion of bodies included in a given space are the same
amongst themselves, whether that space is at rest or moves uniformly
forward in a straight line." And in addition, the differential
equations that hold in one frame also hold in the other. That
is, the classical laws of mechanics are the same in both.
Frame of Reference.
A physical event is defined as something that happens
independently of the frame of reference that be used to describe it.
For example, an event occurs when two particles collide with each other
or when a tiny light source is turned on. The event happens at a point
in space and at an instant of time. An event may be specified by four
(space-time) measurements in a particular frame of reference,
giving three position numbers x, y, z, and
the time t when the event occurs. For example, the collision
of the two particles occurred at x = 1 meters, y = 2 meters,
z = 4 meters, and at time t = 6 sec in some frame
of reference (in a laboratory on earth) so that the four numbers
(1, 2, 4, 6) specify the event in that frame of reference. The
same event observed from a different frame of reference (for example,
an airplane flying overhead) would also be specified by four numbers,
although the numbers may be different from those in the laboratory
frame of reference. Thus in order to describe an event, the first
step is to establish a frame of reference. Consider a
physical event at point P, whose space and time coordinates
are measured in each inertial frame of reference. An observer
attached to the frame K specifies by means of a meter stick
and clock the location and time of the occurrence of the event,
ascribing space coordinates x, y, and z and
time t to it.
An observer attached to the frame K′, using his measuring
instruments, specifies the same event by space-time coordinates
x′, y′, z′, and t′.
The coordinates x, y, z will specify the position in
space of P relative to the origin O as measured by the
observer in K, and t will be the time of occurrence
of P that observer in K records with his clock.
Similarly, the coordinates x′, y′, z′
refer the position of event P to the origin of O′ and the
time of P, t′ to the clock in inertial frame
K′. Now in order to establish the relationship between the
measurements in K and K′, the two inertial observers must
use meter sticks, which have been compared and calibrated against
one another, and the clocks, which have been synchronized and
calibrated against one another. The classical procedures assumes
that the length intervals and time intervals are absolute, that
is, that they are the same for all inertial observers of the same
events. For simplicity, let us set the clocks of each observer
to read zero at the instant that the origins O and O′
of the frames of reference K and K′, which are in
uniform relative motion, coincide. The relation between the measurements
made in K and K′ are given by the Galilean coordinate
transformations.
x′ = x - vt,
y′ = y,
z′ = z,
t′ = t.
These equations assume that time can be defined independently
of an particular frame of reference. This is the implicit assumption
of classical Newtonian physics. It is made explicit by the equation
t′ = t.
Now the time interval between two given events, say P and
Q, is the same for each observer, that is,
t′P - t′Q =
tP - tQ,
and that the distance, or space interval, between two points,
say A and B, measured at a given instant, the same
for each observer, that is,
x′B - x′A =
xB - xA.
Note carefully that the two measurements (the end points of the
space interval or distance) are made for each observer and that
they are assumed that they were made at the same time
(tA = tB, or
t′A = t′B).
The assumption that the measurements are made at the same time, that is, simultaneously, is a crucial part of our definition of the length of a moving rod. Of course, we should not measure the location of the end points of the rod at different times to get the length of the moving rod.
Classical Mechanics.
The time interval and space interval measurements described above are
absolute according to the Galilean coordinate transformation; that is,
they are same for all inertial observers, the relative velocity v
of the frames being arbitrary and not entering into the results.
And when we consider the assumption of classical physics that
the mass of a body is a constant, independent of its motion with
respect to an observer, then we can conclude that classical mechanics
and the Galilean transformations imply that length, mass, and
time -- the three basic quantities of mechanics -- are all independent
of the relative motion of the observer (or measurer).
How do the measurements of velocity and acceleration compare when
made by different inertial observers? Since the position of particle
in motion is a function of time, the velocity and acceleration
may expressed as time derivatives of position. By carrying out
successive time differentiations of the Galilean transformation
equations we can calculate the velocity and acceleration. If we
differentiate the equation
x′ = x - vt with respect to time t,
we get
dx′/dt = dx/dt - v.
But since t′ = t, the operation d/dt
is identical to the operation d/dt′, so that
dx′/dt = dx′/dt′.
Therefore,
dx′/dt = dx/dt - v.
Similarly,
dy′/dt = dy/dt, and
dz′/dt = dz/dt.
Now let dx′/dt′ = u′x,
the x-component of the velocity measured in K′, and
dx/dt = ux , the x-component of
the velocity measured in K, and substituting we get the
classical velocity addition theorem.
u′x = ux - v,
u′y = uy,
u′z = uz.
To obtain the acceleration transformation we just differentiate
these equations for velocity addition. We get
du′x = dux,
since v being a constant,
du′y = duy, and
du′z = duz.
That is, a′x = ax,
a′x = ax, and
a′x = ax.
Hence, a′ = a. That is, the
measured components of acceleration of a particle are unaffected by
the uniform relative velocity v of the reference frames.
The velocity measured in different inertial frames by different observers who are in relative motion will differ by the relative velocity of the two observers, which is in the case of inertial observers is a constant velocity. Now if the particle velocity changes, the change will be same for both observers. That is, each observer will measure the same acceleration for the particle. The acceleration of a particle is the same in all reference frames which are moving relative to one another with constant velocity; that is, a′ = a.
Since in classical physics the mass is also unaffected by the motion of the reference frame. Hence, the product ma will be the same for all inertial observers. If F = ma is taken as the definition of force, then each observer obtain the same measure for each force. If F = ma, then F′ = ma′ and F = F′. Newton's laws of motion and the equations of motion of a particle would be exactly the same in all inertial systems. And since, in mechanics, the conservation principles of energy, momentum and angular momentum, all can be shown to be consequences of Newton's laws, it follows that the laws of mechanics are the same in all inertial frames. That is, not only is acceleration an invariant, but so is Newton's laws; they are the same for all inertial observers.
Maxwell's Equations and Galilean Transformations.
Now let us consider Maxwell's equations. Let us inquire whether
the laws of electromagnetism (and any other laws of physics in
addition to those of mechanics) are invariant under Galilean transformations.
At the end of the nineteenth century, it was widely believed
that the same partial differential equations held in any Galilean
frame. That is, it was believed that Newtonian relativity principle
would hold not only for mechanics but for all of physics. In
other words, no inertial frame would be preferred over any other
and no type of experiment in physics, not merely mechanical ones,
carried out in a single frame would enable us to determine the
velocity of our frame relative to an other frame. There would
then be no preferred, or absolute, frame of reference. Thus,
it was believed that it was true for electromagnetics as in Newtonian
mechanics. But this was not true. When the Galilean laws of
transformation were applied to the electromagnetic equations in
K in order to obtain them in K′, it was found that
the equations were modified by adding terms that involved the
relative velocity of the two frames. Maxwell's equations are
not preserved in form by the Galilean transformations. The reason
for this is that the velocity is not invariant and Maxwell's equations
contained the constant
c = 1/√(μ0ε0),
the speed of light, the velocity of propagation of a plane wave
in vacuum. But such a velocity cannot be the same for observers
in different inertial frames, according to the Galilean transformations,
so that electromagnetic effects will probably not be the same
for different inertial observers. Consider a light signal or
pulse sent to the right with the velocity c and another
sent to the left with the velocity c. An observer moving
to the right with velocity v is "catching up"
to the light signal, and so for that observer the signal has the velocity
c - v.
On the other hand, this observer is running
away from the second light signal, and has velocity relative to
the signal of c + v. That is, in a frame K′
moving a constant velocity v with respect to the aether
frame, an observer would measure a different velocity for the
light pulse, ranging from c + v to c - v
depending upon the direction of relative motion, using the Galilean
velocity transformation. For the moving observer the two light
signals do not propagate with the same velocity, and so Maxwell's
equations do not have the same form for the observer. As far
as Maxwell's equations are concerned, there is only one preferred
frame, the frame at rest with respect to the aether.
Thus the transformation of Maxwell's electromagnetic equations from one frame of reference to another moving with a constant velocity with respect to the first frame by the Galilean transformation, showed that Maxwell's equations did not behave in the same way as Newton's laws of mechanics. The fact that the Galilean transformations does apply to Newtonian laws of mechanics but not to the Maxwell's laws of electromagnetism requires us to choose one of the following alternatives.
Einstein's Special Theory of Relativity.
In 1905, before many of the experiments were performed,
Einstein (1879-1955), apparently unaware of several important papers
on the subject, published a paper, titled
"On the Electrodynamics of Moving Bodies",
in the same volume of the Annalen der Physik, as his
paper on the Brownian motion. He sets forth the relativity theory
of Poincare and Lorentz with some modifications and he asserts
as a fundamental principle the constancy of the velocity of light,
that is, that the velocity of light in vacuo is the same
in all systems of reference which are moving uniformly relatively
to each other. He wrote,
"...for all coordinate systems for which the mechanical equations hold, the equivalent electrodynamical and optical equations hold also. ... In the following we make these assumption (which we shall subsequently call the Principle of Relatively) and introduce the further assumption -- an assumption which is at the first sight quite irreconcilable with the former one -- that light is propagated in vacant space, with a velocity c which independent of the nature of motion of the emitting body. These two assumptions are quite sufficient to give as a simple and consistent theory of electrodynamics of moving bodies on the basis of the Maxwellian theory for bodies at rest."
These two assumptions of Einstein may be restated as follows.
Einstein determined three relationships, using the Lorentz transformations, between two inertial systems (that is, systems between which there is no relative acceleration or rotation) that are moving with a velocity v relative to one another.
Einstein showed that with this definition Newton's Third Law still holds for collisions between massive bodies. The total momentum of a system of colliding bodies is unchanged by the collision; and further, only with this relativistic definition of momentum is momentum conserved according to the observer, no matter with what speed he is moving relative to the bodies. And for bodies who act on each other at a distance, the situation is a little more complicated. For example, the electric and magnetic forces are propagated by electro-magnetic fields. Einstein showed that Maxwell's complete theory of electro-magnetic fields, together with the waves that can propagate with the field, is consistent with the Special Theory of Relativity. To deal with gravitational forces Einstein had to develop his General Theory of Relativity.
Relativistic Energy.
Einstein was impelled by his new transformation equations relating time,
space and velocities as measured by two separated and moving systems of
reference, to apply the transformation equations to kinetic energy and
radiant energy. In a second paper in 1905, titled, "Does the Inertia
of a Body Depend Upon Its Energy Content?" Einstein showed
that there is a form of energy associated with mass of a particle
and defined the total energy E of a particle of
mass m0 and the speed v to be
E = mc2 =
m0c2/
√(1 - v2/c2) =
γm0c2.
This definition of the total energy of a particle results in the
conservation of energy for a isolated system; but it does not
include potential energy. Notice that the total energy of the
particle is not zero when the particle is at rest. Setting v = 0
results in
E = m0c2.
This is the energy of the particle when it is at rest, and it is called
the rest-mass energy of the particle,
E0.
For small velocities compared with velocity of light, the expression
γ = 1/√(1 - v2/c2) =
(1 - v2/c2)-½
may be expanded by the binomial theorem:
(x + y)n = xn +
nxn-1y +
[n(n-1)/(1·2)]xn-2y2
+ ...,
to the series
(1 - v2/c2)-½ =
1 + ½(v2/c2) +
3/8(v4/c4) + ...
with continuously higher powers of v/c.
Since v/c is a small fraction, the term with
v4/c4 and the higher terms
may be neglected, and the expression above for total
energy becomes then
E = m0c2[1 +
½(v2/c2)] =
m0c2 +
½m0v2.
This was interpreted by Einstein to mean that the energy associated
with any particle is composed of two types: its permanent "rest
energy"
m0c2
and its classical kinetic energy
½m0v2.
The last term on the right side of this equation has the same form
as the classical kinetic energy K.
But the first term on the right side
m0c2
is probably the best known and the most dramatic consequence of the Special
Theory of Relativity. It is called the "rest energy"
E0 of the particle and it identifies the mass of the
particle with energy. The rest energy is by definition the energy of the
particle at rest, when v = 0 and K = 0. The total
energy of the particle is the sum of its rest energy and its kinetic energy:
E = E0 + K.
This identification of mass and energy as given in the form of the
famous equation
E = mc2
governs the transformation of mass into energy and vice versa.
Through the application of this equation
many previously unexplained physical phenomena of the universe,
such as the apparently inexhaustible source of heat of the sun
and the stars, transmutation of radioactive elements, and other
nuclear processes, were understood. And its application led to
the development of atomic, or more accurately, nuclear energy.
It does not tell us how to convert mass into energy, but identifies
the amount of energy that equivalent to the mass of the particle.
Experimental Verification of the Theory.
The theory of relativity was designed to agree with the experimental
fact that the velocity of light is observed to be the same in frames of
reference which are in uniform translation with respect to each
other. But in addition to having achieve this, the theory predicts
a number of new phenomena, such as time dilation, length contraction,
relativistic increase in mass, and a relation between mass and
energy. This is to be expected from a scientific theory. A scientific
theory is initially accepted tentatively until the predictions
of the theory is verified by experiment. The first verification
of the special theory of relativity was performed in 1909 by Bucherer.
It consisted of a measurement of the masses of high velocity
electrons. It showed that the mass of the electron increased
according the relativistic mass increase relation. A number of
other experimental verification of length contraction and time
dilation have been performed. One of the clearest of these involved
the measuring the lifetime of unstable particles call mesons at
various velocities. The relation between mass and energy of relativity
has been verified by an overwhelming amount of evidence, the atomic
bomb and nuclear energy. The predictions of the special theory
of relativity have been confirmed at every point, and there is
now universal acceptance of its validity.