Chapter 6 Interferometry: The basic principlesdullemond/lectures/obsastro_2010/... · Chapter 6 Interferometry: The basic principles We have seen that the size of the telescope sets

Chapter 6

Interferometry: The basic principles

We have seen that the size of the telescope sets a limit on the spatial resolution of ourimages. There is a practical limit to telescope sizes, which would mean that we would neverbe able to achieve resolutions beyond that limit. The solution to this technical problem isto use the technique of interferometry.

In this chapter we will discuss the basic concepts of interferometry and its variousincarnations. We will also discuss some of the basics of coherence of light. The topic ofinterferometry is, however, too broad to be covered entirely in one chapter. We thereforerefer to the literature in the list below for further details.

I (CPD) owe thanks to Tom Herbst (MPIA) for his extremely fruitful and eye-opening15 minute co!ee-break explanation of interferometry with the LBT on 19 May 2010, whichresulted in Sections 6.12 and 6.13.

Literature:Lecture notes of the IRAM summer school on interferometry: http://iram.fr/IRAMFR/IS/school.htm.Lecture notes about interferometry from NRAO: http://www.cv.nrao.edu/course/astr534/PDFnew.shtml.Hogbom, J.A., Aperture Synthesis with a Non-Regular Distribution of Interferometer Baselines, 1974,

Astronomy & Astrophysics Supplements Vol. 15, p. 417Book by Thompson, A.R., Moran, J.M. and Swenson, G.W. Jr., Interferometry and synthesis in radio

astronomy, 1986, John Wily and Sons. ISBN 0-471-80614-5

6.1 Fizeau interferometry

Perhaps the most basic technique of interferometry is Fizeau interferometry, named afterHippolyte Fizeau (1819-1896), a French physicist who first suggested to use interferometryto measure the sizes of stars. The idea is simple: Just take the light of all your telescopesand project them, using a series of well-placed mirrors, all on the same image plane, as ifthe mirrors were all part of one huge mirror. If this is done such that the light from eachof the telescopes arrives at the image plane exactly at the same time, the beams of lightfrom all telescopes combined produce a point-spread-function (PSF) that is the Fouriertransform of the combined apertures of the telescopes. This is as if we actually have a hugemirror, but put black paint on it everywhere except for two or more circular regions.

We know from chapter 3 (see Fig. 3.7) that one can compute this “combined PSF” easilyusing a Fast Fourier Transform (FFT) algorithm applied to the “image” of the aperture.Let us, for now, take the example of a pair of telescopes next to each other, like the Large

Figure 6.1: The PSF of a pair of two apertures next to each other, computed in exactly thesame way as in Fig. 3.7. Left: The aperture pair. Right: The PSF of this pair, computedas the amplitude-squared of the Fourier transform of the pair of apertures (brightness isplotted logarithmically). This is the principle of a Fizeau interferometer, and this exampleis inspired by the LINC-NIRVANA interferometer to be installed on the Large BinocularTelescope.

Binocular Telescope (LBT) on Mount Graham in Arizona, which has participation fromthe Max-Planck-Institute for Astronomy in Heidelberg. Instead of having a single circularaperture, we have two circular apertures next to each other. If we compute the Fouriertransform of this pair of apertures we get a PSF that is a product of the PSF of eachsingle telescope and a wave-shaped modulation on top, called fringes. This is shown inFig. 6.1. The result of such an interferometer is an image of the sky with a PSF whichhas the shape shown in Fig. 6.1. Each star on the image thus looks like an Airy diskmultiplied by a fringe pattern (wavy modulation). Since the light from the two mirrors iscombined on the image plane, Fizeau interferometers are called focal plane interferometers.At the Max-Planck-Institute for Astronomy in Heidelberg they are currently building theLINC-NIRVANA instrument for the LBT, which will do exactly this kind of interferometry.

If we would join more than just two mirrors, more modulations of the original single-mirror PSF would appear, and the PSF gets more and more centrally concentrated withfewer “sidelobes”. It starts to look more and more like a PSF of a single huge mirror. Thistechnique has been used on the Multi-Mirror Telescope (MMT) in Arizona, which featuredsix 1.8-meter mirrors on a single mounting. Note however that today the MMT no longerhas six mirrors: it now features a single 6.5 m mirror. However, the experience gainedwith the MMT is now put to use for the LBT and for the design and future operation ofthe LINC-NIRVANA interferometer.

Let us now, as a Gedankenexperiment, add so many mirrors that they all touch eachother, and we fill also the space in between with mirror. We have then created one singlehuge mirror. The modulations now all conspire to produce the Airy pattern for the hugemirror, i.e. much smaller in angular scale than the single-telescope PSF. This is what itshould be, because we have now indeed created a true huge mirror.

Now let us go back to the case of the LBT with its future LINC-NIRVANA interfer-ometer: the PSF has this unusual shape of a single-mirror PSF multiplied by a wave. Ifwe would use such a camera to image a field on the sky with extended sources (like e.g. a

galaxy or a spatially resolved image of a solar system planet), then we would still get imagedegradation (smearing) over a typical scale of the original single-mirror PSF, though witha central peak. It seems that we have gained only partially in terms of angular resolution.The solution lies in the following:

1. Make multiple images over one night, to allow the field-rotation caused by the Earth’srotation to rotate the baseline of the LBT over your object so that the wavy modu-lation (fringe pattern) occurs in di!erent directions for each image.

2. Use computer software to “deconvolve” the combined set of images into a single highspatial resolution image. The trick here is to ask the computer to find the imagewhich, when convolved with each of the PSFs of each of the observed images, yieldsthe best match to the actual observed images1.

Fizeau interferometers are particularly useful for cases where the telescope mirrors areclose together. This is because in that way a relatively large part of the hypothetical hugemirror is accounted for by the real mirrors. The PSF fringes are, in this case, just a factor oftwo or three times as narrow as the single-mirror PSF, which helps the computer softwarevery much to find the high-resolution image. If one would use Fizeau interferometry formirrors that are very far away from each other (measured in units of the mirror size),then each single-mirror PSF would be modulated by a very fine fringe pattern. Whilethis formally yields an angular resolution of a single fringe bump, it contains too littleinformation for computer software to “deconvolve” the image to the sharpness of the singlefringe bumps. This is not surprising since the two mirrors, even when accounting for fieldrotation, would cover only a fraction of the surface area of the hypothetical huge mirror.One can thus not reasonably expect to be able to create images anywhere near what onewould obtain with the huge mirror.

The solution would lie in using many such mirrors, each subjective to field rotation. Infact, this is what is done in radio interferometry and in many areas of infrared interferom-etery. However, the above computer technique to retrieve the high-resolution image wouldthen quickly become impractical: one would have to work with thousands of images, andthe methods becomes prohibitively slow. Instead, for such cases one resorts to di!erenttechniques. But to understand those, we will have to have a closer look at the concept ofcoherence.

6.2 Coherence in time

In reality astrophysical radiation rarely behaves as a perfect monochromatic wave front.With the exception of laser/maser emission, most astrophysical emission is incoherent:radiation emitted at one location does not care about radiation that has been emitted atanother location. Yet, the fact that light behaves as a wave shows that even if astrophysicalradiation is largely incoherent, some degree of coherence must automatically arise.

Let us do a simple thought experiment which is not mathematically strict, but illustra-tive. Consider a long tube of hot gas. At the leftmost end an atom emits radiation throughthe tube. This radiation is a wave, the electric field of which we denote by E1(x, t). In this

1A nice paper describing such computer methods to be used for LINC-NIRVANA is: Bertero & Boccacci(2000) A&A Suppl. 144, 181.

analysis we ignore the vector character of the electromagnetic field: we assume that it isperpendicular to the direction of propagation (!E!k = 0, where !k is the wave vector), andwe focus on just one of the two polarization components. Next to it another atom emitsradiation, given by E2(x, t), and next to that E3(x, t) etc., until we reach atom N , the finalatom, which emits EN(x, t). We assume that all these radiation sources emit radiation ofthe same amplitude A, but mutually non-coherent, so therefore all these waves have com-pletely arbitrary phase compared to each other. In the complex plane this means we addN vectors of equal length but arbitrary direction. Assume now also that over a long timethe phases of all these emitting particles gradually shift randomly, so that the sum vectorassumes various incarnations of the sum of N randomly oriented vectors of length A. Ac-cording to standard statistics we know then that the average length of the resulting vector!

!E!E" =#

NA. The intensity I = !E!E" = NA2. The intesity therefore scales linearlywith N , precisely what one would expect for incoherent emitters, and precisely what onewould expect for the particle-description of radiation: Each atom emits its own photons,not caring about any other photons around; the photon chain obeys Poisson statistics2.This is an example of how incoherent radiation still produces a wave which, in itself, hassome degree of coherence.

The above thought experiment only works, however, if the phases of the emittingparticles gradually drift, so that on average we get the

#N behavior of the resulting

radiation. The question is: for how long will a wave keep its phase? In other words: whatis the distance along a wave for which one can assume phase stability? To study this, letus introduce the autocovariance "(") of the wave, as measured at a given location for time" :

"(") $ !E!(t)E(t + ")" $1

" T

E!(t)E(t + ")dt (6.1)

where t is time and T is su#ciently large that it can be considered “semi-infinite”, i.e. a longtime compared to one wave period. For visible light a measurement of one second alreadyamounts to about 1015 wave periods – “infinitely many” for most practical purposes. Theflux F in units of erg cm"2s"1, for the case of a (nearly) plane wave, is then

F =c

4#!E!(t)E(t)" =

4#"(0) (6.2)

This is a real quantity, because "(0) has no imaginary component. However, "(") is ingeneral a complex quantity. Define the autocorrelation R(") as

R(") ="(")

"(0)(6.3)

Its amplitude tells how much the signal stays correlated over a time span of " , i.e. howmuch “phase memory” the signal has over that time span. The autocorrelation functiontherefore allows us to define a typical coherence time "c for which |R(" < "c)| % 1/e.

Now coming back to our original question: What is the value of this coherence time?Interestingly, the answer lies as much in the technique of observation as it does in theprocess of emission. This is because the minimal coherence time is inversely proportional

2Keep in mind, however, that this example breaks down if the medium becomes optically thick, sincethen stimulated emission becomes important even if no masering occurs.

to the wavelength bandwidth at which we observe our astronomical signal. And for anemission line it is also inversely proportional to the line width.

To see this, let us start with the utopic case of a perfectly monochromatic wave. In sucha case the phase memory is by definition perfect, and the amplitude of the autocorrelationfunction R(") is 1 for all values of " . This is because the Fourier transform of a deltafunction is a perfect wave. However, if we would modify the wave slightly by introducinga gradual loss of phase memory over the coherence time scale "c, the Fourier transform isno longer a perfect delta-function. It will be a slightly broadened peak with a width:

$$ &1

"c. (6.4)

In other words: a certain degree of incoherence is automatically related to a certain degreeof non-monochromaticness. By using a narrow wavelength filter in our telescope detectorwe automatically increase the minimal coherence time. However, if we measure an emissionline which is narrower than our filter, then the actual coherence time is longer than thatset by our filter: it is set by the line width.

6.3 Coherence in time and space

Let us now generalize the notion of coherence to time and space. Consider two spatiallocations !r1 and !r2 where we measure the radiation field. We can now define the quantity

"12(") $ !E!1(t)E2(t + ")" (6.5)

where E1 is a shorthand for E(!r1). Let us further define

R12(") ="12(")

"11(0)"22(0)(6.6)

If !r1 and !r2 lie along the wave vector !k (i.e. along the propagation of the light), thenthe spatial coherence and the time coherence are in fact the same thing. Let is, in thisparticular case, write R12(") as R(l, "). We then have

|R(c", 0)| = |R(0, ")| (6.7)

One can therefore, in this case, define the coherence length lc directly from the coherencetime "c: lc = c"c.

However, if !r1 and !r2 are not along the !k vector, then things become more complicatedand we need to visit the van Cittert-Zernike theorem.

6.4 Van Cittert - Zernike theorem

The van Cittert - Zernike theorem is named after Pieter Hendrik van Cittert (1889-?)and Frits Zernike (1888-1966). Zernike was a professor in Groningen, the Netherlands,and obtained the Nobel prize for physics in 1953 for the invention of the phase contrastmicroscope. In its simplified version we will focus on here, the van Cittert - Zernike theoremaddresses the question of the degree of spatially coherence of emission from some object(s)

Source Screen

Screen

"infinity"Source at

Figure 6.2: Pictograms used for the derivation of the van Cittert - Zernike theorem. Left:a source of a size smaller than the wavelength emitting pseudo-monochromatic radiation,which is detected at two positions on the screen: position 1 and 2. Right: same setup, butwith the source moved away to very large distance so that the rays are e!ectively parallel.

on the sky impinging on a screen. The situation is pictographically shown in Fig. 6.2-left.In this pictogram a small area located at the point marked “a” emits radiation, whichis received at locations “1” and “2”. We assume that the size of the emitting regionat “a” is roughly one wavelength of the emitted radiation. Later we will integrate overzillions of regions “a”, but for now we just take a single emitting region. Let us write thedistance between the source and point “1” as d1 and define d2 likewise. We assume thatd1,2 ' |!r2 ( !r1|, so that for many (but not all!) purposes we can use the average distanced = (d1 + d2)/2.

Let us now assume that the emitting region “a” emits perfectly monochromatic radia-tion at wavelength $. The electric field at the location “a” can then be written as

Ea(t) = Aae"2!i"t (6.8)

where Aa is a complex number that does not change in time. The electric field emergingfrom this source falls o!, as we know from electrodynamics theory, as |E| ) 1/d, i.e.inversely proportional to the distance from “a”. In addition, there is a phase lag betweenpoints “a” and “1” (or equivalently “2”) because of the fact that light emitted at time tat point “a” will reach point “1” at time t + d1/c. The electric field at point “1” can thusbe written as

E1(t) &#$Sa

dAa exp

(2#i$

t (d1

(6.9)

and likewise for E2(t). Here $Sa is the surface area of the emitting region at “a”. We usedd1 & d in the denominator of the above expression, but in the exponent we kept d1. Butwhy did we use the square-root of $Sa? This is because when we later integrate over manyemitting regions, we must account for the fact that these emitting regions are mutuallynon-coherent. Analogous to the description in Section 6.2, when we add the emission of Nnon-coherent regions, we get – on average – an electric field amplitude of

#N times that

of a single region. The field in Eq. (6.9) therefore scales with#$Sa.

Our purpose was to study the spatial coherence between points “1” and “2”, so let uscompute

!E!1(t)E2(t)" =

" T

$Sa

d2A!

aAa exp

2#i$

d2 ( d1

dt (6.10)

=$Sa

d2A!

aAa exp

2#i$

d2 ( d1

(6.11)

For the special case of !r2 = !r1 we obtain the flux emerging from this emitting region andmeasured at the screen:

F =c

4#!E!

1(t)E1(t)" =c

$Sa

d2A!

aAa (6.12)

But if !r1 *= !r2 we must find a useful expression for the distance di!erence d2(d1. For this,let us remind ourselves that d1,2 ' |!r2 ( !r1|, so that the two rays are essentially parallel,as shown in Fig. 6.2-right. The quantity L shown in the figure is the value of d2 ( d1 thatwe need. It is given by L = d1(d2 = |!r2(!r1| sin %, where % is the angle toward the source,measured from the normal of the screen (see figure). We can write this more practicallyusing vector notation. The vector !n shown in the figure is a unit vector pointing towardthe source. We then have

d1 ( d2 = !n · (!r2 ( !r1) (6.13)

so that we can write the coherence as

!E!1(t)E2(t)" =

$Sa

d2A!

aAa exp

(2#i$

!n · (!r2 ( !r1)

(6.14)

The inner products are taken in 3-D. But if the screen is purely 2-D, we can cast thisinner product into a 2-D version. Let us take the z-axis to be perpendicular to the screen.If we define !r $ !r2 ( !r1 we thus have

!n · (!r2 ( !r1) $ !n · !r = nxrx + nyry (6.15)

because rz = 0 as !r lies in the plane of the screen. We can now interpret (nx, ny) as a 2-Dvector describing the angular position of our emitting object on the sky and (rx, ry) as a2-D baseline on the screen. We obtain

!E!1(t)E2(t)" =

$Sa

d2A!

aAa exp

(2#i$

nxrx + nyry

(6.16)

This is, for the case of a single emitting area “a”, our final result for the moment. It showsthat the radiation fields between points “1” and “2” on the screen are perfectly correlated,but have a phase shift of

&'(rx, ry) = $nxrx + nyry

c(6.17)

So far this is not really surprising, as it is what one would expect from a plane-parallelwave impinging on the screen at an orientation !n.

But now let us integrate this result over an entire region on the sky, i.e. a continuousseries of regions “a”. To do this, we have to relate steps in nx and ny to the surface area$S:

dSa = d2dnxdny (6.18)

We thus get

!E!1(t)E2(t)" =

A!(nx, ny)A(nx, ny) exp

(2#i$

nxrx + nyry

dnxdny (6.19)

where A(nx, ny) now depends on nx and ny, as we now integrate over di!erent regions onthe sky. Looking carefully at the above equation you may notice that this is in fact a

2-D Fourier integral with rx/( and ry/( as the wave numbers in x and y direction (with( = c/$). The integrand that is Fourier transformed is A!(nx, ny)A(nx, ny), which is infact the intensity I(nx, ny). The correlation between points “1” and “2” is therefore theFourier transform of the image on the sky. This is the van Cittert - Zernike theorem, andstands at the basis of interferometry.

6.5 Delay lines and the uv-plane

To reconstruct the image on the sky from interferometry measurements we must, accordingto the van Cittert - Zernike theorem, measure the correlation of the electric field betweenbetween as many pairs of points in the “pupil plane” as possible. With “pupil plane”is meant the plane parallel to the wave front of the object we wish to observe. In Fig.6.3 this is indicated with the dotted line. However, in practice our telescopes are usuallynot located in the same plane, except for the LBT where both mirrors are on the samemounting. One can solve this problem by e!ectively delaying the signal received by thetelescope closest to the source (the right one in Fig. 6.3) in a delay line. For opticalinterferometers this is simply a tunnel with mirrors where the light is diverted into, suchthat the path length of the light from both telescopes to the instrument (correlator) isequal. Since the Earth is rotating and thus the relative projected distance between thetelescopes (called the projected baseline) changes accordingly, the length of the requireddelay path in the delay line must be continuously adapted. At the VLT this is done withmirrors mounted onto a “train” that moves steadily along a rails at a very precisely tunedvelocity. This must be a very high-precision device, since the delay path must be accurateto within 1 µm over projected baselines of up to 200 m!

If we have N telescopes, then we need N ( 1 delay lines to bring them all e!ectivelyinto the same pupil plane.

One may ask: why do we have to shift all telescopes into the same pupil plane? Ifwe have a perfect plane wave hitting our telescopes, we would have interference also ifthe telescopes are several periods of the wave out of phase. A perfect plane wave alwaysinterferes with itself since sin(') = sin('+ 2#n), even if n = 100000 or more. The answeris related to the coherence length. As we saw in Section 6.2, if you measurement is done ina finite-width wavelength band, then there is a finite coherence time "c, corresponding toa finite coherence length along the direction of propagation lc = c"c. If two telescopes arenot brought to withing a distance lc from their mutual pupil plane, then the signals of thetwo telescopes become decoherent, and interferometry is not possible. Taking too narrow aband would mean that you receive a very weak signal, which could kill your interferometryattempt. But the broader your band, the smaller lc becomes and the closer you have tobring the two telescopes into their common projected plane.

Let us now define a common pupil plane for all N telescopes. The precise position ofthis plane along the pointing direction (the direction perpendicular to the plane) is notimportant (and in fact does not a!ect our measurements), so we just choose one.

Now define a coordinate system in this plane: (x#, y#). Again, the precise location oftheir zero-points are not important. We now want to measure !E!

1(t)E2(t)" in the entirepupil plane, and thus obtain a two-dimensional function

CorrE(x#1, y

#1, x

#2, y

#2) = !E!

1(t)E2(t)" (6.20)

primary "pupil plane"

delay line

for this system

correlator

wave fronts

Figure 6.3: Principle of an interferometer with delay lines to e!ectively “shift” the tele-scopes into a common “pupil plane”. In this example the right telescope is, by the delayline, e!ectively shifted back to the dotted plane.

Since !E!1(t)E2(t)" will depend not on the absolute positions of our telescopes, but only on

the relative positions x#12 = x#

2 ( x#1 and y#

12 = y#2 ( y#

CorrE(x#12, y

#12) = !E!

1(t)E2(t)" (6.21)

It will turn out to be more convenient to measure these coordinates in units of thewavelength:

u :=x#

2 ( x#1

(and v :=

y#2 ( y#

((6.22)

This is what is called the uv-plane. It is actually nothing else than the pupil plane, scaledsuch that one wavelength corresponds to unity.

If we want to be able to reconstruct an image of an object on the sky, then we mustmeasure

CorrE(u, v) = !E!1(t)E2(t)" (6.23)

at as many (u, v) points as possible. In other words, we must have a good uv-coverage.If we have N telescopes measuring simultaneously, then we have N(N ( 1)/2 indepen-

dent baselines. Note that the projected baseline corresponding to (u, v) is the same as theone corresponding to ((u,(v). Now, the projected baseline on the sky changes with time,because the Earth rotates. This means that each baseline describes an elliptic curve inthe uv-plane, and therefore you get multiple (u, v)-points (multiple projected baselines)for a single two-telescope baseline on the ground. If you measure over a substantial partof the night, you therefore get a much better uv-coverage as when you do a single shortmeasurement.

Ideally we must cover the uv-plane perfectly, but this is never possible. We will discusshow to make sense out of measurements with imperfect uv-coverage in Section 6.9. Butlet us first have a look at how to measure !E!

1(t)E2(t)" in the first place.

6.6 Measuring “fringes”: The “visibility” of an inter-ferometric signal

Once we have, using delay lines, e!ectively shifted a pair of telescopes into their commonpupil plane, we can try to measure the correlation of their signals. While in opticalinterferometry each telescope receives a full image, in radio interferometry a telescopetypically receives just a single analog signal (i.e. one single “pixel”). Let us, in this sectionand the next, focus on the case where we are dealing just with a single pixel and we haveto measure the correlation between these signals, as they are received by two or moretelescopes.

Strictly speaking we could record the exact wave pattern of E1(t) and E2(t), where 1and 2 stand for telescope 1 and telescope 2, and t is such that they are measured in thesame pupil plane (i.e. with delay included). Having both signals, we could then calculate!E!

1E2" directly using Eq. (B.38) with " = 0. If we would do this (which is impractical, andfor optical interferometry even physically impossible), then we would have at any giventime t absolute phase information: we know the exact value of E1 and E2 at time t. Thisinformation is, however, useless, because with time the phase changes rapidly (it changes2# for each time interval $t = 1/$). We are more interested in the relative phase betweenthe two telescopes. This gives information about the exact position of a source on the sky.In fact, this is exactly one of the two pieces of information in the complex number !E!

1E2":writing this as Aei# the ' is this relative phase. If one can measure this, then one can doastrometry at ultra-high precision.

In practice one never records the precise wave functions E1(t) and E2(t). One usesother techniques. One often used technique (in particular in radio interferometry, butalso in infrared interferometry sometimes) is the technique of heterodyne interferometry,which we will deal with in Section 6.11. But to understand the principles of long-baselineinterferometry (here defined as interferometry with telescopes that are not on the samemounting), we keep it a bit more simple for the moment.

The simplest way to measure the correlation between the two signals is to simply letthem interfere: We redirect both signals onto a single device that measures the squareamplitude of the sum of the signals:

S12 = ![E!1(t) + E!

2(t)][E1(t) + E2(t)]" (6.24)

This is what you would get in optical interferometry if you simply let the two signalsinterfere on a CCD camera: you measure the intensity of the signal, i.e. the amplitude-squared of the sum of the signals. It is also what you get if you measure the energy outputof two electric signals from two radio telescopes linked together. We can measure such asignal with standard technology (e.g. a CCD for optical interferometry). Note that Eq. 6.24can be regarded as the spatial structure function of E(x#, y#, t) where (x#, y#) are the twospatial coordinates in the pupil plane, and E1(t) $ E(x#

1, y#1, t) and E2(t) $ E(x#

2, y#2, t).

To get from the measured quantity S12 to the desired quantity !E!1E2", we write:

S12 = ![E!1(t) + E!

2(t)][E1(t) + E2(t)]"= !E!

1(t)E1(t)" + !E!2(t)E2(t)" + !E!

1(t)E2(t)" + !E!2(t)E1(t)"

(6.25)

The first two terms are simply the intensity of the object measured by each telescope:S1 = !E!

1(t)E1(t)", S2 = !E!2(t)E2(t)". If both telescopes have the same diameter, these

are the same, and we simply write S = !E!E". We can measure this quantity simply byshutting out one of the telescopes and measuring the signal, which is then S. Eq. 6.25thus becomes

S12 = 2S + !E!1(t)E2(t)" + !E!

2(t)E1(t)" (6.26)

The last two terms are the correlation and its complex conjugate. This is the quantity weneed. So if we evaluate S12 ( 2S we obtain !E!

1(t)E2(t)" + !E!2(t)E1(t)" which is nearly

what we need. To obtain exactly what we need (the complex number !E!1(t)E2(t)", not

just its real part) we can write

!E!1(t)E2(t)" =: A12e

i#12 (6.27)

where A12 is a real number (the amplitude of the correlation) and ' is the relative phase.We obtain

!E!1(t)E2(t)" + !E!

2(t)E1(t)" = 2A12 cos('12) (6.28)

We see that Eq. 6.26 becomes

S12 = 2S + 2A12 cos('12) (6.29)

If the two signals are totally decoherent, then A12 = 0, and we measure S12 = 2S. If theyare perfectly coherent (if we measure a point source on the sky), then A12 = S. In thatcase we measure S12 = 2S[1 + cos('12)].

So how do we get from Eq. 6.29 to the complex number !E!1(t)E2(t)" that we need?

This is a bit subtle. If we do a single measurement of S12 and S, then we have one real(!)equation (Eq. 6.29) for two real unknowns (A12 and '12). So we do not have enoughinformation to fully reconstruct the complex value of !E!

1(t)E2(t)".To solve this impasse, we can do a trick: we slightly change the length of the delay line

to induce an extra phase di!erence between the two telescopes. Let us call this artificallyinduced phase di!erence &'. If we scan over a few wavelength by changing &' smoothly,we will see that S12(&') will follow a cosine-like curve with an o!set:

S12(&') = 2S + 2A12 cos('12 + &') (6.30)

From this so-called fringe pattern we can read o! the o!set (which should be, and will beexactly 2S) and the amplitude of the cosine, which is the value of 2A12. In formulae: weread o! the maximum value of S12(&'), and call it Smax, and the minimum of S12(&'), andcall it Smin, and we then have

A12 =Smax ( Smin

4(6.31)

Strictly speaking, we can now go back to the real pupil plane (&' = 0), and then weobtain the relative phase '12 by solving

cos('12) =S12/2 ( S

A12(6.32)

For any given domain of length 2# this has two solutions, but since we have, by varying&', scanned the fringe pattern, we can figure out which of these two we must take. Inthis we we have, at least in principle, found the phase di!erence between telescopes 1 and2. But in most practical circ*mstances this an extremely unreliable quantity. The reasonis that it is hard to calibrate an interferometer so accurately that we can find the exactpupil plane, or in other words: to find the exact length of the delay line required to bringboth telescopes exactly to a known projected distance from each other. Remember thatfor IR interferometry we would have to do this to within a fraction of a micron, while thetelescopes have distances of up to 200 meters for the VLT! In principle this would be notimpossible, but in practice this is very hard. But an even more dramatic problem is theturbulence of the atmosphere: as we have seen in Chapter 4, turbulence induces phaseshifts. These will be di!erent (and mutually uncorrelated) for the di!erent telescopes,because the telescope separation is usually larger than the Fried length r0.

In most circ*mstances one tries to find the pupil plane (i.e. find the right delay length)by making an initial good guess and then tuning it (i.e. moving &'12 over many, manywavelengths) until you find a fringe pattern. Once you found a fringe pattern, you knowthat you are within one coherence length from the pupil (see Section 6.3). That is usuallythe best you can do. Keeping this (unknown) distance from the pupil plane fixed can bedone much more accurately. Special devices that continuously fine-tune the mirrors in thedelay line (using piezo-electric element) are called fringe trackers.

In conclusion: what we have measured is A12, but not '.The quantity A12 is what is called the correlated flux and obeys

0 + A12 + S (6.33)

One can thus split the flux S into a correlated part (A12) and an uncorrelated part (S(A12).Professional interferometrists also often use what they call the visibility V12 (nevermind theconfusing name), which is

V12 =Smax ( Smin

4S=

A12

S(6.34)

(see Eq. 6.31). All these quantities can be directly measured in the manner describedabove.

Also often the complex visibility is used, which is defined as

V12 =!E!

1(t)E2(t)"S

=A12ei#12

S(6.35)

which is, as we now know, not directly measurable with the techniques in this section:Only its amplitude A12 is measurable. But it is nevertheless a very useful quantity, as weshall see in Section 6.8.

6.7 Examples of visibility signals of simple objects

TO BE DONE

6.8 The “Closure phase”

By comparing predicted visibilities of simple models with measured visibilities at di!erent(u, v) we can already obtain some information about what the object on the sky looks like

(Section 6.7). But can we also reconstruct an entire image on the sky from measurementsof the visibility in the uv-plane? The answer is: no. We would need the full complexvisibility V(u, v) instead of only its amplitude V (u, v). The amplitude information V (u, v)tells us which Fourier components are in our image, but not their position (phase), andthus we cannot do the inverse Fourier transform.

So how to solve this problem? Remember that we can in principle measure somephase for each baseline using Eq. 6.32, but that we rejected it because it was fraught withuncertainty. But if we study this uncertainty more carefully, we can do a trick to overcomethis problem at least partially.

Suppose we have three telescopes (1, 2 and 3). Let us quantify the (unknown) phaseerror of each telescope with )1, )2 and )3, respectively. If we measure the phases '12, '23

and '31 using e.g. Eq. 6.32 we obtain the following measurements:

'measured12 = '12 + )1 ( )2 (6.36)

'measured23 = '23 + )2 ( )2 (6.37)

'measured31 = '31 + )3 ( )1 (6.38)

where '12, '23 and '31 are the real phases (assuming no atmosphere and no instrumenterrors). If we would know )1, )2 and )3, then we would be able to retrieve these real phasesfrom the measured phases. But since we do not know these phase errors, we cannot dothis. This is in fact what we concluded in Section 6.6 and was the reason why we rejectedthe use of Eq. 6.32 to find the phases.

However, if we take the sum of Eqs. 6.36, 6.37, 6.38, then we obtain

'123 $ '12 + '23 + '31 = 'measured12 + 'measured

23 + 'measured31 (6.39)

This is called the closure phase. As one can see: all unknown errors have dropped out,so it is a reliable quantity. The closure phase contains information about depature frompoint-symmetry. Any object on the sky that is perfectly point-symmetric (e.g. an ellipse)will have zero closure phase.

The closure phase has a very important property: It is not a!ected by phase shifts due tothe atmosphere. Atmospheric turbulence has therefore no influence on this measurement.

For an interferometer with N baselines, we can determine the closure phase 'ijk betweeneach triple of telescopes i, j and k. Now suppose we assume the phase between 1 and 2and between 1 and 3 to be zero (assuming that the telescopes 1, 2 and 3 are not alignedalong the same line as projected on the sky), then by measuring the closure phase '123 wecan calculate '23 = '123. Now add one telescope, number 4, and measure '124 and '234.Since we know '12 and '23 we obtain the equations

'12 + '24 + '41 = '124 (6.40)

'23 + '34 + '42 = '234 (6.41)

which is two equations with two unknowns ('41 and '42). So we now know again therelative phases at all baselines. In fact, since we now have N = 4 telescopes, we have 6baselines, and for all these baselines we have the amplitude and phase information, albeitthat we made an assumption for two of the phases. This assumption for these two initialrelative phases amounts to an assumption for the position of our object on the sky (whichis two coordinates). So apart from this freedom to shift our image anywhere on the sky, we

have now obtained all amplitude and phase information we need. In other words: we have(excluding the two assumed phases): N(N (1)/2 visibility amplitudes and N(N (1)/2(2pieces of phase information.

Since the value of S is the same for each baseline pair, we can say that an interferometerarray of N telescopes gives N(N ( 1)/2 values of the complex visibility V (of which twoof the phases have been chosen ad-hoc) and one value of the total flux F (which is Sdivided by the telescope aperture size). Of course, in this calculation it is assumed thatnone of the baselines are duplicates of each other, which is not generally guaranteed. Forinstance, the Westerbork telescope or the VLA have telescopes arranges in regular spacing.If you have for instance three telescopes in a row, and the distance between telescope 1and 2 is the same as that between 2 and 3, then in e!ect you have not 3, but just 2independent baselines. Also, keep in mind that V(u, v) = V!((u,(v), so that when youobtain N(N ( 1) complex visibilities, you get the ones oppose of the origin for free. Inthat sense one has in fact N(N ( 1) complex visibility points, but only half of them areindependent.

6.9 Image reconstruction: Aperture synthesis and theCLEAN algorithm

When we do interferometry in the above described way, by measuring visibilities andclosure phases, and thus constructing a set of N(N ( 1)/2 independent complex visibilitypoints V(u, v) plus their mirror copies V((u,(v), we may have the information we needin order to apply van Cittert-Zernike theory to reconstruct the image, but in practice thisis not so trivial. The problem is that we never have a perfect coverage of the uv-plane.And in order to perform the integrals of the Fourier transformation, we would in fact needa full coverage.

If we would have a nice and regular uv-coverage, for instance a perfectly rectangulargrid in (u, v), then we could use the Fast Fourier Transform algorithm (see Section A.5) toobtain our image on the sky. But we rarely have the complex visibilities measured on sucha regular grid in the uv-plane. In practice we have irregular spacings and certain uv-areaswhere there is little or no sampling at all. The complex visibility in these “holes” could beanything; we simply have no information about that. We could put the complex visibilityin these “holes” to zero and linearly interpolate between nearby measurements in regionswith su#cient sampling. What we then get, after Fourier transformation, is an image ofthe sky that is quite messy. Such an image is called a dirty image. It is in fact the trueimage convolved with the PSF corresponding to the uv-coverage. This PSF is called thedirty beam, because this PSF, due to the “holes” in the uv-coverage, has many sidelobes.A single point source on the sky will thus appear as a point source surrounded by weakerpoint sources, some of which can actually be rather far away from the actual point source.For reasonable uv-coverage these sidelobes are usually substantially weaker than the mainpeak, but they still mess up the image pretty much.

People have tried many di!erent ways to “guess” the visibility between measurementpoints in order to get the best possible image out of it after Fourier transformation. Butin practice most of these methods have drawbacks that can be substantial. A radicallydi!erent approach was proposed in 1974 by Hogbom (see literature list). In this paperit was proposed to make a dirty image using one’s best guess of the inter-measurement

complex visibility values. Call this image A. Also make an image B which we initiallymake empty. Now do the following procedure, which is called the CLEAN algorithm:

1. Find the highest peak in the image A

2. Fit the peak of the dirty beam to this peak

3. Subtract this properly normalized dirty beam from the image A

4. Add a clean beam with the same strength at the same position in image B

5. Go back to 1 until we hit the noise level

The clean beam is a simple Gaussian-shaped peak with the same width as the main peakof the dirty beam. Also the shape of the clean beam can be made elliptic so as to accountfor the di!erent spatial resolution one typically has in di!erent directions (e.g. if a sourceis near the horizon). The above scheme is a slightly simplified version of the actual CLEANalgorithm, but it shows the principle.

The CLEAN algorithm works very well if the object on the sky is a collection of pointsources. If it is, on the other hand, a rather smooth configuration it may work less well.

Also note that the typical lengths of the projected baselines determines to which typicalsizes of features of your object on the sky you are sensitive to. If you have only very longbaselines, then you may see small-scale details, but you miss large scale structures. TheCLEAN procedure, by adding one clean beam at a time to your image until you reach noiselevel, will thus simply not add any large scale structure to your image. If the objectdoes have large scale structure, then the integrated flux of the image the CLEAN procedureproduces will be less than the measured integrated flux (measured from the single telescopesignal S). By comparing the flux retrieved in the clean image with the single telescopeflux you can estimate how much large scale structure is missing.

The full cycle of observing the complex visibilities, filling the uv-plane and reconstruct-ing the image in the image plane using e.g. the CLEAN algorithm is called aperture synthesisor synthesis imaging.

6.10 Primary beam and coherent field of view

If we point our interferometer at the sky we have a spatial resolution of %i = (/L, where Lis our largest projected baseline. However, we cannot know for sure that we are measuringexactly a single object: the object we want to observe. We may be accidently picking upalso another nearby object or objects. We must therefore go a bit more into detail of whatwe are actually seeing with our interferometer.

First of all, we have the field of view of each of the telescopes individually. Let us definethe radius of this field of view as %f . For infrared or optical telescopes this is typically muchlarger than the telescope resolution, %b = (/D, though this may vary between instrumentsused on that telescope, and even between di!erent modes of the same instrument on thesame telescope. At any rate: the field of view is larger than the size of the PSF: %f ' %b.

With radio telescopes, on the other hand, the PSF of a single telescope (here oftencalled the “primary beam”, hence our use of the index b in %b) is usually so wide, thatnot more than a single “pixel” is in an “image”. That is: each telescope simply measures

one signal, not an image with multiple pixels. For millimeter-wave telescopes we are inan intermediate regime where a single telescope still has a small enough primary beamthat low-resolution “images” can be made. For instance, the SCUBA-2 instrument on the15-meter James Clerk Maxwell telescope (JCMT) has a 32x40 detector array, sensitive to450 µm and 850 µm radiation. Each pixel on the detector has a size of 0.11 cm, not muchmore than the wavelength of the radiation it is measuring. However, for radio telescopeswith much longer wavelengths the “pixels” get also proportionally larger. For instance, for( =21 cm (the neutral hydrogen hyperfine structure line) the minimum size of a resolutionelement on the focal plane is of the order of 21 cm, but in practice even larger. Forthat reason, rather than trying to make multi-pixel images with a single radio telescope,an image is usually obtained through scanning: each pointing of the telescope delivers asingle “pixel” of the image. The field of view of each individual pointing is then in factthe same as the beam size: %f & %b.

In the analysis of Sections 6.6, 6.8 and 6.9 we measured the interference between theprimary beam of telescope 1 with the primary beam of telescope 2. Both beams mustbe pointing at the same object to within an accuracy of at least %b, of course, otherwiseno interference can be expected. But in addition the the object we are interested in, anyother objects in this primary beam may also participate in the interference. The questionis: will the flux from some additional source at an angular distance % from our source ofinterest (with % smaller than the beam size) constructively interfere with our object, orwill it simply dilute the coherence? Or equivalently: if we have two point sources A andB, separated by % but both inside the primary beam, will they still interfere with eachother, or are they too far apart to interfere? This is a concrete question, because thereis an angular distance %c beyond which no interference is possible: the signals becomedecoherent. This is related to the fact that if we have a projected baseline length of L andwe use e.g. source A to define our “pupil plane” (phase di!erence 0), then the wave frontfrom source B has an angle with respect to the pupil plane that translates into a distancedi!erence of l = L%. If that distance is larger than the coherence length, then the twosources A and B cannot interfere with each other. Thus %c = lc/L. With Eq. (6.4) andlc = c"c we thus obtain

%c =c

L$$=(

$$=

L$($ %i

(

$($ %b

(

$((6.42)

Let us call this the coherent field of view. According to Eq. (6.42) this coherent field ofview can be smaller than the primary beam if:

L(6.43)

For su#ciently narrow bandwidth and su#ciently large dishes (or small baselines) we neednot worry about the coherent field of view, as long as our source (or multi-source object)is smaller than the primary beam. But for very broad bandwidth or very large baselineswe may need to worry about this. We can also see from Eq. (6.42) that the coherent fieldof view %c is larger than the interferometer resolution %i by a factor

%c%i

$$=

(

$((6.44)

So if we, for example, wish to perform interferometery in the N band, and taking the fullN band (from 8 to 13 µm roughly) for our interferometry to gain sensitivity, then we have

(/&( &2, meaning that our coherent field of view is just twice as large as the resolution ofour interferometer!

6.11 Heterodyne receivers

In radio interferometry the technology of heterodyne receivers is usually used. The ideais to convert the frequency of the incoming signal to a lower frequency that can be bet-ter handled, more easily amplified etc. This technique is also common in everyday life:typically radio or television receivers use this technology.

So how do we convert a signal at frequency $s to some intermediate frequency $0 , $s?We follow the discussion in the book by Thompson here (see literature list). Let us writethe source signal at frequency $s as Vs(t), which stands for the electric voltage in ourreceiver induced by the source on the sky that our antenna is receiving. This is obviouslya very tiny voltage, as we wish to observe weak sources on the sky. Let us assume thatthis is a cosine wave:

Vs(t) = Vs,0 cos(2#$st + 's) (6.45)

where 's is simply some o!set phase.Now introduce a local oscillator: a device that produces a voltage Vlo(t) at some (dif-

ferent) frequency $lo:Vlo(t) = Vlo,0 cos(2#$lot + 'lo) (6.46)

Typically this voltage is much larger in amplitude than Vs(t). But it has (or better:should have) a very high phase stability. Typically $lo is chosen to be fairly close to $s, i.e.|$lo ( $s| , $s.

If we now add Vlo(t) to our source signal Vs(t) (this is called mixing) we get a signalthat behaves like a modulated cosine: A cosine with an amplitude “envelope” that changesat a beat frequency $beat = |$s ( $lo|. While this beat frequency is now much smaller (andthus much more easily managable) than the original $s, this does not yet help us much,because this modulation is only an apparent wave, not a real one. To get a real signal atthis beat frequency one must introduce some non-linearity in our electric circuit.

Suppose we put our mixed voltage signal on one end of a diode for which we know thatthe resulting current I is a highly non-linear function of the input voltage V (in its mostextreme case I = 0 for V < 0 and I = V/R for V > 0, where R is some resistance in frontor behind the diode). Let us use a power series to model this non-linearity:

I(t) = a0 + a1V (t) + a2V (t)2 + a3V (t)3 + · · · (6.47)

The first two terms are simply the linear response, and will just yield the amplitude-modulated cosine wave of the mixed signal. This is not interesting for us. But the quadraticterm is interesting:

V (t)2 = [Vs(t) + Vlo(t)]2 (6.48)

= [Vs,0 cos(2#$st + 's) + Vlo,0 cos(2#$lot + 'lo)]2 (6.49)

= V 2s,0 cos2(2#$st + 's) + V 2

lo,0 cos2(2#$lot + 'lo) + (6.50)

2Vs,0Vlo,0 cos(2#$st + 's) cos(2#$lot + 'lo) (6.51)

(6.52)

The first two terms are signals at the original $s and $lo frequencies and are not interestingin our quest for an intermediate frequency conversion of our source signal. But the lastterm contains the multiplication of our two waves, and here something interesting happens.Let us write out this last term:

last term = 2Vs,0Vlo,0 cos(2#$st + 's) cos(2#$lot + 'lo) (6.53)

= Vs,0Vlo,0 cos'

2#($s + $lo)t + 's + 'lo

(

+ (6.54)

Vs,0Vlo,0 cos'

2#($s ( $lo)t + 's ( 'lo

(

(6.55)

So here we see that we now obtain the sum of two cosine waves at frequencies $s + $lo and$s ( $lo respectively. These are not modulations, but actual signals at these frequencies. If|$lo($s| , $s, then only the second of these terms (the one with $s($lo) will be interestingto us, because this gives a signal at a much lower frequency than the original frequency $s.

We now put this signal through a frequency filter that selects the frequency $0 = |$s($lo|(which is traditionally called the intermediate frequency) with some bandwidth $$0 anddamps out all other modes (including the local oscillator and the signal itself, as well asthe $s + $lo frequency signal and any other signal not falling in the range $0 ±$$0/2). Wethen obtain the desired frequency-reduced signal:

I0 = a2Vs,0Vlo,0 cos'

2#($s ( $lo)t + 's ( 'lo

(

(6.56)

For interferometry this equation has a very important property: The phase 's of the inputsignal is still conserved. If we later want to interfere this signal with a signal from anothertelescope that has a phase 't, then the phase di!erence between the signals, 't('s, is stillthe original one, in spite of the mixing with the local oscillator and the down-grading of thefrequency from $s to the intermediate frequency $0 , $s, provided that our local oscillatorhas a very high phase stability so that we can remove the phase di!erence in 'lo from thisfinal phase di!erence. The conservation of phase means that if we do interferometry withthe intermediate frequency signal, we still get the same interference! We can thus safelyuse the intermediate frequency signal for any further interferometry dealings.

We are now almost there, but there is one more thing we have to clarify. Typicallywe do not receive a single-frequency signal from the sky. We receive a spectrum. Thequestion is now: given some local oscillator frequency $lo and some filter at frequency $0and bandwidth $$0, which signal frequencies $s are we going to be sensitive to? Theanswer is: any $s for which $0 = |$s ( $lo|. This gives two solutions:

$s = $lo ± $0 (6.57)

each with bandwidth $$s = $$0. Since $0 , $lo this creates two sensitive bands that areclose together in frequency. These are called the lower sideband and the upper sideband.This means that the intermediate frequency signal we work with is a mixture of the signalsat two nearby frequencies. This is usually not what we want. Thus we need to filter outone of the sidebands before the mixer, so that we are only sensitive to the other sideband.

From here on, everything we learned about long baseline interferometry can now beapplied to his intermediate frequency signal.

6.12 When telescope size is not negligible comparedto the baseline

If we perform true long baseline interferometry, where b ' D, we can ignore the finitesize of the individual telescope apertures, and regard our visibility measurement as ameasurement at a single (u, v) position. If, however, we put our telescopes so close to eachother that b is only a few times their diameter (possibly even down to b = D where themirrors touch each other, but typically at somewhat larger distance), then we have to havea closer look at what we are doing. We already saw in Section 6.1 an example of such asituation: The LBT in Arizona, where two 8.4 meter telescope are arranged in a baselineof 14.6 meters, i.e. there is merely 6.2 meters between the edges of the mirrors. If wecombine the light of the two telescope at the image plane (Fizeau interferometry), we havealready seen that we get a PSF which is a single-mirror PSF with a fringe pattern on top.We obtained this pattern simply by Fourier transforming a wave front passing through ourdouble-pupil:

Image = |F [E](!x)|2 (6.58)

where E(x, y, pupil) = 0 there where (x, y) is not part of the double-pupil. If the wavepassing through the double-pupil is a plane wave, the function E(x, y, pupil) depicts theshape of the double-pupil (see Fig. 6.1-left), and the image is the PSF (see Fig. 6.1-right).

We can now link this “Fizeau inteferometry language”, where we think in terms of thePSF in the image-plane, to “long baseline interferometry language”, where we think interms of measuring the autocorrelation of the radiation field in the pupil-plane. We dothis by using the spatial version of the Wiener-Khinchin theorem (Eq. B.10 of AppendixB.1), applied to the wave:

Image = |F [E](!x)|2 = F [Corr[E, E]](!x) (6.59)

In other words: the image on the image plane is in fact the Fourier transform of theautocorrelation function in the pupil plane! If we compare this to what we learned fromthe van Cittert-Zernike theorem, this makes perfect sense: the image plane is indeed aprojection of the sky, and we already knew that the Fourier transform of the autocorrelationfunction in the pupil plane represents the image on the sky.

By applying an inverse Fourier transform to Eq. 6.59 we get

Corr[E, E](!u) = F'

|F [E](!x)|2(

= F [Image] (6.60)

If we have a plane wave hitting our pupil, then the “Image” is the PSF, so the aboveequation then gives the autocorrelation of the pupil. This function is called the modulationtransfer function (MTF). Remember, by the way, the telescope transfer function, Eq. 4.54.It is the same thing. Here we just apply it to the two-mirror system as a whole.

Let us apply Eq. 6.60 to the PSF we obtained for the LBT (Fig. 6.1-right). So we Fouriertransform the PSF (which itself is the absolute-value-squared of the Fourier transform ofthe pupil) to the pupil plane. What we get is shown in Fig. 6.4. Note that we Fouriertransform the image-plane electric field squared !E!E", not the image plane field E itself,otherwise we would have obtained Fig. 6.1-left back. Instead, we now obtain a tripletof blobs, partly overlapping each other, and each being the “brightest” in their centerand dropping o! toward their edge. The middle blob corresponds to all possible vectors

Figure 6.4: Left: the PSF of the Large Binocular Telescope (identical to Fig. 6.1-right).Right: The Fourier transform of this PSF back to the pupil plane.

!r connecting two points on the same mirror, while the two sidelobes correspond to allpossible vectors !r connecting one point on one of the mirrors with another point on theother mirror.

Contrary to the pupil shape function shown in Fig. 6.1-left, the autocorrelation functionin Fig. 6.4-right is not a constant function over some area and dropping to zero outside ofthat area. Instead, the blobs are centrally bright and fade toward their periphery. Thishas a simple explanation: some vectors !r connecting two points on the double-pupil may“fit” to a much more limited set of pairs of points on the pupils than others. For instancefor a single LBT mirror (8.4 meter diameter) a separation of 8.4 meter (|!r| = 8.4m) isonly represented by pairs of points on the telescope edge that are diametrically opposite,while if (|!r| = 1cm) almost every point !x on the mirror has a point !x + !r that is also onthe mirror. This means that the autocorrelation function is much better sampled at small!r than at large !r. A similar reasoning holds for the !r vectors that link a point !x on onemirror to a point !x + !r on the other mirror: hence also the “sidelobes” in Fig. 6.4-rightfade toward their periphery.

Now here we see a di!erence between the Fizeau-style measurements of Corr[E, E] andthe long-baseline-style measurements of Corr[E, E]; a di!erence that becomes apparentwhen the mirrors are close together. For Fizeau interferometry for closeby telescopesthe true Corr[E, E] function (before entering the aperture of our pair of telescopes) isstrongly modulated by the finite size of the telescope. For ideal long-baseline interferometry(zero telescope diameter) we actually measure the true Corr[E, E] function without thismodulation. In principle the measurement of the fringes of the image-plane image of ourFizeau interferometer represent a measurement of Corr[E, E] in the pupil plane, but insteadof measuring a single (u, v) point, we measure a non-trivial integral of Corr[E, E] valuesmodulated by the modulation transfer function shown in Fig. 6.4-right. If we now push thetelescopes apart (of course keeping their diameter fixed), then the blobs in Fig. 6.4-rightalso move apart (while staying the same size). By measuring the fringes, we in fact measurethe Corr[E, E] in the two “side-lobes” of Fig. 6.4-right, which now become more and morepoint-like compared to their distance from the origin. We thus approach the long-baselineinterferometry-style measurement of Corr[E, E].

6.13 Pupil-plane versus image-plane interferometry

The modulation of the autocorrelation function with the MTF, discussed in Section 6.12,applies to the case when we combine the light of the two telescopes directly on the imageplane. However, one can also combine the light of the two telescopes at the pupil plane.The idea is to create an exit pupil for each telescope, which is a miniature version of thetelescope entry pupil (aperture), and now overlay the two exit pupils of the two telescopeprecisely over each other by using a beam combiner. The result is a wave at the combinedexit pupil that is the direct sum of the waves in the two entry pupils.

Ecombined(!x#) - E(!x) + E(!x + !r) (6.61)

where !x# is the scaled-down exit-pupil version of the !x on the primary mirror (entry pupil)and !r is the baseline vector connecting the two telescopes. In other words: by combining thebeams in the pupil plane instead of in the image plane, you get interference always betweena position on one primary mirror with the equivalent position on the other primary mirror.This means that we are now measuring interferometry truly at a single (u, v) point, in spiteof the non-negligible size of the telescope. And since both mirrors are of the same shapeand size, there is no modulation of the correlation function by the telescope system. If weobserve a single unresolved point source on the sky, what we observe on the image plane isthat the entire PSF fades and brightens as you make slight changes of the delay betweenthe two telescopes. This is indeed the fringe pattern we encountered in the long-baselineinterferometry and we can thus measure the visibility.

Pupil-plane beam combining and image-plane beam combining both have their advan-tages and disadvantages. The nice thing about image-plane interferometry is that one getstrue images, if one manages to “deconvolve” the measured images with the Fizeau PSF. Itdoes not su!er from the field-of-view limitation due to decoherence. But it requires verygood adaptive optics to work, and it only works well for telescope pairs that are close, i.e.the gain in spatial resolution is moderate. The nice thing about pupil-plane interferometryis that it does not su!er from the non-trivial MTF of the pair of telescopes, and it canhandle very large baselines and thus obtain incredible spatial resolution. But the field ofview is limited by decoherence if the wavelength band &( is large (which is usually only aconcern in infrared and optical interferometry, less so in millimeter/radio interferometry).And also each measurement is just a single (u, v) point, and one needs many baselines toconstruct a decent image.

Chapter 6 Interferometry: The basic principlesdullemond/lectures/obsastro_2010/... · Chapter 6 Interferometry: The basic principles We have seen that the size of the telescope sets - [PDF Document] (2024)