We describe the use of Bayesian analysis methods applied to time-of-flight secondary ion mass spectrometer (TOF-SIMS) spectra. The method is applied to the COmetary Secondary Ion Mass Analyzer (COSIMA) TOF-SIMS mass spectra where the analysis can be broken into subgroups of lines close to integer mass values. The effects of the instrumental dead time are discussed in a new way. The method finds the joint probability density functions of measured line parameters (number of lines, and their widths, peak amplitudes, integrated amplitudes and positions). In the case of two or more lines, these distributions can take complex forms. The derived line parameters can be used to further calibrate the mass scaling of TOF-SIMS and to feed the results into other analysis methods such as multivariate analyses of spectra. We intend to use the method, first as a comprehensive tool to perform quantitative analysis of spectra, and second as a fast tool for studying interesting targets for obtaining additional TOF-SIMS measurements of the sample, a property unique to COSIMA. Finally, we point out that the Bayesian method can be thought of as a means to solve inverse problems but with forward calculations, only with no iterative corrections or other manipulation of the observed data.

The COmetary Secondary Ion Mass Analyzer (COSIMA) is a time-of-flight secondary ion mass spectrometer (TOF-SIMS) on board the Horizon 2000 European Space Agency Rosetta mission en route to encounter the 67P/Churyumov–Gerasimenko comet. The space probe consists of an orbiter and a lander. After the in-flight hibernation, the space probe and its instruments were successfully woken up on 20 January 2014. The first orbital maneuvers for the comet approach took place in May 2014. The formal mission end date is 31 December 2015. By that date the comet will have passed perihelion with the Rosetta spacecraft clinging near it all the time.

While the orbiter is traveling at slow speed (meters per second) in the
vicinity of the comet

The outcome of a measurement is a time-of-flight spectrum, which we are
directly interested in. In this paper, we discuss the quantitative foundation
of understanding the spectrum through statistical analyses of individual
spectral lines and touch on some critical issues such as instrument dead-time
effects, normalization, and isotope ratio calculation of lines. Multivariate
techniques connecting complex chemistry and complete spectra are discussed
elsewhere

We measure the time-of-flight spectrum, i.e., the number of secondary ions as a function of time. This defines the coordinate space we are working in.

The time of flight of an ion scales with the mass

The resolution of the mass spectrometer is

The full spectrum consists of

In weak lines with low secondary ion yield, most of the primary ion beam
firings will produce no secondary ions for that mass, and only single ions
will be recorded occasionally. For example, a line with a total count of

If the ion yield is higher, such that the total counts of a spectral line are
on the order of 10 % of the total number of
shots, in a 3 min exposure, a few times

The single most important parameter for understanding the dead-time effect is the yield of secondary ions to the ratio of counts creating a given line to the number of shots. It gives a measure of how many occurrences of two or more ions in a single shot occur for that particular line. This implies that two spectra with the same absolute line counts for a given mass will have different dead-time effects if they have a different number of primary ion shots. On the other hand, the shape of the same line in a sample from short exposures and long exposures will not change due to dead-time effects if the secondary ion yield does not change.

Ideally, the time of the flight spectrum would show no background, show sharp discrete line peaks and have an exact TOF mass calibration. In reality, we are limited by measurement statistics, finite resolution, dead-time effects, background, multiple nearby lines and various other issues. We will next address how to analyze our COSIMA spectra from a Bayesian perspective.

The ordinate in the data is the TOF time bin, which has a linear relation to the time of flight, which scales as the square root of ion mass. The data themselves are count data and thus Poisson distributed.

The parameters we are interested in at a given mass are the number of
spectral components, the integrated count of each component, the mass
corresponding to each line and the confidence limits of all these parameters.
We approximate the spectral lines as Gaussian in the time-of-flight
coordinate system. Standard methods such as least squares or

Our data are particle count data and as such positive definite. They follow
Poisson statistics. The Poisson probability density function is defined as

The Poisson nature of the data implies that the probability density function is not symmetric. The mean has a value different from the median and the mode, the most likely value. As the distribution is not symmetric, the standard square root of variance, “sigma”, should not be used to calculate confidence limits or “error limits”. Note also that strong peaks have the largest noise in absolute terms, whereas low peaks have a more significant noise contribution in relative terms.

The instrumental dead time brings an additional special complication. The observed data that are affected by the dead time still have a Poisson probability distribution in secondary ion counts per time bin. The “correction” of the dead time used in conventional methods applied to the observed data points effectively distorts the statistical properties of the data by increasing the real noise in the corrected data to a level larger than what is expected from Poisson data, the corrected data being essentially too noisy. This is a potential problem for strong lines. Our approach avoids these problems because we apply the dead-time corrections to the model and not to the observed peaks.

In statistical analyses, one tends to habitually assume Gaussian noise and the propagation of errors through addition of variances. These assumptions are not valid in our case. Their use could cause negative values in “error” limits, which is mathematically and physically an impossible situation, as they would imply negative counts. Furthermore, the way propagation is used contains the hidden assumption of symmetric errors, which is not the case in these data.

We will address the analysis of COSIMA spectra through Bayesian analysis, which will avoid all the problems mentioned above.

The Bayesian analysis is a universal means of understanding and interpreting measured data. In principle, we could consider our spectrum as one measured entity with several hundred lines and interpret the full spectrum by Bayesian means. This would require working in a data space of a dimensionality of several thousands squared. In practice, it is more convenient to reduce the analysis into an analysis of hundreds of lines, each consisting of a combination of one or several nearby lines. We can do this as, at low masses, there is no overlap between lines of different integer masses and, at high masses, the lines tend to be sparse and still well separated.

The conventional analysis of resolved time-of-flight mass spectrum starts
with the observed spectrum, applies a correction term in order to correct for
(i.e., remove) the dead-time effects, and then possibly remove a background
and then treat the remainder as the real line. Careful conventional analysis
does not ignore the contribution of background noise, however.

Our analysis is nearly reverse in many aspects. We start by selecting a model
from a large set of models of a beam shape, amplitude and background, after
which we apply the effects of the dead time and obtain a model for an
observed spectrum. Assuming a Poisson distribution for the model, we then
calculate the likelihood that this model will explain the observations. We
then iterate towards a cloud of solutions. There are two details we should
emphasize here. Our calculation is a forward calculation. For this reason, we
take the dead-time effects into account in a reverse order than what is
conventional; thus, we

Assume that you have an observed peak shape

We will search for a solution from the values of model parameters

Mathematically, we are interested in knowing which is the best model

The prior densities

Our result will provide the probability density distribution of various parameters on the left of the equation above. By finding the mode of this distribution, we get the most likely value in the model parameter space describing the data. Confidence limits to the model parameters can be calculated from the posteriori probability distributions.

Markov chain Monte Carlo (MCMC) algorithms are very useful in determining the posteriori probability space. A random walk in the parameter space is created. This chain converges to the target distribution that, multiplied by the prior distribution, will give the posteriori distribution. In creating the random walk sequence, the next draw from the parameter space depends on the position and the value of the previous sample.

One widely used family of MCMC chains is the Metropolis algorithm. The core
in these algorithms is the decision of whether to accept the next move. Say
that one has calculated the probability

We use the adaptive Metropolis algorithm

For simulated and real spectra, we have used typically 200 iterations in the burn-in phase and 20 000 in the main iteration phase. The upper limit of iterations is determined by convergence to the posteriori distribution and the confidence levels needed.

The dead-time effects cause the spectral peaks to become weaker, shift the peak maximum to smaller masses and distort significantly small peaks that have a mass slightly larger than a strong peak. To characterize the effects of dead time on COSIMA spectra and to understand the characteristics in detail, we performed simulation. Furthermore, the simulations were run to validate that the equations we derived for our Bayesian method from first principles are valid.

We assume a dead time of 10

Effects of dead time in COSIMA. The effect is shown for two artificial Gaussian lines. The strongest line has a total yield of 50 %. The fainter line has a yield of 5 %. The top-most curve (green stars and dot-dash line) shows the original 50 % curve; the second curve (red crosses, dash line) shows how the dead-time effect has changed the curve. This is in principle the observed line. Note how the maximum and correspondingly the total line counts have decreased. Also note how the line center and peak have shifted to the left. This is particularly noteworthy on the right side of the line. The lower two curves show similar cases for the 5 % line with blue squares and purple crosses, respectively.

From these simulations, we derive some relevant statistical properties of the
dead-time effects. We can confirm the equations derived purely on statistical
grounds by

The dead time shifts the peak position by about

Our Bayesian approach is not affected by the two problems described above as
our approach can be considered as an inverse solution calculated by fully
forward sub-solutions. In our simulation, it is better that we use a model
for the dead time. The magnitude of the dead time depends on the number of
counts in the previous bins as follows. A count will be recorded if there are
no counts earlier in the same bin or in the previous 10

First, we make a simple example. We follow the Bayesian approach, but because
of the simplicity of the problem, we need no simulations. We assume no
background counts and a line with a total integrated number of counts as

Posteriori distributions of a Poisson peak with a given amplitude

We should note that, although the single
highest probability

The differences in the mode, median and mean are important, but fortunately for our case they are not of big concern as the differences are at the most 1 count. More important for us is the asymmetry of the distributions in the case of small peaks.

If background counts are present, e.g., 10 counts in addition to a line of 20 counts, then the distribution above will be the joint distribution.

We calculated a set of artificial one- and two-peak cases to validate our
method. The exact shape of the line is not critical. For simplicity we chose
a Gaussian. We selected an array of 20 time bins and drew a random position
for the peak randomly from

Before our full Bayesian tests, we performed a brute force calculation. The
best fit parameters were calculated at 0.1 bins in time and 20 intervals per
dex in log space, i.e., a resolution of a factor

We calculated a total of 3300 cases with a variable background noise and different separations. The major systematic source of error is the discretization of the solutions of the data into the sub-bins. This is an effect that shows the weakness of the direct grid calculation of the probabilities.

A faster and more accurate estimate of the line parameters is obtained by the Bayesian method. The additional benefit is that we obtain automatically a distribution for the various parameters of the solution without having to resort to a grid calculation.

An example of the posteriori distribution of a weak line at
mass 19. The background around the line is very low. The weak line
has an observed maximum of 16 counts. The top panel shows
the posteriori distributions in total count vs. time flight bin. The red
curve contains 50 % posteriori confidence limits, the green curve 68 %, the dark
blue 90 %, and the light blue 95 % limits. Note that
the most likely value has a rather symmetric distribution with
a 68 % confidence width of about 0.34 TOF time bins or 0.002

We show an example of a real line from the COSIMA full spectrum. The example
shown in Fig. 2 is a relatively weak line with a total number count of about
100. The line mass is derived from the in-flight measurements of constants

The solution for a single Gaussian that gave us a mass of 19.0056

Two examples of the posteriori distributions of the
amplitudes and positions. The two simulated Gaussian peaks have
Poisson noise added to each point. The Gaussians have a FWHM of 2.5
time-of-flight bins or 0.031

A simulated two-line case is shown in Fig. 3. The two simulated Gaussian
peaks have Poisson noise added to each point. The Gaussians in this
simulation have FWHM 2.5 time-of-flight bins or 0.031

We ran 10 000 two-line simulations and modeled them with one and two peaks
and investigated which of the models were correctly identified by our
Bayesian analysis. The results were quite clear cut. Two nearby peaks are not
identified correctly in the presence of Poisson noise if the following
limitations are met: the smaller peak has an amplitude of

To analyze real COSIMA spectra, we make an assumption of the line shape. We have chosen as options a Gaussian shape, but on occasions a 80 % Gaussian and 20 % Lorentzian combination is an option that is suitable for modeling lines in positive COSIMA spectra. If an asymmetry of the peak develops in COSIMA for any reason, we will be able to take this into account. Negative ion spectra are more complicated as an additional signal before the main peak is created by the electrons sputtered off the grids inside the reflectron. We will not discuss negative spectra in this paper in detail.

We first estimate the line amplitude and width from the observed line. To the estimated line, we then apply the analytic dead-time correction and obtain a model line that we can compare to the observed line. Note that here we can use the information that the probability distribution of the counts follows a Poisson distribution. With the Bayesian adaptive metropolis algorithm described earlier, we can obtain the posteriori distributions of both the parameters of the original line and of the observed dead-time affected line. These will include automatically the proper line positions and amplitudes. The total counts are obtained by summing discrete counts from the continuous model curves, so the fitted amplitudes of the continuous Gaussian do not represent a real quantity, but just a mathematical aid for measuring the total count from discrete abscissa values.

If we are able to give good guesses for the initial starting points, the algorithm tends to converge faster to a good solution. This is not necessary for the method but aids in reducing the computing time considerably, particularly when estimating multiple spectral lines simultaneously.

The analysis of the lines provides a complicated challenge. Some lines are clear and isolated; often, two separate lines occur together. If they are sufficiently far apart, they can be treated as single isolated lines. Occasionally, a section occurs in the spectrum where several lines appear to be present and mixed in. Sometimes the background levels are somewhat elevated, mimicking multiple merged lines. Our approach is the following: we create a running 5 pixel boxcar sum of the spectrum of the original spectrum and find the local maximum by comparing the adjacent smoothed pixel sums. We then accept as good guesses the points where this maximum has a value that is larger than the background. The background is defined as the smallest of two background measurements. One background estimate is obtained from the 5 pixel sum 20 pixels earlier and the second background 20 pixels on the other side of the maximum. If this difference of the boxcar sum and the background sum is over a certain limit, we accept this point as a guess for a component. We have used an ad hoc lower limit of five counts. This is not a critical limit as it is only a first guess for our Bayesian analysis.

A general normalization is often performed by dividing the count of the
spectral lines by a certain constant or line, e.g.,

The proper way to normalize is to build a model where the line ratio is solved for. Take a guess of the stronger integrated line count, make a good guess of the background, and apply a good guess for the line ratio. You have now calculated two integrated line counts. Using the Poisson distribution, calculate what the likelihood is that the observed lines are explained by the given model. Continue with the Bayesian principles of searching for the posteriori probability distribution. Finally, marginalize (integrate) over background and amplitudes to get the likelihood of the line ratio distribution.

An example of the mass 53 spectral lines in a RM spectrum
CS_45D_20110309T074148_SP_P.TAB. The model fit here is a two-Gaussian model and a constant background. The observed line is shown
with the red stars and the fit with green circles. The masses
derived are 52.967 and 53.055

The spectra are provided with an initial estimate of the mass calibration
parameters,

In this study, we have considered so far all lines as independent in the
sense that the background level, the line width, position and the maximum
amplitude of the peak have been free parameters. However, if we wish to ask
a specific question such as whether a certain mass contains lines at
predefined exact masses, we can employ different variations to the analysis.
For example, if we see

An additional setup can be created between the above multiline kind example
and isotope ratios. Consider, e.g., the lines

Investigating the full parameter space of all possible models is beyond the scope of this paper, but we wish to point out the generality of the Bayesian method. These kinds of analyses are not easy to do with conventional means, and the posteriori probability distributions in those cases are at best only guesses. We thus provide posteriori distributions and confidence limits for all measured parameters.

We have discussed the basic principles of applying a Bayesian approach to the analysis of COSIMA spectra. We address the accuracy, the fundamental principles of Bayesian analysis. We show that one is able to obtain posteriori distributions for integrated line counts, line positions, and line widths in systems of one or several lines. Even if some of the parameters may turn out to produce strong correlation or degeneracy in the solution, its severity can be characterized by Bayesian analysis.

The instrumental properties of COSIMA that simplify our analysis are the long time interval between the shots so that the secondary ion formation and flight time of the ions can be considered usually statistically independent from shot to shot. Second, the shortness of the pulse and the well-calibrated instrument means that not only each mass line but often also the organic and mineral components can be analyzed separately. Third, the dead time is relatively short and quite nicely matched with the line width, so the dead-time effects will not leak to neighboring lines. The narrow line shape means that the spectra cannot be well modeled by a line shape derived from the spectrum.

Our analysis methods can be generalized to data with other sorts of noise properties, and nearly any kind of line shapes.

COSIMA was built by a consortium led by the Max-Planck-Institut für Extraterrestrische Physik, Garching, Germany, in collaboration with Laboratoire de Physique et Chimie de l'Environnement, Orléans, France, Institut d'Astrophysique Spatiale, CNRS/INSU and Université Paris Sud, Orsay, France, the Finnish Meteorological Institute, Helsinki, Finland, Universität Wuppertal, Wuppertal, Germany, von Hoerner und Sulger GmbH, Schwetzingen, Germany, Universität der Bundeswehr, Neubiberg, Germany, Institut für Physik, Forschungszentrum Seibersdorf, Seibersdorf, Austria, and Institut für Weltraumforschung, Österreichische Akademie der Wissenschaften, Graz, Austria, and is lead by the Max-Planck-Institut für Sonnensystemforschung, Göttingen, Germany. The support of the national funding agencies of Germany (DLR), France (CNES), Austria and Finland and the ESA Technical Directorate is gratefully acknowledged. We thank the Rosetta Science Ground Segment at ESAC, the Rosetta Mission Operations Centre at ESOC and the Rosetta Project at ESTEC for their outstanding work enabling the science return of the Rosetta Mission.

H. J. Lehto and B. Zaprudin acknowledge the support of the Academy of Finland (grant number 277375). Edited by: M. Paton