# Measuring electron temperature with a swept Langmuir probe¶

Written by Katerina Hromasova with input from Martina Lauerova, Georgiy Sarancha, Jan Stockel, Vojtech Svoboda, Michael Komm and others.

## Theory of swept probe measurements¶

The following figure shows the ideal Langmuir probe $I$-$V$ characteristic. The ion branch of the curve (left half of the plot) can be described by a three-parameter exponential function.

$I(V) = I_{sat} \left( 1 - \exp \left( -\frac{V - V_f}{T_e} \right)\right)$

The three parameters are the ion saturated current $I_{sat}$, the probe floating potential $V_f$ and the electron temperature $T_e$ [eV]. The shape of the characteristic changes depending on these parameters, and by fitting an experimental $I$-$V$ characteristic with an exponential function, one may retrieve their values.

To collect the whole $I$-$V$ characteristic in experiment, the biasing voltage $V$ on the probe is swept (i.e. varied periodically). The exact voltage shape is irrelevant, though we most often encounter the sawtooth (zig-zag) shape and the sine shape. The biasing voltage $V$ is then plotted against the current $I$ flowing from the probe to the ground and the curve is fitted with the exponential.

This notebook performs $I$-$V$ characteristic fitting throughout the current discharge. It documents the process step by step and concludes with drawing the temporal evolution of the ion saturated current $I_{sat}$, the probe floating potential $V_f$ and the electron temperature $T_e$.

Note: All the time variables are given in seconds.

## Import the basic libraries¶

First we import basic libraries: Numpy and Matplotlib. We will import more libraries throughout the notebook as needed.

## Access the diagnostics data¶

The Langmuir probe we shall be working with is placed on the PetiProbe. (The Langmuir probe is the small metal pin on the right.)

The data directory of the PetiProbe is http://golem.fjfi.cvut.cz/shots/{shot}/Diagnostics/PetiProbe/. Here, we write the function get_data to download the data.

The biasing voltage $V$ is collected under the name U_bias. The voltage proportional to the probe current is called U_current. The probe current can be calculated as $I = V/R$, where $R=46.7 \, \Omega$ is the measuring resistor resistance.

In the following, we load this data for the current shot, calculate the probe current $I$ and plot the time evolution of $I$ and $V$. Notice that at the discharge beginning, the current isn't flat zero. This is the effect of the parasitic current, which we will discuss shortly.

## Remove the parasitic current¶

The parasitic current appears due to the capacity of the data collection system. At high sweeping frequencies, the wires behave like capacitors and cause current oscillations proportional to the time derivative of the biasing voltage. This parasitic current adds up with the probe current, distorting it.

$I_{total}(V) = I_{probe}(V) + c \cdot \frac{dV}{dt}$

Since the biasing voltage is largely independent of the plasma parameters, $V(t)$ is periodically constant throughout the discharge and so is the parasitic current. We use this in the parasitic signal reconstruction and removal.

First, we sample the parasitic current at the beginning of the discharge, where $I_{probe}=0$ and $I_{total}=c \cdot \frac{dV}{dt}$. This is the time period between the opening of the $B_t$ capacitor banks and the opening of the current drive capacitor banks.

We want to "clone" this sample and cover the rest of the discharge with it. To do that, we need to know exactly how long its period is. We load this from the database, where the sweeping frequency f_fg is stored.

Next, we pick a few whole periods of the parasitic signal from the discharge beginning and clone the entire parasitic signal from them. Finally, we subtract the parasitic current from the total current, retrieving the probe current alone.

## Cut the probe signal into individual $I-V$ characteristics¶

The probe current $I$ and voltage $V$ are now ready to be plotted into the $I-V$ characteristic. However, we can't mix $I-V$ characteristics from different parts of the discharge - the plasma paramaters are different and so are the $I-V$ characteristics. We need to treat them separately, and that means breaking up the signal into individual periods of the sweeping voltage.

In the following, we create a list of voltage peaks maxima and valleys minima. Specifically, we detect the first peak position in $V$ and "predict" the following peaks based on the sweeping period.

## Plot a sample $I-V$ characteristic¶

As an $I-V$ characteristic example, we take the first sweeping voltage period starting after $t = 7$ ms. We plot the $I$-$V$ characteristics separately for the voltage ramp up and ramp down to show any potential hysteresis.

## Apply the bin average to the $I-V$ characteristic¶

$I$-$V$ characteristics often contain a lot of fluctuations. This can mean that the exponential fit will not converge. In the past, when fitting techniques were slow, this was alleviated by applying the bin average to the data.

Bin averaging is breaking the data into individual "bin" and averaging them within that bin. Typically, the x axis (here the biasing voltage $V$) is split into even parts and all the samples within a given part (bin) are averaged. Each average is given an errorbar, calculated as the standard deviation of the averaged data. The errorbars can then be used as weights during the characteristic fitting.

Today's fitting techniques are, however, much more powerful than they used to be. Bin averaging no longer provides faster result but, on the contrary, distorts the results. This is becuase its errorbars, pretty as they are, are not very representative of the actual uncertainties in the signal. It is much better to fit the $I-V$ characteristic as we collect it, sample by sample.

We will demonstrate the difference between fitting the full and the bin-averaged $I-V$ characteristic in this notebook. Thereafter, we will use bin averaging to get a good first estimate of the plasma parameters. This can improve the fit quality of the real data.

In the following, we calculate the bin average of the two $I$-$V$ characteristics shown in the figure above.

## Fit the bin-averaged $I-V$ characteristic¶

Next, we fit this binned $I$-$V$ characteristic by the exponential function and print the resulting plasma parameters.

Notice that only a part of the curve is used as fit input, in particular the data points whose probe current value is above $-2 I_{sat}$. This improves the fit stability by disregarding the more volatile datapoints near the electron branch of the $I$-$V$ characteristic.

## Fit the full $I-V$ characteristic¶

We use the fit of the bin-averaged data as initial guesses for the fit of the full data.

## Investigate the fit result errorbars with the covariance matrix¶

The Python fitting function scipy.optimization.curve_fit returns, beside the fit result values popt, also the so-called covariance matrix pcov. This is a 2D matrix whose diagonal contains the squares of the "fit error". They serve as an estimate of the fit results errorbars.

Notice that the values obtained by fitting the bin-averaged $I-V$ characteristics may not fall within these errorbars.

These errorbars, however, may not very representative of the uncertainty of the fit results. To get a real sense for the $I_{sat}$, $V_f$ and $T_e$ uncertainty due to the data fluctuation, we employ so-called bootstrapping.

## Investigate the fit result errorbars with bootstraping¶

Bootstraping (Wikipedia article)) is a simple and flexible tool for calculating errorbars. The general idea is such:

1. Calculate your quantity (here $I_{sat}$, $V_f$ and $T_e$) from your dataset (here $I-V$ characteristic).
2. Create a large number of synthetic datasets, based on the original one.
3. Calculate your quantity for each of the synthetic datasets.
4. Look how your quantity varies between these datasets.

In other words, the quantity uncertainty ($I_{sat}$, $V_f$ and $T_e$ errobars) are gauged based on how much the "synthetic" $I_{sat}$, $V_f$ and $T_e$ vary across the synthetic datasets. Bootstrapping has a lot of advantages, some of which will be showed later in the notebook. Among them is that it can be applied to any dataset and any quantity you calculate from it. It makes no assumptions on the distribution function of the data (which, in other methods, in frequently assumed to be Gaussian) and you can adjust its precision easily by changing the number of synthetic datasets. Its major drawback it that it takes a lot of time, particularly if calculating your quantity is complicated (or, God forbid, cannot be automatised). But in the case of our $I-V$ characteristics, the time needed is not that long. (In the current tests, the entire notebook takes about 30 seconds to execute on a personal computer.) Plus, the synthetic dataset are by nature independent, so the calculation can be easily parallelised.

The following function creates a number of synthetic $I-V$ characteristics (by default 100), fits them and returns the resulting 100 samples of synthetic $I_{sat}$, $V_f$ and $T_e$.

First we visualise the different fits by plotting them onto the $I-V$ characteristic.