Written by Katerina Hromasova with input from Martina Lauerova, Georgiy Sarancha, Jan Stockel, Vojtech Svoboda, Michael Komm and others.

The following figure shows the ideal Langmuir probe $I$-$V$ characteristic.

The ion branch of the curve (left half of the plot) can be described by a three-parameter exponential function.

$I(V) = I_{sat} \left( 1 - \exp \left( -\frac{V - V_f}{T_e} \right)\right)$

The three parameters are the ion saturated current $I_{sat}$, the probe floating potential $V_f$ and the electron temperature $T_e$ [eV]. The shape of the characteristic changes depending on these parameters, and by fitting an experimental $I$-$V$ characteristic with an exponential function, one may retrieve their values.

To collect the whole $I$-$V$ characteristic in experiment, the biasing voltage $V$ on the probe is swept (i.e. varied periodically). The exact voltage shape is irrelevant, though we most often encounter the sawtooth (zig-zag) shape and the sine shape. The biasing voltage $V$ is then plotted against the current $I$ flowing from the probe to the ground and the curve is fitted with the exponential.

This notebook performs $I$-$V$ characteristic fitting throughout the current discharge. It documents the process step by step and concludes with drawing the temporal evolution of the ion saturated current $I_{sat}$, the probe floating potential $V_f$ and the electron temperature $T_e$.

**Note: All the time variables are given in seconds.**

First we import basic libraries: Numpy and Matplotlib. We will import more libraries throughout the notebook as needed.

In [1]:

```
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
#Pick the discharge to analyse
shot_no = 35801 #35428 #test discharge for which the notebook will definitely work
shot = shot_no
```

The Langmuir probe we shall be working with is placed on the PetiProbe.

(The Langmuir probe is the small metal pin on the right.)

The data directory of the PetiProbe is `http://golem.fjfi.cvut.cz/shots/{shot}/Diagnostics/PetiProbe/`

. Here, we write the function `get_data`

to download the data.

In [2]:

```
from urllib.error import HTTPError # recognise the error stemming from missing data
import pandas as pd # for reading csv files
#Define an exception which will be raised if the data is missing and stop the notebook execution
class StopExecution(Exception):
def _render_traceback_(self):
pass
def get_data(shot, identifier):
URL = "http://golem.fjfi.cvut.cz/shots/{shot}/Diagnostics/PetiProbe/{identifier}.csv"
url = URL.format(shot=shot, identifier=identifier)
try:
df = pd.read_csv(url, names=['time', identifier], index_col='time')
except HTTPError:
print('File not found at %s. Aborting notebook execution.' % url)
raise StopExecution
t = np.array(df.index)
data = np.transpose(np.array(df))[0]
return t, data
```

The biasing voltage $V$ is collected under the name `U_bias`

. The voltage proportional to the probe current is called `U_current`

. The probe current can be calculated as $I = V/R$, where $R=46.7 \, \Omega$ is the measuring resistor resistance.

In the following, we load this data for the current shot, calculate the probe current $I$ and plot the time evolution of $I$ and $V$. Notice that at the discharge beginning, the current isn't flat zero. This is the effect of the parasitic current, which we will discuss shortly.

In [3]:

```
#Load the probe data
t, U_bias = get_data(shot, 'U_bias')
t, U_current = get_data(shot, 'U_current')
I = U_current/46.7 #probe current
V = U_bias # probe voltage
#Prepare the figure
fig = plt.figure('Time evolution of the raw probe data', figsize=(12,5))
fig.tight_layout()
#Plot the probe current
ax1 = plt.subplot(121)
ax1.set_title('Raw probe current')
plt.plot(t*1000, I*1000)
plt.xlabel('$t$ [ms]')
plt.ylabel('$I$ [mA]')
plt.grid(True)
plt.axhline(c='k')
#Plot the probe voltage
ax2 = plt.subplot(122)
ax2.set_title('Raw probe voltage')
plt.plot(t*1000, V)
plt.xlabel('$t$ [ms]')
plt.ylabel('$V$ [V]')
plt.grid(True)
plt.axhline(c='k')
# Cut all the signals so that they start at t=0 (for later calculation purposes)
I = I[t>=0]
V = V[t>=0]
t = t[t>=0]
# Remove infinities and NaNs from the signals
def remove_bad_numbers(data):
n_bad = np.sum(np.isnan(data) | np.isinf(data))
if np.isnan(data[0]) or np.isinf(data[0]):
data[0] = 0
for i in range(1, data.size):
if np.isnan(data[i]) or np.isinf(data[i]):
data[i] = data[i-1]
return data, n_bad
I, n_bad = remove_bad_numbers(I)
print('Probe current: %s infinities removed' % n_bad)
V, n_bad = remove_bad_numbers(V)
print('Probe voltage: %s infinities removed' % n_bad)
```

Probe current: 0 infinities removed Probe voltage: 0 infinities removed

The parasitic current appears due to the capacity of the data collection system. At high sweeping frequencies, the wires behave like capacitors and cause current oscillations proportional to the time derivative of the biasing voltage. This parasitic current adds up with the probe current, distorting it.

$I_{total}(V) = I_{probe}(V) + c \cdot \frac{dV}{dt}$

Since the biasing voltage is largely independent of the plasma parameters, $V(t)$ is periodically constant throughout the discharge and so is the parasitic current. We use this in the parasitic signal reconstruction and removal.

First, we sample the parasitic current at the beginning of the discharge, where $I_{probe}=0$ and $I_{total}=c \cdot \frac{dV}{dt}$. This is the time period between the opening of the $B_t$ capacitor banks and the opening of the current drive capacitor banks.

In [4]:

```
from urllib.request import urlopen
#Define a function for downloading single values from the GOLEM database
def get_parameter(url, shot, silent=False):
URL = 'http://golem.fjfi.cvut.cz/shots/%i/%s' % (shot, url)
if not silent:
print(URL)
f = urlopen(URL)
try:
return np.loadtxt(f)
except ValueError: # the data couldn't be converted to a row of numbers
return np.array([np.nan])
#Load the time of the thyristor opening
Tcd = get_parameter('Production/Parameters/Tcd', shot) * 1e-6 #s
TBt = get_parameter('Production/Parameters/TBt', shot) * 1e-6 #s
T0 = Tcd - TBt #time before which the parasitic signal is sampled
if T0 == 0: #set SOME value so the script doesn't fail
print('Warning: T0 set arbitrarily.')
T0 = 0.001
print('T0 = %.3f ms' % (T0*1000))
#Plot the parasitic signal
plt.figure('Parasitic signal sample', figsize=(8,6))
t_sample = t[t<T0]
I_sample = I[t<T0]
plt.plot(t_sample*1000, I_sample*1000)
plt.title('Parasitic signal sample')
plt.xlabel('$t$ [ms]')
plt.ylabel('$I$ [mA]')
plt.grid(True)
```

We want to "clone" this sample and cover the rest of the discharge with it. To do that, we need to know exactly how long its period is. We load this from the database, where the sweeping frequency `f_fg`

is stored.

In [5]:

```
# Load the sweeping frequency, calculate the period and print them
f = get_parameter('Diagnostics/PetiProbe/Parameters/f_fg', shot) #Hz
T = 1./f #s
print('f = %.1f kHz' % (f*1e-3))
print('T = %.3f ms' % (T*1e3))
```

http://golem.fjfi.cvut.cz/shots/35801/Diagnostics/PetiProbe/Parameters/f_fg f = 5.0 kHz T = 0.200 ms

Next, we pick a few whole periods of the parasitic signal from the discharge beginning and clone the entire parasitic signal from them. Finally, we subtract the parasitic current from the total current, retrieving the probe current alone.

In [6]:

```
# Clone the parasitic current sample
n = int(np.floor(T0/T)) #number of whole periods in the parasitic signal
par_sample = I[t < T*n] #parasitic signal sample, exactly n periods long
n_final = int(np.ceil( t[-1] / (T*n) )) #number of sample repetitions required to cover the whole probe signal
par = np.hstack(n_final * [par_sample]) #extended parasitic signal covering the probe signal and a bit
par = par[:t.size] #extended parasitic signal covering exactly the whole probe signal
#Plot the total current and the parasitic current
plt.figure('Cloned parasitic signal', figsize=(8,6))
plt.title('Cloned parasitic signal')
plt.plot(t*1000, I*1000, label='Total current')
plt.plot(t*1000, par*1000, label='Parasitic current')
plt.legend()
plt.xlabel('$t$ [ms]')
plt.ylabel('$I$ [mA]')
plt.grid(True)
plt.axhline(c='k')
#Plot the probe current alone
plt.figure('Probe current after parasitic signal removal', figsize=(8,6))
plt.title('Probe current after parasitic signal removal')
I = I - par
plt.plot(t*1000, I*1000, label='Probe current')
plt.xlabel('$t$ [ms]')
plt.ylabel('$I$ [mA]')
plt.grid(True)
plt.legend()
plt.axhline(c='k')
```

Out[6]:

<matplotlib.lines.Line2D at 0x7f3f388be3d0>

The probe current $I$ and voltage $V$ are now ready to be plotted into the $I-V$ characteristic. However, we can't mix $I-V$ characteristics from different parts of the discharge - the plasma paramaters are different and so are the $I-V$ characteristics. We need to treat them separately, and that means breaking up the signal into individual periods of the sweeping voltage.

In the following, we create a list of voltage peaks `maxima`

and valleys `minima`

. Specifically, we detect the first peak position in $V$ and "predict" the following peaks based on the sweeping period.

In [7]:

```
from scipy.signal import find_peaks
from scipy.interpolate import UnivariateSpline
#Prepare figure
mask = (t < T)
fig, axs = plt.subplots(1, 2, num='Probe voltage maxima detection', figsize=(11,5))
fig.tight_layout()
fig.subplots_adjust(wspace=0.3)
#Select the first period of the voltage signal and plot it
axs[0].set_title('First voltage period')
axs[0].plot(1000*t[mask], V[mask])
axs[0].set_xlabel('$t$ [ms]')
axs[0].set_ylabel('$V$ [V]')
axs[0].grid(True)
#Smooth the voltage and plot it
spline = UnivariateSpline(t[mask], V[mask])
V_smooth = spline(t[mask])
axs[0].plot(1000*t[mask], V_smooth)
#Find the index of its peak and plot it
peaks, _ = find_peaks(V_smooth)
first_peak_index = peaks[0]
axs[0].plot(1000*t[first_peak_index], V_smooth[first_peak_index], 'ro')
#Find the number of samples that makes up one period of the probe voltage
dt = t[1] - t[0]
N_period = int(T/dt)
#Create a list of maxima and minima indexes on the probe voltage
maxima = np.array([i*N_period for i in range(100000)]) + first_peak_index
maxima = maxima[maxima < t.size]
minima = maxima[:-1] + int(N_period/2)
extremes = np.sort(np.hstack((minima, maxima)))
#Plot the voltage maxima and minima for check
axs[1].set(title='Probe voltage', xlabel='$t$ [ms]', ylabel='$V$ [V]')
axs[1].plot(t*1000, V)
axs[1].plot(1000*t[mask], V_smooth)
axs[1].plot(1000*t[maxima], V[maxima], 'ro')
axs[1].plot(1000*t[minima], V[minima], 'bo')
axs[1].grid(True)
axs[1].axhline(c='k')
#plt.xlim(0,2)
```

Out[7]:

<matplotlib.lines.Line2D at 0x7f3f32716850>

As an $I-V$ characteristic example, we take the first sweeping voltage period starting after $t = 7$ ms. We plot the $I$-$V$ characteristics separately for the voltage ramp up and ramp down to show any potential hysteresis.

In [8]:

```
#Define where the sample extrema should start
t_ext = t[extremes]
sample_ext = extremes[t_ext>7e-3]
#Pick samples for the first I-V characteristic
sample_t1 = t[sample_ext[0] : sample_ext[1]]
sample_I1 = I[sample_ext[0] : sample_ext[1]]
sample_V1 = V[sample_ext[0] : sample_ext[1]]
#Pick samples for the second I-V characteristic
sample_t2 = t[sample_ext[1] : sample_ext[2]]
sample_I2 = I[sample_ext[1] : sample_ext[2]]
sample_V2 = V[sample_ext[1] : sample_ext[2]]
#Plot the time evolution of the probe voltage
plt.figure('One period of probe voltage', figsize=(8,4))
plt.title('One period of probe voltage')
plt.plot(sample_t1*1000, sample_V1, 'r.')
plt.plot(sample_t2*1000, sample_V2, 'b.')
plt.xlabel('$t$ [ms]')
plt.ylabel('$V$ [V]')
plt.grid(True)
plt.axhline(c='k')
#Plot the time evolution of the probe current
plt.figure('One period of probe current', figsize=(8,4))
plt.title('One period of probe current')
plt.plot(sample_t1*1000, sample_I1*1000, 'r.')
plt.plot(sample_t2*1000, sample_I2*1000, 'b.')
plt.xlabel('$t$ [ms]')
plt.ylabel('$I$ [mA]')
plt.grid(True)
plt.axhline(c='k')
#Plot the two I-V characteristics
plt.figure('Sample of the I-V characteristic', figsize=(8,4))
plt.title('Sample of the I-V characteristic')
plt.plot(sample_V1, sample_I1*1000, 'r.')
plt.plot(sample_V2, sample_I2*1000, 'b.')
plt.xlabel('$V$ [V]')
plt.ylabel('$I$ [mA]')
plt.grid(True)
plt.axhline(c='k')
```

Out[8]:

<matplotlib.lines.Line2D at 0x7f3f325c9ca0>

$I$-$V$ characteristics often contain a lot of fluctuations. This can mean that the exponential fit will not converge. In the past, when fitting techniques were slow, this was alleviated by applying the *bin average* to the data.

Bin averaging is breaking the data into individual "bin" and averaging them within that bin. Typically, the x axis (here the biasing voltage $V$) is split into even parts and all the samples within a given part (bin) are averaged. Each average is given an errorbar, calculated as the standard deviation of the averaged data. The errorbars can then be used as weights during the characteristic fitting.

Today's fitting techniques are, however, much more powerful than they used to be. Bin averaging no longer provides faster result but, on the contrary, distorts the results. This is becuase its errorbars, pretty as they are, are not very representative of the actual uncertainties in the signal. It is much better to fit the $I-V$ characteristic as we collect it, sample by sample.

We will demonstrate the difference between fitting the full and the bin-averaged $I-V$ characteristic in this notebook. Thereafter, we will use bin averaging to get a good first estimate of the plasma parameters. This can improve the fit quality of the real data.

In the following, we calculate the bin average of the two $I$-$V$ characteristics shown in the figure above.

In [9]:

```
#Define a function which performs bin averaging
def bin_average(x, y, N_bins=100):
dx = (np.nanmax(x) - np.nanmin(x)) / N_bins
bins = np.linspace(np.nanmin(x) + dx/2, np.nanmax(x) - dx/2, N_bins)
y_binned = []
yerr = []
for bin_centre in bins:
y_samples = y[(bin_centre-dx/2 < x) & (x <= bin_centre+dx/2)]
y_binned.append(np.mean(y_samples))
yerr.append(np.std(y_samples))
return bins, np.array(y_binned), np.array(yerr)
#Select data from both the above I-V characteristics and apply bin averaging to them
mask = slice(sample_ext[0], sample_ext[2])
V_binned, I_binned, Ierr = bin_average(V[mask], I[mask], N_bins=40)
#Plot the bin-averaged I-V characteristic
plt.figure('Bin-averaged I-V characteristic', figsize=(8,6))
plt.title('Bin-averaged I-V characteristic')
plt.errorbar(V_binned, I_binned, yerr=Ierr, color='r', marker='o', ls='None', capsize=3, elinewidth=0.5)
plt.grid(True)
plt.axhline(c='k')
plt.xlabel('$V$ [V]')
plt.ylabel('$I$ [mA]')
```

Out[9]:

Text(0, 0.5, '$I$ [mA]')

Next, we fit this binned $I$-$V$ characteristic by the exponential function and print the resulting plasma parameters.

Notice that only a part of the curve is used as fit input, in particular the data points whose probe current value is above $-2 I_{sat}$. This improves the fit stability by disregarding the more volatile datapoints near the electron branch of the $I$-$V$ characteristic.

In [10]:

```
import scipy.optimize as optimization # for fitting
#Define the 3-parameter fit function; its arguments are a=Isat, b=Vf and c=Te
def fit3par(x,a,b,c):
return a*(1-np.exp((x-b)/c))
#Trim the I-V characteristic
Isat_estimate = np.mean(I_binned[:20])
V_for_fit = V_binned[I_binned > -2*Isat_estimate]
I_for_fit = I_binned[I_binned > -2*Isat_estimate]
Ierr_for_fit = Ierr[I_binned > -2*Isat_estimate]
#Fit the I-V characteristic
p0 = [0.005, -5, 15] #initial guess of plasma parameters
popt, pcov = optimization.curve_fit(fit3par, V_for_fit, I_for_fit, p0, sigma=Ierr_for_fit)
#Keep a copy of the results for later
popt_bin = popt
pcov_bin = pcov
#Plot the fitted I-V characteristic
plt.figure('3-parameter fit of the binned I-V characteristic', figsize=(8,6))
plt.title('3-parameter fit of the binned I-V characteristic')
plt.errorbar(V_binned, I_binned, yerr=Ierr, color='k', marker='o', ls='None', capsize=3,
elinewidth=0.5, alpha=0.3, label='disregarded samples')
plt.errorbar(V_for_fit, I_for_fit, yerr=Ierr_for_fit, color='r', marker='o', ls='None',
capsize=3, elinewidth=0.5, label='samples used for fitting')
x = np.linspace(V_for_fit.min(), V_for_fit.max(), 100)
plt.plot(x, fit3par(x,*popt), color = 'b')
plt.grid(True)
plt.legend()
plt.axhline(c='k')
plt.xlabel('$V$ [V]')
plt.ylabel('$I$ [mA]')
#Print the fit results
print('Isat = %.1f mA' % (popt[0]*1000))
print('V_fl = %.1f V' % popt[1])
print('Te = %.1f eV' % popt[2])
```

Isat = 3.0 mA V_fl = -39.4 V Te = 30.3 eV

We use the fit of the bin-averaged data as initial guesses for the fit of the full data.

In [11]:

```
#Name the data to fit
V_sample = V[mask]
I_sample = I[mask]
#Trim the I-V characteristic
Isat_estimate = popt[0] #Isat of the bin-averaged I-V characteristic
V_for_fit = V_sample[I_sample > -2*Isat_estimate]
I_for_fit = I_sample[I_sample > -2*Isat_estimate]
#Fit the I-V characteristic
p0 = popt_bin #initial guess of plasma parameters
popt, pcov = optimization.curve_fit(fit3par, V_for_fit
```