Data mining

Golem database supports web access, however it is very ineffective to download separately huge number of scalar parameters, therefore it is recommender to use function get_history from pygolem toolkit. Function get_history is in pygolem_lite.modules. This method is used in HistoricalAnalysis webpage. Although only python version is implemented, data can be easily saved to .mat files. See the second example

Example of use:

from pygolem_lite import get_history
from matplotlib.pyplot import plot
shots = range(10000, 10500) 
data = get_history("pressure", shots)
plot(shots, data, ',')
show()

Downloading data for breakdown studies link :

from pygolem_lite.modules import get_history
from numpy import savez, mean, isnan
diags = ['plasma', 'gas_filling', 'pressure_initial', 'pressure', 'Ub',
 'Tb', 'Ucd', 'Tcd', 'Ust', 'Tst', 'preionization',  'breakdown_voltage', 'loop_voltage_max',
  'plasma_life' , 'transformator_saturation']
shots = range(5000, 10900)  # range of shot numbers 
data = dict()
for diag in diags:                                                                                                                                                        
        data[diag] = get_history(diag, shots)
        print  "Success rate ", (1-mean(isnan(data[diag])))*100, ' - ', diag
for diag in [ 'plasma_status',  'session_name']:   # load string variables                                                                                                                                                        
        data[diag] = get_history(diag, shots, dtype="str")

savez('data', shots = shots, data =  data)

# ! save data for matlab !
from scipy.io import savemat
data['shots'] = shots
savemat('data', data)

Note:

  • if plasma_status is nan => some serious failure of diagnostics ,
  • plasma_life > 15 ms is probably error
  • loop_voltage_max < 5V is probably DAS error
  • pressure > 100mPa is unphysical (probably opened chamber)
  • transformator_saturation if more than 0.8 and plasma == 1 => probably false plasma detect
  • session_name - some sessions should be avoided ie Technological/, Vacuum/

Search closest shot script

This script finds shots close in database to user selected variables. Example of use:

./Closest_Shot.py --Ub=500 --Ucd=200 --Ubd=50 --Tbd=0  --Tcd=0.003 --gas_filling=1 --preionization=1 --pressure_request=20

Five closest shots will be downloaded to closest_shots file. Shots are searched in a preloaded data file - (shots 5000 to 10700)

List of inputs for closest shot search script:

diags = ["Ub","Ucd","Ubd", 'Tbd', 'Tcd',  'gas_filling', 'preionization', 'pressure_request',
 'plasma', 'loop_voltage_max',  'plasma_life' , 'transformator_saturation',  'plasma_status',  'session_name']

Note:

  • Plotting should be removed later
  • If some variables are not user defined, they are treated as irrelevant dimensions

Parallel data downloader:

download.py - diagnostic and shot range can be setup in the script