--- format: markdown toc: no title: Data mining categories: Analysis, SW ... Golem database supports web access, however it is very ineffective to download separately huge number of scalar parameters, therefore it is recommender to use function ``get_history`` from [pygolem toolkit](/SW/pygolem). Function ``get_history`` is in [pygolem_lite.modules](/SW/pygolem/modules.py). This method is used in [HistoricalAnalysis](http://golem.fjfi.cvut.cz/operation/currentshot/analysis/Basics/1012HistoricalAnalysis.ON/) webpage. *Although only python version is implemented, data can be easily saved to `.mat` files. See the second example* Example of use: ~~~ { .python } from pygolem_lite import get_history from matplotlib.pyplot import plot shots = range(10000, 10500) data = get_history("pressure", shots) plot(shots, data, ',') show() ~~~ Downloading data for breakdown studies [link](data_mining/data_mine.py) : ~~~ { .python } from pygolem_lite.modules import get_history from numpy import savez, mean, isnan diags = ['plasma', 'gas_filling', 'pressure_initial', 'pressure', 'Ub', 'Tb', 'Ucd', 'Tcd', 'Ust', 'Tst', 'preionization', 'breakdown_voltage', 'loop_voltage_max', 'plasma_life' , 'transformator_saturation'] shots = range(5000, 10900) # range of shot numbers data = dict() for diag in diags: data[diag] = get_history(diag, shots) print "Success rate ", (1-mean(isnan(data[diag])))*100, ' - ', diag for diag in [ 'plasma_status', 'session_name']: # load string variables data[diag] = get_history(diag, shots, dtype="str") savez('data', shots = shots, data = data) # ! save data for matlab ! from scipy.io import savemat data['shots'] = shots savemat('data', data) ~~~ **Note:** - if plasma_status is ``nan`` => some serious failure of diagnostics , - ``plasma_life`` > 15 ms is probably error - ``loop_voltage_max`` < 5V is probably DAS error - ``pressure`` > 100mPa is unphysical (probably opened chamber) - ``transformator_saturation`` if more than 0.8 and plasma == 1 => probably false plasma detect - ``session_name`` - some sessions should be avoided ie Technological/*, Vacuum/ * Search closest shot script ----------------------------------- This [script](data_mining/Closest_Shot.py) finds shots close in database to user selected variables. Example of use: ~~~ { .bash } ./Closest_Shot.py --Ub=500 --Ucd=200 --Ubd=50 --Tbd=0 --Tcd=0.003 --gas_filling=1 --preionization=1 --pressure_request=20 ~~~ Five closest shots will be downloaded to closest_shots file. Shots are searched in a preloaded [data file](data_mining/data_close.npz) - (shots 5000 to 10700) List of inputs for closest shot search script: ~~~ { .python } diags = ["Ub","Ucd","Ubd", 'Tbd', 'Tcd', 'gas_filling', 'preionization', 'pressure_request', 'plasma', 'loop_voltage_max', 'plasma_life' , 'transformator_saturation', 'plasma_status', 'session_name'] ~~~ **Note:** - Plotting should be removed later - If some variables are not user defined, they are treated as irrelevant dimensions Parallel data downloader: --------------------------------------- [download.py](http://golem.fjfi.cvut.cz:5001/SW/pygolem/download.py) - diagnostic and shot range can be setup in the script