.. Copyright 2018 Peter K. G. Williams and collaborators. Licensed under the Creative Commons Attribution-ShareAlike 4.0 International License. .. _make-training-set: Make Your Training Set ====================== In order to train a neural network, you need something to train it on! The training data are a bunch of samples of your detailed calculation. That is: given a choice of some input parameters (e.g., a harmonic number), your detailed calculation will produce eight output parameters (the Stokes radiative transfer coefficients). The exact number of input parameters can vary depending on what particle distribution you’re modeling. *Neurosynchro* needs a lot of samples of this function to develop a good approximation to it. From the standpoint of *neurosynchro*, the tool that you use to **generate** those data doesn’t matter. What matters is the **format** in which the training data are stored. However, it is true that *neurosynchro* has been designed to work with `rimphony `_, which has `sample programs `_ that will generate training data sets in the format described below. .. attention:: Read this section carefully! *Neurosynchro* bakes in some assumptions that might surprise you. The training data fed to *neurosynchro* must be saved as a set of plain textual tables stored in a single directory. The file names must end in ``.txt``. Each table file is line-oriented. The first line is a header, and all subsequent lines give samples of the exact calculation. For example:: s(log) theta(lin) p(lin) k(lin) time_ms(meta) j_I(res) alpha_I(res) j_Q(res) alpha_Q(res) j_V(res) alpha_V(res) rho_Q(res) rho_V(res) 1.95393e2 8.8966e-1 3.49e0 1.41e0 2.21270e3 8.42819e-35 2.26887e-8 -6.439070e-35 -1.80416e-8 1.17279e-35 3.56901e-9 3.2947e-7 3.8318e-5 6.51244e2 8.0044e-1 3.28e0 1.94e0 3.30821e3 6.88161e-36 9.03608e-10 -5.226766e-36 -7.17748e-10 6.41000e-37 9.59868e-11 2.4798e-8 1.2309e-5 The header lines gives the names of each column. Each column name includes a suffix indicating its type: ``lin`` An input parameter that is sampled linearly in some range. At the moment, the way in which the parameter was sampled isn’t actually used anywhere in the code. But it can be a helpful piece of information to have handy when specifying how the neural nets will be trained. ``log`` An input parameter that is sampled logarithmically in some range. ``res`` A output result from the computation. ``meta`` A metadata value that records some extra piece of information about the sample in question. Above, the ``time_ms`` metadata item records how many milliseconds it took for *Rimphony* to calculate the sample. This is useful for identifying regions of parameter space where the code runs into numerical problems. So, in the example above, there are four input parameters. The detailed calculation shows that when the harmonic number *s* ≃ 195, observing angle *theta* ≃ 0.9 radians, energy power-law index *p* ≃ 3.5, and pitch-angle distribution index *k* ≃ 1.4, the emission coefficient *j_I* ≃ 8 × 10 :superscript:`-35` erg s :superscript:`-1` cm :superscript:`-3` Hz :superscript:`-1` sr :superscript:`-1`. The *rimphony* calculation of that result took about 2.2 seconds. Something like 100,000 rows is enough to train some good neural networks. It doesn't matter how many different files those rows are split into. .. tip:: *Neurosynchro* takes a directory of files as an input, rather than one specific file, since the former is easier to create on a big HPC cluster where you can launch 1,000 jobs to compute coefficients for you in parallel. .. tip:: Each individual input file can be easily loaded into a `Pandas `_ data frame with the function call ``pandas.read_table()``. Important assumptions --------------------- In the example above, there are just four input parameters: *s*, *theta*, *p*, and *k*. These are likely not the usual parameters that you see when thinking about synchrotron radiation. There’s an important reason for this! *Neurosynchro* bakes in three key assumptions about how synchrotron radiation works: 1. You must compute all of your coefficients **at an observing frequency of 1 Hz**! This is because synchrotron coefficients scale simply with frequency: emission coefficients linearly with ν, absorption coefficients as 1/ν. So the observing frequency doesn’t actually need to be part of the neural network regression. 2. You must compute all of your coefficients **at an energetic particle density of 1 per cubic centimeter**! Here too, all the synchrotron coefficients scale simply with the energetic particle density (namely, they all scale linearly). Once again this means that the energetic particle density doesn´t actually need to be part of the regression. 3. You only need to do computations where the angle between the line of sight and the magnetic field, θ, is less than 90°. *Neurosynchro* assumes that all parameters are symmetric with regards to θ = 90° except the Stokes V components, which negate. Given those assumptions, almost every part of *neurosynchro* expects that the following input parameters will exist: *s* The harmonic number of interest, such that: .. math:: \nu_\text{obs} = s \nu_\text{cyc} = s \frac{e B}{2 \pi m_e c} *theta* The angle between the direction of radiation propagation and the local magnetic field, *in radians*. Given the known ways in which synchrotron coefficients scale, the standard quartet of input parameters ``nu``, ``theta``, ``n_e``, and ``B`` can be reduced to these two parameters, plus scalings that are known *a priori*. In the example above, the two remaining parameters *p* and *k* relate to the shape of the particle distribution function. .. _standard-output-parameters: On the output side, *neurosynchro* applies some more assumptions to ensure that it always produces physically sensible output (i.e., that the polarized Stokes emission parameters are never bigger than the total-intensity Stokes emission parameter). It also uses the standard linear polarization basis in which Stokes Q is aligned with the magnetic field, which means that the Stokes U parameters are zero by construction (see, for example, Shcherbakov & Huang (2011), `DOI 10.1111/j.1365-2966.2010.17502.x `_, equation 17). So unless you are doing something very unusual, your tables should always contain eight output results: *j_I* The calculated Stokes I emission coefficient, in erg s :superscript:`-1` cm :superscript:`-3` Hz :superscript:`-1` sr :superscript:`-1`. *j_Q* The Stokes Q emission coefficient, in the same units. *j_V* The Stokes V emission coefficient, in the same units. *alpha_I* The calculated Stokes I absorption coefficient, in cm :superscript:`-1`. *alpha_Q* The Stokes Q absorption coefficient, in the same units. *alpha_V* The Stokes V absorption coefficient, in the same units. *rho_Q* The Faraday conversion coefficient (mixing Stokes U and Stokes V), in the same units. *rho_V* The Faraday rotation coefficient (mixing Stokes Q and Stokes U), in the same units. **Next**: :ref:`transform your training set `.