neurosynchro¶
Neurosynchro is a small Python package for creating and using neural networks to quickly approximate the coefficients needed for fully-polarized synchrotron radiative transfer. It builds on the Keras deep learning library and is licensed under the MIT License.
This documentation refers to version 0.1 of neurosynchro.
- Take me to the tutorial where I get to train some neural networks! (The training can take a few hours if you’re on a laptop, though.)
- Take me to the tutorial where I just calculate some coefficients! (This tutorial is much quicker … maybe too quick.)
Who It’s For¶
Neurosynchro solves a very specific problem.
You are a person that wants to perform some numerical radiative transfer of synchrotron radiation in the Stokes basis. In this problem, for each point in space you need to calculate eight matrix coefficients: three numbers describing emission (total intensity, linear polarization, and circular polarization), three describing absorption, and two “Faraday terms” (rotation of linear polarization, conversion of circular and linear).
You have a code, such as Rimphony or Symphony, that can calculate these numbers for you as a function of some input physical parameters such as an electron temperature or an energy power law index. You also have a code such as grtrans that can perform your radiative transfer integration in the Stokes basis.
The problem is that your synchrotron calculator code is very precise but it’s just too slow for your needs: you need to do lots of integrations for arbitrary input parameters, and things are taking too long.
Enter neurosynchro. This package provides a toolkit for training neural networks to quickly approximate the coefficients you need. Once you’ve trained your network, you can get very good approximations of the necessary coefficients really quickly — something like 10,000 times faster than the detailed calculations.
Installation¶
How to Install neurosynchro¶
Neurosynchro is a pure-Python package, so it is not difficult to install by itself. However, it requires heavyweight Python libraries like Keras, which contain a lot of compiled code, so the recommended means of installation is though an Anaconda Python system.
Pre-built packages of neurosynchro are available through conda-forge, a community-based project that uses the conda package manager. If you are using a conda
-based
Python installation, the quickest path to getting neurosynchro going is to
run the following commands in your terminal:
$ conda config --add channels conda-forge
$ conda install neurosynchro
Note, however, that conda-forge
provides rebuilds of virtually every
package in the stock Anaconda system, and this installation method will
configure your system to prefer them. You may suddenly see lots of package
version changes related to this. You can install neurosynchro without
committing to using conda-forge
for everything by skipping the conda
config
command and instead running:
$ conda install -c conda-forge neurosynchro
In the author’s experience, however, it is better to go the first route. This
latter route won’t get package updates from conda-forge
and can lead to
vexing dependency mismatches further down the line.
Finally, it is also possible to try to install neurosynchro through pip in the standard fashion, with pip install
neurosynchro
. However, as mentioned above, this can easily get hairy if one
of the big binary dependencies isn’t available. See neurosynchro’s
requirements list for
a list of what you’ll need.
Tutorial, AKA, How to Use It¶
There are three main ways that you might use neurosynchro:
- You might have the data files for a useful, trained neural network on hand. In this case, all you need to do is load it up and use it. We have provided sample trained neural networks for download — the above page walks you through downloading them. So, this is the quickest way to get started.
- You might have a training set on hand, but not have created your neural networks. In this case, you need to go through the process of training a new set of neural networks. Once again, an example training set is available for download. The training takes a few dozen core-hours of CPU time, so it can take a little while.
- Finally, if you are investigating a new or unusual region of parameter space, you might not even have an existing training set to work with. In this case you will need to start at the very beginning by making your own training set. How you do so is beyond the purview of neurosynchro, but in the above pages we document the format that the training data must obey.
Or, to break down the various steps in roughly sequential order:
Make Your Training Set¶
In order to train a neural network, you need something to train it on!
The training data are a bunch of samples of your detailed calculation. That is: given a choice of some input parameters (e.g., a harmonic number), your detailed calculation will produce eight output parameters (the Stokes radiative transfer coefficients). The exact number of input parameters can vary depending on what particle distribution you’re modeling. Neurosynchro needs a lot of samples of this function to develop a good approximation to it.
From the standpoint of neurosynchro, the tool that you use to generate those data doesn’t matter. What matters is the format in which the training data are stored. However, it is true that neurosynchro has been designed to work with rimphony, which has sample programs that will generate training data sets in the format described below.
Attention
Read this section carefully! Neurosynchro bakes in some assumptions that might surprise you.
The training data fed to neurosynchro must be saved as a set of plain
textual tables stored in a single directory. The file names must end in
.txt
. Each table file is line-oriented. The first line is a header, and
all subsequent lines give samples of the exact calculation. For example:
s(log) theta(lin) p(lin) k(lin) time_ms(meta) j_I(res) alpha_I(res) j_Q(res) alpha_Q(res) j_V(res) alpha_V(res) rho_Q(res) rho_V(res)
1.95393e2 8.8966e-1 3.49e0 1.41e0 2.21270e3 8.42819e-35 2.26887e-8 -6.439070e-35 -1.80416e-8 1.17279e-35 3.56901e-9 3.2947e-7 3.8318e-5
6.51244e2 8.0044e-1 3.28e0 1.94e0 3.30821e3 6.88161e-36 9.03608e-10 -5.226766e-36 -7.17748e-10 6.41000e-37 9.59868e-11 2.4798e-8 1.2309e-5
The header lines gives the names of each column. Each column name includes a suffix indicating its type:
lin
- An input parameter that is sampled linearly in some range. At the moment, the way in which the parameter was sampled isn’t actually used anywhere in the code. But it can be a helpful piece of information to have handy when specifying how the neural nets will be trained.
log
- An input parameter that is sampled logarithmically in some range.
res
- A output result from the computation.
meta
- A metadata value that records some extra piece of information about
the sample in question. Above, the
time_ms
metadata item records how many milliseconds it took for Rimphony to calculate the sample. This is useful for identifying regions of parameter space where the code runs into numerical problems.
So, in the example above, there are four input parameters. The detailed calculation shows that when the harmonic number s ≃ 195, observing angle theta ≃ 0.9 radians, energy power-law index p ≃ 3.5, and pitch-angle distribution index k ≃ 1.4, the emission coefficient j_I ≃ 8 × 10 -35 erg s -1 cm -3 Hz -1 sr -1. The rimphony calculation of that result took about 2.2 seconds.
Something like 100,000 rows is enough to train some good neural networks. It doesn’t matter how many different files those rows are split into.
Tip
Neurosynchro takes a directory of files as an input, rather than one specific file, since the former is easier to create on a big HPC cluster where you can launch 1,000 jobs to compute coefficients for you in parallel.
Tip
Each individual input file can be easily loaded into a Pandas data frame with the function call
pandas.read_table()
.
Important assumptions¶
In the example above, there are just four input parameters: s, theta, p, and k. These are likely not the usual parameters that you see when thinking about synchrotron radiation. There’s an important reason for this!
Neurosynchro bakes in three key assumptions about how synchrotron radiation works:
- You must compute all of your coefficients at an observing frequency of 1 Hz! This is because synchrotron coefficients scale simply with frequency: emission coefficients linearly with ν, absorption coefficients as 1/ν. So the observing frequency doesn’t actually need to be part of the neural network regression.
- You must compute all of your coefficients at an energetic particle density of 1 per cubic centimeter! Here too, all the synchrotron coefficients scale simply with the energetic particle density (namely, they all scale linearly). Once again this means that the energetic particle density doesn´t actually need to be part of the regression.
- You only need to do computations where the angle between the line of sight and the magnetic field, θ, is less than 90°. Neurosynchro assumes that all parameters are symmetric with regards to θ = 90° except the Stokes V components, which negate.
Given those assumptions, almost every part of neurosynchro expects that the following input parameters will exist:
- s
The harmonic number of interest, such that:
\[\nu_\text{obs} = s \nu_\text{cyc} = s \frac{e B}{2 \pi m_e c}\]- theta
- The angle between the direction of radiation propagation and the local magnetic field, in radians.
Given the known ways in which synchrotron coefficients scale, the standard
quartet of input parameters nu
, theta
, n_e
, and B
can be
reduced to these two parameters, plus scalings that are known a priori. In
the example above, the two remaining parameters p and k relate to the
shape of the particle distribution function.
On the output side, neurosynchro applies some more assumptions to ensure that it always produces physically sensible output (i.e., that the polarized Stokes emission parameters are never bigger than the total-intensity Stokes emission parameter). It also uses the standard linear polarization basis in which Stokes Q is aligned with the magnetic field, which means that the Stokes U parameters are zero by construction (see, for example, Shcherbakov & Huang (2011), DOI 10.1111/j.1365-2966.2010.17502.x, equation 17). So unless you are doing something very unusual, your tables should always contain eight output results:
- j_I
- The calculated Stokes I emission coefficient, in erg s -1 cm -3 Hz -1 sr -1.
- j_Q
- The Stokes Q emission coefficient, in the same units.
- j_V
- The Stokes V emission coefficient, in the same units.
- alpha_I
- The calculated Stokes I absorption coefficient, in cm -1.
- alpha_Q
- The Stokes Q absorption coefficient, in the same units.
- alpha_V
- The Stokes V absorption coefficient, in the same units.
- rho_Q
- The Faraday conversion coefficient (mixing Stokes U and Stokes V), in the same units.
- rho_V
- The Faraday rotation coefficient (mixing Stokes Q and Stokes U), in the same units.
Next: transform your training set.
Download a Sample Training Set¶
In order to train a neural network, you need something to train it on!
In neurosynchro, the training data are a bunch of samples of some detailed synchrotron calculation. That is: given a choice of some input parameters (e.g., a harmonic number), the detailed calculation produces eight output parameters (the Stokes radiative transfer coefficients). The exact number of input parameters can vary depending on what particle distribution is being modeled. Neurosynchro needs a lot of samples of this function to develop a good approximation to it.
The page Make Your Training Set describes how such a training set should be parametrized and formatted. Here, we’ll just download a training set and start using it without worrying about the details.
Note
The training set is about a 300 MB download. It expands to a 700 MB directory, which we will soon reprocess into another directory that’s also about 700 MB.
First, download the example training data set. It is archived on Zenodo under DOI 10.5281/zenodo.1341154. The main data file is a bzipped tar archive named rimphony_powerlaw_s5-5e7_p1.5-7.tar.bz2. The following shell command should download it:
$ curl -fsSL https://zenodo.org/record/1341154/files/rimphony_powerlaw_s5-5e7_p1.5-7.tar.bz2 >trainingset.tar.bz2
Note
This portion of the tutorial is driven through the command line, so you should open up a terminal even if you download and unpack the data file through your web browser or some other means.
And this command should unpack it, creating a directory named
rimphony_powerlaw_s5-5e7_p1.5-7
:
$ tar xjf trainingset.tar.bz2
If you poke around, you will see that the new directory contains about 500 text files. These are the raw training data, working out to about 22 million synchrotron coefficient calculations. They are performed for an isotropic power-law model with power-law indices ranging between 1.5 and 7.
It is easiest to do the training if we rearrange the directory structure a little bit. The following commands will set things up in a way that we have found to be convenient:
$ mv rimphony_powerlaw_s5-5e7_p1.5-7 rawsamples
$ neurosynchro init-nndir rimphony_powerlaw_s5-5e7_p1.5-7
$ mv rawsamples rimphony_powerlaw_s5-5e7_p1.5-7/
$ cd rimphony_powerlaw_s5-5e7_p1.5-7/
The second command above creates a new directory called
rimphony_powerlaw_s5-5e7_p1.5-7
(the same name as was initially created,
until we renamed it) and populates it with a default configuration file,
nn_config.toml
, that we will edit later. The final command changes your
shell to work from this new directory, which will now also contain a
subdirectory named rawsamples
. Subsequent steps in the tutorial will
assume that you are still operating from this directory. Make sure to return
to it if you close and re-open your shell.
Next: transform your training set.
Transform Your Training Set¶
The standard parameters of radiative transfer in the Stokes basis are the easiest to work with in a scientific context, but they can be tricky to work with when doing numerical approximations. This is because they must obey invariants such as
which can be broken in the face of approximation noise. Therefore, neurosynchro internally uses a transformed set of parameters that don’t have to obey such invariants.
Once you have generated training set data, then, you must transform them into
this internal representation. This is done with the transform
subcommand
of the neurosynchro
command that gets installed along with the
neurosynchro Python package. It’s very straightforward to use. Assuming that
you are using the standard directory structure, just run:
$ mkdir -p transformed
$ neurosynchro transform rawsamples >transformed/all.txt
Here, rawsamples
is the name of the directory containing the training
data. The transform
subcommand prints the transformed database to standard
output, so in the example above, shell redirection is used to save the results
to the file all.txt
in a new subdirectory. With the example training set
data, the resulting text file will be hundreds of megabytes in size.
The transform
subcommand will also filter your data, discarding any rows
containing non-finite values or with outputs that do not obey the necessary
invariants to begin with. It prints a statistical summary of this filtering to
the standard error stream.
The summarize
subcommand will print out some summary information about
your training set:
$ neurosynchro summarize transformed
Here you could also give an argument of rawsamples
to analyze the
pre-transformed inputs.
Specify Your Parameter Transformations¶
The neural networks used by neurosynchro perform best when their input and output parameters are normalized. In order for neurosynchro to do this normalization, you must give it some hints about your input and output parameters.
These hints are stored in a configuration file that will go along with your
compiled neural network data. If you’re following along with the tutorial,
this file should already exist in the directory that your terminal is working
in; it was created by the neurosynchro init-nndir
command.
Before training the neural nets, however, we need to finalize the configuration file as described below.
Attention
If you’re following along with the tutorial, the final section gives important instructions you need to follow. Most of the rest of the material can be skimmed for now, though. Maybe read it while your nets are training?
How Neurosynchro transforms parameters¶
Before we go into the details of configuration, we need to describe the transformations that neurosynchro performs for its computations.
In the previous section, you transformed the output parameters of your training set. That first stage of transformation makes it so that the numerics don’t need to worry about obeying various physical invariants that apply to the Stokes-basis radiative transfer coefficients.
The numbers in the resulting output file (called transformed/all.txt
in
the example command) are referred to as physical values in the context of
neurosynchro. These are the numbers that you will eventually get back out of
the neural network.
Inside neurosynchro, physical values are transformed into what are called,
well, transformed values that are easier to work with. For instance, the
total emission coefficient j_I
should generally undergo a log transform
because its physical values span a huge dynamic range. At this stage, there is
also a question of how to deal physical inputs that are out of bounds: if you
trained your neural network on power-law indices between 1 and 3, and your
code asks neurosynchro to compute coefficients with a power-law index of
3.5, what should neurosynchro do? The “correct” answer to this question can
vary, so neurosynchro lets you configure its behavior.
Finally, for the innermost calculations the transformed values are converted to normalized values with a simple linear mapping:
Typically the mean is subtracted off and the standard deviation is divided out, so that the distribution of the normalized parameters should be approximately normal (i.e., Gaussian). In special circumstances, however, other transforms can be applied.
The nn_config.toml
configuration file¶
The neurosynchro init-nndir
initialization command will create a neural
network data directory and put one file in it: a configuration file named
nn_config.toml
. This file specifies how the input and output parameters
should be handled. The file is in TOML format, which is a simple scheme for
storing structured data. (TOML is like JSON but, in
the opinion of this writer, better.)
The default file consists of a series of stanzas each denoted by a pair of double square brackets. By default, the first stanza looks like this:
[[params]]
name = "s"
maptype = "log"
The delimiter [[params]]
indicates that this stanza defines an input
parameter. The following lines assign various settings as described below.
There are also stanzas marked with the delimiter [[results]]
, which give
configuration settings for output results. The two kinds of stanzas accept
almost identical kinds of configuration values; exceptions will be noted
below.
Allowed configuration values are as follows:
- name
- This isn’t really a configuration setting. It gives the name of the input or output parameter being configured. This should be one of the items that appears in the column headers of your training sample. Names corresponding to nonexistent columns will be ignored.
- maptype
- This setting specifies how the physical values given in your training set
are transformed. The most useful options are
direct
, which performs no transformation, andlog
, which takes the logarithm of the physical values. The full set of possible values is given below. - phys_bounds_mode
- As mentioned above, neurosynchro has to decide what to do if you ask it to
compute coefficients outside of the bounds spanned by its training set. This
parameter configures how those bounds are actually determined. There are
just two options:
empirical
andtheta
. Forempirical
, the bounds are determined by the empirical minimum and maximum physical parameter values seen in the training set. Fortheta
, the bounds are fixed to be the range between 0 and π/2. The default isempirical
, which is what makes sense in almost all cases. Thetheta
setting is recommended for thetheta
input parameter, so that calculations where θ got really close to π/2 don’t get rejected even if your training set didn’t get quite as close. - normalization_mode
- This setting specifies how to determine the parameters of the linear
transform used to obtain the final normalized values. If it is
gaussian
, the default, the mean and standard deviation of the transformed values (note: not the physical values) are used to yield an approximately normal distribution. Ifunit_interval
, the min and max of the transformed values will be used in such a way that the normalized values span the unit interval[0, 1]
. - out_of_sample
- This setting specifies what to do if a physical value lies beyond the range
of the training set. This can happen on either the input or the output
side. Your simulation might require input parameters that are beyond the
ones you trained on, but also the neural network approximator might yield
results that end up lying outside of training-set range. Possible values are
ignore
(the default),clip
, andnan
. Withignore
, the sample limits are ignored and the calculation plunges ahead recklessly. Withclip
, the input or output physical parameters are clipped to stay within the sampled physical range — note that means that you can get back results that just plain do not correspond to the parameters that you thought you were using! The neurosynchro driver code collects flags so that you can tell when this happens. Finally,nan
flags the affected calculations and causes the driver to return Not-a-Number values unconditionally. - trainer
- This setting only applies to output parameters. It specifies which scheme
will be used to train the neural network to compute this output. There is a
generic
trainer that generally does well; the list of all possibilities is given in the next section.
Map Types¶
Neurosynchro supports the following transformations between “physical” parameter values and internal “transformed” values:
- abs_log
- The transformed value is the logarithm of the absolute value of the physical value. This transform is not reversible on its own. It is used by the driver code for the rho_Q parameter, which both spans a large dynamic range and takes on both positive and negative values. The driver deals with this by splitting it into two components: an overall amplitude (using this mapping) and a sign term.
- direct
- The transformed value is the physical value. This is useful for parameters like power-law indices that do not span a large dynamic range.
- log
- The transformed value is the base-10 logarithm of the physical value. This is useful for parameters that span large dynamic ranges and are always positive.
- logit
The transformed value is the logit of the physical value:
\[t = \log(\frac{p}{1 - p})\]This maps a value in the range
(0, 1)
to the range(-∞, +∞)
, so to use this the physical value must be constrained to lie in the unit interval. This is the case for the “polarization share” parameters used in the transformed output parameters.- neg_log
- The transformed value is the base-10 logarithm of the negation of the physical value. This is useful for parameters that span large dynamic ranges and are always negative.
- ninth_root
- The transformed value is the ninth root of the physical value, preserving sign. This is adequate for parameters that span large-ish dynamic ranges and take on both positive and negative values. This mapping is no longer used in the recommended configuration.
- sign
- The transformed value is the signum of the physical value: either -1, 0, or 1 depending on physical value’s sign. This transform is not reversible on its own, but is used for the rho_Q parameter as described above.
Finalizing the Configuration File¶
The configuration file generated by the neurosynchro init-nndir
command
contains suggested defaults for the s and theta input parameters and the
suite of output parameters generated by the neurosynchro transform
step.
However, the command doesn’t (and can’t) know what other input parameters your
model uses, so you must edit the nn_config.toml
file to define them. Add
stanzas analogous to the example one used for the s input parameter. The
defaults are often useful, so you probably only need to ask yourself:
- Should this parameter have a
maptype
ofdirect
orlog
? - What do I want its
out_of_sample
behavior to be?
You may want to revisit this file to, for example, try a different neural network training scheme to improve neurosynchro’s performance for a certain model output parameter.
Finalizing Configuration for the Tutorial¶
For the purposes of the tutorial, here are the specific adjustments you should make to the default configuration file:
Add a stanza defining the power-law index parameter, after the stanza for the
theta
parameter:[[params]] name = "p" maptype = "direct"
Next: precalculate the domain and range of your training set.
Precalculate the Domain and Range of Your Training Set¶
There’s one more piece of preparation to do before actually training the
neural networks. For efficiency, quantities like the bounds on the physical
input and output parameters are gathered from the training set just once, then
saved in nn_config.toml
. You can gather this information with the
following command:
$ neurosynchro lock-domain-range transformed .
Here transformed
is the path to transformed training set and .
is
the path to the directory containing your nn_config.toml
file — which, in
the standard tutorial layout, is the current directory. This command will
rewrite this file, embedding the needed quantities.
Train Your Networks¶
We’re finally ready to train your networks! After all of our preparatory work, the training is pretty straightforward. For each output parameter, one runs a command like:
$ neurosynchro train transformed . j_I
Here, transformed
is the path to the directory with the transformed
training set, .
is the result directory containing the nn_config.toml
file (the current directory, in the standard tutorial layout), and j_I
is
the name of the component to train for. After training is complete, a file
named j_I.h5
will be saved in the result directory. The program will print
out the mean squared error (MSE) characterizing the neural network’s
performance against the training set.
When training against the training set used in the tutorial, it takes about 20 minutes to train on each parameter when using an 8-core laptop CPU. Because there are 9 parameters to train, this means that the training might take something like 3 hours in total. (No substantial effort has gone into optimizing the training process!) If you’re ready to commit to that, you can train all of the parameters in sequence with:
$ all="j_I alpha_I rho_Q rho_V j_frac_pol alpha_frac_pol j_V_share alpha_V_share rho_Q_sign"
$ for p in $all ; do neurosynchro train transformed . $p ; done
If you pass the argument -p
to the train
subcommand, diagnostic plots
will be shown after training is complete. The plots will be made with the
obscure omegaplot package, so make sure
to install it before trying this option.
Trainer Types¶
Neurosynchro supports the following neural network training schemes. For
each output parameter, you can specify which scheme to use by editing its
trainer
keyword in the nn_config.toml
file.
- generic
This neural network has the following characteristics:
- Dense, single-layer architecture
- 300 neurons
- RELU activation.
- Keras’s
normal
kernel initializer - Trained with the adam optimizer.
- Optimized against the mean-squared-error (MSE) loss function.
The network is trained in two passes. First, 30 epochs of training are run. Then the training set is sigma-clipped with ±7σ tolerance — the intention is to remove any cases where the detailed calculation has mistakenly delivered totally bogus results. Then 30 more epochs of training are run.
This network has been observed to perform well in a variety of real-world situations.
- twolayer
This neural network has the following characteristics:
- Dense, two-layer architecture
- 120 neurons in first layer, 60 in second
- RELU activation in both layers.
- Keras’s
normal
kernel initializer - Trained with the adam optimizer.
- Optimized against the mean-squared-error (MSE) loss function.
The training is run in the same way as in the
generic
setup. This network has been observed to perform a little bit better than the generic network when predicting the rho_Q output parameter. This doesn’t always hold, though; if you wish to investigate, try both and see which gives a better MSE.- binary
This neural network has the following characteristics:
- Dense, two-layer architecture
- 120 neurons in first layer, 60 in second
- RELU activation in both layers.
- Sigmoid activation in the output layer.
- Keras’s
normal
kernel initializer - Trained with the adam optimizer.
- Optimized against the binary cross-entropy loss function.
The training is run in almost the same way as in the
generic
setup, but no sigma-clipping is performed. This setup is intended for the rho_Q_sign output parameter, which predicts the sign of the rho_Q coefficient. However, sometimes thegeneric
scheme actually performs better in practice. Once again, investigate by trying both and seeing which gives a better MSE.
This menu of options is, obviously, quite limited. For novel applications, you may have to edit the code to add new training schemes. Pull requests contributing new ones are more than welcome!
Next: run some test problems!
Download Some Pre-Trained Neural Network Data¶
One nice feature of the neurosynchro model is that the data files representing a trained neural network are quite compact. Once a network has been trained for your particular problem space, it’s easy and quick to use it.
The tutorial sequence starting with Download a Sample Training Set describes how to obtain a training set and train the neural nets on your own machine. This step is essential in the (quite common) circumstance that you need a network whose parameter ranges are tuned to your particular application. But here, we’ll just download pre-trained neural network files.
All you need to do is download and unpack the bzipped tar archive rimphony_powerlaw_s5-5e7_p1.5-7_nndata.tar.bz2, which is archived on Zenodo under DOI 10.5281/zenodo.1341364. It’s about 160 kiB. This archive contains the data needed to do synchrotron radiative transfer if your particle distribution is a power-law in energy, isotropic in pitch angle, the power-law indices range between 1.5 and 7, and the relevant harmonic numbers are between 5 and 50,000,000.
Unpacking this archive will create a directory named
rimphony_powerlaw_s5-5e7_p1.5-7
. Remaining steps working from the terminal
will assume that you’re working from inside this directory, e.g.:
$ cd rimphony_powerlaw_s5-5e7_p1.5-7
Next: run some test problems!
Run Some Test Problems¶
There are many ways to test the performance of neural networks. Sadly,
neurosynchro does not currently provide much infrastructure for running such
tests — this is an area for improvement. At the moment, the most prominent
tools are the reports of mean squared error (MSE) and residual plots that you
get from the neurosynchro train
command.
However neurosynchro does include a tool that will use grtrans to do full radiative transfer integrations with both precisely-calculated and approximated coefficients, so that you can check its performance in an end-to-end example.
Warning
Unfortunately, in its current state, grtrans
is really
difficult to install in a useful way … and due to this situation,
we don’t distribute any canned test problems. We can’t recommend
this testing strategy for any but the hardest-core users. You
might want to skip ahead to learn about how to use your
networks in your application.
To run these tests, the grtrans
Python module radtrans_integrate
must
be importable in your Python setup. Unfortunately, grtrans
does not
provide a standard installation path, so it can be a bit of a hassle to make
this module available. You will know that things are working when the
following command runs without printing any error messages:
$ python -c 'import radtrans_integrate'
Detailed instructions on how to install grtrans
properly are beyond the
scope of this manual.
Once grtrans
has been installed, you need to create a test data set, as
described below. Then a command like this will run the comparison:
$ neurosynchro test-grtrans . testproblem.txt
Here is example output for a test problem prepared for the tutorial neural networks, trained on a power-law electron distribution:
Using Theano backend.
Precise computation: I=2.1134e-08 Q=-2.6238e-10 U=-1.6681e-09 V=4.4779e-09 calc_time=103937 ms
WARNING: some of the approximations were out-of-sample
Approx. computation: I=2.1151e-08 Q=-3.0201e-10 U=-1.6711e-09 V=4.4855e-09 calc_time=234 ms
Accuracy: I=0.001 Q=0.151 U=0.002 V=0.002
Speedup: 444.5
Generating data for a test problem¶
The format of the test data file is an expanded version of the one used for the training data. As with the training data, the data should come in a line-oriented text file with several tab-separated columns and a header line, with input, output, and metadata parameters indicated as with the training data.
Given such a file, the test tool runs a radiative transfer integration along a single ray, where each row in the test data file gives a sample of these coefficients as they change along the ray.
To compute the “right answer”, the testing tool performs a radiative transfer
integration using the coefficients exactly as they are given in your file.
Therefore the test file must contain columns for the standard suite of eight
output parameters: (j_I, alpha_I, j_Q, alpha_Q, j_V, alpha_V, rho_Q,
rho_V)
, computed at the reference frequency and density as described
for the training data.
Four extra columns are also needed to put everything on the physical footing
used by grtrans
:
- d(meta)
- This is the distance along the ray path of each sample point, measured in cm. These values should start at zero and increase monotonically along each row of the file.
- psi(meta)
- This is the projected angle between some invariant Stokes U axis and the magnetic field, measured in radians. These values can just be zero if you don’t care about having a magnetic field whose direction rotates on the sky. This parameter is used to map from the local linear polarization frame (in which the Stokes U coefficients are always aligned with the magnetic field and hence are always zero) to the observer frame.
- n_e(meta)
- This is the local density of synchrotron-emitting electrons, in cm -3.
- time_ms(meta)
- This is how much time it took to do the precise calculation of the RT coefficients for this row, in milliseconds. This column is only used to determine how much faster neurosynchro was compared to the detailed calculation. You can set it to zero if you don’t collect this information.
It is not necessary to choose realistic values for these parameters — the point of the testing is to just compare the precise and approximated results. But if you provide realistic values, the integration should give realistic results.
To compute the approximated result, the test tool will load up the specified neural network and make predictions based on the input parameters listed in your file. You must have parameters named s, for the local harmonic number, and theta, for the angle between the line-of-sight and the local magnetic field (measured in radians). The RT integration is performed at a fixed observing frequency, which means that the local magnetic field strength B is defined implicitly by the ratio of s and that frequency.
Note that the goal here is to assess the real-world impact of approximation
errors — not necessarily to test the grtrans
integrator in challenging
circumstances. When generating your test data sets, there’s no particular need
to explore unusual corners of parameter space unless you expect them to arise
in your science application.
Use Your Networks for Your Application¶
You’ve generated a training set, trained your networks, and tested their performance in some sample problems. Time to use them for real!
As we have written, actually performing a full radiative transfer integration is not within neurosynchro’s purview. As suggested in the previous section, the tool grtrans can be a good choice, but there are many options out there.
Instead, neurosynchro’s job is just to compute radiative transfer coefficients quickly, given some input parameters. This is as straightforward as:
# Python 2/3 compatibility:
from __future__ import absolute_import, division, print_function
# Imports:
import neurosynchro.impl
import numpy as np
# Configuration: all you need is the path to the directory with some
# trained neural nets. If you're following the tutorial, the
# relevant directory is named this -- but you should use '.' if
# you started Python in the directory containing `nn_config.toml`.
nn_dir = 'rimphony_powerlaw_s5-5e7_p1.5-7'
# Create an object that will do your calculations. Initializing the
# neural networks typically takes ~10 seconds.
approx = neurosynchro.impl.PhysicalApproximator(nn_dir)
# Create some arrays of physical parameters of interest. The
# following parameters will occur in all computations:
N = 256 # how many points to sample?
nu = 1e10 # observing frequency in Hz
B = np.linspace(100, 10, N) # samples of magnetic field strengths,
# in Gauss
n_e = np.logspace(6, 4, N) # electron densities, in cm^-3
theta = 0.7 # angle(s) between line-of-sight and the local B
# field, in radians
psi = np.linspace(0.1, 0.3, N) # angles between local B field and
# observer's Stokes U axis, in
# radians
# But you will almost always need additional arrays of data
# specifying the parameters of the electron distribution
# function. Exactly which parameters are relevant parameters depends
# on which distribution function is being used. For an isotropic
# power law distribution, there's just:
p = np.linspace(2, 4, N) # electron energy power law index
# Compute coefficients in basis where Stokes U tracks the local
# field. "oos" stands for "out of sample", and gives information
# about if any of the calculations ran outside of the domain of the
# training set. Additional parameters describing the electron
# distribution function would be passed as additional keywords.
coeffs, oos = approx.compute_all_nontrivial(nu, B, n_e, theta, p=p)
# Transform to a basis in which Stokes U is invariant from the
# observer's perspective:
coeffs = neurosynchro.detrivialize_stokes_basis(coeffs, psi)
# The coefficients are packed as follows:
names = 'jI aI jQ aQ jU aU jV aV rQ rU rV'.split()
for index, name in enumerate(names):
print('Mean {} coefficient: {:e}'
.format(name, np.mean(coeffs[...,index])))
That’s about all there is to it, in the end — what you do with the numbers you
get from neurosynchro is not its business. But if you’d like a few more
details, you might want to start with the documentation of Python API of the
neurosynchro.impl.PhysicalApproximator
class.
Python API Reference¶
In most cases, the only part of the Python API that you will care about is the
neurosynchro.impl.PhysicalApproximator
class — this is the class that
loads up a neural network and computes coefficients for you. The rest of
neurosynchro’s functionality is best accessed through the command-line
interface neurosynchro
, rather than directly using the Python code. As
such, many internal parts of the code are not documented.
neurosynchro
: the main module¶
Train and use artificial neural network approximations (“regressions”) of synchrotron radiative transfer coefficients as a function of various physical input parameters.
The meat of the neural network code is in the impl
module to avoid
importing Keras unless needed.
The top-level module of neurosynchro does not actually include any
neural-network functionality: such functionality requires loading the
keras
module, which is very slow to initialize. Instead, this module
contains generic code for dealing with training set data and radiative
transfer coefficients:
Loading Training Set Data¶
-
neurosynchro.
basic_load
(datadir, drop_metadata=True)[source]¶ Load a directory of textual tables (such as training set data).
Call signature
- datadir
- A path to a directory of textual tables; format described below.
- drop_metadata (default
True
) - If true, columns marked as metadata will be dropped from the returned table.
- Return value
- A
pandas.DataFrame
of data, concatenating all of the input tables.
The data format is documented in Make Your Training Set. Briefly, each file in datadir whose name ends with
.txt
will be loaded as a table usingpandas.read_table()
. The recommended format is tab-separated values with a single header row. Column names should end in type identifiers such as(lin)
to identify their roles, although this function ignores this information except to drop columns whose names end in(meta)
if so directed.
Working With Radiative Transfer Coefficients¶
-
neurosynchro.
detrivialize_stokes_basis
(coeffs, psi)[source]¶ Re-express coefficients in a basis in which the magnetic field can rotate on the sky.
Arguments
- coeffs
- Radiative transfer coefficients in the Stokes basis where the Stokes U
axis is aligned with the magnetic field. This is an array of shape
(S, 8)
where S is the shape of psi. Along the inner axis of the array, the coefficients are:(j_I, alpha_I, j_Q, alpha_Q, j_V, alpha_V, rho_Q, rho_V)
. This is the representation returned byneurosynchro.impl.PhysicalApproximator.compute_all_nontrivial()
. - psi
- The angle(s) between the magnetic field as projected on the sky and some invariant Stokes U axis, in radians. XXX: sign convention?
Return value
An array of radiative transfer coefficients in which the Stokes U terms are no longer trivial. The shape is
(S, 11)
. Along the inner axis of the array, the coefficients are:(j_I, alpha_I, j_Q, alpha_Q, j_U, alpha_U, j_V, alpha_V, rho_Q, rho_U, rho_V)
.Details
Synchrotron calculations are generally performed in a basis in which the Stokes U axis is aligned with the magnetic field, which means that the corresponding radiative transfer coefficients are zero (“trivial”). In actual work, however, the magnetic field orientation is not guaranteed to be constant along the direction of propagation. If the Stokes U axis is held fixed along the integration, the Stokes U coefficients become nontrivial. This function transforms coefficients from the former basis to the latter.
See Shcherbakov & Huang (2011MNRAS.410.1052S), equations 50-51. Note that the linear polarization axis rotates at twice the rate that psi does, because linear polarization is an orientation not an angle.
Mappings¶
-
class
neurosynchro.
Mapping
(name)[source]¶ An abstract base class for parameter transformations.
Mapping classes are used to translate the physical parameters fed into neurosynchro into normalized values that are easier to work with numerically. You will not normally need to use them directly.
-
classmethod
from_info_and_samples
(info, phys_samples)[source]¶ Create a new
Mapping
from a dictionary of information and a set of samples.Call signature
- info
- A dictionary of attributes, passed into
Mapping.from_dict()
. - phys_samples
- A 1D Numpy array of samples of this parameter, in no particular order.
- Return value
- A new instance of
Mapping
(or one of its subclasses) with initialized bounds parameters.
-
classmethod
from_dict
(info, load_bounds=True)[source]¶ Deserialize an ordered dictionary into a new
Mapping
instance.Call signature
- info
- A dictionary of parameters, as generated by
Mapping.to_dict()
. - load_bounds (default:
True
) - If true, deserialize bounds information such as the maximum and minimum
observed physical values. If
False
, these are left uninitialized. - Return value
- A new
Mapping
instance.
-
phys_to_norm
(phys)[source]¶ Map “physical” parameters to normalized values
Argument
- phys
- An array of “physical” input values (see How Neurosynchro transforms parameters).
Return values
This method returns a tuple
(normalized, oos)
.- normalized
- The normalized versions of the input data.
- oos
- An array of booleans of the same shape as the input data. True values indicate inputs that were out of the sample that was used to define the mapping.
-
norm_to_phys
(norm)[source]¶ Map “normalized” parameters to “physical” values
Argument
- norm
- An array of “normalized” input values (see How Neurosynchro transforms parameters).
Return values
This method returns a tuple
(phys, oos)
.- phys
- The physical versions of the input data.
- oos
- An array of booleans of the same shape as the input data. True values indicate inputs that were out of the sample that was used to define the mapping.
-
classmethod
-
neurosynchro.
mapping_from_info_and_samples
(info, phys_samples)[source]¶ Create a
Mapping
subclass from configuration data and sample data.Call signature
- info
- A dictionary of configuration info, loaded from
nn_config.toml
or created byMapping.to_dict()
. - phys_samples
- A 1D array of training data used to initialize the bounds of this mapping.
- Return value
- A new
Mapping
instance. The particular subclass used depends on themaptype
setting, as documented in the Map Types.
-
neurosynchro.
mapping_from_dict
(info)[source]¶ Create a
Mapping
subclass from configuration data.Call signature
- info
- A dictionary of configuration info, loaded from
nn_config.toml
or created byMapping.to_dict()
. - Return value
- A new
Mapping
instance. The particular subclass used depends on themaptype
setting, as documented in the Map Types.
Unlike
mapping_from_info_and_samples()
, the bounds data are not initialized if they are not already defined in the configuration dictionary.
neurosynchro.impl
: the neural-network implementation¶
This module contains PhysicalApproximator
, the class that uses
prebuild keras
neural networks to rapidly approximate radiative
transfer coefficients.
neurosynchro.grtrans
: helpers for working with grtrans
¶
This module provides helpers for doing radiative transfer integrations with grtrans, including a simple framework for running end-to-end tests.
In order to use this functionality, the Python module radtrans_integrate
must be importable. Sadly grtrans
doesn’t install itself like a regular Python package, so getting this working
can be a pain. Documenting the installation procedure is beyond the scope of
this project.
-
neurosynchro.grtrans.
integrate
(d, coeffs, psi)[source]¶ Integrate a ray with grtrans, using reasonable defaults.
Call signature
- d
- An array giving the location of each sample along the ray, starting from zero, in cm.
- coeffs
- An array of shape (N, 8) of RT coefficients in the basis where the
Stokes U coefficients are always zero. Such arrays are returned by
neurosynchro.impl.PhysicalApproximator.compute_all_nontrivial()
. - psi
- An array of angles between the local magnetic field and the observer’s Stokes U axis, in radians.
- Return value
- An array of shape (4,), giving the Stokes IQUV at the end of the ray.
This function is mainly intended to test what happens if the passed-in coefficients are slightly different due to the neural network approximation. So we don’t provide many knobs or diagnostics here.
Low-level Interfaces to grtrans
¶
-
neurosynchro.grtrans.
integrate_ray_formal
(x, j, K)[source]¶ Use grtrans to integrate one ray using its “formal” (matricant / O-matrix) method.
Call signature
- x
- 1D array, shape (n,). Path length along ray, starting from zero, in cm.
- j
- Array, shape (n, 4). Emission coefficients:
j_{IQUV}
, in that order. - K
- Array, shape (n, 7). Absorption coefficients and Faraday mixing coefficients:
alpha_{IQUV}, rho_{QUV}
. - Return value
- Array of shape (4, m): Stokes intensities
IQUV
along parts of the ray with non-zero total emissivities; m <= n.
-
neurosynchro.grtrans.
integrate_ray_lsoda
(x, j, K, atol=1e-08, rtol=1e-06, max_step_size=None, frac_max_step_size=0.001, max_steps=100000)[source]¶ Use grtrans to integrate one ray using its LSODA method.
Call signature
- x
- 1D array, shape (n,). Path length along ray, starting from zero, in cm.
- j
- Array, shape (n, 4). Emission coefficients:
j_{IQUV}
, in that order. - K
- Array, shape (n, 7). Absorption coefficients and Faraday mixing coefficients:
alpha_{IQUV}, rho_{QUV}
. - atol
- Some kind of tolerance parameter.
- rtol
- Some kind of tolerance parameter.
- max_step_size
- The maximum absolute step size. Overrides frac_max_step_size.
- frac_max_step_size
- If max_step_size is not specified, the maximum step size passed to the
integrator is
x.max()
multiplied by this parameter. Experience shows that (for LSODA at least) this parameter must be pretty small to get good convergence! - max_steps
- The maximum number of steps to take.
- Return value
- Array of shape (4, m): Stokes intensities IQUV along parts of the ray with non-zero total emissivities; m <= n.
Contributing to the Project¶
Contributions are welcome! See the CONTRIBUTING.md file attached to the project’s GitHub repository. The goal is to run the project with standard open-source best practices. We do wish to call your attention to the code of conduct, however. Participation in the project is contingent on abiding by the code in both letter and spirit.