This is an update of my previous reports on applications of deep learning to the reconstruction of PRMI degrees of freedom (MICH/PRCL) from real free swinging data. The results shown here are improved with respect to elog 13274 and 13294. The training is performed in two steps, the first one using simulated data, and the second one fine tuning the parameters on real data.
This step is exactly the same already described in the previous entries and in my talks at the CSWG and LVC. For details on the DNN architecture please refer to G1701455 or G1701589. Or if you really want all the details you can look at the code. I used the following signals as input to the DNN: POPDC, POP22_Q, ASDC, REFL11_I/Q, REFL55_I/Q, AS55_I/Q. The network is trained using linear trajectories in the PRCL/MICH space, and signals obtained from a model that simulates the PRMI behavior in the plane wave approximation. A total of 150000 trajectories are used. The model includes uncertainties in all the optical parameters of the 40m PRMI configuration, so that the optical signals for each trajectory are actually computed using random optical parameteres, drwn from gaussian distributions with proper mean and width. Also, white random gaussian sensing noise is added to all signals with levels comparable to the measured sensing noise.
The typical performance on real data of a network pre-trained in this way was already described in elog 13274, and although being reasoble, it was not too good.
Real free swinging data is used in this step. I fine tuned the demodulation phases of the real signals. Please note that due to an old mistake, my convention for phases is 90 degrees off, so for example REFL11 is tuned such that PRCL is maximized in Q instead of I. Regardless of this convention confusion, here's how I tuned the phases:
Then I built the following training architecture. The neural network takes the real signals and produces estimates of PRCL and MICH for each time sample. Those estimates are used as the input for the PRMI model, to produce the corresponding simulated optical signals. My cost function is the squared difference of the simulated versus real signals. The training data is generated from the real signals, by selection 100000 random 0.25s long chunks: the history of real signal over the whole 0.25s is used as input, and only the last sample is used for the cost function computation. The weights and biases of the neural network, as well as the model parameters are allowed to change during the learning process. The model parameters are regularized to suppress large deviations from the nominal values.
One side note here. At first sight it might seems weird that I'm actually fedding as input the last sample and at the same time using it as the reference for the loss function. However, you have to remember that there is no "direct" path from input to output: instead all goes through the estimated MICH/PRCL degrees of freedom, and the optical model. So this actually forces the network to tune the reconstruction to the model. This approach is very similar to the auto-encoder architectures used in unsupervised feature learning in image recognition.
After trainng the network with the two previous steps, I can produce time domain plots like the one below, which show MICH and PRCL signals behaving reasonably well:
To get a feeling of how good the reconstruction is, I produced the 2d maps shown below. I divided the MICH/PRCL plane in 51x51 bins, and averaged the real optical signals with binning determined by the reconstructed MICH and PRCL degrees of freedom. For comparison the expected simulation results are shown. I would say that reconstructed and simulated results match quite well. It looks like MICH reconstruction is still a bit "compressed", but this should not be a big issue, since it should still work for lock acquisition.
There a few things that can be done to futher tune the network. Those are mostly details, and I don't expect significant improvements. However, I think the results are good enough to move on to the next step, which is the on-line implementation of the neural network in the real time system.