Aim: To find a model that trains the simulated data of Gaussian beam spot moving in a vertical direction by the application of a sinusoidal signal.
All the attachments are in the zip folder.
The simulated video of beam spot motion without noise (amplitude of sinusoidal signal given = 20 pixels) is given in this link https://drive.google.com/file/d/1oCqd0Ki7wUm64QeFxmF3jRQ7gDUnuAfx/view?usp=sharing
I tried several cases:
I added random uniform noise (ranging from 0 to 25.5 i.e. 10% of the maximum pixel value 255) using opencv to 64*64 simulated images made in the last case( https://nodus.ligo.caltech.edu:8081/40m/13972), clipped the pixel values from 0 to 255 & trained using the same network as in the previous elog and it worked well. The variation in mean squared error with epochs is given in Attachment 1 & applied signal and output of the neural network (NN) (magnitude of the signal vs time) as well as the residual error is given in Attachment 2.
I simulated images 128*128 at 10 frames/sec by applying a sine wave of frequency 0.2Hz that moves the beam spot & resized it using opencv to 64*64. Then I trained 300cycles & tested with 1000 cycles with the following sequential model:
(i) Layers and number of nodes in each:
4096 (dropout = 0.1) -> 1024 (dropout = 0.1) -> 512 (dropout = 0.1) -> 256 -> 64 -> 8 -> 1
Activation : selu -> selu -> selu -> selu -> selu -> selu -> linear
(ii) loss function = mean squared error ( I used mean squared error to easily comprehend the result. Initially I had tried log(cosh) also but unfortunately I had stopped the run in between when test loss value had no improvement), optimizer = Nadam with default learning rate = 0.002
(iii) batch size = 32, no. of epochs = 400
I have attached the variation in loss function with epochs (Attachment 3). It was found that test loss value increases after ~50 epochs. To avoid overfitting, I added dropout to the layer of 256 nodes in the next model and removed the layer of 4096 nodes.
Same simulated data as case 2 trained with the following model,
1024 (dropout = 0.1) -> 512 (dropout = 0.1) -> 256 (dropout = 0.1) -> 64 -> 8 -> 1
Activation : selu -> selu -> selu -> selu -> selu -> linear
(ii) changed the learning rate from default value of 0.002 to 0.001. Rest of the hyperparameters same.
The variation in mean squared error in attachment 4 & NN output, applied signal & residual error (zoomed) in attachment 5. Here also test loss value increases after ~65 epochs but this fits better than the previous model as loss value is less.
Since in most of the examples in keras, training dataset was more than test dataset, I tried training 1000 cycles & testing with 300 cycles. The respective plots are attached as attachment 6 & 7. Here also, there is no significant improvement except that the test loss is increasing at a slower rate with epochs as compared to the last case.
Since most of the above cases were like overfitting (https://machinelearningmastery.com/diagnose-overfitting-underfitting-lstm-models/, https://github.com/keras-team/keras/issues/3755) except that test loss is less than train loss value in the beginning , I tried implementing case 4 with the initial model of 2 layers of 256 nodes each but with Nadam optimizer. Respective graphs in attachment 8, 9 & 10(zoomed). The loss value is slightly higher than the previous models as seen from the graph but test & train loss values converge after some epochs.
I have forgot to give ylabel in some of the graphs. It's the magnitude of the applied sine signal to move the beam spot. In most of the cases, the network almost correctly fits the data and test loss value is lower in the initial epochs. I think it's because of the dropout we added in the model & also we are training on the clean dataset.