40m QIL Cryo_Lab CTN SUS_Lab TCS_Lab OMC_Lab CRIME_Lab FEA ENG_Labs OptContFac Mariner WBEEShop
  40m Log  Not logged in ELOG logo
Entry  Fri Sep 8 12:08:32 2017, Gabriele, Summary, LSC, Good reconstruction of PRMI degrees of freedom with deep learning full_real_data_2017_09_06b_100x4_4000_300_20_free_swinging_time_domain.png.pngfull_real_data_2017_09_06b_100x4_4000_300_20_free_swinging_free_swinging_histograms2d.png.pnggru_2017_08_13a_100x4_4000_300_20_free_swinging_simulation_reference.png.png
    Reply  Fri Oct 6 12:56:40 2017, gautam, Summary, LSC, RTCDS NN post_NN_test.png
       Reply  Tue Oct 17 17:53:25 2017, jamie, Summary, LSC, prep for tests of Gabriele's neural network cavity length reconstruction c1dnn.png
          Reply  Wed Oct 18 12:14:08 2017, jamie, Summary, LSC, prep for tests of Gabriele's neural network cavity length reconstruction 
             Reply  Thu Oct 19 15:42:03 2017, jamie, Summary, LSC, MICH/PRCL reconstruction neural network running on c1lsc NN.pngC1DNN_GDS.pngC1DNN_CPU_METER.png
                Reply  Tue Oct 24 20:14:21 2017, jamie, Summary, LSC, further testing of c1dnn integration; plugged in to DAQ NN.pngc1dnn_out.png
                   Reply  Wed Oct 25 09:32:14 2017, Gabriele, Summary, LSC, further testing of c1dnn integration; plugged in to DAQ 
                   Reply  Mon Nov 6 18:22:48 2017, jamie, Summary, LSC, current procedure for running c1dnn code 
                      Reply  Thu Nov 9 10:51:37 2017, gautam, Summary, LSC, current procedure for compiling and installing c1dnn code 
Message ID: 13383     Entry time: Tue Oct 17 17:53:25 2017     In reply to: 13365     Reply to this: 13390
Author: jamie 
Type: Summary 
Category: LSC 
Subject: prep for tests of Gabriele's neural network cavity length reconstruction 

I've been preparing for testing Gabriele's deep neural network MICH/PRCL reconstruction.  No changes to the front end have been made yet, this is all just prep/testing work.

Background:

We have been unable to get Gabriele's nn.c code running in kernel space for reasons unknown (see tests described in previous post).  However, Rolf recently added functionality to the RCG that allows front end models to be run in user space, without needing to be loaded into the kernel.  Surprisingly, this seems to work very well, and is much more stable for the overall system (starting/stopping the user space models will not ever crash the front end machine).  The nn.c code has been running fine on a test machine in this configuration.  The RCG version that supports user space models is not that much newer than what the 40m is running now, so we should be able to run user space models on the existing system without upgrading anything at the 40m.  Again, I've tested this on a test machine and it seems to work fine.

The new RCG with user space support compiles and installs both kernel and user-space versions of the model.

Work done:

  • Create 'c1dnn' model for the nn.c code.  This will run on the c1lsc front end machine (on core 6 which is currently empty), and will communicate with the c1lsc model via SHMEM IPC.  It lives at:
    • /opt/rtcds/userapps/release/isc/c1/models/c1dnn.mdl
  • Got latest copy of nn.c code from Gabriele's git, and put it at:
    • /opt/rtcds/userapps/release/isc/c1/src/nn/
  • Checked out the latest version of the RCG (currently SVN trunk r4532):
    • /opt/rtcds/rtscore/test/nn-test
  • Set up the appropriate build area:
    • /opt/rtcds/caltech/c1/rtbuild/test/nn-test
  • Built the model in the new nn-test build directory ("make c1dnn")
  • Installed the model from the nn-test build dir ("make install-c1dnn")

Test:

I tried a manual test of the new user space model.  Since this is a user space process running it should have no affect on the rest of the front end system (which it didn't):

  • Manually started the c1dnn EPICS IOC:
    • $ (cd /opt/rtcds/caltech/c1/target/c1dnn/c1dnnepics && ./startupC1)
  • Tried running the model user-space process directly:
    • $ taskset -c 6 /opt/rtcds/caltech/c1/target/c1dnn/bin/c1dnn -m  c1dnn

Unfortunately, the process died with an "ADC TIMEOUT" error.  I'm investigating why.

Once we confirm the model runs, we'll add the appropriate SHMEM IPC connections to connect it to the c1lsc model.

Attachment 1: c1dnn.png  83 kB  | Show | Show all
ELOG V3.1.3-