40m QIL Cryo_Lab CTN SUS_Lab TCS_Lab OMC_Lab CRIME_Lab FEA ENG_Labs OptContFac Mariner WBEEShop
  40m Log  Not logged in ELOG logo
Entry  Thu Aug 4 19:01:59 2022, Tega, Update, Computers, Front-end machine in supermicro boxes IMG_20220804_184444473.jpgIMG_20220804_191658206.jpgIMG_20220804_185336240.jpgIMG_20220804_185023002.jpg
    Reply  Mon Aug 8 17:16:51 2022, Tega, Update, Computers, Front-end machine setup IMG_20220808_170349717.jpg
       Reply  Wed Aug 10 20:51:14 2022, Tega, Update, Computers, CDS upgrade Front-end machine setup IMG_20220810_171002928.jpgIMG_20220810_171019633.jpg
          Reply  Tue Aug 16 18:22:59 2022, Tega, Update, Computers, c1teststand rack mounting for CDS upgrade IMG_20220816_180157132.jpgIMG_20220816_175125874.jpg
             Reply  Wed Aug 17 11:10:51 2022, rana, Update, Computers, c1teststand rack mounting for CDS upgrade 
             Reply  Mon Aug 22 19:02:15 2022, Tega, Update, Computers, c1teststand rack mounting for CDS upgrade II IMG_20220822_185437763.jpgIMG_20220822_131340214.jpgc1bhd.jpegc1lsc.jpeg
                Reply  Tue Aug 23 22:30:24 2022, Tega, Update, Computers, c1teststand OS upgrade - I 
                Reply  Fri Aug 26 14:05:09 2022, Tega, Update, Computers, rack reshuffle proposal for CDS upgrade 6x
                   Reply  Sun Aug 28 23:14:22 2022, Jamie, Update, Computers, rack reshuffle proposal for CDS upgrade 
                      Reply  Mon Aug 29 15:15:46 2022, Tega, Update, Computers, 3 FEs from LLO got delivered today IMG_20220829_145533452.jpgIMG_20220829_144801365.jpg
                         Reply  Tue Aug 30 15:21:27 2022, Tega, Update, Computers, 3 FEs from LHO got delivered today IMG_20220830_144925325.jpgIMG_20220830_142307495.jpgIMG_20220830_143059443.jpg
                   Reply  Mon Sep 19 20:21:06 2022, Tega, Update, Computers, 1X7 and 1X6 work IMG_20220919_204013819.jpgIMG_20220919_203541114.jpgIMG_20220919_203458952.jpg
                      Reply  Tue Sep 20 23:06:23 2022, Tega, Update, Computers, Setup the 6 new front-ends to boot off the FB1 clone IMG_20220921_084220465.jpgdolphin_err_init_state.pngdolphin_final_state.png
                         Reply  Wed Sep 21 17:16:14 2022, Tega, Update, Computers, Setup the 6 new front-ends to boot off the FB1 clone 
                            Reply  Thu Sep 22 20:57:16 2022, Tega, Update, Computers, build, install and start 40m models on teststand  dolphin_state_plus_c1shimmer.pngFE_status_overview.png
                               Reply  Fri Sep 23 19:07:03 2022, Tega, Update, Computers, Work to improve stability of 40m models running on teststand  dolphin_state_all_green.pngdolphin_state_IPC_glitch.png
                            Reply  Thu Sep 29 15:12:02 2022, JC, Update, Computers, Setup the 6 new front-ends to boot off the FB1 clone 
                               Reply  Tue Oct 4 21:00:49 2022, Chris, Update, Computers, Failed takeover attempt with the new front ends 
                                  Reply  Thu Oct 6 07:29:30 2022, Chris, Update, Computers, Successful takeover attempt with the new front ends 
Message ID: 17158     Entry time: Fri Sep 23 19:07:03 2022     In reply to: 17153
Author: Tega 
Type: Update 
Category: Computers 
Subject: Work to improve stability of 40m models running on teststand  

[Chris, Tega]

Timing glitch investigation:

  • Moved dolphin transmit node from c1sus to c1lsc bcos we suspect that the glitch might be coming from the c1sus machine (earlier c1pem on c1sus was running faster then realtime).
  • Installed and started c1oaf to remove the shared memory IPC error to/from c1lsc model
  • /opt/DIS/sbin/dis_diag gives two warnings on c1sus2
    • [WARN] IXH Adapter 0 - PCIe slot link speed is only Gen1
    • [WARN] Node 28 not reachable, but is an entry in the dishosts.conf file - c1shimmer is currently off, so this is fine.

DAQ network setup:

  • Added the DAQ ethernet MAC address  and fixed IPV4 address for the front-ends to '/etc/dhcp/dhcpd.conf'
  • Added the fixed DAQ IPV4 address and port for all the front-ends to '/etc/advligorts/subscriptions.txt' for `cps_recv` service
  • Edited '/etc/advligorts/master' by including all the iop and user models '.ini' files in '/opt/rtcds/caltech/c1/chans/daq/' containing channel info and the corresponding tespoint files in '/opt/rtcds/caltech/c1/target/gds/param/'
  • Created systemd environment file for each front-end in '/diskless/root/etc/advligorts/' containing the argument for local data concentrator and daq data transmitter (`local_dc_args` and `cps_xmit_args`). We currently have staggered the delay (-D waitValue) times of the front-ends by setting it to the last number in the daq ip address when we were facing timing glitch issues, but should probably set it back to zero to see if it has any effect.

Other:

  • Edited /etc/resolv.conf on fb1 and 'diskless/root' to enable name resolution via for example `host c1shimmer` but the file gets overwritten on chiara for some reason

Issues:

  1. Frame writing is not working at the moment. It did at some point in the past for a couple of days but stopped working earlier today and we can't quite figure out why. 
  2. We can't get data via diaggui or ndscope either. Again, we recall the working in the past too but not sure why it has stopped working now.   
  3. The cpu load on c1su2 is too high so we should split into two models
  4. We still get the occassional IPC glitch both for shared memory and dolphin, see attachments
Attachment 1: dolphin_state_all_green.png  36 kB  Uploaded Fri Sep 23 20:17:33 2022  | Hide | Hide all
dolphin_state_all_green.png
Attachment 2: dolphin_state_IPC_glitch.png  37 kB  Uploaded Fri Sep 23 20:17:42 2022  | Hide | Hide all
dolphin_state_IPC_glitch.png
ELOG V3.1.3-