ID |
Date |
Author |
Type |
Category |
Subject |
14374
|
Thu Dec 20 17:17:41 2018 |
gautam | Update | CDS | Logging of new Vacuum channels | Added the following channels to C0EDCU.ini:
[C1:Vac-P1b_pressure]
units=torr
[C1:Vac-PRP_pressure]
units=torr
[C1:Vac-PTP2_pressure]
units=torr
[C1:Vac-PTP3_pressure]
units=torr
[C1:Vac-TP2_rot]
units=kRPM
[C1:Vac-TP3_rot]
units=kRPM
Also modified the old P1 channel to
[C1:Vac-P1a_pressure]
units=torr
Unfortunately, we realized too late that we don't have these channels in the frames, so we don't have the data from this test pumpdown logged, but we will have future stuff. I say we should also log diagnostics from the pumps, such as temperature, current etc. After making the changes, I restarted the daqd processes.
Things to add to ASA wiki page once the wiki comes back online:
- What is the safe way to clean the cryo pump if we want to use it again?
- What are safe conditions to turn the RGA on?
|
14376
|
Fri Dec 21 11:11:51 2018 |
gautam | Update | CDS | Logging of new Vacuum channels | The N2 pressure channel name was also wrong in C0EDCU.ini, so I updated it this morning to the correct name and units:
[C1:Vac-N2_pressure]
units=psi
Now it too is being recorded to frames. |
14386
|
Fri Jan 4 17:43:24 2019 |
gautam | Update | CDS | Timing issues | [J Hanks (remote), koji, gautam]
Summary:
The problem stems from the way GPS timing signals are handled by the FEs and FB. We effected a partial fix:
- Now, old frame data is no longer being overwritten
- For the channels that are indeed being recorded now, the correct time stamp is being applied so they can be found in /frames by looking for the appropriate gpstime.
Details:
- The usual FE/FB power cycling did not fix the problem.
- The gps time used by FB and associated RT processes may be found by using cat /proc/gps (i.e. this is different from the system time found by using date, or gpstime).
- This was off by 2 years.
- The way this worked up till now was by adding a fixed offset to this time.
- This offset can be found as a line saying set symm_gps_offset=31622382 in daqdrc.fw (for example)
- There were similar lines in daqdrc.rcv and daqdrc.dc - however, they were not all the same offset! We couldn't figure out why.
- All these files live in /opt/rtcds/caltech/c1/target/daqd/.
Changes effected:
- First, we tried changing the offset in the daqdrc.fw file only.
- Incremented it by 24*60*60*365 = number of seconds in a year with no leap seconds/days.
- This did not fix the problem.
- So J Hanks decided to rebuild the Spectracom driver - these commands may not be comprehensive, but I think I got everything).
- The relevant file is spectracomGPS.c (made a copy of /usr/src/symmetricom-3.3~rc1, called symmetricom-3.3~rc1-patched, this file is in /usr/src/symmetricom-3.3~rc1-patched/include/drv)
- Added the following lines:
/* 2018 had no leap seconds or leap days, so adjust for that */
pHardware->gpsOffset += 31536000;
- re-built and installed the modified symmetricom driver.
- Checked that cat /proc/gps now yields the correct time.
- Reset the gps time offsets in daqdrc.fw, daqdrc.rcv and daqdrc.dc to 0
- With these steps, the frames were being written to /frames with the correct timestamp.
- Next, we checked the timing on the FEs
- Basically, J Hanks rebuilt the version of the symmetricom driver that is used by the rtcds models to mimic the changes made for FB.
- This did the trick for c1lsc and c1ioo - cat /proc/gps now returns the correct time on those FEs.
- However, c1sus remains problematic (it initially reported a GPS time from 2 years ago, and even after the re-installed driver, is 4 days behind) - he suspects that this is because c1sus is the only FE with a Symmetricom/Spectracom card installed in the I/O chassis. So c1sus reports a gpstime that is ~4 days behind the "correct" gpstime.
- He is going to check with Rolf/other CDS experts to figure out if it's okay for us to simply remove the card and run the models, or if we need to take other steps.
- As part of this work, the c1x02 IOP model was recompiled, re-installed and re-started.
The realtime models were not restarted (although all the vertex FEs are running) - we can regroup next week and decide what is the correct course of action.
Quote: |
- Attachment #2 shows the minute trend of the pressure gauges for a 12 day period - it looks like there is some issue with the frame builder clock, perhaps this issue resurfaced? But checking the system time on FB doesn't suggest anything is wrong.. I double checked with dataviewer as well that the trends don't exist... But checking the status of the individual daqd processes indeed showed that the dates were off by 1 year, so I just restarted all of them and now the time seems correct. How can we fix this problem more permanently? Also, the P1b readout looks suspicious - why are there periods where it seems like we are reading values better than the LSB of the device?
|
|
14392
|
Wed Jan 9 11:33:35 2019 |
gautam | Update | CDS | Timing issues still persist | Summary:
The gps time mismatch between /proc/gps and gpstime seems to be resolved. However, the 0x4000 DC errors still persist. It is not clear to me why.
Details:
On the phone with J Hanks on Friday, he reminded me that c1sus seems to be the only machine with an IRIG-B timing card installed. I can't find the elog but I remembered that Jamie, ericq and I had done this work in 2016 (?), and I also remembered Jamie saying it wasn't working exactly as expected. Since the DAQ was working fine before this card was installed, and since there are no problems with the recording of channels from the other four FE machines without this card installed, I decided to simply pull out the card from the expansion chassis. The card has been stored in the CDS/FE cabinet along the Y arm for now. There was also a cable that interfaces to the card which brings over the 1pps from the GPS unit, which has also been stored in the CDS/FE cabinet.
This seems to have resolved the mismatch between the gpstime reported by cat /proc/gps and the gpstime commands - Attachment #1 (the <1 second mismatch is presumably due to the deadtime between commands). However, the 0x4000 DC errors still persist. I'll try the full power cycle of FEs and FB which has fixed this kind of error in the past, but apart from that, I'm out of ideas.
Update 1215:
Following the instructions in this elog did not fix the problem. The problem seems to be with the daqd_fw service, which reports the following:
controls@fb1:~ 0$ sudo systemctl status daqd_fw.service
● daqd_fw.service - Advanced LIGO RTS daqd frame writer
Loaded: loaded (/etc/systemd/system/daqd_fw.service; enabled)
Active: failed (Result: start-limit) since Wed 2019-01-09 12:17:12 PST; 2min 0s ago
Process: 2120 ExecStart=/usr/bin/daqd_fw -c /opt/rtcds/caltech/c1/target/daqd/daqdrc.fw (code=killed, signal=ABRT)
Main PID: 2120 (code=killed, signal=ABRT)
Jan 09 12:17:12 fb1 systemd[1]: Unit daqd_fw.service entered failed state.
Jan 09 12:17:12 fb1 systemd[1]: daqd_fw.service holdoff time over, scheduling restart.
Jan 09 12:17:12 fb1 systemd[1]: Stopping Advanced LIGO RTS daqd frame writer...
Jan 09 12:17:12 fb1 systemd[1]: Starting Advanced LIGO RTS daqd frame writer...
Jan 09 12:17:12 fb1 systemd[1]: daqd_fw.service start request repeated too quickly, refusing to start.
Jan 09 12:17:12 fb1 systemd[1]: Failed to start Advanced LIGO RTS daqd frame writer.
Jan 09 12:17:12 fb1 systemd[1]: Unit daqd_fw.service entered failed state.
Update 1530:
The frame-writer error was tracked down to a C0EDCU issue. Jon told me that the Hornet CC1 pressure gauge channel was renamed to . C1:Vac-CC1_pressure, and I made the change in the C0EDCU file. However, it returns a value of 9990000000.0, which the frame writer is not happy about... Keeping the old channel name makes the frame-writer run again (although the actual data is bunk).
Update 1755:
J Hanks suggested adding a 1 second offset to the daqdrc config files. This has now fixed the 0x4000 errors, and we are back to the "nominal" RTCDS status screen now - Attachment #2. |
14455
|
Thu Feb 14 23:14:12 2019 |
gautam | Update | CDS | c1rfm errors | The pressure is still 2e-4 torr according to CC1 so I thought I'd give ASS debugging a go tonight. But the arm transmission signal isn't coming through to the LSC model from the end PDs - so a resurfacing of this problem. Rebooting the sender model, c1scy, did not fix the problem. Moreover, c1susaux is dead. The last time I rebooted it, ITMY got stuck so I'm not going to attempt a revival tonight. |
14457
|
Fri Feb 15 15:22:08 2019 |
gautam | Update | CDS | c1rfm errors persist | I restarted c1scy, c1rfm (so both sender and receiver models were cycled) and power-cycled the c1iscey and c1sus machines. The TRY PD is certainly seeing light - it is just not getting piped over to c1rfm. dmesg doesn't give any clues. I'm out of ideas.
P.S. The new reality seems to be that getting ITMY stuck in the event of a c1susaux reboot is inevitable. As is the practise for ITMX, I tried slowly ramping the PIT and YAW biases to 0 slowly - but in the process of ramping YAW to 0, the optic got stuck. I am ramping in steps of 0.1 (in units of the PIT/YAW sliders, waiting ~3 seconds between steps), I guess I can try ramping even more slowly.
Update: I power cycled the physical RFM switch. This necessitated reboot of all vertex FEs. But seems like things are back to normal now...
Note: to unstick ITMY, seems like the best approach is:
- Jiggle bias until SIDE shadow sensor is on average above it's half-light level. This is the critical step. A bias of +20000 cts on the fast SIDE output seems to help.
- Set YAW bias to -10, ramp down the BIAS in steps of 0.1, watching shadow sensor levels to ensure optic doesn't get stuck again.
- Hope for the best. Iterate if necessary.
Quote: |
The pressure is still 2e-4 torr according to CC1 so I thought I'd give ASS debugging a go tonight. But the arm transmission signal isn't coming through to the LSC model from the end PDs - so a resurfacing of this problem. Rebooting the sender model, c1scy, did not fix the problem. Moreover, c1susaux is dead. The last time I rebooted it, ITMY got stuck so I'm not going to attempt a revival tonight.
|
|
14472
|
Sat Mar 2 14:19:35 2019 |
gautam | Update | CDS | FSS Slow servo gains not burt-ed | PSL NPRO PZT voltage showed large low frequency (hour timescale) excursions on the control room StripTool trace, leading me to suspect the slow servo wasn't working as expected. Yesterday evening, I keyed the unresponsive c1psl crate at ~9 PM PST, and had to run the burtrestore to get the PMC locking working. I must have pressed the wrong button on burtgooey or something, because all the FSS_SLOW channels were reset to 0. What's more, their values were not being saved by the hourly burt-snap script, so I don't have any lookback on what these values were. There isn't any detailed record on the elog about what the optimal values for these are, and the most recent reference I could find was Ki=0.1, Kp=Kd=0, which is what I've set it now to. The servo isn't running away, so I'm leaving things in this state, PID tuning can be done later.
I also added the FSS Slow servo channels to the burt snapshot requirement file at /cvs/cds/caltech/target/c1psl/autoBurt.req, and confirmed that the snapshots are getting the channels from now onwards.
While looking at the req file, I saw a bunch of *_MOPA* channels and also several other currently unused channels. Probably would benefit from going through these and commenting out all the legacy channels, to minimize disk space wastage (though we compress the snapshot files every few years anyways I guess).
Reminder that this (unrelated) issue still needs to be looked into... Note also that the new vacuum system does not have burt snapshot set up (i.e. it is still trying to get the old channels from the c1vac1 and c1vac2 databases, which while has significant overlap with the new system, should probably be setup correctly). |
14492
|
Thu Mar 21 18:09:36 2019 |
Koji | Update | CDS | db file preparation for acromag c1susaux | I have updated the google doc spreadsheet to indicate the required action for the new dbfile generation.
There are three types of actions:
1. COPY - Just duplicate the old EPICS db entry. This is for soft channels, calc channels.
2. DELETE - Delete the entry for some physical channels that will not be implemented on Acromag (oplev, dewhitening mon, AI monitor, etc)
3. REPLACE - For the physical channels, we want to replace the port names.
The blue part of the spreadsheet indicates the action for each channel. If it is a physical channel, the assigned module and the channel are indicated there. What we still want to do is to use the these information for generating the port name which looks like "@asynMask(C1VAC_XT1221A_ADC 1 -16)MODBUS_DATA" .
The links to the spreadsheets can be found on 40m wiki: https://wiki-40m.ligo.caltech.edu/CDS/SlowControls/c1susaux |
14505
|
Mon Apr 1 12:01:52 2019 |
Jon | Update | CDS | | I brought c1susaux back online this morning for suspension-channel test scripting. It had been dead for some time. I followed the procedure outlined in #12542. ITMY became stuck during this process, which Gautam tells me always happens since the last vacuum access, but ITMX is not stuck. |
14506
|
Mon Apr 1 22:33:00 2019 |
gautam | Update | CDS | ITMY freed | While Anjali is working on the 1um MZ setup, the pesky ITMY was liberated from the OSEMs. The "algorithm" :
- Apply a large (-30000 cts) offset to the side coil using the fast system.
- Approach the zero of the YAW DoF from -2.00V, PIT from +10V (you'll have to jiggle the offsets until the optic is free swinging, and then step the bias down by 0.1). At this point I had the damping off.
- Once the PIT bias slider reaches -4V, I engaged all damping loops, and brought the optic to its nominal bias position under damping.
While doing this work, I noticed several errors corresponding to EPICS channel conflicts. Turns out the c1susaux2 EPICS server was left running, and the MEDM screens (and possibly several scripts) were confused. There has to be some other way of testing the new crate, on an isolated network or something - please do not leave the modbus service running as it potentially interferes with normal IFO operation. For good measure, I stopped the process and shut down the machine since I saw nothing in the elog about any running tests.
Quote: |
ITMY became stuck during this process
|
|
14507
|
Tue Apr 2 14:53:57 2019 |
gautam | Update | CDS | c1vac added to burt | I deleted references to c1vac1 and c1vac2 (which no longer exist) and added c1vac to the autoburt request file list at /opt/rtcds/caltech/c1/burt/autoburt/requestfilelist |
14508
|
Tue Apr 2 15:02:53 2019 |
Jon | Update | CDS | ITMY freed | I renamed all channels on c1susaux2 from "C1:SUS-..." to "C1:SUS2-..." to avoid contention. When the new system is ready to install, those channel names can be reverted with a quick search-and-replace edit.
Quote: |
While doing this work, I noticed several errors corresponding to EPICS channel conflicts. Turns out the c1susaux2 EPICS server was left running, and the MEDM screens (and possibly several scripts) were confused. There has to be some other way of testing the new crate, on an isolated network or something - please do not leave the modbus service running as it potentially interferes with normal IFO operation. For good measure, I stopped the process and shut down the machine since I saw nothing in the elog about any running tests.
|
|
14522
|
Mon Apr 8 11:53:17 2019 |
gautam | Update | CDS | c1oaf needs debugging | I tried restarting c1oaf this weekend to see if turning on the MC length FF would affect the ALS noise performance. I burtrestored the filter settings from March 2016. However, I noticed several possible anomalies, which need debugging. I am not turning the model off because of the possibility of having to reboot all the vertex FEs, but this model is totally unusable right now.
- Attachment #1 - the vertex seismometer input produces 1e+20 cts at the output of the feedforward filter. Attachment #2 shows the shape of the feedforward filters - doesn't explain the saturation. Since this is a feedforward loop, a runaway loop can't be the explanation either.
- The MC length feedforward control signal is supposed to only go to MC2 - but MC1 and MC3 coil outputs were saturated when I enabled the feedforward.
|
14672
|
Thu Jun 13 22:21:44 2019 |
Koji | Configuration | CDS | Paola wireless connected to martian | SURFs had trouble connecting paola to martian via wireless.
Of course, it requires a fixed IP but it had not it yet. So I went to chiara and gave 192.168.113.110 as "paolawl". Note that the wired connection has .111 and it is "paola".
Followed the instruction on http://nodus.ligo.caltech.edu:8080/40m/14121 |
14692
|
Mon Jun 24 13:48:36 2019 |
Kruthi | Configuration | CDS | Giada wireless connection | [Gautam, Kruthi]
This afternoon, Gautam helped me setup Giada to access the GigE installed for MC2. Unlike Paola, which was being used earlier, Giada has a better battery life and doesn't shutdown when the charger is unplugged. Gautam configured Giada to enable its wireless connection to Martian, just like Koji had configured Paola (https://nodus.ligo.caltech.edu:8081/40m/14672). We also rerouted the ethernet cable we were using with the PoE adaptor from Netgear Switch in 1x2 to 1x6. |
14719
|
Tue Jul 2 16:57:09 2019 |
gautam | Update | CDS | c1sus is flaky | Since the work earlier this morning, the fast c1sus model has crashed ~5 times. Tried rebooting vertex FEs using the reboot script a few times, but the problem is persisting. I'm opting to do the full hard reboot of the 3 vertex FEs to resolve this problem.
Judging by Attachment #1, the processes have been stable overnight. |
14744
|
Wed Jul 10 14:57:01 2019 |
Koji | Summary | CDS | Channel recipe for iscaux upgrade | The list of the iscaux channels and pin assignments were posted to google drive.
The spreadsheet can be viewable by the link sent to the 40m ML. It was shared with foteee@gmail for full access.
Summary
- We need
4 ADC modules
5 DAC modules
5 Binary I/O modules
- Be aware that there are bundled multiple digital I/O channels such as "mbboDirect" and "mbbi".
- The full db record of the new channels need to be inferred from the existing channels.
Necessary electronics modification
1. D990694 whitening filter modification (4 modules)
This module shares the fast and slow channels on the top DIN96pin (P1) connector. Also, the whitening selector (done by an analog signal per channel) is assigned over 17pin of the P1 connector, resulting in the necessity of the second DSUB cable. By migrating the fast channels, we can swap the cable from the P1 to P2. Also, the whitening selectors are concentrated on the first Dsub. (See Attachment1 P1)
2. D040180 / D1500308 Common Mode Board
CM servo board itself doesn't need any modification. The CM board uses P1 and P2. So we need to manufacture a special connector for CM Board P2. (cf The adapter board for P1 T1800260). See also D1700058.
3. D990543A1 LSC Photodiode Interface
PD I/F board has the DC mon channels spread over the 16pin limit. P1 21A can be connected to 6A so that we can accommdate it in the first Dsub.
Also the board uses AD797s. This is not necessary. We can replace them to OP27s. I actually don't know what is happening to those bias control, temp mon, enable, and status. These features should be disables at the I/F and the PDs. (See Attachment2 P1) |
14747
|
Thu Jul 11 12:42:35 2019 |
gautam | Summary | CDS | P2 interface board | I looked into the design of the P2 interface board. The main difficulty here is geometric - we have to somehow accommodate sufficient number of D-sub connectors in the tight space between the two P-type connectors.
I think the least painful option is to stick with Johannes' design for the P1 connector. For the CM board, the P2 connector only uses 6 pairs of conductors for signals. So we can use a D-15 connector instead of 2 D-37 connectors. Then we can change the PCB shape such that the P1 connector can be accommodated (see Attachment #1). The other alternative would be to have 2 P-type connectors and 3 D-subs on the same PCB, but then we have to be extra careful about the relative positioning of the P-type connectors (otherwise they wont fit onto the Eurocrate). So I opted to still have two separate PCBs.
I took a first pass at the design, the files may be found here. I just auto-routed the connections, this is just an electrical feedthrough so I don't think we need to be too concerned about the PCB trace routing? If this looks okay, we should send out the piece for fab ASAP.
I will work on putting together the EPICS server machine (SuperMicro) this afternoon.
Quote: |
2. D040180 / D1500308 Common Mode Board
CM servo board itself doesn't need any modification. The CM board uses P1 and P2. So we need to manufacture a special connector for CM Board P2. (cf The adapter board for P1 T1800260). See also D1700058.
|
|
14749
|
Thu Jul 11 13:08:36 2019 |
Chub | Summary | CDS | P2 interface board | It's nice and compact, and the cost of new 15-pin DSUB cables shouldn't be a factor here. What does the 15p cable connect to? |
14750
|
Thu Jul 11 13:09:22 2019 |
gautam | Summary | CDS | P2 interface board | it will connect to a 15 pin breakout board in the Acromag chassis
Quote: |
It's nice and compact, and the cost of new 15-pin DSUB cables shouldn't be a factor here. What does the 15p cable connect to?
|
|
14764
|
Tue Jul 16 15:17:57 2019 |
Koji | HowTo | CDS | Final bit bug of the BIO CDS module | Yutaro talked about the BIO bug in KAGRA elog. http://klog.icrr.u-tokyo.ac.jp/osl/?r=9536
I think I made the similar change for the 40m model somewhere (don't remember), but be aware of the presense of this bug. |
14765
|
Tue Jul 16 16:00:01 2019 |
gautam | Update | CDS | c1iscaux Supermicro setup | I worked on preparing for the c1iscaux upgrade a bit today.
- Attachment #1: This shows where the 120 GB solid-state hard-drive and the 2 RAM cards (2GB each) are installed.
- I found that it required considerable application of force to get the RAM cards into their slots.
- Note: the 4GB RAM is broken up into two separate physical cards, each 2GB. The labeling is a bit confusing, as each card suggests it is by itself 4GB.
- OS install for c1iscaux:
- I followed Jon's instructions (and added some of mine to the wiki page to hopefully make this process even less thinking-intensive).
- To be able to use the IP address 192.168.113.83, removed "bscteststand" from chiara martian.hosts and rev.113.168.192.in-addr.arpa as the last mention I could find of this machine was from 2009 (and I'm pretty sure it isn't an active unit anymore). I then restarted the bind9 process.
- The hostname for this machine is currently "c1iscaux3" for testing purposes, I will change it once we do the actual install.
- There was an error in the installation instructions to allow incoming ssh connections - it is openssh-server that is required, not openssh-client. This has now been fixed on the wiki page instructions.
- Acromag static IP assignment:
- Assigned 2 ADCs (XT1221), 5 DACs (XT1541) and 5 sinking BIO units (XT1111) static IP addresses (and labelled them for easy reference) using the windows laptop and the Acromag IP config utility.
- I saw no reason not to use the 192.168.114.yyy scheme for the Acromag subnet on this machine, even though c1auxex and c1vac both have subnets with this addressing prefix. For reasons unknown to me, Jon opted to use 192.168.115.yyy for the c1susaux Acromag subnet.
- Followed the excellent step-by-step to install EPICS, Modbus and Asyn.
- This took a while, ~1 hour, dominated by the building of EPICS. The other two took only a couple of minutes each.
- The same combination suggested on Jon's wiki, of Modbus R2-11, EPICS base-7.0.1 and asyn4-33, are the most current at the time of installation.
- Couple of typos that prevented straight up copy-pasting were fixed on the wiki.
- Playground for testing new database files:
- made a directory /cvs/cds/caltech/target/c1iscaux3 and copied over the .db files from /cvs/cds/caltech/target/c1iscaux and /cvs/cds/caltech/target/c1iscaux2 over.
- Johannes said he did not develop any code to automate the process of translating the old .db files into the new ones for the Acromag - I won't invest the time in developing any either as I think just manually editing the files will be faster.
- I think I will follow the c1susaux convention of grouping .db files by the physical electronics system where possible (e.g. REFL11 channels in one file, CM channels in one file etc), as I think this makes for easier debugging.
- There is an old "PZT_AI.db" file which I think consists completely of obsolete channels.
- Next steps:
- Wire up the crate [Chub]
- Make the database files and modbus files for talking to the Acromags on the internal subnet [Gautam], check the .db files [Koji]
- Wiring of whitening switching from P1 to P2 connector, Issue #1 in this elog (this will also requrie the installation of the DIN shrouds) [Koji]
- Soldering of P2 interface boards [Gautam]
- Bench testing [Gautam, Koji, Chub]
- Installation and in-situ testing [Gautam, Koji, Chub]
All the required additional parts should be here by the end of the week - I'd like to aim for Wednesday 7/24 for the installation in 1Y3 and in-situ testing. While talking to Rana, I realized that we should also factor in the c1aux slow channels into this acromag crate - there is no need for a separate machine to handle the shutters and illuminators. But let's not worry about that for now, those channels can simply be added later. |
14769
|
Wed Jul 17 21:22:41 2019 |
gautam | Update | CDS | CM board Latch Enable subtlety | [koji, gautam]
Koji pointed out an important subtlety pertaining to the "LATCH ENABLE" signal line on the CM board. The purpose of this line is to smoothly facilitate the transition of a change in the "multi-bit-binary-outputs", a.k.a. "mbbo", that are controlled by MEDM gain sliders, to the analog electronics on the CM board. Why is this necessary? Imagine changing the gain from 7dB (=0111 in mbbo representation) to 8dB (=1000 in mbbo representation). In order to realize this change, all 4 bits have to change their state. But this almost certainly doesn't happen synchronously, because our EPICS interface isn't synchronous. So at some intermediate times, the mbbo representation could be 0100 (=4dB), or 1111 (=15dB), or many other possible values, which are all significantly different from either the initial value or the desired final state. This is clearly undesirable.
In order to protect against this kind of error, a Latched output part, 74ALS573, is used to buffer the physical digital logic levels from the switches in the analog gain stages. So in the default state, the "LATCH ENABLE" signal line is held "LOW". When a change happens in the EPICS value corresponding to a gain slider, the "LATCH ENABLE" state is quickly toggled to "HIGH", so as to enable the appropriate analog gain stages to be switched, and then again to "LOW", at which point the latch holds its output state. This logic is currently implemented by a piece of code called "latch.o", which is the compiled version of "latch.st", which may be found in /cvs/cds/caltech/target/c1iool0 where it presumably was written for the IMC servo board, but not in /cvs/cds/caltech/target/c1iool0 , which is where the CM board database files reside. The only elog reference I can find pertaining to this particular piece of code is from Alan, and doesn't say anything about the actual logic.
For the new c1iscaux, we need to implement this logic somehow. After discussion between Koji and me, we feel that a piece of python code is sufficient. This would continuously run in the background on the supermicro server machine. The channel hierarchy for each gain channes is as follows (I've taken the example of C1:LSC-CM_REFL1_GAIN):
- C1:LSC-CM_REFL1_GAIN ------ this is the channel tied to an MEDM slider, and so is a "soft" channel
- C1:LSC-CM_REFL1_SET ------- this is a "soft" channel that gets converted to an mbbo
- C1:LSC-CM_REFL1_BITS ------ this is a channel that actually controls (multiple) physical binary outputs on the Acromag
So the logic will be that it continuously scans the EPICS channel C1:LSC-CM_REFL1_GAIN for a change in set value. When a change is detected, it has to update the C1:LSC-CM_REFL1_SET channel. In the next EPICS refresh cycle, this would result in the mbbo bits, C1:LSC-CM_REFL1_BITS , all changing to the appropriate values. After these changes have happened, we need to toggle the LATCH ENABLE in order to allow the changes to propagate to the analog gain stage switches. Need to think about what's the best way to do this. |
14770
|
Thu Jul 18 00:51:52 2019 |
Koji | Summary | CDS | iscaux electronics modifications | Along with the plan in ELOG 14744, the ISC PD interface and the whitening filter board have been modificed. The ISC PD I/Fs were restored to the crate and the cables were connected. The whitening filteres are still on the electronics bench for some more tests before being returned to the crate.
The updated schematics were uploaded as https://dcc.ligo.org/D1900318 and https://dcc.ligo.org/D1900319
- Modification of the ISC PD interface: Jumpers between DIN96 P1 and P2. Replace all AD797s with OP27. In fact only I/F #1 (the left most) had total 12 AD797 but the other units already had OP27s.
- Modification of the whitening filter: Jumpers between DIN96 P1 and P2. |
14771
|
Thu Jul 18 10:46:04 2019 |
gautam | Update | CDS | Database files made | I completed the translation of the .db files for the EPICS database records from the VME notation to the Acromag/Modbus/Asyn notation. The channels are now organized into 5 database files, located in /cvs/cds/caltech/target/c1iscaux3/, for convenience:
- C1_ISC-AUX_LSCPDs.db -------- This handles whitening gain, AA enable/bypass, Demodulator FE, and PD Interface Board channels for REFL11, REFL55, REFL33, REFL165, POP22, POP110, POX11, POY11, AS55 and AS110 photodiodes.
- C1_ISC-AUX_CM.db -------------- This handles all channels for the CM board. The mbbo addressing notation needs to be checked.
- C1_ISC-AUX_QPDs.db ----------- This handles all channels for the IPPOS QPD.
- C1_ISC-AUX_ALS.db ------------- This handles all channels for the IR ALS DFD LO and RF power monitoring.
- C1_ISC-AUX_SPARE.db ---------- This handles the unused channels for the various whitening, AA and PD interface boards.
For reasons unknown to me, the database files in the other Acromag system target directories (e.g. c1susaux, c1auxex) all had 755 level access permission - maybe this is required for systemctl to handle the EPICS serving? Anyways, I upgraded the permission level of the above 5 files using chmod.
There are almost certainly typos / other errors, and I may have missed copying over some soft/calibrated channels, but I hope that this way of grouping by subsystem will make the debugging less painful. Once Chub connects up the power lines to the Acromags, I will run the soft tests. For this purpose, I've also made a C1_ISC-AUX.cmd file and a C1_ISC-AUX.env file in the above target directory, and also made the modbusIOC.service file in /etc/systemd/system on the supermicro. |
14773
|
Thu Jul 18 19:58:56 2019 |
gautam | Update | CDS | Work on Acromag chassis | Now that the .db files were prepared, I wanted to test for errors. So I did the following:
- Acromags were mounted on the DIN rails. Attachment #1 shows the grouping of ADC, DAC and BIO units. They are labelled with their IP addresses.
- Wiring of power:
- Chub had already prepared the backplane with the power connectors, switches and indicator LEDs.
- So I just had to daisy chain the +24 V (RED) and GND (BLACK) terminals for all the acromags together, which I did using 24 AWG wire (we may want to use heavier gauge given the current draw).
- Ethernet cables were used to daisy chain the network connectivity between the various units. Attachment #1 shows the current state of the chassis box.
- Front panel pieces were attached and labelled, see Attachment #2.
- I found it was sufficient to use the front - we may use the rear panel slots when we want to add connections for controlling the c1aux machine channels.
- The D15 P2 connector panel for the CM board will arrive tomorrow and will be installed then.
- Entire setup was connected to power and ethernet, see Attachment #3.
- As usual, the current draw is significant for the collection of Acromags, I got around this problem by using the bench supply to "Parallel" mode to enhance the current driving capacity.
- For the ethernet connection, I used the office space port #6, which I connected at the network rack end to the eth1 port of the Supermicro.
All the Acromags are seen on the 192.168.114 subnet on c1iscaux3 - however, when I run the modbusIOC process, I see various errors in the logfile , so more debugging is required. Nevertheless, progress.
Update 2245: Turns out the errors were indeed due to a copy/paste error - I had changed the IP addresses for the ADCs from the .115 subnet c1susaux was using, but forgot to do so for the DACs and BIOs. Now, if I turn off the existing c1iscaux so that there aren't any EPICS clashes, the EPICS server initializes correctly. There are still some errors in the log file - these pertain to (i) the mbbo notation, which I have to figure out, and (ii) the fact that this version of EPICS, 7.0.1, does not support channel descriptions longer than 28 characters (we have several that exceed this threshold). I think the latter isn't a serious problem.
Getting closer... Note that I turned off the c1iscaux VME crate to prevent any EPICS server clashes. I will turn it back on tomorrow. |
14775
|
Thu Jul 18 22:34:40 2019 |
Koji | Summary | CDS | iscaux electronics modifications | The whitening filter modules have been restored to the crates. The SMA cables have been restored and fastened by a spanner. The ribbon cable to the antialiasing board was also connected. The backplane cables have not been moved from the upper DIN96 connector to the lower one.
Everything is expected to be good, but just keep eyes on the LSC signals as the boards were not quantitatvely tested yet. If you find something suspicious, report on the elog. |
14781
|
Fri Jul 19 19:44:03 2019 |
gautam | Update | CDS | Database file test | Summary:
The database files for C1ISCAUX seem to work file - the exception being the mbbo channels for the CM board.
Details:
This was just a software test - the actual functionality of the channels will have to be tested once the Acromag crate has been installed in the rack. One change I had to make on the MEDM screen for the LSC PD whitening gains was to get rid of the "NMS" suffix on the EPICS channel names for whitening gain sliders/drop-down-menus. I suspect this has to do with the EPICS version we are using, 7.0.1. Furthermore, AS165 and POP55 no longer exist - I hold off removing them from the MEDM screen for the moment.
Next steps:
From the software point of view, the major steps are:
- Fix the mbbo channel notation in the database files
- Write and test the latch enabling code
- Figure out what scripted tests can be done to test the functionality of the new Acromag box.
I am stopping the EPICS server on the new machine and restarting the old VME crate over the weekend. |
14785
|
Sat Jul 20 11:57:39 2019 |
gautam | Summary | CDS | P2 interface board | The boards arrived. I soldered on a DIN96 connector, and tested that the goemetry will work. It does . The only constraint is that the P2 interface board has to be installed before the P1 interface is installed. Next step is to confirm that the pin-mapping is correct. The pin mapping from the DIN96 connector to the DB15 was also verified.
*Maybe it isn't obvious from the picture, but there shouldn't be any space constraint even with the DB37/DB15 cables connected to the respective adapter boards. |
14790
|
Sun Jul 21 12:55:38 2019 |
gautam | Update | CDS | CM board Latch Enable test script | DATED, SEE ELOG14941 for the most up-to-date info on latch.py.
I wrote (/cvs/cds/caltech/target/c1iscaux3/latch.py) and tested the logic illustrated in Attachment #1. Results of a test are shown in Attachment #2, the various channels change as expected. Note that for negative values of the gain channel, the corresponding "BITS" channel will take on values like 65536 - this is because the mbboDirect data type is a 16 bit data type, and presumably the MSB is the sign bit. A bit mask is applied to this channel before the actual BIO unit bits are set - we should verify that the correct behavior happens, but I don't immediately see any problems.
To me, this is a robust logic, but it will benefit from more sets of eyes giving it a look over. The idea is to run this continuously on the Supermicro machine.
Apart from this, I also fixed some errors in the mbboDirect record syntax - so now I am able to start up the EPICS server without it throwing any error messages. It remains to verify that changing an EPICS gain slider results in the appropriate gain bits being flipped in the correct way (on the hardware side, I think the correct behavior is happening on the software end). For this testing, I turned off the old c1iscaux crate at ~10am, and started up the server on c1iscaux3. I am reverting to the nominal config now (~1pm).
Further testing will require the wiring inside the Acromag chassis to be completed. This should be the priority task for next week.
*Update 1130 22 July 2019: I've now installed the required dependencies on c1iscaux3 and setup the latch.py script to run as a systemctl process dependent on modbusIOC.service. |
14795
|
Mon Jul 22 07:21:13 2019 |
gautam | Update | CDS | painosa messed with | Somebody changed the settings on painosa without elogging anything about it. Why does this keep happening? I thought the point of the elog was to communicate. I think there are sufficient number of problems in the lab without me having to manually reset the control room workstation settings every week. Please make an elog if you change something. |
14832
|
Tue Aug 6 14:55:23 2019 |
gautam | Update | CDS | Making Matlab R2015b the default | ML2013 is unable to open Simulink on any of the workstations. We decided to make the default version of Matlab R2015b (the default of the version of RCG we are using).
I commenced the procedure of the migration, starting with making a tagged commit of the current running simulink models. A local backup was also made, plus we have the usual chiara-based backup so I think we're in good hands.
Currently the branch and tag are protected - once we verify that everything works as expected post migration, I will open it up. I changed the directory structure of the models, need to confirm that the rtcds compilers don't have any hardcoded paths which may break due to my change.
The symlink to Matlab R2013 was deleted and a new symlink to R2015b was made. I activated the license using the Caltech campus license. Now running matlab from shell starts up R2015b . Simulink even works 😲 . |
14837
|
Fri Aug 9 08:59:04 2019 |
gautam | Update | CDS | Prep for install of c1iscaux | [chub, gautam]
We scoped out the 1Y3 rack this morning to figure out what needs to be done hardware wise. We did not think about how to power the Acromag crate - the LSC rack electronics are all powered by linear supplies and not Sorensens, and the linear supplies are operating at pretty close to their maximum current-drive. The Acromag box draws ~3A of current from the 20 V supply, not sure what the current draw will be from the 15 V supply. Options:
- Since there are sorensens in 1Y2 and 1Y1, do we really care about installing another pair of switching supplies (+20 V DC and +15 V DC) in 1Y3?
- Contingent on us having two spare Sorensens available in the lab. Chub has already located one.
- Use the Sorensens installed already in 1Y1.
- Probably the easiest and fastest option.
- +15 V already available, we'd have to install a +20 V one (or if the +/-5 V or +12 V is unused, reconfigure for +20 V DC).
- Can argue that "this doesn't make the situation any worse than it already is"
- Will require the running of some long (~3 m) long cabling to bring the DC power to 1Y3 where it is required.
- Get new linear supplies, and hook them up in parallel with the existing.
- Need to wait for new linear supply to arrive
- Probably expensive
- Questionable benefit to electronics noise given the uncharacterized RF pickup situation at 1Y2
I'm going with option #2 unless anyone has strong objections. |
14840
|
Sun Aug 11 11:47:42 2019 |
gautam | Update | CDS | Bench test of c1iscaux | I bench tested the functionality of all the c1iscaux Acromag crate channels. Summary: we are not ready for a Monday install, much debugging remains.
- DAC channels were tested using 4 ch oscilloscope and stepping the whitening gain sliders through their 15 gain settings
- Response was satisfactory - the output changes between 0 - 5 V DC in 15 steps.
- This analog voltage is converted to binary representation by an on-board ADC on the whitening boards. So we may have to tune the offset voltage and range to avoid accidental bit flipping due to the analog voltage of a particualr step falling close to the bit-flipping edge of the on-board ADC. This will require an in-situ test.
- Test passed
- BIO output channels were tested using a DMM, and monitoring the resistance between the BIO pin and the RTN pin. In the "ON" state, the expected resistance is ~5 Mohm, and in the off state, it is ~3 ohms.
- The AA filter switches on BIO1 unit do not show the expected behavior - @ Chub, please check the wiring.
- All others (except the mbboDirect bits, see next bullet) were okay, including those for the CM board that are NOT part of the mbboDirect groups.
- Test failed
- ADC channels were tested by driving a ~2Vpp 300mHz sine wave with a function generator, and looking at the corresponding EPICS channel with StripTool.
- I found that all the ADC channels don't function as expected.
- Part of the problem is due to incorrect formatting of the EPICS records in the db files, but I think the ADCs also need to be calibrated with the precision voltage source.
- Why only ADCs require calibration and not the DACs????
- Test failed
- mbboDirect BIO output test - I made a little LED breadboard tester kit to simultaneously monitor the status of these groups of binary outputs.
- The LSB is toggled as expected when moving the gain slider along.
- However, the other bits in the group are not toggled correctly.
- I believe this is a problem with either (i) the way the EPICS record is configured to address the bits or (ii) the incorrect modbus datatype is used to initialize the ioc.
- It will be helpful if someone can look into this and get the mbboDirect bits working, I don't really want to spend more time on this.
- Test failed
I am leaving the crate powered (by bench supplies) in the office area so I have the option to work remotely on this. |
14841
|
Mon Aug 12 17:36:04 2019 |
gautam | Update | CDS | More bench test of c1iscaux | [chub, gautam]
With Chub's help, most of the problems have been resolved. Summary: I judge that we are good to go ahead with an install tomorrow.
- The problem with the BIO channels was a mis-wiring internal to the chassis - Chub fixed this and now all 32 AA enable/disable switches seem to work as advertised. Of course we will need to do the in-situ test to make sure.
- The problem with the ADC channels were multiple:
- On the software end, I had gotten some addressing wrong - this was fixed.
- On the hardware side - even though the inputs of the Acromag are "differential", I found that the readback was extremely noisy (~0.5 V RMS for a 3 V DC signal from the handheld calibrator unit 😲 ). Looking through the manual, I found a recommendation (pg10) that the "IN-" terminal of the Acromag ADC units be tied to the "RTN" pins on the same units. I don't know if this preserves the differential receiving capability of the Acromag ADCs - anyways, after Chub implemented this change, all the Analog Input channels behave as expected (I tested with a DC voltage and also a 200 mHz sine wave from a function generator).
- Note that most of the Eurocard electronics we use are single-ended sending anyways.
- What does this mean for the other Acromag ADCs (e.g. OSEM Shadow Sensor monitors) we have installed????? I saw no documentation in the elog/wiki.
- Binary input channel:
- This is used by the "CM LIMIT" channel.
- I found that I had to initialize a separate alias for the BIO3 unit, which acquires this signal, to use the modbus function "4" corresponding to "Read Input Registers" - c.f. the binary output modbus function 6, which is to "Write Single Register".
- The fix for the mbbo channels is also likely to be along this lines - but I don't have the energy for that endavor right now.
- Testing of the physical mbboDirect bit channels using the Acromag Window utility
- I can't get the mbboDirect EPICS record to work as expected, so I decided to use the native Acromag utility to test the functionality
- First I released control of the acromags from the supermicro (stopped modbus)
- There were several wiring errors - Chub had left for the day so I just fixed it myself.
- The LED tester kit was used to check that the correct bits were flipped - they were.
- At the time of writing, the non-functional channels (in EPICS) are all related to the CM board:
C1:LSC-CM_LIMIT (binary input) tested later in the day, works okay...
- C1:LSC-CM_REFL1_BITS (mbboDirect)
- C1:LSC-CM_REFL2_BITS (mbboDirect)
- C1:LSC-CM_AO_BITS (mbboDirect)
- C1:LSC-CM_BOOST2_BITS (mbboDirect)
Since we don't immediately need the CM board, I say we push ahead with the install - at least that will restore the ability to lock PRMI / DRMI. Then we can debug these issues in situ - I'm certain the issue is related to the EPICS/Modbus setup and not the hardware because I verified the physical channel map using the Acromag windows utility.
Remaining Tasks:
- Install power supply cables at 1Y3
- Install supermicro and Acromag crates in 1Y3
- Migrate existing P1 connectors to P2 where applicable (Whitening boards)
- Connect Dsub-->P1 / P2 adaptors
- Run in-situ tests
Quote: |
I bench tested the functionality of all the c1iscaux Acromag crate channels. Summary: we are not ready for a Monday install, much debugging remains.
|
|
14843
|
Mon Aug 12 21:25:19 2019 |
Koji | Update | CDS | More bench test of c1iscaux | 1.
> Looking through the manual, I found a recommendation (pg10) that the "IN-" terminal of the Acromag ADC units be tied to the "RTN" pins on the same units. I don't know if this preserves the differential receiving capability of the Acromag ADCs
I suppose, we loose the differential capability of an input if the -IN is connected to whatever defined potential. We should check if the channels are still working as a true differential or not.
2. If the multi bit operation is too complicated to solve, we can use EPICS Calc channels to breakout a value to bits and send the individual bits as same as the other individual binary channels.
|
14844
|
Tue Aug 13 08:07:09 2019 |
gautam | Update | CDS | P1--->P2 | This morning, I wanted to move the existing cables going to the P1 connectors of the iLIGO whitening boards to the P2 connector, to test the modifications made to allow whitening stage switching. Unfortunately, I found that the shrouds werent installed. Where can I find these? |
14845
|
Tue Aug 13 14:36:17 2019 |
gautam | Update | CDS | P1--->P2 | As it turns out, only one extra shroud needed to be installed - I did this and migrated the cables for the 4 whitening boards from the P1 to P2 connectors. So until the new Acromag box is installed, we have no control over the whitening gains (slow channels), but do still have control over the whitening filter enable/disable (controlled by fast BIO). I am thinking about the easiest way to test the latter - I think the ambient PD dark noise level is too low to be seen above ADC noise even with the whitening enabled, and setting up drive signals to individual channels is too painful - maybe with +45dB of whitening gain, the (z,p) whitening filter shape can be seen with just PD/demod chain electroncis noise.
Quote: |
This morning, I wanted to move the existing cables going to the P1 connectors of the iLIGO whitening boards to the P2 connector, to test the modifications made to allow whitening stage switching. Unfortunately, I found that the shrouds werent installed. Where can I find these?
|
|
14848
|
Fri Aug 16 16:40:04 2019 |
gautam | Update | CDS | 1Y3 work | [chub, gautam]
Installation: The following equipment were installed in 1Y3, see Attachment #1:
- Supermicro server, which is the new c1iscaux machine, with IP Address 192.168.113.83.
- 6U Acromag chassis which contains all the ADCs, DACs and BIO units.
- 2 Sorensen DC power supplies to provide +24 V DC and +15 V DC to the Acromags.
- Fusable DIN rail power blocks were installed on the North side of the 1Y3 rack - I placed 2 banks of 5 connectors each for +15 V DC and +24 V DC.
Removal: The following equipment was removed from 1Y3:
- VME crates that were the old c1iscaux and c1iscaux2 machines.
- Spare VME crate that used to be c1susaux, which Chub and I brought over to 1Y3 in an attempt to revive the broken c1iscaux2.
- Approximately 30 twisted ribbon cables that were going to the cross connects. For now, we have not done a full cleanup and they are just piled along the east arm (see Attachment #2), beware if you are walking there!
Software:
- I connected the c1iscaux machine to the martian network.
- Then I edited the relevant files on chiara to free up the IP addresses previously used by c1iscaux (192.168.113.81) and c1iscaux2 (192.168.113.82), and re-assigned the IP address used for c1iscaux to be 192.168.113.83.
- I also changed the hostname of the c1iscaux machine (it was temporarily called c1iscaux3 to allow bench testing).
- I moved the old /cvs/cds/caltech/target/c1iscaux and /cvs/cds/caltech/target/c1iscaux2 directories to /cvs/cds/caltech/target/preAcromag_oldVME/c1iscaux and /cvs/cds/caltech/target/preAcromag_oldVME/c1iscaux2 respectively.
- I moved the temporarily named /cvs/cds/caltech/target/c1iscaux3 directory, from which I was running all the tests, to /cvs/cds/caltech/target/c1iscaux.
- I edited all references to c1iscaux3 in the systemd files so that we can run the approriate systemd services.
Next steps:
- We did not get around to running the DB37 cables between the Acromag chassis and the 1Y2 Eurocrates today - this operation itself took the whole day as we also needed to lay out some support struts etc on the rack to support the Sorensens and the Acromag chassis.
- Once the Acromags are connected to the Eurocrates, we have to run in-situ tests to make sure the appropriate functionality has been restored.
- We must have bumped something in the c1lsc expansion chassis - the CDS FE overview screen is reporting some errors (see Attachment #3). I will fix this.
- General tidiness, strain-relief etc.
Quote: |
I judge that we are good to go ahead with an install tomorrow.
|
|
14849
|
Sat Aug 17 16:49:23 2019 |
gautam | Update | CDS | More 1Y3 work | Work done today:
- All ribbon cable connections to the backplane of the 1Y2 Eurocrates were removed. The cables themselves were cleared for more space to work with.
- 20x 15ft DB37 Cables were run between 1Y2 and 1Y3 via overhead cable tray.
- Backplane interface boards were installed for 1Y2 Eurocrate boards.
- Connections were made between the Acromag chassis and the eurocrate electronics modules.
Testing of functionality:
- Fast BIO switching was verified to work for the following photodiodes:
- AS55, AS110, REFL11, REFL33, REFL55, REFL165, POX11, POY11, POP22, POP110.
- No light was incident on the PDs.
- Test was done by increasing the whitening gain to +45 dB, and then looking at the ASD of the electronics noise between 50 Hz and 500 Hz with the whitening enabled/disabled. We expect x10 difference between the two states. This was seen.
- "DetMon" channels were verified to work - see Attachment #1
- Y-axis units is volts
- Test was done by toggling the output of the 11 MHz Marconi, and looking for a change.
- As seen in the attachment, all 5 monitor channels show a change.
- This needs to be calibrated into some sensible units - I don't know why the different modulation frequencies have such different readbacks from supposedly identical Demod Board monitor points.
- Not sure if the ~10 V reported by the REFL165 monitor point is real or saturated.
- These channels are installed to signal/help debug the infamous ERA-5 decay problem, but maybe already some are decayed?
- QPD interface channels were verified to work - see Attachment #2.
- Test was done by shining a green laser pointer on QPD quadrants.
Much testing remains to be done, but I defer further testing till Monday - the main functionality to be verified in the short run is the whitening gain stepping. The strain-relief of cables and general cleanup will be undertaken by Chub. Current state of affairs is in Attachment #3, leaves much to be desired in terms of cleanliness.
I will also setup the autoburt for the new machine on Monday. We will also need to add some channels to C0EDCU.ini if we want to trend them over some years (e.g. RF signal powers for monitoring ERA-5 health).
* c1lsc FE was rebooted using the usual script, and everything seems to be healthy in CDS-land again, see Attachment #4.
Quote: |
Next steps:
- We did not get around to running the DB37 cables between the Acromag chassis and the 1Y2 Eurocrates today - this operation itself took the whole day as we also needed to lay out some support struts etc on the rack to support the Sorensens and the Acromag chassis.
- Once the Acromags are connected to the Eurocrates, we have to run in-situ tests to make sure the appropriate functionality has been restored.
- We must have bumped something in the c1lsc expansion chassis - the CDS FE overview screen is reporting some errors (see Attachment #3). I will fix this.
- General tidiness, strain-relief etc.
|
|
14850
|
Mon Aug 19 14:36:21 2019 |
gautam | Update | CDS | c1iscaux remaining work | Here is what is left to do:
- Strain relief of all cabling. Chub will take care of this in the coming days. I have said he can connect and disconnect cables as he pleases, but after this work, we may require a hard reboot of the Acromag chassis before restoring functionality to the channels, as it is known that the Acromags can sometimes get "stuck" by a sudden connection of voltage.
- Installation of DB15 cable to the P2 connector of the CM board and a DB9 cable to the ALS demod unit (LO and RF power monitors). These will arrive in the next couple of days and Chub will take care of the install.
- Design, manufacture and install of a custom version of the backplane P1 adaptor board with only 1 D37 connector - for some of the PD DC signals, a custom adaptor board, part number D010005 for which I can't find any schematics is already installed on the P2 connector, and makes the DC monitor signals available to 4 LEMO connectors. These signals are then digitized by the fast CDS system, presumably for PDH signal normalization. The footprint of this P2--->LEMO adaptor is such that we cannot simply install our P1---> 2xDB37 adaptor boards, because of space constraints. Fortunately, there is a simple fix to reduce the footprint of the board: remove the bottom DB37 connector, which is unused in the c1iscaux system except for the CM board. I recommend getting ~10 pcs of such boards, as it is also useful in a few other places, where the power cabling to the eurocrates are a space constraint. See Attachment #1 for a picture explaining this situation. Anyone want to volunteer to take care of this?
- In-situ testing. This is easiest done with some light available in the interferometer. Which in turn requires IMC to be locked. Which in turn requires satellite box fixing. Anyone want to volunteer to take care of this?
- Modify C0EDCU.ini to trend the new slow channels we may want long-term monitoring of (e.g. LO power levels to the Demod boards). Anyone want to volunteer to take care of this?
- Decide what to do about the CM latch logic. There are some contraints with the way the acromag register addressing works, that I've had to change the way the mbboDirect bits are controlled. Unfortunately, this seems to sometimes and unpredictably cause the bits to flip in a non-robust way, which is the whole point of having the latch in the first place. Either the latch logic needs to be improved, or we need to implement the latch logic in the fast CDS system, not the slow.
Today I set up the autoburt.req file for the c1iscaux channels, and confirmed that the snapshots are getting recorded. There were a lot of channels in the old autoburt.req file which I thought were un-necessary (and several which no longer exist), so now the only channels that are burt-ed are the whitening gains and states of the AA filters. If someone feels we need more channels to be snapshot recorded, you can add them to the file.
In the old target directory, there were also various versions of a "saverestore.req" file - why do we need this in addition to an autoburt? I guess it is possible they are used by the IFOconfigure scripts to setup some whitening gains etc... |
14851
|
Tue Aug 20 19:05:24 2019 |
Koji | Update | CDS | MC1 (and MC3) troubleshoot | Started the troubleshoot from the MC1 issue. Gautam showed me how to use the fake PD/LED pair to diagnose the satellite box without involving the suspension mechanics.
This revealed that the MC1 has frequent light level glitches which are common for five sensors. This feature does not exist in the test with the MC3 satellite box. I will open and check the MC1 satellite box to find the cause of this common glitches tomorrow. MC1 is currently shutdown and undamped.
BTW, at the MC3 test, i found that J2 of the satellite box (male Dsub) has all the pins too low (or too short?). I brought the box outside and found that the housing of this connector was half broken down. The connector was reassembled and the metal parts of the housing was bent again so that the housing can hold the connector body tightly.
The MC3 satellite box was restored and connected to the cables. As I touched this box, it is still under probation. |
14852
|
Thu Aug 22 12:54:06 2019 |
Koji | Update | CDS | MC1 glitch removed (for now) and IMC locking recovered | I have checked the MC1 satellite box and made a bunch of changes. For now, the glitches coming from the satellite box is gone. I quickly tested the MC1 damping and the IMC locking. The IMC was locked as usual. I still have some cleaning up but will work on them today and tomorrow.
Attachment 1: Result
The noise level of the satellite box was tested with the suspension simulator (i.e., five pair of the LED and PD in a plastic box).
Each plot shows the ASD of the sensor outputs 1) before the modification, 2) after the change, and 3) with the satellite box disconnected (i.e., the noise from the PD whitening filter in the SUS rack).
Before the modification, these five signals showed significant (~0.9) correlation each other, indicating that the noise source is common. After the modification, the spectra are lowered down to the noise level of the whitening filters, and there is no correlation observed anymore. EXCEPT FOR the LR sensor: It seems that the LR has additional noise issue somewhere in the downstream. This is a separate issue.
Attachment 2: Photo of the satellite box before the modification
The thermal environment in the box is terrible. They are too hot to touch. You can see that the flat ribbon cable was burned. The amps, buffers, and regulators generate much heat.
Attachment 3: Where the board was modified
- (upper left corner) Every time I touched C51, the diode output went to zero. So C51 was replaced with WIMA 10uF (50V) cap.
- (lower left area) I found a clear indication of the glitch coming from the PD bias path (U3C). So I first replaced another 10uF (C50) with WIMA 10uF (50V). This did not change the glitch. So I replaced U3 (LT1125). This U3 had unused opamp which had railed to the supply voltage. Pins 14 and 15 of U3 were shorted to ground.
- (lower right corner) Similarly to U3, U6 also had two opamps which are railed due to no termination. U6 was replaced, and Pins 11, 12, 14, and 15 were shorted to ground.
- (middle right) During the course of the search, I suspected that the LR glitch comes from U5. So U5 was replaced to the new chip, but this had no effect.
Attachment 4: Thermal degradation of the internal ribbon cable
Because of the heat, the internal ribbon cable lost the flexibility. The cable is cracked and brittle. It now exposes some wires. This needs to be replaced. I'll work on this later this week.
Attachment 5: Thermal degradation of the board
Because of the excessive heat for those 20years, the bond between the board and the patten were degraded. In conjunction with extremely thin wire pattern, desoldering of the components (particularly LT1125s) was very difficult. I'd want to throw away this board right now if it were possible...
Attachment 6: Shorting the unused opamps
This shows how the pieces of wires were soldered to ground vias to short the unused opamps.
Attachment 7: Comparison of the noise level with the sus simulator and the actual MC1 motion
After the satellite box fix, the sensor outputs were measured with the suspension connected. This shows that the suspension is moving much more than the noise level around 1Hz. However, at the microseismic frequency there is also most no mergin. Considering the use of the adaptive feedforward, we need to lower the noise of the satellite box as well as the noise of the whitening filters.
=> Use better chips (no LT1125, no current buffers), use low noise resistors, better thermal environment.
|
14853
|
Thu Aug 22 20:56:51 2019 |
Koji | Update | CDS | MC1 glitch removed (for now) and IMC locking recovered | The internal ribbon cable for the MC1 satellite box was replaced with the one in the spare box. The MC1 box was closed and reinstalled as before. The IMC is locking well.
Now the burnt cable was disassembled and reassembles with a new cable. It is now in the spare box.
The case closed (literally) |
14855
|
Fri Aug 23 18:46:17 2019 |
Jon | Update | CDS | c1iscaux remaining work | I added the list of new c1iscaux channels to /opt/rtcds/caltech/c1/chans/daq/C0EDCU.ini and restarted the framebuilder. Koji had thought some of these channels might have previously existed under slightly different names. However, after looking through C0EDCU.ini and the other _SLOW.ini files, I did not find any candidates for removal. As far as I can tell, all of these channels are being recorded for the first time.
Quote:Koji |
- Modify C0EDCU.ini to trend the new slow channels we may want long-term monitoring of (e.g. LO power levels to the Demod boards). Anyone want to volunteer to take care of this?
|
|
14857
|
Sun Aug 25 14:18:08 2019 |
gautam | Update | CDS | c1iscaux remaining work | There were a bunch of useless / degenerate channels added - e.g. whitening gains which are alreay burt-snapshot. Maybe there are many more useless channels being trended, but no need to add more.
Copy-pasting wasn't done correctly - the first 4 added channels were duplicates. There are in fact 5 LO power mons, one for each of the frequencies 11, 33, 55, 110 and 165 MHz.
I cleaned up. Basically only the detect-mon channels, and the ALS channels, are new in the setup now. I will review if any extra channels are required later. While checking that the daqd is happy, I noticed c1lsc FEs are in their stuck state, see Attachment #1. I guess a cable was bumped when the strain relief operation was underway. I'm not attempting a remote resuscitation.
Quote: |
I added the list of new c1iscaux channels to /opt/rtcds/caltech/c1/chans/daq/C0EDCU.ini and restarted the framebuilder. Koji had thought some of these channels might have previously existed under slightly different names. However, after looking through C0EDCU.ini and the other _SLOW.ini files, I did not find any candidates for removal. As far as I can tell, all of these channels are being recorded for the first time.
|
|
14858
|
Thu Sep 5 18:42:19 2019 |
aaron | HowTo | CDS | WFS discussion, restarting CDS | [aaron, rana]
While going to take some transfer functions of the MC WFS loop, LSC was down. When we tried to restart the FE using 'rtcds restart --all', c1lsc crashed and froze. We manually reset c1lsc, then laboriously determined the correct order of machines to reboot. Here's what works best:
on c1lsc:
rtcds start c1x04 c1lsc c1ass c1oaf c1cal c1daf
Starting c1dnn crashes the other FE
on c1ioo
rtcds restart --all
on c1sus
rtcds restart c1rfm c1sus c1mcs
restarting c1pem crashes the other FE on c1sus
We're seeing a lot of red IPC indicators--perhaps it's an issue with the order we're restarting? |
14859
|
Thu Sep 5 20:30:43 2019 |
rana | HowTo | CDS | WFS discussion, restarting CDS | via Polish chat, GV tells us to RTFE |
14860
|
Fri Sep 6 09:40:56 2019 |
aaron | HowTo | CDS | WFS discussion, restarting CDS | As suggested, I ran the script cds/rebootC1LSC.sh
I got a timeout error when the script tried closing the PSL shutter ('C1:AUX-PSL_ShutterRqst' not found), but Rana and I closed the shutter before leaving last night. c1sus is down, so the script found no route to host c1sus; I'm thinking I need to reset c1sus for the script to run completely. Nonetheless, c1lsc was rebooted, which crashed c1ioo and left the c1lsc FE all red (probably because c1sus wasn't restarted).
|
14861
|
Fri Sep 6 11:56:44 2019 |
aaron | HowTo | CDS | WFS discussion, restarting CDS | Rebooting
I reset c1lsc, c1sus, and c1ioo.
I noticed that the script gives the command 'ssh c1XXX', but we have been getting no route to host using this command. Instead, the machines are currently only reachable as c1XXX.martian. I'm not sure why this is, so I just appended .martian in rebootC1LSC.sh
This time, the script does run. I did get 'no route to host' on c1ioo, so I think I need to reset that machine again. After reset, the script failed to login to c1ioo and c1lsc.
Fri Sep 6 13:09:05 2019
After lunch, I reset the computers again, and try the script again. There is again no route to host for c1ioo. I'm going inside to shutoff the power to c1ioo, since the reset buttom seems to not be working. I still can't login from nodus, so I'm bringing a keyboard and monitor over to plug in directly.
On reset, c1ioo repeatedly reaches the screen in attachment 1, before going black. Holding down shift or ctrl+alt+f1 doesn't get me a command prompt. After waiting/searching the elog for >>3 min, we decided to follow these instructions to cycle the power of c1ioo. The same problem recurred following power up. I found online some instructions that the SunSystems 4600 can hang during reboot if it has become too hot ("reboot during a thermal shutdown"); I did notice that the temperature light was on earlier in this procedure, so perhaps that is the problem. I followed the wiki instructions to shut down the computer again (pressed power button, unplugged 4 power supplies from back of machine), and left it unplugged for 10-30 min (Fri Sep 6 14:46:18 2019 ).
Fri Sep 6 15:03:31 2019
Rana plugged in the power supplies and reset the machine again.
Fri Sep 6 16:30:37 2019
c1ioo is still unreachable! I pressed reset once, and the reset button flashes white. The yellow warning light is still on.
Fri Sep 6 16:54:21 2019
The reset light has stopped flashing, but I still can't access c1ioo. I reset once more, this time watching c1ioo on a monitor directly. I'm still seeing the same boot screen repeatedly. I do see that CPU0 is not clocking, which seems weird.
Troubleshooting CPU module
Following gautam's elog here, I found the Sun Fire X4600 manual for locating faulty CPUs. After the white reset light stopped flashing, I held down the power button to turn off the system. Before shutdown, all of the CPU displayed amber lights; after shutdown, only the leftmost CPU (as viewed from the back, presumably CPU0) displays an amber light. The manual says this is evidence that the CPU or DIMM is faulty. Following the manual, I remove the standby power, then checked out these Instructions for replacing the CPU to remove the CPU; Gautam also has done this before.
Fri Sep 6 20:09:01 2019 Fri Sep 6 20:09:02 2019
I pulled the leftmost CPU module out, following the instructions above. The CPU module matches the physical layout and part number of the Sun Fire X4600 M2 8-DIMM CPU module; pressing the fault reminder light gives amber indicators at the DIMM ejectors, indicating faulty DIMMs (see). The other indicator LEDs did not illuminate.
I located several spare DIMMs in the digital cabinet along Y arm (and a couple with misc computer components in the control room), but didn't find the correct one for this CPU module. The DIMM is Sun PN 371-1764-01; I found it online and ordered eight. Please let me know if this is incorrect.
To protect the CPU module, I've put it in an ESD safe bag with some bubble wrap and a note. It's on the E shop bench.
Conclusion: Need new DIMM, didn't find the correct part but ordered it. |
|