40m QIL Cryo_Lab CTN SUS_Lab TCS_Lab OMC_Lab CRIME_Lab FEA ENG_Labs OptContFac Mariner WBEEShop
 40m Log, Page 73 of 329 Not logged in
ID Date Author Type Category Subject
16331   Tue Sep 14 19:12:03 2021 KojiSummaryPEMExcess seismic noise in 0.1 - 0.3 Hz band

Looks like this increase is correlated for BS/EX/EY. So it is likely to be real.

Comparison between 9/15 (UTC) (Attachment 1) and 9/10 (UTC) (Attachment 2)

Attachment 1: C1-ALL_393F21_SPECTRUM-1315699218-86400.png
Attachment 2: C1-ALL_393F21_SPECTRUM-1315267218-86400.png
5623   Wed Oct 5 18:31:02 2011 KatrinUpdateGreen LockingExchanged mirror on YARM table

On the Green YARM end table the second mirror behind the laser has been exchanged.

Reason: The light is impinging on the mirror under an angle of  about 10 degrees, but the old mirror was coated for angle of incidence (aoi) of 45°.

Thus it was more like a beam splitter. The new mirror is a 1" Goock & Housego mirror which has a better performance for an aoi of 10 degree.

Realignment through Faraday Isolator and SHG cristall as well as 532nm isolator is almost finished.

8900   Tue Jul 23 04:07:48 2013 gautamUpdateCDSExcitation points set up on c1scx

In light of recent events and the decision to test the piezo tip-tilts for green beam steering on the X-end table, I have set up 8 excitation points to channels 8 through 15 of the DAC on c1scx (as was done earlier for the DAC at 1Y4 with Jenne's help) in order to verify that the pin-outs of the DAC interface board. I have not yet compiled the model or restarted the computer, and will do these tomorrow, after which I will do the test. The channels are named YYY_CHAN9 etc.

8909   Tue Jul 23 16:47:01 2013 gautamUpdateCDSExcitation points set up on c1scx

I just compiled and installed the model with the excitation points on c1scx and then restarted framebuilder. The channels I set up are now showing up in the awggui dropdown menu. I will do the tests on the DAC channels shortly.

Just to keep things on record, these are the steps I followed:

• opened the model c1scx (path: /opt/rtcds/userapps/release/sus/c1/models) with MATLAB
• Added 8 excitation points and saved the model. A copy has been saved as c1scx.mdl.r2010b because of the recent upgrade to r2013a.
• ssh to c1iscex (computer running the model c1scx).
• Entered the following sequence of commands in terminal: rtcds make c1scx ,  rtcds install c1scx , rtcds start c1scx
• ssh to framebuilder, and restarted the framebuilder by entering telnet fb 8088   and then   shutdown.
1708   Sat Jun 27 03:16:16 2009 ClaraUpdatePEMExciting microphone things!

So, I'm double-posting, but I figured the last post was long enough as it was, and this is about something different. After double and triple checking the XLR cables, I hooked up the microphone setup (mic---preamp---output) to the oscilloscope to figure out what kind of voltage would register with loud noises. So, I clapped and shouted and forgot to warn the other people in the lab what I was doing (sorry guys) and discovered that, even on the lowest gain setting, my loud noises were generation 2-3 times as much voltage as the ADC can handle (2V). And, since our XLR cables are so freaking long, we probably want to go for a higher gain, which puts us at something like 20 times too much voltage. I doubt this is really necessary, but it's late (early) and I got camera-happy, so I'm going to share anyway:

So, to deal with this issue, I made some nifty voltage dividers. Hopefully they are small enough to fit side-by-side in the ports without needing extra cableage. Anyway, they should prevent the voltage from getting larger than 2V at the output even if the mic setup is producing 50V. Seeing as my screaming as loud as I could about 2mm away from the mic at full gain could only produce 45V, I think this should be pretty safe. I put the ADC in parallel with a 25.5 kOhm resistor, which should have a noise like 10^-8 V/rHz. This is a lot smaller than 1 uV/rHz (the noise in the ADC, if I understood Rana's explanation correctly), so the voltage dividers should pose a noise issue. Now for pictures.

I opened one so you can see its innards.

In case the diagram on the box was too small to decipher...

And finally, I came up with a name scheme for the mics and pre-amps. We now have two Bluebird (bacteriophage) mics named Bonnie and Butch Cassidy. Their preamps are, naturally, Clyde and The Sundance Kid. Sadly, no photos. I know it's disappointing. Also, before anyone gives me crap for putting the labels on the mics upside-down, they are meant to be hung or mounted from high things, and the location (and stiffness) of the cable prevents us from simply standing them up. So they will more than likely be in some kind of upsidedownish position.

15763   Thu Jan 14 11:46:20 2021 gautamUpdateCDSExpansion chassis from LHO

I picked the boxes up this morning. The inventory per Fil's email looks accurate. Some comments:

1. They shipped the chassis and mounting parts (we should still get rails to mount these on, they're pretty heavy to just be supported on 4 rack nuts on the front). idk if we still need the two empty chassis that were requested from Rich.
2. Regarding the fibers - one of the fibers is pre-2012. These are known to fail (according to Rolf). One of the two that LHO shipped is from 2012 (judging by S/N, I can't find an online lookup for the serial number), the other is 2011. IIRC, Rolf offered us some fibers so we may want to take him up on that. We may also be able to use copper cables if the distances b/w server and expansion chassis are short.
3. The units are fitted with a +24V DC input power connector and not the AC power supplies that we have on all the rest of the chassis. This is probably just gonna be a matter of convenience, whether we want to stick to this scheme or revert to the AC input adaptor we have on all the other units. idk what the current draw will be from the Sorensen - I tested that the boards get power, and with noi ADCs/DACs/BIOs, the chassis draws ~1A (read off from DCPS display, not measured with a DMM). ~Half of this is for the cooling fans It seems like KT @ LLO has offered to ship AC power supplies so maybe we want to take them up on that offer.
4. Without the host side OSSI PCIe card, timing interface board, and supermicro servers that actually have enough PCIe slots, we still can't actually run any meaningful test. I ran just a basic diagnostic that the chassis can be powered on, and the indicator LEDs and cooling fans run.
5. Some photos of the contents are here. The units are stored along the east arm pending installation.

>     Koji,
>
>     Barebones on this order.
>
>        1. Main PCIe board
>        2. Backplane (Interface board)
>        3. Power Board
>        4. Fiber (One Stop) Interface Card, chassis side only
>        5. Two One Stop Fibers
>        6. No Timing Interface
>        7. No Binary Cards.
>        8. No ADC or DAC cards
>
>     Fil Clara
>
15764   Thu Jan 14 12:19:43 2021 JonUpdateCDSExpansion chassis from LHO

That's fine, we didn't actually request those. We bought and already have in hand new PCIe x4 cables for the chassis-host connection. They're 3 m copper cables, which was based on the assumption of the time that host and chassis would be installed in the same rack.

 Quote: Regarding the fibers - one of the fibers is pre-2012. These are known to fail (according to Rolf). One of the two that LHO shipped is from 2012 (judging by S/N, I can't find an online lookup for the serial number), the other is 2011. IIRC, Rolf offered us some fibers so we may want to take him up on that. We may also be able to use copper cables if the distances b/w server and expansion chassis are short.
15766   Fri Jan 15 15:06:42 2021 JonUpdateCDSExpansion chassis from LHO

Koji asked me assemble a detailed breakdown of the parts received from LHO, which I do based on the high-res photos that Gautam posted of the shipment.

Parts in hand:

Qty Part Note(s)
2 Chassis body
2 Power board and cooling fans As noted in 15763, these have the standard LIGO +24V input connector which we may want to change
2 IO interface backplane
2 PCIe backplane
2 Chassis-side OSS PCIe x4 card
2 CX4 fiber cables These were not requested and are not needed

Parts still needed:

 Qty Part Note(s) 2 Host-side OSS PCIe x4 card These were requested but missing from the LHO shipment 2 Timing slave These were not originally requested, but we have recently learned they will be replaced at LHO soon

Issue with PCIe slots in new FEs

Also, I looked into the mix-up regarding the number of PCIe slots in the new Supermicro servers. The motherboard actually has six PCIe slots and is on the CDS list of boards known to be compatible. The mistake (mine) was in selecting a low-profile (1U) chassis that only exposes one of these slots. But at least it's not a fundamental limitation.

One option is to install an external PCIe expansion chassis that would be rack-mounted right above the FE. It is automatically configured by the system BIOS, so doesn't require any special drivers. It also supports hot-swapping of PCIe cards. There are also cheap ribbon-cable riser cards that would allow more cards to be connected for testing, although this is not as great for permanent mounting.

It may still be better to use the machines offered by Keith Thorne from LLO, as they're more powerful anyway. But if there is going to be an extended delay before those can be received, we should be able to use the machines we already have in conjunction with one of these PCIe expansion options.

15767   Fri Jan 15 16:54:57 2021 gautamUpdateCDSExpansion chassis from LHO

Can you please provide a link to this "list of boards"? The only document I can find is T1800302. In that, under "Basic Requirements" (before considering specific motherboards), it is specified that the processor should be clocked @ >3GHz. The 3 new supermicros we have are clocked at 1.7 GHz. X10SRi-F boards are used according to that doc, but the processor is clocked at 3.6 or 3.2 GHz.

Please also confirm that there are no conflicts w.r.t. the generation of PCIe slots, and the interfaces (Dolphin, OSSI) we are planning to use - the new machines we have are "PCIe 2.0" (though i have no idea if this is the same as Gen 2).

 Quote: The motherboard actually has six PCIe slots and is on the CDS list of boards known to be compatible.

As for the CX4 cable - I still think it's good to have these on hand. Not good to be in a situation later where FE and expansion chassis have to be in different racks, and the copper cable can't be used.

Attachment 1: Screenshot_2021-01-15_17-00-06.png
15770   Tue Jan 19 13:19:24 2021 JonUpdateCDSExpansion chassis from LHO

Indeed T1800302 is the document I was alluding to, but I completely missed the statement about >3 GHz speed. There is an option for 3.4 GHz processors on the X10SRi-F board, but back in 2019 I chose against it because it would double the cost of the systems. At the time I thought I had saved us 5k. Hopefully we can get the LLO machines in the near term---but if not, I wonder if it's worth testing one of these to see whether the performance is tolerable.  Can you please provide a link to this "list of boards"? The only document I can find is T1800302.... I confirm that PCIe 2.0 motherboards are backwards compatible with PCIe 1.x cards, so there's no hardware issue. My main concern is whether the obsolete Dolphin drivers (requiring linux kernel <=3.x) will work on a new system, albeit one running Debian 8. The OSS PCIe card is automatically configured by the BIOS, so no external drivers are required for that one.  Please also confirm that there are no conflicts w.r.t. the generation of PCIe slots, and the interfaces (Dolphin, OSSI) we are planning to use - the new machines we have are "PCIe 2.0" (though i have no idea if this is the same as Gen 2). 7905 Wed Jan 16 18:08:06 2013 JenneUpdateLockingExpected PRC gains I was calculating the power recycling gains we expect for different versions of the PRC, and I am a little concerned that we aren't going to have much gain with the new LaserOptik mirrors. I'm using  t_PRM^2  G = -------------------------------------------  (1 - r_PRM * r_PR2 * r_PR3 * r_end)^2 from eqn 11.20 in Siegman. r_end is either the ITM (for a symmetric Michelson) or the flat mirror that we'll put in (for the PR-flat test case). r = sqrt( R ) = sqrt( 1 - T ) for mirrors whose power transmission is the quoted value. Some values: t_PRM^2 = T_PRM = 0.055 ---------> r_PRM = sqrt( 1 - 0.055 ) T_G&H = 20e-6 ----> r_G&H = sqrt( 1 - 20e-6 ) T_LaserOptic = 0.015 (see elog 7624 where Raji measured this...1.5% was the best that she measured for P polarization. Elog 7644 has more data, with 3.1% for 40deg AoI) -------> r_LasOpt = sqrt( 1 - 0.015 ) or sqrt( 1 - 0.031) T_ITM = 0.014 -----------> r_ITM = sqrt( 1 - 0.014 ) Some calculations with 1.5% LaserOptik transmission: G_PRC_2G&H = 45 G_PRC_G&H_LasOpt = 31 G_PRM_flatG&H = 51 With the 3% LaserOptik transmission: G_PRC_G&H_LasOpt = 22 G_PRM_flatG&H = 30 More ideal case of just PRM, flat mirror (either ITM or G&H), ignoring the folding mirrors: G_PRM_ITM = 45 G_PRM_flatG&H = 70 Punchline: If the LaserOptik mirror has 1.5% transmission at ~45 degrees, the regular PRC expected gain goes down to 31, from 45 with both folding mirrors as G&Hs. 7909 Wed Jan 16 20:27:16 2013 ranaUpdateLockingExpected PRC gains Why would we use such a bad optic in our recycling cavity? Is 1.5% the spec for these mirrors? Is this the requirement that Kiwamu calculated somehow? Did anyone confirm this measurement? I can't believe that we'll have low noise performance in a RC where we dump so much power. 7910 Thu Jan 17 00:17:31 2013 JenneUpdateLockingExpected PRC gains  Quote: Why would we use such a bad optic in our recycling cavity? Is 1.5% the spec for these mirrors? Is this the requirement that Kiwamu calculated somehow? Did anyone confirm this measurement? I can't believe that we'll have low noise performance in a RC where we dump so much power. Yeah, Koji mentioned in response to Raji's measurements several months ago that the LaserOptic mirros were pretty far out of spec. We should probably redo the measurement to confirm. 3224 Wed Jul 14 19:36:17 2010 GopalUpdateOptic StacksExperimental Confirmation of COMSOL Analysis I experimentally determined the spring constant of a single Vitol spring in order to obtain a rough estimate for the natural frequency of single-stack oscillation. The procedure basically involved stacking metal bars of known mass onto the Vitol and using a caliper to measure deviations from the equilibrium length. The plot below shows that, for small compressions, the response is linear to an R-squared of 0.98. The experimental spring constant came out to be about 270 lb/ft, or 3900 N/m. Previous documents have listed that the top stack takes on a load of approximately 43 kg. per individual spring. A bit of calculation yields an experimental resonant frequency of 9.5 Hz. Compared with the theoretical COMSOL first harmonic of about 7.5 Hz, there is a reasonable amount of error. Of course, I used this calculation as a simple ballpark estimate: errors from misplacement onto the Viton were minimized through use of a level, but were still inevitable on the mm scale. Since the two methods yield answers with the same order of magnitude, we are ready to move forward and build the remaining layers of the stack. 4208 Wed Jan 26 12:04:31 2011 josephbUpdateCDSExplanation of why c1sus and c1lsc models crash when the other one goes down So apparently with the current Dolphin drivers, when one of the nodes goes down (say c1lsc), it causes all the other nodes to freeze for up to 20 seconds. This 20 seconds can force a model to go over the 60 microseconds limit and is sufficiently long enough to force the FE to time out. Alex and Rolf have been working with the vendors to get this problem fixed, as having all your front ends go down because you rebooted a single computer is bad. [40184.120912] c1rfm: sync error my=0x3a6b2d5d00000000 remote=0x0 [40184.120914] c1rfm: sync error my=0x3a6b2d5d00000000 remote=0x0 [44472.627831] c1pem: ADC TIMEOUT 0 7718 38 7782 [44472.627835] c1mcs: ADC TIMEOUT 0 7718 38 7782 [44472.627849] c1sus: ADC TIMEOUT 0 7718 38 7782 [44472.644677] c1rfm: cycle 1945 time 17872; adcWait 15; write1 0; write2 0; longest write2 0 [44472.644682] c1x02: cycle 7782 time 17849; adcWait 12; write1 0; write2 0; longest write2 0 [44472.646898] c1rfm: ADC TIMEOUT 0 8133 5 7941  The solution for the moment is to start the computers at exactly the same time, so the dolphin is up before the front ends, or start the models by hand after the computer is up and dolphin running, but after they have timed out. This is done by: sudo rmmod c1SYSfe sudo insmod /opt/rtcds/caltech/c1/target/c1SYS/bin/c1SYSfe.ko Alex and Rolf have been working with the vendors to get this fixed, and we may simply need to update our Dolphin drivers. I'm trying to get in contact with them and see if this is the case. 15503 Tue Jul 28 13:55:11 2020 HangUpdateBHDExploring bilinear SRCL->DARM coupling We explore bilinear SRCL to DARM noise coupling mechanisms, and show two cases that by doing BHD readout the noise performance can be improved. In the first case, the bilinear piece is due to residual DHARD motion (see also LHO:45823), and it matters mostly for the low-frequency (<100 Hz) part, and in the second piece the bilinear piece is due to residual SRCL fluctuation and it matters mostly for the a few x 100 Hz part. Details are below: ================================================= General Model: We can write the SRCL to DARM transfer function as (Evan Hall's thesis, eq. 2.29) Z_s2d(f) = C_lf(f) * F^2 * x_D + C_hf(f) * F * dphi_S * x_D ---- (1) where C_lf ~ 1/f^2 and C_hf ~ f are constants at each frequency unless there are major upgrades to the IFO, F is the finesse of the arm cavity which depends on the alignment, spot position on the TMs, etc., dphi_S is the SRCL detuning (wrt the nominal 90 deg value), x_D is the DC DARM offset. The linear part of this can be removed with feedforward subtractions and it is the bilinear piece that matters, which reads dZ_s2d = C_lf * <F>^2 * dx_D + C_hf * <F> * <dphi_S> * dx_D + 2C_lf * <F> * <x_D> * dF + C_hf * <dphi_S> * <x_D> * dF + C_hf * <F> * <x_D> * d(dphi_S). ---- (2) The first term in (2) is due to residual DARM motion dx_D. This term does not depends on the DC value of DARM offset <x_D> and thus does not depend on doing BHD or DC readout. On the other hand, the typical residual DARM motion is 1 fm << 1 pm of DARM offset. Since the current feedforward reduction factor is about 10 (see both Den Martynov's thesis and Evan Hall's thesis), clearly we are not limited by the residual DARM motion. The second term is due to the change in the arm finesse, which can be affected by, e.g., the alignment fluctuation (both increasing the loss due to scattering into 01/10 modes and affecting the spot positon and hence changing the losses), and is likely to be the reason why we see the effect being modulated by DHARD. The last term in (2) is due to the residual SRCL fluctuation and is important for the ~ a few x 100 Hz band. ================================================= DHARD effects. As argued above, the DHARD affects the SRCL -> DARM coupling as it changes the finesse in the arm cavity (through scattering into 01/10 modes; in finesse we cannot directly simulate the effects due to spot hitting a rougher location). Since in the second term of eq. (2) the LF part depends on the DARM DC offset <x_D>, this effect can be improved by going from DC readout to BHD. To simulate it in finesse, at a fixed DARM DC offset, we compute the SRCL->DARM transfer functions at different DHARD offsets, and then numerically compute the derivative \partial Z_s2d / \partial \theta_{DH}. Then multiplying this derivative with the rms value of DHARD fluctuation \theta_{DH} we then know the expected bilinear coupling piece. The result is shown in the first attached plot. Here we have assumed a flat SRCL noise of 5e-16 m/rtHz for simplicity (see PRD 93, 112004, 2016). We do not account for the loop effects which further reduces the high frequency components for now. The residual DHARD RMS is assumed to be 1 nrad. In the first plot, from top to bottom we show the SRCL noise projection at different DARM DC offsets of (0.1, 1, 10) pm. Since the DHARD alignment only affects the arm finesse starting at quadratic order, it thus matters what DC offset in DHARD we assume. In each pannel, the blue trace is for no DC offset in DHARD and the orange one for a 5 nrad DC offset. As a reference, the A+ sensitivity is shown in grey trace in each plot as a reference. We can see if there is a large DC offset in DHARD (a few nrad) and we still do DC readout with a few pm of DARM offset, then the bilinear piece of SRCL can still contaminate the sensitivity in the 10-100 Hz band (bottom panel; orange trace). On the other hand, if we do BHD, then the SRCL noise should be down by ~ x100 even compared to with the top panel. (A 5 nrad of DC offset in DHARD coupled with 1 nrad RMS would cause about 0.5% RIN in the arms. This is somewhat greater than the typically measured RIN which is more like <~ 0.2%. See the second plot). ================================================= SRCL effect. Similarly we can consider the SRCL->DARM coupling due to residual SRCL rms. The approach is very similar to what we did above for DHARD. I.e., we compute Z_s2d at fixed DARM offset and for different SRCL offsets, then we numerically evaluate \partial Z_s2d / \partial dphi_S. A residual SRCL rms of 0.1 nm is then used to generate the projection shown in the third figure. Unlike the DHARD effect, the bilinear SRCL piece does not depend on the DC SRCL detuning (for the 50-500 Hz part). It does still depends on the DARM DC offset and therefore could be improved by BHD. Since we do not include the LP of the SRCL loop in this plot, the HF noise at 1 kHz is artifical as it can be easily filtered out. However, the LP will not be very strong around 100-300 Hz for a SRCL UGF ~ 30 Hz, and thus doing BHD could still have some small improvements for this effect. Attachment 1: SRCL_bilin_DHARD.pdf Attachment 2: ARM_RIN.pdf Attachment 3: SRCL_bilin_SRCL.pdf 16100 Thu Apr 29 17:43:48 2021 AnchalUpdateCDSF2A Filters double check I double checked today and the F2A filters in the output matrices of MC1, MC2 and MC3 in the POS column are ON. I do not get what SDF means? Did we need to add these filters elsewhere?  Quote: The IMC suspension team should double check their filters are on again. I am not familiar with the settings and I don't think they've been added to the SDF. Attachment 1: F2AFiltersON.png 16105 Fri Apr 30 00:20:30 2021 gautamUpdateCDSF2A Filters double check The SDF system is supposed to help with restoring the correct settings, complementary to burt. My personal opinion is that there is no need to commit these filters to SDF until we're convinced that they help with the locking / noise performance.  Quote: I double checked today and the F2A filters in the output matrices of MC1, MC2 and MC3 in the POS column are ON. I do not get what SDF means? Did we need to add these filters elsewhere 6004 Thu Nov 24 20:22:42 2011 MirkoUpdateIOOF2A filter for MC I calculated the F2A filters for the input mode cleaner optics as described in T010140-01-D eq (4). On Ranas recommendation I added an s/ ( w_0 * Q ) term to the numerator. The used values are: w_0 = 2pi / s h= 0.0009 D= 2.46957E-2 Q=10 I put theses filters into C1:SUS-MC1_TO_COIL_1_1 to _4_1 . For convenience split in Z and P. Well it doesn't work. After a few seconds the optic begins to swing wildly. 6006 Fri Nov 25 17:52:28 2011 ranaUpdateIOOF2A filter for MC Woo. Pretty crazy. The numerators should only be ~10% larger than the denominator below 1 Hz. Let's try again. 6012 Fri Nov 25 23:25:24 2011 MirkoUpdateIOOF2A filter for MC  Quote: Woo. Pretty crazy. The numerators should only be ~10% larger than the denominator below 1 Hz. Let's try again. [Rana, Mirko] I redid this calculation. The idea behind it is to get rid on any pitch that is introduced by applying longitudinal feedback to the mirrors. This coupling happens because the center of percussion for pitch , which is identical with the point where the wires lift off of the mirror, is above the center of mass. With the same values as before, just less faulty math and Q = 2 instead of 10 we end up with the following filters: For the lower coils (red), compared to corresponding preexisting BS filters (black): The upper coils' TF is just mirrored at the 0dB magnitude axis, and have a corresponding frequency response. I switched the F2a filters on for all MC mirrors. For convenience they are split into F2aZeros and F2aPoles. Everything seems fine. The F2a filters seem to be off for ( all ?) other mirrors. 6021 Mon Nov 28 10:54:40 2011 ranaUpdateSUSF2A filter for MC Our approach to making the F2A or F2P filters for the MC is to use the measured resonant frequencies and then calculating the appropriate mechanical dimensions of each suspension. This is basically because we don't have optical levers with normal incidence on these optics, but the method should be fine. To find the formulas, I asked Gaby for her old cheat sheet: Its now in the DCC. Its only for Large optics, but you should be able to reconstruct the right ones for SOS by just changing the parameters. 989 Thu Sep 25 02:35:21 2008 ranaSummaryPSLFAST is moving alot It looks like the FAST signal has started moving a lot - this is partly what inspired us to tune the SLOW loop. Some of the spiking events happen when people go on the table or the MC loses lock. But at other times it just spikes for no apparent reason. You can also see from the first plot (9 day 10-minute trend) that there is no great change in DTEC so we shouldn't be worried about clogging in the NPRO head. The second plot is a 1 day minute-trend. Attachment 1: Untitled.png Attachment 2: Untitled.png 1023 Fri Oct 3 15:09:58 2008 robUpdatePSLFAST/SLOW Last night during locking, for no apparent reason (no common mode), the PSL FAST/SLOW loop starting going just a little nutz. Attached is a two day plot. The noisy period started around 11-ish last night. Attachment 1: FASTSLOW.png 6639 Thu May 10 22:05:21 2012 DenUpdateCDSFB Already for the second time today all computers loose connection to the framebuilder. When I ssh to framebuilder DAQD process was not running. I started it controls@fb ~ 130 sudo /sbin/init q

But I do not know what causes this problem. May be this is a memory issue. For FB

Mem:   7678472k total,  7598368k used,    80104k free

Practically all memory is used. If more is needed and swap is off, DAQD process may die.

6640   Fri May 11 08:07:30 2012 JamieUpdateCDSFB

 Quote: Already for the second time today all computers loose connection to the framebuilder. When I ssh to framebuilder DAQD process was not running. I started it controls@fb ~ 130$sudo /sbin/init q Just to be clear, "init q" does not start the framebuilder. It just tells the init process to reparse the /etc/inittab. And since init is supposed to be configured to restart daqd when it dies, it restarted it after the reloading of /etc/inittab. You and Alex must have forgot to do that after you modified the inittab when you're were trying to fix daqd last week. daqd is known to crash without reason. It usually just goes unnoticed because init always restarts it automatically. But we've known about this problem for a while.  Quote: But I do not know what causes this problem. May be this is a memory issue. For FB Mem: 7678472k total, 7598368k used, 80104k free Practically all memory is used. If more is needed and swap is off, DAQD process may die. This doesn't really mean anything, since the computer always ends up using all available memory. It doesn't indicate a lack of memory. If the machine is really running out of memory you would see lots of ugly messages in dmesg. 13152 Mon Jul 31 15:13:24 2017 gautamUpdateCDSFB ---> FB1 [jamie, gautam] In order to test the new daqd config that Jamie has been working on, we felt it would be most convenient for the host name "fb" (martian network IP 192.168.113.202) to point to the physical machine "fb1" (martian network IP 192.168.113.201). I made this change in /var/lib/bind/martian.hosts on chiara, and then ran sudo service bind9 restart. It seems to have done the job. So as things stand, both hostnames "fb" and "fb1" point to 192.168.113.201. Now, when starting up DTT or dataviewer, the NDS server is automatically found. More details to follow. 11076 Thu Feb 26 13:17:31 2015 ericqUpdateComputer Scripts / ProgramsFB IO load Over the past few days, I've occasionally been peeking at the framebuilder IO load to see If I could correlate anything with it, but it's usually been low when I looked. I.e. with daqd and all models running, the %wa time was in the few percents at most. Just now, I was seeing some EPICS sluggishness, and sure enough, the %wa was in the 50-60 range. I used iostat -xmh 5 on the framebuilder to see that /dev/sda, the /frames drive, was at 100% utilization, which means it was reading and writing as fast as it possibliy could. I ssh'd over to nodus, and with iotop found that an rsync job was running (rsync -am --exclude .*.gwf full 131.215.114.19::40m/full), and its IO rates corresponded very closely to the data read rates on the framebuilder from /frames. I killed the rsync process on nodus, and the %wa time on the framebuilder dropped to near zero. The ASS striptools, where I had noticed the sluggishness, immediately started updating faster. While rsync is supposed to play nice with a system's IO demands, maybe it only knows about nodus's IO usage, not fb which is the underlying NFS server where the frames live. I think it would be good to throttle the bandwidth of these jobs to a specific bandwidth. 50MB/s seemed like too much, so maybe 10MB/s is ok? 11077 Thu Feb 26 13:55:59 2015 jamieUpdateComputer Scripts / ProgramsFB IO load We should use "ionice" to throttle the rsync. Use something like "ionice -c 3 rsync ..." to set the priority such that the rsync process will only work when there is no other IO contention. See "man ionice" for other options. 8374 Fri Mar 29 17:24:43 2013 JamieUpdateComputersFB RAID power supply replaced Steve ordered a replacement power supply for the FB JetStor power supply that failed a couple weeks ago. I just installed it and it looks fine. 12024 Sun Mar 6 15:24:05 2016 gautamUpdateCDSFB down again I came in to check the status of the nitrogen and noticed that the striptool panels in the control room were all blank. • PMC was unlocked but I was able to relock it using the usual procedure • FB seems to be down: I was unable to ssh into it (or any of the FEs for that matter). I checked the lights on the RAID array, they are all green. I am holding off on doing a hard reboot of FB in case there is some other debugging that can be done first • None of the watchdogs were tripped, but judging by the green spots on the mirrors, all of them are moving quite a bit. I've shutdown the watchdogs on all the optics except the MC mirrors, but the ITMs and ETMs still seem to be moving quite a bit. I am leaving things in this state for now. It is unclear why this should have happened, it doesn't seem like there was a power glitch? Attachment 1: 58.png 12025 Mon Mar 7 20:40:02 2016 ericqUpdateCDSFB down again We went and looked at the monitor plugged into FB. All kinds of messages were being spammed to the screen (maybe RAM errors), and nothing could be done to interrupt. Sadly, a hard reboot of FB was neccesary. Video of error messages: https://youtu.be/7rea_kokhPY After the reboot, it just took a couple of model restarts to get the CDS screen happy. 16294 Tue Aug 24 18:44:03 2021 KojiUpdateCDSFB is writing the frames with a year old date Dan Kozak pointed out that the new frame files of the 40m has not been written in 2021 GPS time but 2020 GPS time. Current GPS time is 1313890914 (or something like that), but the new files are written as C-R-1282268576-16.gwf I don't know how this can happen but this may explain why we can't have the agreement between the FB gps time and the RTS gps time. (dataviewer seems dependent on the FB GPS time and it indicates 2020 date. DTT/diaggui does not.) This is the way to check the gpstime on fb1. It's apparently a year off. controls@fb1:~ 0$ cat /proc/gps
1282269402.89

Attachment 1: Screen_Shot_2021-08-24_at_18.46.24.png
16298   Wed Aug 25 17:31:30 2021 PacoUpdateCDSFB is writing the frames with a year old date

[paco, tega, koji]

After invaluable assistance from Jamie in fixing this yearly offset in the gps time reported by cat /proc/gps, we managed to restart the real time system correctly (while still manually synchronizing the front end machine times). After this, we recovered the mode cleaner and were able to lock the arms with not much fuss.

Nevertheless, tega and I noticed some weird noise in the C1:LSC-TRX_OUT which was not present in the YARM transmission, and that is present even in the absence of light (we unlocked the arms and just saw it on the ndscope as shown in Attachment #1). It seems to affect the XARM and in general the lock acquisition...

We took some quick spectrum with diaggui (Attachment #2) but it doesn't look normal; there seems to be broadband excess noise with a remarkable 1 kHz component. We will probably look into it in more detail.

Attachment 1: TRX_noise_2021-08-25_17-40-55.png
Attachment 2: TRX_TRY_power_spectra.pdf
16300   Thu Aug 26 10:10:44 2021 PacoUpdateCDSFB is writing the frames with a year old date

[paco, ]

We went over the X end to check what was going on with the TRX signal. We spotted the ground terminal coming from the QPD is loosely touching the handle of one of the computers on the rack. When we detached it completely from the rack the noise was gone (attachment 1).

We taped this terminal so it doesn't touch anything accidently. We don't know if this is the best solution since it is probably needs a stable voltage reference. In the Y end those ground terminals are connected to the same point on the rack. The other ground terminals in the X end are just cut.

We also took the PSD of these channels (attachment 2). The noise seem to be gone but TRX is still a bit noisier than TRY. Maybe we should setup a proper ground for the X arm QPD?

We saw that the X end station ALS laser was off. We turned it on and also the crystal oven and reenabled the temperature controller. Green light immidiately appeared. We are now working to restore the ALS lock. After running XARM ASS we were unable to lock the green laser so we went to the XEND and moved the piezo X ALS alignment mirrors until we maximized the transmission in the right mode. We then locked the ALS beams on both arms successfully. It very well could be that the PZT offsets were reset by the power glitch. The XARM ALS still needs some tweaking, its level is ~ 25% of what it was before the power glitch.

Attachment 1: Screenshot_from_2021-08-26_10-09-50.png
Attachment 2: TRXTRY_Spectra.pdf
9021   Sun Aug 18 16:04:07 2013 ranaSummaryCDSFB lights all RED: mxstream restart

Sun Aug 18 15:52:50 2013

Found the FB lights (C1:FEC-NN_FB_NET_STATUS and C1:DAQ-DC0_C1XXX_STATUS) RED for everything on the CDS_FE_STATUS screen.

I used the (! mxstream restart) button ro restart the mxstreams. Everything is green now.

PMC was out of lock- relocked it and the IMC locked itself as did the X & Y arms on IR. X was already green locked.

Attachment 1: IFO-Trend.png
9354   Wed Nov 6 15:12:01 2013 JenneUpdateCDSFB not talking to LSC?

Something funny is going on with the framebuilder's communication with the LSC machine.

This is a different failure mode / error than I have seen before.  It's not the type of problem that is solved by restarting the mxstreams (that is indicated by also the 2 blocks on top of one another, that are green on the lsc machine right now, being red), although I did try that, before I looked closer and realized that that wasn't the problem.

ssh-ing to c1lsc, and doing a "rtcds restart all" seems to be fixing the problem.  Both c1oaf and c1cal needed another round of restarting, because they needed their BURT buttons pressed manually.  All of the models on the lsc machine are running fine now, though.

Here's a screenshot of the CDS overview screen, with the error lights:

9357   Wed Nov 6 17:21:58 2013 JamitUpdateCDSFB not talking to LSC?

 Quote: Something funny is going on with the framebuilder's communication with the LSC machine.  This is a different failure mode / error than I have seen before.  It's not the type of problem that is solved by restarting the mxstreams (that is indicated by also the 2 blocks on top of one another, that are green on the lsc machine right now, being red), although I did try that, before I looked closer and realized that that wasn't the problem. ssh-ing to c1lsc, and doing a "rtcds restart all" seems to be fixing the problem.  Both c1oaf and c1cal needed another round of restarting, because they needed their BURT buttons pressed manually.  All of the models on the lsc machine are running fine now, though. Here's a screenshot of the CDS overview screen, with the error lights:

This definitely looks like a timing problem on the c1lsc front end computer.  The red lights on the left mean that the timing synchronization is lost at the user model.  I'm perplexed why it looks like the IOP is not seeing the same error, though, since it should originate at the ADC.  The red lights to the right just mean the timing synchronization is lost with the DAQ, which is too be expected given a timing loss at the front end.

We'll have to take a closer look when this happens again.

8278   Tue Mar 12 12:06:22 2013 JamieUpdateComputersFB recovered, RAID power supply #1 dead

The framebuilder RAID is back online.  The disk had been mounted read-only (see below) so daqd couldn't write frames, which was in turn causing it to segfault immediately, so it was constantly restarting.

The jetstor RAID unit itself has a dead power supply.  This is not fatal, since it has three.  It has three so it can continue to function if one fails.  I have removed the bad supply and gave it to Steve so he can get a suitable replacement.

Some recovery had to be done on fb to get everything back up and running again.  I ran into issues trying to do it on the fly, so I eventually just rebooted.  It seemed to come back ok, except for something going on with daqd.  It was reporting the following error upon restart:

[Tue Mar 12 11:43:54 2013] main profiler warning: 0 empty blocks in the buffer


It was spitting out this message about once a second, until eventually the daqd died.  When it restarted it seemed to come back up fine.  I'm not exactly clear what those messages were about, but I think it has something to do with not being able to dump it's data buffers to disk.  I'm guessing that this was a residual problem from the umounted /frames, which somehow cleared on it's own.  Everything seems to be ok now.

 Quote: Manasa just went inside to recenter the AS beam on the camera after our Yarm spot centering exercises of the evening, and heard a loud beeping.  We determined that it is the RAID attached to the framebuilder, which holds all of our frame data that is beeping incessantly.  The top center power switch on the back (there are FOUR power switches, and 3 power cables, btw.  That's a lot) had a red light next to it, so I power cycled the box.  After the box came back up, it started beeping again, with the same front panel message: H/W monitor power #1 failed.

DO NOT DO THIS.  This is what caused all the problems.  The unit has three redundant power supplies, for just this reason.  It was probably continuing to function fine.  The beeping was just to tell you that there was something that needed attention.  Rebooting the device does nothing to solve the problem.  Rebooting in an attempt to silence beeping is not a solution.  Shutting of the RAID unit is basically the equivalent of ripping out a mounted external USB drive.  You can damage the filesystem that way.  The disk was still functioning properly.  As far as I understand it the only problem was the beeping, and there were no other issues.  After you hard rebooted the device, fb lost it's mounted disk and then went into emergency mode, which was to remount the disk read-only.  It didn't understand what was going on, only that the disk seemed to disappear and the reappear.  This was then what caused the problems.  It was not the beeping, it was the restarting the RAID that was mounted on fb.

Computers are not like regular pieces of hardware.  You can't just yank the power on them.  Worse yet is yanking the power on a device that is connected to a computer.  DON"T DO THIS UNLESS YOU KNOW WHAT YOU"RE DOING.  If the device is a disk drive, then doing this is a sure-fire way to damage data on disk.

9437   Wed Dec 4 12:02:39 2013 KojiUpdateCDSFB restored

Now FB is fixed: daqd and nds are running

When I rebooted FB, I noticed that any of the nfs file systems were not mounted.
I started tracking down the issues from here.

I googled the common issues of the nfs mounting during the boot sequence.
- It is good to give "_netdev" option to fstab to mount the system after the network connection is established.

- "auto" option specifies that the file system is mounted when mount -a is run

Resulting /etc/fstab is this:

 /dev/sdb1                            /            ext3    noatime                    0 1 /swapfile                            none         swap    sw                         0 0 shm                                  /dev/shm     tmpfs   nodev,nosuid,noexec        0 0 /dev/sda1                            /frames      ext3    noatime                    0 0 linux1:/home/cds/                    /cvs/cds     nfs     _netdev,auto,rw,bg,soft    0 0 linux1:/home/cds/rtcds               /opt/rtcds   nfs     _netdev,auto,rw,bg,soft    0 0 linux1:/home/cds/rtapps              /opt/rtapps  nfs     _netdev,auto,rw,bg,soft    0 0 linux1:/home/cds/caltech/apps/linux  /opt/apps    nfs     _netdev,auto,rw,bg,soft    0 0

But this didn't help mounting the nfs file systems at boot yet. I dug into google again and found a command "/sbin/rc-update".
"/sbin/rc-update show" shows what services are activated at boot. It did not include "nfsmount". So the following command
was executed

> sudo /sbin/rc-update add nfsmount boot

> /sbin/rc-update show


 * Broken runlevel entry: /etc/runlevels/boot/portmap             bootmisc | boot                                       checkfs | boot                                     checkroot | boot                                         clock | boot                                   consolefont | boot                                         dcron |      default                                 dhcpd |      default                              hostname | boot                                      in.tftpd | boot                                       keymaps | boot                                         local |      default nonetwork                  localmount | boot                                       modules | boot                                         monit |      default                                    mx |      default                              net.eth0 |      default                                net.lo | boot                                      netmount |      default                                   nfs | boot                                      nfsmount | boot                                    ntp-client | boot default                             rmnologin | boot                                     rpc.statd | boot                                          sshd | boot                                     syslog-ng | boot                                udev-postmount |      default                               urandom | boot                                        xinetd |      default

After rebooting, I confirmed that the nfs file systems are correctly mounted
and daqd and nds are automatically started.

This means that FB had never been configured to run correctly at boot. Shame on you!

10050   Tue Jun 17 17:04:26 2014 ericqUpdateComputer Scripts / ProgramsFB troubles

 Quote: Also, the CDS FE status screen had red lights blinking as if it required an 'mxstream restart'. I did the same and it did not fix the problem. So I tried to restart fb using the usual 'telnet fb 8087'; but could not restart fb that way.

FB is acting strange. When ssh-ing in, certain commands cause an inescapable hang, which can't be ctrl-c'd out of. Telling it to reboot does nothing. This kind of situation was seen by me before, when we were getting all the front ends back, I eventually hard rebooted it, hoping it was a one time thing. Guess it's not.

Looking at the dmesg output, daqd seems to be segfault-ing all over the place. This may be related... Here are some examples:

451314.730502] daqd[17339]: segfault at 7ff589ae3b30 ip 00007ff589ae3b30 sp 00007ff49931dfb8 error 15 in libmyriexpress.so[7ff589ae3000+1000]

[530516.313238] daqd[18442] general protection ip:7f3f2ce73a6c sp:7f3e29949d50 error:0

[530516.313250] daqd[18420] general protection ip:7f3f2ce73a6c sp:7f3e2a19fd50 error:0 in libc-2.10.1.so[7f3f2ce3f000+14c000]

[530516.313262]  in libc-2.10.1.so[7f3f2ce3f000+14c000]

[530516.327083] daqd[18412]: segfault at 3b04c9cd0 ip 00007f3f2ce73a6c sp 00007f3e2a4a7d50 error 4 in libc-2.10.1.so[7f3f2ce3f000+14c000]

[537695.364481] daqd[18489]: segfault at 12dbbcae0 ip 00007fa35a3b8a0a sp 00007fa298381af0 error 6 in libmyriexpress.so[7fa35a399000+28000]

[577316.821618] daqd[18758]: segfault at 7f5c4d3e9b30 ip 00007f5c4d3e9b30 sp 00007f5b5cc23fb8 error 15 in libmyriexpress.so[7f5c4d3e9000+1000]

I'm not inclined to go reboot it right now, but not sure how to address these problems...

9839   Tue Apr 22 01:39:57 2014 JenneUpdateCDSFB unhappy again

[Jenne, Q]

The frame builder (or something) is unhappy again.  I know that we've seen this before, but I can't find the elog entry that relates to this particular problem.

Every few minutes, the fb status lights on the CDS_STATUS screen go white, and then come back green.  It's annoying when it happens every hour or so (which is unfortunately typical), but it's pretty debilitating when it stops dataviewer and dtt every few minutes.  Just from the way the lights change, it looks like perhaps the daqd process is restarting itself periodically?

12151   Mon Jun 6 16:41:36 2016 ericqUpdateCDSFB upgrade work

Barring objections, starting tomorrow morning, Jamie will be testing the new FB code. The IFO will not be available for other use while this is ongoing.

13312   Fri Sep 15 15:54:28 2017 gautamUpdateCDSFB wiper script

A wiper script is not yet set up for our new Frame-Builder. The disk usage is ~80% now, so I think we should start running a wiper script that manages overall disk usage and deletes old frame files to this end.

From what I could find on the elog, the way this was done was by running a cron job on FB. There is a perl script, /opt/rtcds/caltech/c1/target/fb/wiper.pl, which from what I could understand, runs a bunch of du commands on different directories to determine if there is a need to delete any files.

I copied this script over to /opt/rtcds/caltech/c1/target/daqd/wiper.pl. This is the directory in which all the new FB stuff resides. Conveniently, the script has a "dry-run" option, which I tried running on FB1. However, I get the following error message:

Fri Sep 15 15:44:45 PDT 2017
Dry run, will not remove any files!!!
You need to rerun this with --delete argument to really delete frame files
Directory disk usage:
/frames/trend/minute_rawk
Combined 0k or 0m or 0Gb
Illegal division by zero at ./wiper.pl line 98.

So it would seem that for some reason, the du commands aren't working. From what I could tell, there aren't any directory paths specific to the old FB machine that need to be changed. I believe the script was working prior to the FB disk crash - unfortunately it doesn't look like this script was under version control but I don't think any changes have been made to this script.

Before I go down a Perl rabbit hole, has anyone seen such an error or is aware of some reason why this might not work on the new FB? Am I even using the correct scripts?

13317   Mon Sep 18 17:17:49 2017 gautamUpdateCDSFB wiper script

After trying to debug this issue using the Perl debugger, I concluded that the problem is in the part of the code that splits the output of the "du" command into directory and disk usage. For whatever, reason, this isn't working. The version of perl running on the new FB1 machine is 5.20.2, whereas I suspect the version running on the old FB machine was 5.14.2 (which is the version on all the Ubuntu 12 workstations and megatron). Unclear whether downgrading the Perl version is the right way to go.

The FB1 disk is now getting close to full, the usage is up to 85% today.

Quote:

Before I go down a Perl rabbit hole, has anyone seen such an error or is aware of some reason why this might not work on the new FB? Am I even using the correct scripts?

13318   Mon Sep 18 17:30:54 2017 ChrisUpdateCDSFB wiper script

Attached is the version of the wiper script we use on the CryoLab cymac. It works with perl v5.20.2. Is this different from what you have?

Attachment 1: wiper.pl
#!/usr/bin/perl
use File::Basename;

print "\n" .  date . "\n";
# Dry run, do not delete anything
$dry_run = 1; if ($ARGV[0] eq "--delete") { $dry_run = 0; } print "Dry run, will not remove any files!!!\n" if$dry_run;

... 184 more lines ...
13319   Mon Sep 18 17:51:26 2017 gautamUpdateCDSFB wiper script

It is a little different - specifically, the way the splitting of the output of the "du" command into disk usage and directory is different (see Attachment #1). Apart from this, some of the parameters (e.g. what percentage to keep free) are different.

I changed the percentages to match what we had here, and edited a couple of other lines to print out the files that will be deleted. The dry run seemed to work okay, it produced the output below. Not sure why "df -h" reports a different use percentage though...

Since the script seems to be working now, I am going to set it up on FB1's crontab. Thanks Chris!.

controls@fb1:/opt/rtcds/caltech/c1/target/daqd 0$./wiper.pl Mon Sep 18 17:47:06 PDT 2017 Dry run, will not remove any files!!! You need to rerun this with --delete argument to really delete frame files Directory disk usage: /frames/trend/minute_raw 47126124k /frames/trend/minute 22900668k /frames/trend/second 760359168k /frames/full 19337278516k Combined 20167664476k or 19694984m or 19233Gb /frames size 25097525144k at 80.36% /frames is below keep value of 85.00% Will not delete any files df reported usage 80.36% controls@fb1:/opt/rtcds/caltech/c1/target/daqd 0$ df -h Filesystem                        Size  Used Avail Use% Mounted on /dev/sda4                         2.0T  1.7T  152G  92% / udev                               10M     0   10M   0% /dev tmpfs                              13G  177M   13G   2% /run tmpfs                              32G     0   32G   0% /dev/shm tmpfs                             5.0M     0  5.0M   0% /run/lock tmpfs                              32G     0   32G   0% /sys/fs/cgroup /dev/sda2                          19G  3.7G   14G  21% /var /dev/sda1                         461M   65M  373M  15% /boot /dev/sdb1                          24T   19T  3.5T  85% /frames 192.168.113.104:/home/cds/rtcds   2.0T  1.6T  291G  85% /opt/rtcds 192.168.113.104:/home/cds/rtapps  2.0T  1.6T  291G  85% /opt/rtapps tmpfs                             6.3G     0  6.3G   0% /run/user/1001
 Quote: Attached is the version of the wiper script we use on the CryoLab cymac. It works with perl v5.20.2. Is this different from what you have?

Attachment 1: perlDiff.png
13320   Mon Sep 18 18:40:34 2017 gautamUpdateCDSFB wiper script

I did a further check on the wiper script by changing the "percent_keep" from 85.0 to 75.0, and running the script in "dry_run" mode again. The script then output to console the names of all the files it would delete in order to free up the required amount of space (but didn't actually delete any files as it was a dry run). Seemed to be sensible.

To set up the cron job, I did the following on FB1:

• crontab -e opened up the crontab
• Copied over a script called "wiper.cron" from /opt/rtcds/caltech/c1/target/fb to /opt/rtcds/caltech/c1/target/daqd. This essentially contains a bunch of instructions to run the wiper script with the --delete flag, and write the console output to a log file.
• Added the following line: 33 3 * * * /opt/rtcds/caltech/c1/target/daqd/wiper.cron. So the cron job should be executed at 3:33AM everyday.
• The cron daemon seems to be running - sudo systemctl status cron.service yields the following output:
controls@fb1:~ 0$sudo systemctl status cron.service ● cron.service - Regular background program processing daemon Loaded: loaded (/lib/systemd/system/cron.service; enabled) Active: active (running) since Mon 2017-09-18 18:16:58 PDT; 27min ago Docs: man:cron(8) Main PID: 30183 (cron) CGroup: /system.slice/cron.service └─30183 /usr/sbin/cron -f Sep 18 18:16:58 fb1 cron[30183]: (CRON) INFO (Skipping @reboot jobs -- not system startup) Sep 18 18:17:01 fb1 CRON[30205]: pam_unix(cron:session): session opened for user root by (uid=0) Sep 18 18:17:01 fb1 CRON[30206]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly) Sep 18 18:17:01 fb1 CRON[30205]: pam_unix(cron:session): session closed for user root Sep 18 18:25:01 fb1 CRON[30820]: pam_unix(cron:session): session opened for user root by (uid=0) Sep 18 18:25:01 fb1 CRON[30821]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1) Sep 18 18:25:01 fb1 CRON[30820]: pam_unix(cron:session): session closed for user root Sep 18 18:35:01 fb1 CRON[31515]: pam_unix(cron:session): session opened for user root by (uid=0) Sep 18 18:35:01 fb1 CRON[31516]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1) Sep 18 18:35:01 fb1 CRON[31515]: pam_unix(cron:session): session closed for user root • crontab -l on FB1 now shows the following: controls@fb1:~ 0$ crontab -l # Edit this file to introduce tasks to be run by cron. # # Each task to run has to be defined through a single line # indicating with different fields when the task will be run # and what command to run for the task # # To define the time you can provide concrete values for # minute (m), hour (h), day of month (dom), month (mon), # and day of week (dow) or use '*' in these fields (for 'any').# # Notice that tasks will be started based on the cron's system # daemon's notion of time and timezones. # # Output of the crontab jobs (including errors) is sent through # email to the user the crontab file belongs to (unless redirected). # # For example, you can run a backup of all your user accounts # at 5 a.m every week with: # 0 5 * * 1 tar -zcf /var/backups/home.tgz /home/ # # For more information see the manual pages of crontab(5) and cron(8) # # m h  dom mon dow   command 33 3 * * * /opt/rtcds/caltech/c1/target/daqd/wiper.cron

Let's see if this works.

 Quote: Since the script seems to be working now, I am going to set it up on FB1's crontab. Thanks Chris!.

8274   Tue Mar 12 00:35:56 2013 JenneUpdateComputersFB's RAID is beeping

[Manasa, Jenne]

Manasa just went inside to recenter the AS beam on the camera after our Yarm spot centering exercises of the evening, and heard a loud beeping.  We determined that it is the RAID attached to the framebuilder, which holds all of our frame data that is beeping incessantly.  The top center power switch on the back (there are FOUR power switches, and 3 power cables, btw.  That's a lot) had a red light next to it, so I power cycled the box.  After the box came back up, it started beeping again, with the same front panel message:

H/W monitor power #1 failed.

Right now the fb is trying to stay connected to things, and we can kind of use dataviewer, but we lose our connection to the framebuilder every ~30 seconds or so.  This rough timing estimate comes from how often we see the fb-related lights on the frontend status screen cycle from green to white to red back to green (or, how long do the lights stay green before going white again).  We weren't having trouble before the RAID went down a few minutes ago, so I'm hopeful that once that's fixed, the fb will be fine.

In other news, just to make Jamie's day a little bit better, Dataviewer does not open on Pianosa or Rosalba.  The window opens, but it stays a blank grey box.  This has been going on for Pianosa for a few days, but it's new (to me at least) on Rosalba.  This is different from the lack of ability to connect to the fb that Rossa and Ottavia are seeing.

354   Tue Mar 4 00:42:51 2008 ranaUpdateComputersFB0 still down ?
The framebuilder is still down. I tried restarting the daqd task and resetting the RFM
switch like it says in the Wiki but it still doesn't work right. The computer itself is
running (I can ssh to it) and the daqd process is running but there's a red light for
it on the RFM screen and dataviewer won't connect to it.

If Alex isn't over by ~10 AM, we should call him and ask for help.
ELOG V3.1.3-