We need to correctly setup crontab or rc.local for the frontend machines.
fb timing was off again.
Off again. Restarted ntp on fb.
There currently seems to be a timing issue with the frame builder. We switched over to using a symmetricom card to get an IRIG-B signal into the fb machine, but the gps time stamp is way off (~80 years Alex said).
If there is a frame buiilder issue, its currently often necessary to kill the associated mx_stream processes, since they don't seem to restart gracefully. To fix it the following steps should be taken:
Kill frame builder, kill the two mx_stream processes, then /etc/restart_streams/, then restart the frame builder (usual daqd -c ./daqdrc >& ./daqd.log in /opt/rtcds/caltech/c1/target/fb).
To restart (or start after a boot) the nds server, you need to go to /opt/rtcds/caltech/c1/target/fb and type
At this time, testpoints are kind of working, but timing issues seem to be preventing useful work being done with it. I'm leaving with Alex working on the code.
Alex fixed the time issue with the IRIG-B signal being far off, apparently their IRIG-B signal in downs seems to be different. He simply corrected for the difference in the two signals in the code.
For debugging purposes we uncommented the following line in the feCodeGen.pl script (in /opt/rtcds/caltech/c1/advLigoRTS/src/epics/util/):
print EPICS "test_points ONE_PPS $dac_testpoint_names $::extraTestPoints\n"
This is to make every ADC testpoint available from the IOP (such as c1x02).
1) Need to check 1 PPS signal alignment
2) Figure out why 1PPS and ADC/DAC testpoints went away from feCodeGen.pl?
3) Fix 1PPS testpoint giving NaN data
4) Figure out why is daqd printing "making gps time correction" twice?
5) Need to investigate why mx_streams are still getting stuck
6) Epics channels should not go out on 114 network (seen messages when doing
7) Dataviewer leaves test points hanging, daqd does not deallocate them
(net_Writer.c shutdown_netwriter call)
8) Need to install wiper scripts on fb
9) Need to install newer kernel on fb to avoid loading myrinet firmware
(avoid boot delay)
The front ends and fb computers were unresponsive this morning.
This was due to the fb machine having its ethernet cable plugged into the wrong input. It should be plugged into the port labeled 0.
Since all the front end machines mount their root partition from fb, this caused them to also hang.
The cable has been relabled to "fb" on both ends, and plugged into the correct jack. All the front ends were rebooted.
I tested the RFM connection between c1ioo and c1scx. Unfortunately, on the first test, it turns out the c1ioo machine had its gps time off by 1 second compared to c1sus and c1iscex. A second reboot seems to have fixed the issue.
However, it bothers me that the code didn't come up with the correct time on the first boot.
The test was done using the c1gcv model and by modifying the c1scx model. At the moment, the MC_L channel is being passed the MC_L input of the ETMX suspension. In the final configuration, this will be a properly shaped error signal from the green locking.
The MC_L signal is currently not actually driving the optic, as the ETMX POS MATRIX currently has a 0 for the MC_L component.
/var on fb1 filled up today, which caused all sorts of CDS issues. I found out about the problem by reading the logs of the services that were having trouble running, in which they complained about not being able to write to disk. I looked at the filesystem status with 'df' and noticed that /var was full, which is where applications write temporary data, and will always cause problems if it's full.
I tracked the issue down to multiple multi-gigabyte log files: /var/log/messages and /var/log/messages.1. They were full of lines like this one:
Seems like something related to the gpstime kernel module?
Anyway, I deleted the log files for now, which cleared up the space on /var. Things should be back to normal now, until the logs fill up again...
Jonathan Hanks pointed me to this fix to the gpstime kernel module that was unfortunately put in after the 3.4 release that we're currently using:
I hacked the source in place (/usr/src/gpstime-3.4/drv/gpstime/gpstime.c) to get the fix, and then rebuilt the kernel module with dkms :
sudo dkms uninstall gpstime/3.4
sudo dkms install gpstime/3.4
I then stopped daqd_dc, unloaded gpstime, reloaded it, restarted daqd_dc. The messages are no longer showing up in /var/log/messages, so I think we're ok for the moment.
NOTE: the fix will be undone if we for some reason reinstall the advligorts-gpstime-dkms package. There shouldn't be a need to do that, but we should be aware. I'm discussing with Jonathan if we want to try to push out a new debian package to fix this issue...
I fixed the JetStor 416S raid array IP address by plugging in my laptop to its ethernet port, setting my IP to be on the same subnet, and using the web interface. (After finally tracking down the password, it has been placed in the usual place).
After this change, I powered up the fb40m2 machine and reboot the fb40m machine. This seems to have made all the associated lights green.
Data viewer is working such that is recording from the point I fixed the JetStor raid array and did the fb40m reboot. It also can go back in time before the IP switch over.
Having determined that Rana (the computer) was having to many issues with testing the new Raid array due to age of the system, we proceeded to test on fb40m.
We brought it down and up several times between 11 and noon. We eventually were able to daisy chain the old raid and the new raid so that fb40m sees both. At this time, the RAID arrays are still daisy chained, but the computer is setup to run on just the original raid, while the full 14 TB array is initialized (16 drives, 1 hot spare, RAID level 5 means 14 TB out of the 16 TB are actually available). We expect this to take a few hours, at which point we will copy the data from the old RAID to the new RAID (which I also expect to take several hours). In the meantime, operations should not be affected. If it is, contact one of us.
This afternoon the alignment script chrashed after returning sysntax errors. We found that the tpman wasn't running on the framebuilder becasue it had probably failed to get restarted in one of the several reboots executed in the morning by Alex and Jo.
Restarting the tpman was then sufficient for the alignment scripts to get back to work.
The fb40m just went out of order with status indicator number 8
It recovered on its own five minutes later.
Backup script restarted, backup of trend frames and /cvs/cds is up-to-date.
The frame builder was power cycled during the morning bootfest. I have restarted the backup script once more.
The 40m frame builder is currently being patched to be able utilize the full 14 TB of the new raid array (as opposed to being limited to 2 TB). This process is expected to take several hours, during which the frame builder will be unavailable.
Alex came over this morning and we began work on the frame builder change over. This required fb40m be brought down and disconnected from the RAID array, so the frame builder is not available.
He brought a Netgear switch which we've installed at the top of the 1X7 rack. This will eventually be connected, via Cat 6 cable, to all the front ends. It is connected to the new fb machine via a 10G fiber.
Alex has gone back to Downs to pickup a Symmetricon (sp?) card for getting timing information into the frame builder. He will also be bringing back a harddrive with the necessary framebuilder software to be copied onto the new fb machine.
He said he'd like to also put a Gentoo boot server on the machine. This boot server will not affect anything at the moment, but its apparently the style the sites are moving towards. So you have a single boot server, and diskless front end computers, running Gentoo. However for the moment we are sticking with our current Centos real time kernel (which is still compatible with the new frame builder code). However this would make a switch over to the new system possible in the future.
At the moment, the RAID array is doing a file system check, and is going slowly while it checks terabytes of data. We will continue work after lunch.
Punchline: things still don't work.
FB40m up and running again after restarting the DAQ.
I received an e-mail from Alex indicating he found the testpoint problem and fixed it today:
Quote from Alex: "After we swapped the frame builder computer it has reconfigured all device files and I needed to create some symlinks on /dev/ to make tpman work again. I test the testpoints and they do work now."
Alex and Steve,
SunFire x4600 ( not MEGATRON 2 , it is fb40m2 ) and JetStor ( 16 x 1 TB drives ) were installed on side rails at the bottom of 1Y6
We cleaned up the fibres and cabling in 1Y7 also
I did a simulation of linear quadratic gaussian (LQG) controller applied to local damping. The cost function was frequency shaped to have a peak at 1 Hz. This technique prevents the controller from adding sensor noise at high and very low frequencies.
Noise was simulated to have 1/f spectrum (seismic) multiplied by stack with a resonance at 4 Hz with Q=5.
New Lumitek IR Sensor Cards are here. We got 2 pieces of Q-11-T (2" x 2"), 2 pieces of Q-11-T (0.75" x 0.75") and one Q-11 (4" x 5")
1, Vacuum envelope grounds must be connected all times! After door removal reconnect both cables immediately.
2, The crane folding had a new issue of getting cut as picture shows.
3, Too much oplev light is scattered. This picture was taken just before we put on the heavy door.
4, We were unprepared to hold the smaller side chamber door 29" od of the IOC
5, Silicon bronze 1/2-13 nuts for chamber doors will be replaced. They are not smooth turning.
Can we get some panel mount FC/APC connectors and put them on a box? Then we could have the whole setup inside of a box that is filled with foam and sits outside the PSL hut.
[Steve, Diego, Manasa]
Since the beatnotes have disappeared, I am taking this as a chance to put the FOL setup together hoping it might help us find them.
Two 70m long fibers now run along the length of the Y arm and reach the PSL table.
The fibers are running through armaflex insulating tubes on the cable racks. The excess length ~6m sits in its spool on the top of the PSL table enclosure.
Both the fibers were tested OK using the fiber fault locator. We had to remove the coupled end of the fiber from the mount and put it back in the process. So there is only 8mW of end laser power at the PSL table after this activity as opposed to ~13mW. This will be recovered with some alignment tweaking.
After the activity I found that the ETMY wouldn't damp. I traced the problem to the ETMY SUS model not running in c1iscey. Restarting the models in c1iscey solved the problem.
AP Armaflex tube 7/8" ID X 1" wall insulation for the long fiber in wall mounted cable trays installed yesterday.
The 6 ft long sections are not glued. Cable tied into the tray pressed against one an other, so they are air tight. This will allow us adding more fibers later.
Atm2: Fiber PSL ends protection added on Friday.
Alex, Gautam and Steve,
Single mode fiber 50m long is layed out into cable tray that is attached to the beam tube of the Y arm.
It goes from ETMY to PSL enclosure. It is protected at both ends with " clear- pvc, slit corrugated loom tubing " 1.5" ID
The fiber is not protected between 1Y1 and 1Y4
I positioned the fiber loaded protecting tubing and anchored them so they can do their job.
However, the area needs a good clean up.
What I did today.
1. Collimation of a beam.
2. Coupling of the IR light at the ETMY table to a fibre.
A Matlab script to calculate Wiener filter coefficients and convert fir to iir is ready. Input is a file with zero mean witness and desired signals, output is a Foton zpk command to specify iir filter.
The plot shows comparison of offline fir , iir and online iir filtering. Spectrum below 4 Hz is still oscillating due to acoustic coupling, this is not a filtering effect. At 1 Hz actuator is badly compensated, more work should be done. Other then that online and offline filtering are the same.
We decided to write a script that will check online filters for digital noise. One method can be implemented using the following algorithm:
Restriction: Single precision filter internal variables must be checked for overflows.
I applied this method to filtering a 1 Hz sine wave with a notch filter. Precise output should also be a 1 Hz wave => at other frequencies we see noise => digital noise spectrum should coincide with filter output. The plot shows the method worked out for this example.
Using this method I estimated digital noise of butter("LowPass", 2, 0.001) applied to white noise. Sampling frequency was 16 kHz.
The script estimates digital noise produces by online filters. First version of Matlab files and complied c files are in scripts/digital_noise directory.
Algorithm for 1 filter bank (max number of filters = 10):
More details on (2)
Often DQ channels have reduced sampling rate. In this case the script will upsample data adding zeros.
AI filter is not applied. But in the end only the frequency range (0, DQ RATE / 2) is analyzed.
More details on (3):
_SW2R channel value is the sum of the following numbers:
Note: as for now Matlab script assumes that input, output and decimation filters are switched ON and there are no turned ON filter switches that do not correspond to any filters
More details on (5)
Digital noise using double precision is estimated by extrapolation of digital noise with single precision. The last is calculated by subtracting outputs of the filters with single and double precision. Then this noise is multiplied by 3 * 10-7.
This extrapolation number was achieved by printf tests of the number 0.123456789012345678 with single and double precision on C. Using type 'float' variables 10 significant numbers show up, using type 'double' - 17.
I also did 'calibration tests' to achieve extrapolation number - signal was filters with an aggresive low-pass filter. At high frequencies filter output spectrum is flat => digital noise amplitude must be the same. The plot shows GUR1_X channel filtered with low-pass chebyshev type 1 filter.
However, extrapolation number is not the same for all cases. In the following example of analyzing BS_SUSPOS filter bank using extrapolation 3 * 10-7 we get noise that is slightly overestimated. In some other examples we need to take a larger number. But in average, I think, this is a good approximation.
To avoid extrapolation problem we can use long double precision (~19 digits). I was able to do this with gcc compiler. However, in mex compiler using long double in filter calculations, I do not get any better precision then using double precision. I'll think more about it.
Online filter diverges. I did offline simulations with current c-code. Offline filter also diverges, even in the simplest case
witness = randn(1e6, 1); target = witness + 0.01*randn(1e6, 1);
I tried to create a new implementation of FXLMS algorithm as a c code. Then with this c code I did offline filtering with MCL and GUR signals and compared the error signals depending on the length of the filter.
One can see the code at the svn
adaptOnline - start here and choose algorithm
adaptive_filtering - Matlab implementation of AF
current_version.c - current version of the Filter (Matt's)
fxlms_filter.c - new version of the FXLMS filter
oaf.c - agent between Matlab and C (edited Matt's file)
Data samples can be found at nodus /users/den/wiener_filtering/data
I say just fix the clipping. Don't worry about the PRM OSEM filters. We can do that next time when we put in the ITM baffles. No need for them on this round.
The LSC time had gone too high. I deleted ~20 filters and rebooted. CPU time came down to 50 usec.
The filters all looked like old trash to me, but its possible they were used.
I didn't delete anything from the DARM, CARM, etc. banks but did from the PD and TM filter banks. You can always go back in time by using the
All fine, except ITMX_sensor_UL's 60 counts deep hoop for an hour.
I made sketches of the final setup. There will be a box in the rack that contains both the heater circuit and the temperature sensor boards. One of them is in the loop while the other isn't. Instead of having many cables leading to the can, there will only be these three, though they can be made into a single wire. It will be connected to the can through a D-9 connector. The second attachment is what will be inside of the box, with all the major wires and components labeled.
Edit: I've canged the layout to (hopefully) make the labels easier to read. I've also added in a cable to the ADC that reads out the voltage across the 1 ohm resistor. I also attached the circuit diagrams for the heater circuit and the temperature sensors. The one for the heater circuit was made by Kevin and I used the same design, except I have LM7818 and LM7918, since the 15V ones were not available at the time I made the circuit.
In addition, all the wires leading to the can will all be part of one bundle of wires (I didn't clearly indicate it as such). There will be a total of 6 wires: two are needed for the wire to supply power to the heater and will have a LEMO connector on the rack end and two are needed for each temperature sensor, which will be attached to the board directly on the rack end.
Also, we don't need two voltage regulators for each temperature circuit. We can just have one of each of LM7815 and LM7915 to supply +/- 15V to the boards.
I've updated the sketches and added in front panels for the seismometer block and the 1U panel (attachments 3 and 4). There was an issue when it came to the panel on the block because the hole is only big enough for the cable that already exists there and there is no space to add in the D-9 connector. Not quite sure how to resolve this issue. Attachment 7 is the current panel on the seismometer block. Attachments 5 and 6 are the updated temperature circuit and the heater circuit.
The boxes will be located in the short racks at EX and EY to minimize cable length.
I've attached the final sketch for the panel on the granite block.
I've attached a sketch of how the panel will be mounted. We should make a small rectangular box that would raise the panel from the block by 1 cm or so to allow the cables to fit into the hole in the block without getting bent. It also has to be airtight so maybe having a thin layer of rubber between the mount and block would be good.
I've added in the dimensions to my sketch.
It seems like placing the two connectors right next to each other would allow both cables to just barely go through the hole in the block.
Can you please add dimensions to the drawing, so we can see if things fit and what the cable lenghts need to be?
For the panel on the granite slab, we should use a thinner piece of metal and mount it with an offset so that the D-sub cable can be fished through the hole in the slab. The hole is wide enough for 2 cables, but not 2 connectors.
since we're just going from the short rack (not the tall rack) to the seismometer, can't we use a cable shorter than 45' ?
the panel should be completely replaced like I described. We don't want to try to squeeze it in artificially and torque the wires. It just needs to be separated from the slab by a few more cm.
If we lay the cable along the floor then it should be around 6' to the current setup and about 20' to the actual seismometer.
Edit: 16 gauge wire should be good.
Attached is a 8-day minute trend of the heater control signals, as well as the in-loop temperature sensor (which underestimates the true fluctuations; we really need an out-of-loop sensor attached to the can or seismometer).
You can see that since the last tuning (on the 13th), its been stable at the set point of 39 C with 8.5 - 10 W of heating power. Need to add the PID loop settings (all the sliders on the MEDM screen) to the frames so that we can help in diagnosing. Also, fix the spelling of "Celcisususs".