ID |
Date |
Author |
Type |
Category |
Subject |
3962
|
Mon Nov 22 12:00:18 2010 |
josephb | Update | CDS | Updated Computer Restart Procedures for FB |
I've updated the Computer Restart Procedures page in the wiki with the latest fb restart procedure.
To just restart just the daqd (frame builder) process, do:
1) telnet fb 8088
2) shutdown
The init process will take care of the rest and restart daqd automatically.
Background:
Plan:
- Check the wiring after SOS Coil Driver Module and circuit around SDSEN
- Check whitening and dewhitening filters. We connected a binary output cable, but didn't checked them yet.
- Make a script for step 2
- Activate new DAQ channels for ETMX (what is the current new fresh up-to-date latest fb restart procedure?)
|
|
3963
|
Mon Nov 22 13:16:52 2010 |
josephb | Summary | CDS | CDS Plan for the week |
CDS Objectives for the Week:
Monday/Tuesday:
1) Investigate ETMX SD sensor problems
2) Fully check out the ETMX suspension and get that to a "green" state.
3) Look into cleaning up target directories (merge old target directory into the current target directory) and update all the slow machines for the new code location.
4) Clean up GDS apps directory (create link to opt/apps on all front end machines).
5) Get Rana his SENSOR, PERROR, etc channels.
Tuesday/Wednesday:
3) Install LSC IO chassis and necessary cabling/fibers.
4) Get LSC computer talking to its remote IO chassis
Wednesday:
5) If time, connect and start debugging Dolphin connection between LSC and SUS machines
|
3964
|
Mon Nov 22 16:16:04 2010 |
josephb | Update | CDS | Did an SVN update on the CDS code |
Problem:
The CDS oscillator part doesn't work inside subsystems.
Solution:
Rolf checked in an older version of the CDS oscillator which includes an input (which you just connect to a ground). This makes the parser work properly so you can build with the oscillator in a subsystem.
So I did an SVN checkout and confirmed that the custom changes we have here were not overwritten.
Edit:
Turns out the latest svn version requires new locations for certain codes, such as EPICS installs. I reverted back to version 2160, which is just before the new EPICs and other rtapps directory locations, but late enough to pick up the temporary fix to the CDS oscillator part. |
3965
|
Mon Nov 22 17:48:11 2010 |
josephb | Update | CDS | c1iscex is not seeing its Binary Output card |
Problem:
c1iscex does not even see its 32 channel Binary output card. This means we have no control over the state of the analog whitening and dewhitening filters. The ADC, DAC, and the 1616 Binary Input/Output cards are recognized and working.
Things tried:
Tried recreating the IOP code from the known working c1x02 (from the c1sus front end), but that didn't help.
Checked seating of the card, but it seems correctly socketed and tightened down nicely with a screw.
Tomorrow will try moving cards around and see if there's an issue with the first slot, which the Binary Output card is in.
Current Status:
The ETMX is currently damping, including POS, PIT, YAW and SIDE degrees of freedom. However, the gds screen is showing a 0x2bad status for the c1scx front end (the IOP seems fine with a 0x0 status). So for the moment, I can't seem to bring up c1scx testpoints. I was able to do so earlier when I was testing the status of the binary outputs, so during one of the rebuilds, something broke. I may have to undo the SVN update and/or a change made by Alex today to allow for longer filter bank names beyond 19 characters. |
3974
|
Tue Nov 23 10:53:20 2010 |
josephb | Update | CDS | timing issues |
Problem:
Front ends seem to be experiencing a timing issue. I can visibly see a difference in the GPS time ticks between models running on c1ioo and c1sus.
In addition, the fb is reporting a 0x2bad to all front ends. The 0x2000 means a mismatch in config files, but the 0xbad indicates an out of sync problem between the front ends and the frame builder.
Plan:
As there are plans to work on the optic tables today and suspension damping is needed, we are holding off on working on the problem until this afternoon/evening, since suspensions are still damping. It does mean the RFM connections are not available.
At that point I'd like to do a reboot of the front ends and framebuilder and see if they come back up in sync or not. |
3975
|
Tue Nov 23 11:20:30 2010 |
josephb | Update | CDS | Cleaning up old target directory |
Winter Cleaning:
I cleaned up the /cvs/cds/caltech/target/ directory of all the random models we had built over the last year, in preparation for the move of the old /cvs/cds/caltech/target slow control machine code into the new /opt/rtcds/caltech/c1/target directories.
I basically deleted all the directories generated by the RCG code that were put there, including things like c1tst, c1tstepics, c1x00, c1x00epics, and so forth. Pre-RCG era code was left untouched. |
3978
|
Tue Nov 23 16:55:14 2010 |
josephb | Update | CDS | Updated apps |
Updated Apps:
I created a new setup script for the newest build of the gds tools (DTT, foton, etc), located in /opt/apps (which is a soft link from /cvs/cds/apps) called gds-env.csh.
This script is now sourced by cshrc.40m for linux 64 bit machines. In addition, the control room machines have a soft link in the /opt directory to the /cvs/cds/apps directory.
So now when you type dtt or foton, it will bring up the Centos compiled code Alex copied over from Hanford last month. |
3994
|
Tue Nov 30 12:10:44 2010 |
josephb | Update | elog | Elog restarted again |
The elog seemed to be down at around 12:05pm. I waited a few minutes to see if the browser would connect, but it did not.
I used the existing script in /cvs/cds/caltech/elog/ (as opposed to Zach's new on in elog/elog-2.8.0/) which also seems to have worked fine. |
3995
|
Tue Nov 30 12:25:08 2010 |
josephb | Update | CDS | LSC computer to chassis cable dead |
Problem:
We seemed to have a broken fiber link for use between the LSC and its IO chassis. It is unclear to mean when this damage occurred. The cable had been sitting in a box with styrofoam padding, and the kink is in the middle of the fiber, with no other obvious damage near by. The cable however may have previously been used by the people in Downs for testing and possibly then. Or when we were stringing it, we caused a kink to happen.
Tried Solutions:
I talked to Alex yesterday, and he suggested unplugging the power on both the computer and the IO chassis completely, then plugging in the new fiber connector, as he had to do that once with a fiber connection at Hanford. We tried this this morning, however, still no joy. At this point I plan to toss the fiber as I don't know of any way to rehabilitate kinked fibers.
Note this means that I rebooted c1sus and then did a burt restore from the Nov/30/07:07 directory for c1suspeics, c1rmsepics, c1mcsepics. It looks like all the filters switched on.
Current Plan:
We do, however, have the a Dolphin fiber which originally was intended to go between the LSC and its IO chassis, before Rolf was told it doesn't work well that way. However, we were going to connect the LSC machine to the rest of the network via Dolphin.
We can put the LSC machine next to its chassis in the LSC rack, and connect the chassis to the rest of the front ends by the Dolphin fiber. In that case we just need the usual copper style cable going between the chassis and the computer.
|
3999
|
Tue Nov 30 16:02:18 2010 |
josephb | Update | CDS | status |
Issues:
1) Turns out the /opt/rtcds/caltech/c1/target/gds/param/testpoint.par file had been emptied or deleted at one point, and the only entry in it was c1pem. This had been causing us a lack of test points for the last few days. It is unclear when or how this happened. The file has been fixed to include all the front end models again. (Fixed)
2) Alex and I worked on tracking down why there's a GPS difference between the front ends and the frame builder, which is why we see a 0x4000 error on all the front end GDS screens. This involved several rebuilds of the front end codes and reboots of the machines involved. (Broken)
3) Still working on understanding why the RFM communication, which I think is related to the timing issues we're seeing. I know the data is being transferred on the card, but it seems to being rejected after being red in, suggesting a time stamp mismatch. (Broken)
4) The c1iscex binary output card still doesn't work. (Broken)
Plan:
Alex and I will be working on the above issues tomorrow morning.
Status:
Currently, the c1ioo, c1sus and c1iscex computers are running with their front ends. They all still have 0x4000 error. However, you can still look at channels on dataviewer for example. However, there's a possibility of inconsistent timing between computer (although all models on a single computer will be in sync).
All the front ends where burt restorted to 07:07 this morning. I spot checked several optic filter banks and they look to have been turned on. |
4009
|
Fri Dec 3 15:37:10 2010 |
josephb | Update | CDS | fb, front ends fixed - tested RFM between c1ioo and c1iscex |
Problem:
The front ends and fb computers were unresponsive this morning.
This was due to the fb machine having its ethernet cable plugged into the wrong input. It should be plugged into the port labeled 0.
Since all the front end machines mount their root partition from fb, this caused them to also hang.
Solution:
The cable has been relabled to "fb" on both ends, and plugged into the correct jack. All the front ends were rebooted.
Testing RFM for green locking:
I tested the RFM connection between c1ioo and c1scx. Unfortunately, on the first test, it turns out the c1ioo machine had its gps time off by 1 second compared to c1sus and c1iscex. A second reboot seems to have fixed the issue.
However, it bothers me that the code didn't come up with the correct time on the first boot.
The test was done using the c1gcv model and by modifying the c1scx model. At the moment, the MC_L channel is being passed the MC_L input of the ETMX suspension. In the final configuration, this will be a properly shaped error signal from the green locking.
The MC_L signal is currently not actually driving the optic, as the ETMX POS MATRIX currently has a 0 for the MC_L component. |
4014
|
Mon Dec 6 11:59:41 2010 |
josephb | Update | CDS | New c1lsc computer moved to lsc rack |
Computer moved:
The c1lsc computer has been moved over to the 1Y3 rack, just above the c1lsc IO chassis.
It will talking to the c1sus computer via a Dolphin PCIe reflected memory card. The cards have been installed into c1lsc and c1sus this morning.
It will talk to its IO chassis via the usual short IO chassis cable.
To Do:
The Dolphin fiber still needs to be strung between c1sus and c1lsc.
The DAQ cable between c1lsc and the DAQ router (which lets the frame builder talk directly with the front ends) also needs t to be strung.
c1lsc needs to be configured to use fb as a boot server, and the fb needs to be configured to handle the c1lsc machine. |
4015
|
Mon Dec 6 16:49:43 2010 |
josephb | Update | CDS | c1lsc halfway to working |
C1LSC Status:
The c1lsc computer is running Gentoo off of the fb server. It has been connected to the DAQ network and is handling mx_streams properly (so we're not flooding the network error messages like we used to with c1iscex). It is using the old c1lsc ip address (192.168.113.62). It can ssh'd into.
However, it is not talking properly to the IO chassis. The IO chassis turns on when the computer turns on, but the host interface board in the IO chassis only has 2 red lights on (as opposed to many green lights on the host interface boards in the c1sus, c1ioo, and c1iscex IO chassis). The c1lsc IO processor (called c1x04) doesn't see any ADCs, DACs, or Binary cards. The timing slave is receiving 1PPS and is locked to it, but because the chassis isn't communicating, c1x04 is running off the computer's internal clock, causing it to be several seconds off.
Need to investigate why the computer and chassis are not talking to each other.
General Status:
The c1sus and c1ioo computers are not talking properly to the frame builder. A reboot of c1iscex fixed the same problem earlier, however, as Kiwamu and Suresh are working in the vacuum, I'm leaving those computers alone for the moment, but a reboot and burt restore probably should be done later today for c1sus and c1ioo
Current CDS status:
MC damp |
dataviewer |
diaggui |
AWG |
c1ioo |
c1sus |
c1iscex |
RFM |
Dolphin RFM |
Sim.Plant |
Frame builder |
TDS |
|
|
|
|
|
|
|
|
|
|
|
|
|
4020
|
Tue Dec 7 16:09:53 2010 |
josephb | Update | CDS | c1iscex status |
I swapped out the IO chassis which could only handle 3 PCIe cards with the another chassis which has space for 17, but which previously had timing issues. A new cable going between the timing slave and the rear board seems to have fixed the timing issues.
I'm hoping to get a replacement PCI extension board which can handle more than 3 cards this week from Rolf and then eventually put it in the Y-end rack. I'm also still waiting for a repaired Host interface board to come in for that as well.
At this point, RFM is working to c1iscex, but I'm still debugging the binary outputs to the analog filters. As of this time they are not working properly (turning the digital filters on and off seems to have no effect on the transfer function measured from an excitation in SUSPOS, all the way around to IN1 of the sensor inputs (but before measuring the digital fitlers). Ideally I should see a difference when I switch the digital filters on and off (since the analog ones should also switch on and off), but I do not. |
4025
|
Wed Dec 8 12:26:56 2010 |
josephb | Update | CDS | megatron set up - as a test front end |
[josephb, Osamu]
Megatron Setup:
To show Osamu how to setup a a front end as well as provide a test computer for Osamu's use, we used the new megatron (sunfire x4600 with 16 cores and 8 gigabytes of memory) as a front end without an IO chassis.
The steps we followed are in the wiki, here.
The new megatron's IP address is 192.168.113.209. It is running the c1x99 front end code. |
4028
|
Wed Dec 8 14:51:09 2010 |
josephb | Update | CDS | c1pem now recording data |
Problem:
c1pem model was reporting all zeros for all the PEM channels.
Solution:
Two fold. On the software end, I added ADCs 0, 1, and 2 to the model. ADC 3 was already present and is the actual ADC taking in PEM information.
There was a problem noted awhile back by Alex and Rolf that there's a problem with the way the DACs and ADCs are number internally in the code. Missing ADCs or DACs prior to the one you're actually using can cause problems.
At some point that problem should be fixed by the CDS crew, but for now, always include all ADCs and DACs up to and including the highest number ADC/DAC you need to use for that model.
On the physical end, I checked the AA filter chassis and found the power was not plugged in. I plugged it in.
Status:
We now have PEM channels being recorded by the FB, which should make Jenne happier. |
4029
|
Wed Dec 8 17:05:39 2010 |
josephb | Update | CDS | Put in dolphin fiber between c1sus and c1lsc |
[josephb,Suresh]
We put in the fiber for use with the Dolphin reflected memory between c1sus and c1lsc (rack 1X4 to rack 1Y3). I still need to setup the dolphin hub in the 1X4 rack, but once that is done, we should be able to test the dolphin memory tomorrow. |
4046
|
Mon Dec 13 17:18:47 2010 |
josephb | Update | CDS | Burt updates |
Problem:
Autoburt wouldn't restore settings for front ends on reboot
What was done:
First I moved the burt directory over to the new directory structure.
This involved moving /cvs/cds/caltech/burt/ to /opt/rtcds/caltech/c1/burt.
Then I updated the burt.cron file in the new location, /opt/rtcds/caltech/c1/burt/autoburt/. This pointed to the new autoburt.pl script.
I created an autoburt directory in the /opt/rtcds/caltech/c1/scripts directory and placed the autoburt.pl script there.
I modified the autoburt.pl script so that it pointed to the new snapshot location. I also modified it so it updates a directory called "latest" located in the /opt/rtcds/caltech/c1/burt/autoburt directory. In there is a set of soft links to the latest autoburt backup.
Lastly, I edited the crontab on op340m (using crontab -e) to point to the new burt.cron file in the new location.
This was the easiest solution since the start script is just a simple bash script and I couldn't think of a quick and easy way to have it navigate the snapshots directory reliably.
I then modified the Makefile located in /opt/rtcds/caltech/c1/core/advLigoRTS/ which actually generates the start scripts, to point at the "latest" directory when doing restores. Previously it had been pointing to /tmp/ which didn't really have anything in it.
So in the future, when building code, it should point to the correct snapshots now. Using sed I modified all the existing start scripts to point to the latest directory when grabbing snapshots.
Future:
According to Keith directory documentation (see T1000248) , the burt restores should live in the individual target system directory i.e. /target/c1sus/burt, /target/c1lsc/burt, etc. This is a distinctly different paradigm from what we've been using in the autoburt script, and would require a fairly extensive rewrite of that script to handle this properly. For the moment I'm keeping the old style, everything in one directory by date. It would probably be worth discussing if and how to move over to the new system. |
4053
|
Tue Dec 14 11:24:35 2010 |
josephb | Update | CDS | burt restore |
I had updated the individual start scripts, but forgotten to update the rc.local file on the front ends to handle burt restores on reboot.
I went to the fb machine and into /diskless/root/etc/ and modified the rc.local file there.
Basically in the loop over systems, I added the following line:
/opt/epics-3.14.9-linux/base/bin/linux-x86/burtwb -f /opt/rtcds/caltech/c1/burt/autoburt/latest/${i}epics.snap -l /opt/rtcds/caltech/c1/burt/autoburt/logs/${i}epics.log.restore -v
The ${i} gets replaced with the system name in the loop (c1sus, c1mcs, c1rms, etc)
|
4057
|
Wed Dec 15 13:36:44 2010 |
josephb | Update | CDS | ETMY IO chassis update |
I gave Alex a sob story over lunch about having to go and try to resurrect dead VME crates. He and Rolf then took pity on me and handed me their last host interface board from their test stand, although I was warned by Rolf that this one (the latest generation board from One Stop) seems to be flakier than previous versions, and may require reboots if it starts in a bad state.
Anyways, with this in hand I'm hoping to get c1iscey damping by tomorrow at the latest. |
4060
|
Wed Dec 15 17:21:20 2010 |
josephb | Update | CDS | ETMY controls status |
Status:
The c1iscey was converted over to be a diskless Gentoo machine like the other front ends, following the instructions found here. Its front end model, c1scy was copied and approriately changed from the c1scx model, along with the filter banks. A new IOP c1x05 was created and assigned to c1iscey.
The c1iscey IO chassis had the small 4 PCI slot board removed and a large 17 PCI slot board put in. It was repopulated with an ADC/DAC/BO and RFM card. The host interface board from Rolf was also put in.
On start up, the IOP process did not see or recognize any of the cards in the IO chassis.
Four reboots later, the IOP code had seen the ADC/DAC/BO/RFM card once. And on that reboot, there was a time out on the ADC which caused the IOP code to exit.
In addition to the not seeing the PCI cards most of the time, several cables still need to be put together for plugging into the the adapter boards and a box need to be made for the DAC adapter electronics.
|
4064
|
Thu Dec 16 10:52:42 2010 |
josephb | Update | Cameras | New PoE digital cameras |
We have two new Basler acA640-100gm cameras. These are power over ethernet (PoE) and very tiny. |
Attachment 1: basler.jpg
|
|
4082
|
Tue Dec 21 11:52:58 2010 |
josephb | Update | Computers | RGA scripts fixed, c0rga fixed |
c0rga apparently had a full hard drive. There was 1 Gig log file in /var/log directory, called Xorg.0.log.old which I deleted which freed up about 20% of the hard drive. This let me then modify the crontab file (which previously had been complaining about no room on disk to make edits).
I updated the crontab to look at the new scripts location, updated the RGA script itself to write to the new log location, and then created a soft link in the /opt directory to /cvs/cds/rtcds on c0rga.
The RGA script should now be running again once a day. |
4097
|
Fri Dec 24 09:01:33 2010 |
josephb | Update | CDS | Borrowed ADC |
Osamu has borrowed an ADC card from the LSC IO chassis (which currently has a flaky generation 2 Host interface board). He has used it to get his temporary Dell test stand running daqd successfully as of yesterday.
This is mostly a note to myself so I remember this in the new year, assuming Osamu hasn't replaced the evidence by January 7th. |
4132
|
Tue Jan 11 11:19:13 2011 |
josephb | Summary | CDS | Storing FE harddrives down Y arm |
Lacking a better place, I've chosen the cabinet down the Y arm which had ethernet cables and various VME cards as a location to store some spare CDS computer equipment, such as harddrives. I've added (or will add in 5 minutes) a label "FE COMPUTER HARD DRIVES" to this cabinet. |
4135
|
Tue Jan 11 14:05:11 2011 |
josephb | Update | Computers | Martian host table updated daily |
I created two simple cron jobs, one running on linux1 and one running on nodus, to produce an updated copy of the martian host table linkable from the wiki every day.
The scripts live in /opt/rtcds/caltech/c1/scripts/AutoUpdate/. One is called updateHostTable.cron and run on linux1 everyday at 4 am, and the other is called moveHostTable.cron which is run on nodus everyday at 5am.
The new link has been added to the Martian Host table wiki page here.
|
4136
|
Tue Jan 11 16:04:17 2011 |
josephb | Update | CDS | Script to update web views of models for all installed front ends |
I wrote a new script that is in /opt/rtcds/caltech/c1/scripts/AutoUpdate/ called webview_simlink_update.m.
This m-file when run in matlab will go to the /opt/rtcds/caltech/c1/target directory and for each c1 front end, generate the corresponding webview files for that system and place them in the AutoUpdate directory.
Afterwards the files can be moved on Nodus to the /users/public_html/FE/ directory with:
mv /opt/rtcds/caltech/c1/scripts/AutoUpdate/*slwebview* /users/public_html/FE/
This was run today, and the files can be viewed at:
https://nodus.ligo.caltech.edu:30889/FE/
Long term, I'd like to figure out a way of automating this to produce automatically updated screens without having to run it manually. However, simulink seems to stubbornly require an X window to work. |
4144
|
Wed Jan 12 17:50:21 2011 |
josephb | Update | CDS | Worked on c1lsc, MC2 screens |
[josephb, osamu, kiwamu]
We worked over by the 1Y2 rack today, trying to debug why we didn't get any signal to the c1lsc ADC.
We turned off the power to the rack several times while examining cards, including the whitening filter board, AA board, and the REFL 33 demod board. I will note, I incorrectly turned off power in the 1Y1 rack briefly.
We noticed a small wire on the whitening filter board on the channel 5 path. Rana suggested this was to part of a fix for the channels 4 and 5 having too much cross talk. A trace was cut and this jumper added to fix that particular problem.
We confirmed would could pass signals through each individual channel on the AA and whitening filter boards. When we put them back in, we did noticed a large offset when the inputs were not terminated. After terminating all inputs, values at the ADC were reasonable, measuring on from 0 to about -20 counts. We applied a 1 Hz, 0.1 Vpp signal and confirmed we saw the digital controls respond back with the correct sine wave.
We examined the REFL 33 demod board and confirmed it would work for demodulating 11 MHZ, although without tuning, the I and Q phases will not be exactly 90 degrees apart.
The REFL 33 I and Q outputs have been connected to the whitening board's 1 and 2 inputs, respectively. Once Kiwamu adds approriate LO and PD signals to the REFL 33 demod board he should be able to see the resulting I and Q signals digitally on the PD1 I and Q channels.
In an unrelated fix, we examined the suspensions screens, specifically the Dewhitening lights. Turns out the lights were still looking at SW2 bit 7 instead of SW2 bit 5. The actual front end models were using the correct bit (21 which corresponds to the 9th filter bank), so this was purely a display issue. Tomorrow I'll take a look at the binary outputs and see why the analog filters aren't actually changing.
|
4150
|
Thu Jan 13 14:21:13 2011 |
josephb | Update | CDS | Webview of front end model files automated |
After Rana pointed me to Yoichi's MEDM snapshot script, I learned how to use Xvfb, which is what Yoichi used to write screens without a real screen. With this I wrote a new cron script, which I added to Mafalda's cron tab to be run once a day at 6am.
The script is called webview_update.cron and is in /opt/rtcds/caltech/c1/scripts/AutoUpdate/.
#!/bin/bash
DISPLAY=:6
export DISPLAY
#Check if Xvfb server is already running
pid=`ps -eaf|grep vfb | grep $DISPLAY | awk '{print $2}'`
if [ $pid ]; then
echo "Xvfb already running [pid=${pid}]" >/dev/null
else
# Start Xvfb
echo "Starting Xvfb on $DISPLAY"
Xvfb $DISPLAY -screen 0 1600x1200x24 >&/dev/null &
fi
pid=$!
echo $pid > /opt/rtcds/caltech/c1/scripts/AutoUpdate/Xvfb.pid
sleep 3
#Running the matlab process
/cvs/cds/caltech/apps/linux/matlab/bin/matlab -display :6 -logfile /opt/rtcds/caltech/c1/scripts/AutoUpdate/webview.log -r webview_simlink_update
|
4151
|
Thu Jan 13 16:34:02 2011 |
josephb | Update | Computers | 32 bit matlab updated |
There was a problem with running the webview report generator in matlab on Mafalada. It complained of not having a spare report generator license to use, even though the report generator was working before and after on other machines such as Rosalba. So I moved the old 32 bit matlab directory from /cvs/cds/caltech/apps/Linux/matlab to /cvs/cds/caltech/apps/Linux/matlab_old. I installed the latest R2010b matlab from IMSS in /cvs/cds/caltech/apps/Linux/matlab and this seems to have made the cron job work on Mafalda now. |
4152
|
Thu Jan 13 16:41:07 2011 |
josephb | Update | CDS | Channel names for LSC updated |
I renamed most of the filter banks in the c1lsc model. The input filters are now labeled based on the RF photodiode's name, plus I or Q. The last set of filters in the OM subsystem (output matrix) have had the TO removed, and are now sensibly named ETMX, ETMY, etc.
We also removed the redundant filter banks between the LSCMTRX and the LSC_OM_MTRX. There is now only one set, the DARM, CARM, etc ones.
The webview of the LSC model can be found here. |
4157
|
Fri Jan 14 17:13:39 2011 |
josephb | Update | Cameras | Pylon driver for Basler Cameras installed on Megatron |
After getting some help from the Basler technical support, I was directed to the following ftp link:
ftp://Pylon4Linux-ro:h50UZgkl@ftp.baslerweb.com
I went to the pylon 2.1.0 directory and downloaded the pylon-2.1.0-1748-bininst-64.tar.bz2 file. Inside of this tar file was another one called pylon-bininst-64.tar.bz2 (along with some other sample programs). I ran tar -jxf on pylon-bininst-64.tar.bz2 and placed the results into the /opt/pylon directory. It produced a directory of includes, libraries and binaries there.
After playing around with the make files for several sample programs they provided, I finally have been able to compile them. At several places I had to have the make files point to /opt/pylon/lib64 rather than /opt/pylon/lib. I'll be testing the camera with these programs on Monday. I'd also like to see if this particular distribution will work on Centos machines. There's some comments in one of the INSTALL help files suggesting packages needed for an install on Fedora 9, which may mean its possible to get this version working on the Centos machines. |
4163
|
Mon Jan 17 15:31:50 2011 |
josephb | Update | Cameras | Test the Basler acA640-100gm camera |
The Basler acA640-100gm is a power over ethernet camera. It uses a power injector to supply power over an ethernet cable to the camera. Once I got past some initial IP difficulties, the camera worked fine out of the box.
You need to set some environment variables first, so the code knows where its libraries are located.
setenv PYLON_ROOT /opt/pylon
setenv GENICAM_ROOT_V1_1 /opt/pylon
setenv GENICAM_CACHE /cvs/cds/caltech/users/josephb/xml_cache
setenv LD_LIBRARY_PATH /opt/pylon/lib64:$LD_LIBRARY_PATH
I then run the /opt/pylon/bin/PylonViewerApp
Notes on IP:
Initially, you need to set the computer connecting to the camera to an ip in the 169.254.0.XXX range. I used 169.254.0.1 on megatron's eth1 ethernet connection. I also set mtu to 9000.
You can then run the IpConfigurator in /opt/pylon/bin/ to change the camera IP as needed. |
Attachment 1: PylonViewer.jpg
|
|
4168
|
Wed Jan 19 10:31:24 2011 |
josephb | Update | elog | Elog restarted again |
Elog wasn't responding at around 10 am this morning. I killed the elogd process, then used the restart script. |
4175
|
Thu Jan 20 10:15:50 2011 |
josephb | Update | CDS | c1scy error |
This is caused by an insufficient number of active DAQ channels in the C1SCY.ini file located in /opt/rtcds/caltech/c1/chans/daq/. A quick look (grep -v # C1SCY.ini) indicates there are no active channels. Experience tells me you need at least 2 active channels.
Taking a look at the activateDAQ.py script in the daq directory, it looks like the C1SCY.ini file is included, by the loop over optics is missing ETMY. This caused the file to improperly updated when the activateDAQ.py script was run. I have fixed the C1SCY.ini file (ran a modified version of the activate script on just C1SCY.ini).
I have restarted the c1scy front end using the startc1scy script and is currently working.
Quote: |
Here is the error messages in the dmesg on c1iscey
[ 39.429002] c1scy: Invalid num daq chans = 0
[ 39.429002] c1scy: DAQ init failed -- exiting
|
|
4179
|
Thu Jan 20 18:20:55 2011 |
josephb | Update | CDS | c1iscex computer and c1sus computer swapped |
Since the 1U sized computers don't have enough slots to hold the host interface board, RFM card, and a dolphin card, we had to move the 2U computer from the end to middle to replace c1sus.
We're hoping this will reduce the time associated with reads off the RFM card compared to when its in the IO chassis. Previous experience on c1ioo shows this change provides about a factor of 2 improvement, with 8 microseconds per read dropping to 4 microseconds per read, per this elog.
So the dolphin card was moved into the 2U chassis, as well as the RFM card. I had to swap the PMC to PCI adapter on the RFM card since the one originally on it required an external power connection, which the computer doesn't provide. So I swapped with one of the DAC cards in the c1sus IO chassis.
But then I forgot to hit submit on this elog entry.............. |
4183
|
Fri Jan 21 15:26:15 2011 |
josephb | Update | CDS | c1sus broken yesterday and now fixed |
[Joe, Koji]
Yesterday's CDS swap of c1sus and c1iscex left the interfometer in a bad state due to several issues.
The first being a need to actually power down the IO chassis completely (I eventually waited for a green LED to stop glowing and then plugged the power back in) when switching computers. I also plugged and plugged the interface cable from the IO chassis and computer while powered down. This let the computer actually see the IO chassis (previously the host interface card was glowing just red, no green lights).
Second, the former c1iscex computer and now new c1sus computer only has 6 CPUs, not 8 like most of the other front ends. Because it was running 6 models (c1sus, c1mcs, c1rms, c1rfm, c1pem, c1x02) and 1 CPU needed to be reserved for the operating system, 2 models were not actually running (recycling mirrors and PEM). This meant the recycling mirrors were left swinging uncontrolled.
To fix this I merged the c1rms model with the c1sus model. The c1sus model now controls BS, ITMX, ITMY, PRM, SRM. I merged the filter files in the /chans/ directory, and reactivated all the DAQ channels. The master file for the fb in the /target/fb directory had all references to c1rms removed, and then the fb was restarted via "telnet fb 8088" and then "shutdown".
My final mistake was starting the work late in the day.
So the lesson for Joe is, don't start changes in the afternoon.
Koji has been helping me test the damping and confirm things are really running. We were having some issues with some of the matrix values. Unfortunately I had to add them by hand since the previous snapshots no longer work with the models. |
4194
|
Mon Jan 24 10:39:16 2011 |
josephb | HowTo | DAQ | DAQ Wiki Failure |
Actually both port 8087 and 8088 work to talk to the frame builder. Don't let the lack of a daqd prompt fool you.
Here's putting in the commands:
rosalba:~>telnet fb 8088 Trying 192.168.113.202...
Connected to fb.martian (192.168.113.202). Escape character is '^]'.
shutdown
0000Connection closed by foreign host.
rosalba:~>date Mon Jan 24 10:30:59 PST 2011
Then looking at the last 3 lines of restart.log in /opt/rtcds/caltech/c1/target/fb/
daqd_start Fri Jan 21 15:20:48 PST 2011
daqd_start Fri Jan 21 23:06:38 PST 2011
daqd_start Mon Jan 24 10:30:29 PST 2011
So clearly its talking to the frame builder, it just doesn't have the right formatting for the prompt. If you try typing in "help" at the prompt, you still get all the frame builder commands listed and can try using any of them.
However, I'll edit the DAQ wiki and indicate 8087 should be used because of the better formatting for the prompt.
Quote: |
Apparently, 8087 is the right port. Various elog entries from Joe and Kiwamu say 8087 or 8088. Not sure what's going on here.
After figuring this out, I activated the C1:GCV-XARM_COARSE_OUT_DAQ and C1:GCV-XARM_FINE_OUT_DAQ and set both of them to be recorded at 2048 Hz. We are loading filters and setting gains into these filter modules such that the OUT signals will be calibrated into Hz (that's why we used the OUT instead of the IN1 as there was last night).
|
|
4200
|
Tue Jan 25 15:20:38 2011 |
josephb | Update | CDS | Updated c1rfm model plus new naming convention for RFM/Dolphin |
After sitting down for 5 minutes and thinking about it, I realized the names I had been using for internal RFM communication were pretty bad. It was because looking at a model didn't let you know where the RFM connection was coming from or going to. So to correct my previous mistakes, I'm instituting the following naming convention for reflected memory, PCIE reflected memory (dolphin) and shared memory names. These don't actually get used anywhere but the models, and thus don't show up as channel names anywhere else. They are replaced by raw hex memory locations in the actual code through the use of the IPC file (/opt/rtcds/caltech/c1/chans/ipc/C1.ipc). However it will make understanding the models easier for anyone looking at them or modifying them.
The new naming convention for RFM and Dolphin channels is as follows.
SITE:Sending Model-Receiving Model_DESCRIPTION_HERE
The description should be unique to that data being transferred and reused if its the same data. Thus if its transfered to another model, its easy to identify it as the same information.
The model should be the .mdl file name, not the subsystem its a part of. So SCX is used instead of SUS. This is to make it easier to track where data is going.
In the unlikely case of multiple models receiving, it should be of the form SITE:Sending Model-Receiving Model 1-Receiving Model 2_DESCRIPTION_HERE. Seperate models by dashes and description by underscores.
Example:
C1:LSC-RFM_ETMX_LSC
This channel goes from the LSC model (on c1lsc) to the RFM model (on c1sus). It transfers ETMX LSC position feedback. The second LSC may seem redundant until we look at the next channel in the chain.
C1:RFM-SCX_ETMX_LSC
This channel goes from the RFM model to the SCX model (on c1iscex). It contains the same information as the first channel, i.e. ETMX LSC position feedback.
I have updated all the models that had RFM and SHMEM connections, as well as adding all the LSC communciation connections to c1rfm. This includes c1sus, c1rfm, c1mcs, c1ioo, c1gcv, c1lsc, c1scx, c1scy. I have not yet built all the models since I didn't finish the updates until this afternoon. I will build and test the code tomorrow morning.
|
4206
|
Wed Jan 26 10:58:48 2011 |
josephb | Update | CDS | Front End multiple crash |
Looking at dmesg on c1lsc, it looks like the model is starting, but then eventually times out due to a long ADC wait.
[ 114.778001] c1lsc: cycle 45 time 23368; adcWait 14; write1 0; write2 0; longest write2 0
[ 114.779001] c1lsc: ADC TIMEOUT 0 1717 53 181
I'm not sure what caused the time out, although there about 20 messages indicating a failed time stamp read from c1sus (its sending TRX information to c1lsc via the dolphin connection) before the time out.
Not seeing any other obvious error messages, I killed the dead c1lsc model by typing:
sudo rmmod c1lscfe
I then tried starting just the front end model again by going to the /opt/rtcds/caltech/c1/target/c1lsc/bin/ directory and typing:
sudo insmod c1lscfe.ko
This started up just the FE again (I didn't use the restart script because the EPICs processes were running fine since we had non-white channels). At the moment, c1lsc is now running and I see green lights and 0x0 for FB0 status on the C1LSC_GDS_TP screen.
At this point I'm not sure what caused the timeout. I'll be adding some more trouble shooting steps to the wiki though. Also, c1scx, c1scy are probably in need of restart to get them properly sync'd to the framebuilder.
I did a quick test on dataviewer and can see LSC channels such as C1:LSC-TRX_IN1, as well other channels on C1SUS such as BS sensors channels.
Quote: |
STATUS:
- Rebooted c1lsc and c1sus. Restarted fb many times.
- c1sus seems working.
- All of the suspensions are damped / Xarm is locked by the green
- Thermal control for the green is working
- c1lsc is frozen
- FB status: c1lsc 0x4000, c1scx/c1scy 0x2bad
- dataviewer not working
|
|
4208
|
Wed Jan 26 12:04:31 2011 |
josephb | Update | CDS | Explanation of why c1sus and c1lsc models crash when the other one goes down |
So apparently with the current Dolphin drivers, when one of the nodes goes down (say c1lsc), it causes all the other nodes to freeze for up to 20 seconds.
This 20 seconds can force a model to go over the 60 microseconds limit and is sufficiently long enough to force the FE to time out. Alex and Rolf have been working with the vendors to get this problem fixed, as having all your front ends go down because you rebooted a single computer is bad.
[40184.120912] c1rfm: sync error my=0x3a6b2d5d00000000 remote=0x0
[40184.120914] c1rfm: sync error my=0x3a6b2d5d00000000 remote=0x0
[44472.627831] c1pem: ADC TIMEOUT 0 7718 38 7782
[44472.627835] c1mcs: ADC TIMEOUT 0 7718 38 7782
[44472.627849] c1sus: ADC TIMEOUT 0 7718 38 7782
[44472.644677] c1rfm: cycle 1945 time 17872; adcWait 15; write1 0; write2 0; longest write2 0
[44472.644682] c1x02: cycle 7782 time 17849; adcWait 12; write1 0; write2 0; longest write2 0
[44472.646898] c1rfm: ADC TIMEOUT 0 8133 5 7941
The solution for the moment is to start the computers at exactly the same time, so the dolphin is up before the front ends, or start the models by hand after the computer is up and dolphin running, but after they have timed out. This is done by:
sudo rmmod c1SYSfe
sudo insmod /opt/rtcds/caltech/c1/target/c1SYS/bin/c1SYSfe.ko
Alex and Rolf have been working with the vendors to get this fixed, and we may simply need to update our Dolphin drivers. I'm trying to get in contact with them and see if this is the case. |
4212
|
Thu Jan 27 15:16:43 2011 |
josephb | Update | CDS | Updated generate_master_screens.py |
I modified the generate_master_screens.py script in /opt/rtcds/caltech/c1/medm/master/ to handle changing the MCL (and MC_L) listings to ALS for the two ETM suspension screens and associated sub-screens.
The relevant added code is:
custom_optic_channels = ['ETMX',
{'MCL':'ALS','MC_L':'ALS'},
'ETMY',
{'MCL':'ALS','MC_L':'ALS'}]
for index in range(len(custom_optic_channels)/2):
if optic == custom_optic_channels[index*2]:
for swap in custom_optic_channels[index*2+1]:
sed_command = start_sed_string + swap + "/" + custom_optic_channels[index*2+1][swap] + middle_sed_string + optic + file
os.system(sed_command)
When run, it generates the correctly named C1:SUS-ETMX_ALS channels, and replaces MCL and MC_L with ALS in the matrix screens.
|
4219
|
Fri Jan 28 11:08:44 2011 |
josephb | Update | Green Locking | no transmission of ALS signals |
As you've correctly noted, the source of the C1:GCV-SCX_ETMX_ALS channels is in the c1gcv model. The first 3 letters of the channel name indicate this (GCV).
The destination of this channel is c1scx, the 2nd 3 letters indicate this (SCX). If it passed through the c1rfm model, it would be written like C1:GCV-RFM_ETMX_ALS.
This particular channel doesn't pass through the c1rfm model, because the computers these two run on (c1ioo and c1scx) are directly connected via our old VMIC 5565 RFM cards, and don't need to pass through the c1sus computer. This is in contrast to all communications going to or from the c1lsc machine, since that is only connected the c1sus machine by the Dolphin RFM. The c1rfm also handles a bunch of RFM reads from the mode cleaner WFS, since each eats up 3-4 microseconds and I didn't want to slow the c1mcs model by 24 microseconds (and ~50 microseconds before the c1sus/c1scx computer switch).
So basically c1rfm is only used for LSC communications and for some RFM reads for local suspensions on c1sus.
As for the reason we have no transmission, that looks to be a problem on c1ioo's end. I'm also noticing that MCL is not updating on the MC2 suspension screen as well as no changes to MC PIT and YAW channels, which suggests we're not transmitting properly.
I rebooted the c1ioo machine and then did a burt restore of the c1ioo and c1gcv models. These are now up and running, and I'm seeing both MCL and ALS data being transmitted now.
Its possible that when we were working on the c1gfd (green frequency divider model) on c1ioo machine we disturbed the RFM communication somehow. Although what exactly, I'm not sure.
Quote: |
No signal is transmitted from C1:GCV-SCX_ETMX_ALS (on c1gcv) to C1:GCV-SCX_ETMX_ALS (on c1scx)
I can't find RFM definition for ALS channels in c1rfm. Where are they???
|
|
4220
|
Fri Jan 28 12:15:58 2011 |
josephb | Update | CDS | Updating conlog channel list/ working on "HealthCheck" script |
I've updated the scan_adls script (currently located in /cvs/cds/caltech/conlog/bin) to look at new location of our medm screens. I made a backup of the old conlog channel list as /cvs/cds/caltech/conlog/data/conlog_channels.old-2011-01-28.
I then ran the update_chanlist script in the same directory, which calls the scan_adl script. After about 5 minutes it finished updating the channel list. I restarted the conlogger just to be sure, and checked that our new model channels showed up in the conlog (which they do).
I have added a cron job to the op340m cron tab to once a day run the update_conlog script at 7am.
Next, I'm working on a HealthCheck script which looks at the conlog channel list and checks to see if channels are actually changing over short time scales, and then spit back a report on possibly non-functioning channels to the user. |
4231
|
Mon Jan 31 10:31:30 2011 |
josephb | Update | WienerFiltering | Improvement in H1 Wiener FF prediction by using weights and taps |
Rossa is a rather beefy machine. It effectively has 8 Intel i7 Cores (2.67 Ghz each) and 12 Gigs of ram. Megatron only has 8 Gigs of ram and just 8 Opterons (1 GHz each). Rosalba has 4 Quad Core2 (2.4 GHz) with only 4 Gigs of ram.
MC damp |
dataviewer |
diaggui |
AWG |
c1ioo |
c1sus |
c1iscex |
RFM |
The Dolphins |
Sim.Plant |
Frame builder |
TDS |
|
|
|
|
|
|
|
|
|
|
|
|
|
4241
|
Wed Feb 2 15:07:20 2011 |
josephb | Update | CDS | activateDAQ.py now includes PEM channels |
[Joe, Jenne]
We modified the activateDAQ.py script to handle the C1PEM.ini file (defining the PEM channels being recorded by the frame builder) in addition to all the optics channels. Jenne will be modifying it further so as to rename more channels. |
4246
|
Thu Feb 3 16:45:28 2011 |
josephb | Update | CDS | General CDS updates |
Updated the FILTER.adl file to have the yellow button moved up, and replaced the symbol in the upper right with a white A with black background. I made a backup of the filter file called FILTER_BAK.adl. These are located in /opt/rtcds/caltech/c1/core/advLigoRTS/src/epics/util.
I also modified the Makefile in /opt/rtcds/caltech/c1/core/advLigoRTS/ to make the startc1SYS scripts it makes take in an argument. If you type in:
sudo startc1SYS 1
it automatically writes 1 to the BURT RESTORE channel so you don't have to open the GDS_TP screen and by hand put a 1 in the box before the model times out.
The scripts also points to the correct burtwb and burtrb files so it should stop complaining about not finding them when running the scripts, and actually puts a time stamped burt snapshot in the /tmp directory when the kill or start scripts are run. The Makefile was also backed up to Makefile_bak.
|
4247
|
Thu Feb 3 17:25:03 2011 |
josephb | Update | Computers | rsync script was not really backing up /cvs/cds |
So today, after an "rm" error while working with the autoburt.pl script and burt restores in general, I asked Dan Kozak how to actually look at the backup data. He said there's no way to actually look at it at the moment. You can reverse the rsync command or ask him to grab the data/file if you know what you want. However, in the course of this, we realized there was no /cvs/cds data backup.
Turns out, the rsync command line in the script had a "-n" option. This means do a dry run. Everything *but* the actual final copying.
I have removed the -n from the script and started it on nodus, so we're backing up as of 5:22pm today.
I'm thinking we should have a better way of viewing the backup data, so I may ask Dan and Stewart about a better setup where we can login and actually look at the backed up files.
In addition, tomorrow I'm planning to add cron jobs which will put changes to files in the /chans and /scripts directories into the SVN on a daily basis, since the backup procedure doesn't really provide a history for those, just a 1 day back backup. |
4249
|
Fri Feb 4 13:31:16 2011 |
josephb | Update | CDS | FE start scripts moved to scripts/FE/ from scripts/ |
All start and kill scripts for the front end models have been moved into the FE directory under scripts: /opt/rtcds/caltech/c1/scripts/FE/. I modified the Makefile in /opt/rtcds/caltech/c1/core/advLigoRTS/ to update and place new scripts in that directory.
This was done by using
sed -i 's[scripts/start$${system}[scripts/FE/start$${system}[g' Makefile
sed -i 's[scripts/kill$${system}[scripts/FE/kill$${system}[g' Makefile
|
4250
|
Fri Feb 4 13:45:25 2011 |
josephb | Update | Computers | Temporarily removed cronjob for rsync.backup |
<p>I removed the rsync backup from nodus' crontab temporarily so as to not have multiple backup jobs running. The job I started from yesterday was still running. Hopefully the backup will finish by Monday.</p>
<p>The line I removed was:</p>
<p>0 5 * * * /opt/rtcds/caltech/c1/scripts/backup/rsync.backup</p>
<table align="left" width="786" cellspacing="1" cellpadding="1" border="2">
<tbody>
<tr>
<td><span style="font-size: larger;">MC damp</span></td>
<td><span style="font-size: larger;">dataviewer</span></td>
<td><span style="font-size: larger;">diaggui</span></td>
<td><span style="font-size: larger;">AWG</span></td>
<td><span style="font-size: larger;">c1lsc</span></td>
<td><span style="font-size: larger;">c1ioo</span></td>
<td><span style="font-size: larger;">c1sus</span></td>
<td><span style="font-size: larger;">c1iscex</span></td>
<td><span style="font-size: larger;">c1iscex</span></td>
<td><span style="font-size: larger;">RFM</span></td>
<td><span style="">The Dolphins</span></td>
<td><span style="font-size: larger;">Sim.Plant</span></td>
<td><span style="font-size: larger;">Frame builder</span></td>
<td><span style="font-size: larger;">TDS</span></td>
<td><span style="font-size: larger;">Cabling</span></td>
</tr>
<tr>
<td bgcolor="blue"><span style="font-size: larger;"> </span></td>
<td bgcolor="green"><span style="font-size: larger;"> </span></td>
<td bgcolor="blue"><span style="font-size: larger;"> </span></td>
<td bgcolor="yellow"><span style="font-size: larger;"> </span></td>
<td bgcolor="orange"><span style="font-size: larger;"> </span></td>
<td bgcolor="yellow"><span style="font-size: larger;"> </span></td>
<td bgcolor="blue"><span style="font-size: larger;"> </span></td>
<td bgcolor="yellow"><span style="font-size: larger;"> </span></td>
<td bgcolor="yellow"><span style="font-size: larger;"> </span></td>
<td bgcolor="blue"><span style="font-size: larger;"> </span></td>
<td bgcolor="blue"><span style="font-size: larger;"> </span></td>
<td bgcolor="red"><span style="font-size: larger;"> </span></td>
<td bgcolor="blue"><span style="font-size: larger;"> </span></td>
<td bgcolor="orange"><span style="font-size: larger;"> </span></td>
<td bgcolor="orange"><span style="font-size: larger;"> </span></td>
</tr>
</tbody>
</table> |