CP Stat 100 sheet-covers were replaced by clean ones on open chambers BS, ITMX, ITMY and ETMY this morning.
Try to fold the sheets such way that the clean side is facing each other, so they do not accumulate dust.
Jessica will soon ELOG about some measurements suggesting that the conductive connector-ized ALS delay line enclosure is the way to go, when considering crosstalk between the delay lines. It is currently mounted and hooked up on the LSC rack, though I need to make a bunch of new SMA cables now that I think a semi-permanent arrangement has been reached.
I did a rough re-calibration of the phase tracker output, since the increased cable delay changes the degree/Hz gain. This was done by fitting a line to a slow sawtooth FM of the SRS DS345's (1Hz rate, 10kHz deviation, 30MHz carrier). This resulted in the following calibration updates
Again, this is a rough calibration. Nevertheless, it is not so surprising we don't get the 50m/30m = 4.4dB increase we would expect just from the lengths; the (I presume) increased cable loss matters. Also, the loss' frequency dependance is an additional reason that the phase tracker calibration is not constant over all frequencies.
I took spectra with the arms in IR lock, but didn't see any real improvement beyond a possible dip in the floor from 100-200Hz. This doesn't surprise me too much, however, since I don't believe that we are currently dominated by electronic noises that this gain increase would help overcome.
Last week, Koji mentioned the ALS phase noise added due to the post-cavity table motion the arm-transmitted green beams experience before hitting the beat PD. I should estimate the size of this effect for our situation.
Koji had the good idea of trying to measure the motion of the POP beam, and feeding that signal to PRM yaw to stabilize the motion. To facilitate this, I have installed a 50% beam splitter before the POP 110/22 PD (so also before the camera).
Before touching anything, I locked the PRM-ITMY half-cavity so that I had a constant beam at POP. I measured the POP DC OUT to be 58.16 counts. I then installed a 1" 50% BS, making sure (using the 'move card in front of optic while watching camera' technique) that I was not close to clipping on the new BS. I then remeasured POP DC OUT, and found it to be 30.63. I closed the PSL shutter to get the dark value, which was -0.30 . This means that I now have a factor of 0.53 less light on the POP110/22 PD. To compensate for this, I changed the values of the power normalization matrix from 0.01 (MICH) to 0.0189, and 100 (PRCL) to 189.
After doing this, I restored the ITMX and am able to get several tens of seconds of PRMI lock (using AS55Q and REFL33I).
I found several QPDs in the PD cabinet down the Y arm, but no readout electronics. The QPD I found is D990272. I don't really want to spend any significant amount of time hacking something for this together, if Valera can provide a QPD with BNC outputs. For now, I have not installed any DC PD or razor blade (which can be a temporary proxy for a QPD, enough to get us yaw information).
Looks like yesterday was particularly noisy. It's unclear to me why diurnal variation much more visible in MC1_Y, and why the floor wanders.
The first plot shows 5 days. The second plot shows 20 days.
1) Linearity Test
LO input level was +10dBm. The LO freq was 11MHz and 55MHz for CH1 and CH2 respectively.
The IF frequency was fixed at 10kHz.
The amplitude of the RF input was swept from -50dBm to +15dBm.
Basically I and Q output of CH1 and CH2 was quite linear in this amplitude range.
2) Freqency Response
RF input was fixed at -20dBm and the IF frequency was swept from 1kHz to 1MHz.
The response was flat upto 100kHz, and have sensitivity upto 300kHz.
3) Output noise
Noise floor of the output is ~20nV/rtHz. All of the channels behave in the same way.
1/f start from 100Hz.
I have tested the left 2ch of 4ch demod board.
The left most is for 11MHz, and the next one is for the 55MHz.
CES Mezzanine is beeing rebuilt to accommodate our new neighbor: the 20ft high water slide...& .jacuzzi
All our ac power transformers are up there. Yesterday we labelled the power switch of 480VAC on the mezz
that we need to keep to run the 3 cranes in the lab.
Only the 40m cranes are running on 480VAC The electricians are rewiring this transformer on the mezzanine so it was shut down.
I tested all three cranes before the 480V power was turned off. The last thing to do with the cranes to wipe them down before use.
It will happen on next Tuesday morning.
I update my old 40mUpgrade Optickle model, by adding the latest updates in the optical layout (mirror distances, main optics transmissivities, folding mirror transmissivities, etc). I also cleaned it from a lot of useless, Advanced LIGO features.
I calculated the expected power in the fields present at the main ports of the interferometer.
I repeated the calculations for both the arms-locked/arms-unlocked configurations. I used a new set of functions that I wrote which let me evaluate the field power and RF power anywhere in the IFO. (all in my SVN directory)
As in Koji's optical layout, I set the arm length to 38m and I found that at the SP port there was much more power that I woud expect at 44Mhz and 110 MHz.
It's not straightforward to identify unequivocally what is causing it (I have about 100 frequencies going around in the IFO), but presumably the measured power at 44MHz was from the beat between f1 an f2 (55-11=44MHz), and that at 110MHz was from the f2 first sidebands.
Here's what i found:
I found that When I set the arm length to 38.55m (the old 40m average arm length), the power at 44 and 110 MHz went significantly down. See here:
I checked the distances between all the frequencies circulating in the IFO from the closest arm resonance to them.
I found that the f2 and 2*f2 are two of the closest frequencies to the arm resonance (~80KHz). With a arm cavity finesse of 450, that shouldn't be a problem, though.
I'll keep using the numbers I got to nail down the culprit.
Anyways, now the question is: what is the design length of the arms? Because if it is really 38m rather than 38.55m, then maybe we should change it back to the old values.
1. Give us the designed arm length. What is the criteria?
2. The arm lengths got shorter as the ITMs had to shift to the end. To make them longer is difficult. Try possible shorter length.
If you have a working 40m Optickle model, put it in a common place in the SVN, not in your own folder.
I can't figure out why changing the arm length would effect the RF sidebands levels. If you are getting RF sidebands resonating in the arms, then some parameter is not set correctly.
As the RF sideband frequency gets closer to resonating in the arm, the CARM/DARM cross-coupling to the short DOFs probably gets bigger.
I uploaded the latest iscmodeling package to the SVN under /trunk. It includes my addition of the 40m Upgrade model: /trunk/iscmodeling/looptickle/config40m/opt40mUpgrade2010.m.
I don't know the causes of this supposed resonances yet. I'm working to try to understand that. It would be interesting also to evaluate the results of absolute length measurements.
Here is what I also found:
It seems that 44, 66 and 110 are resonating.
If that is real, than 37.5m could be a better place. Although I don't have a definition of "better" yet. All I can say is these resonances are smaller there.
I have moved the summary pages stuff that Duncan set up to a new directory that it accessible to the nodus web server and is therefore available from the outside world:
which is available at:
I updated the scripts, configurations, and crontab appropriately:
The aLIGO-style summary webpages are now running on 40m data! They are running on megatron so can be viewed from within the martian network at:
At the moment I have configured the 5 seismic BLRMS bands, and a random set of PSL channels taken from a strip tool.
Since there are no segments or triggers for C1, the only data sources are GWF frames. These are mounted from the framebuilder under /frames on megatron. There is a python script that takes in a pair of GPS times and a frame type that will locate the frames for you. This is how you use it to find T type frames (second trends) for May 25 2012:
python /home/controls/public_html/summary/bin/framecache.py --ifo C1 --gps-start-time 1021939215 --gps-end-time 1022025615 --type T -o framecache.lcf
If you don't have GPS times, you can use the tconvert tool to generate them
$ tconvert May 25
$ tconvert May 25
The available frame types, as far as I'm aware are R (raw), T (seconds trends), and M (minute trends).
The code is designed to be fairly easy to use, with most of the options set in the ini file. The code has three modes - day, month, or GPS start-stop pair. The month mode is a little sketchy so don't expect too much from it. To run in day mode:
python /home/controls/public_html/summary/bin/summary_page.py --ifo C1 --config-file /home/controls/public_html/summary/share/c1_summary_page.ini --output-dir . --verbose --data-cache framecache.lcf -SRQDUTAZBVCXH --day 20120525
Please forgive the large apparently arbitrary collection of letters, since the 40m doesn't use segments or triggers, these options disable processing of these elements, and there are quite a few of them. They correspond to --skip-something options in long form. To see all the options, run
python /home/controls/public_html/summary/bin/summary_page.py --help
There is also a convenient shell script that will run over today's data in day mode, doing everything for you. This will run framecache.py to find the frames, then run summary_page.py to generate the results in the correct output directory. To use this, run
Different data tabs are disabled via command link --skip-this-tab style options, but the content of tabs is controlled via the ini file. I'll try to give an overview of how to use these. The only configuration required for the Seismic BLRMS 0.1-0.3 Hz tab is the following section:
[data-Seismic 0.1-0.3 Hz]
channels = C1:PEM-RMS_STS1X_0p1_0p3,C1:PEM-RMS_STS1Y_0p1_0p3,C1:PEM-RMS_STS1Z_0p1_0p3
labels = STS1X,STS1Y,STS1Z
frame-type = R
amplitude-log = True
amplitude-lim = 1,500
amplitude-label = BLRMS motion ($\mu$m/s)
The entries can be explained as follows:
Other compatible options not used in this example are:
At the moment a package version issue means the spectrogram doesn't work, but the spectrum should. At the time of writing, to use the spectrum simple add 'plot-dataplot2'.
You can view the configuration file within the webpage via the 'About' link off any page.
Please e-mail any suggestions/complaints/praise to email@example.com.
There is now a job in the crontab that will run the shell wrapper every hour, so the pages _should_ take care of themselves. If you make adjustments to the configuration file they will get picked up on the hour, or you can just run the script by hand at any time.
$ crontab -l
# m h dom mon dow command
0 */1 * * * bash /home/controls/public_html/summary/bin/c1_summary_page.sh > /dev/null 2>&1
Josh Smith, Fabian Magana-Sandoval, Jackie Lee (Fullerton)
Thanks to Jamie and Jenne for the tour and the input on the pages.
We had a look at the GEO summary pages and thought about how best to make a 40m summary page that would eventually become and aligo summary page. Here's a rough plan:
- First we'll check that we can access the 40m NDS2 server to get data from the 40m lab in Fullerton.
- We'll make a first draft of a 40m summary page in python, using pynds, and base the layout on the current geo summary pages.
- When this takes shape we'll iterate with Jamie, Jenne, Rana to get more ideas for measurements, layout.
Other suggestions: Jenne is working on an automated noisebudget and suggests having a placeholder for it on the page. We can also incorporate some of the features of Aidan's 40m overview medm screen that's in progress, possibly with different plots corresponding to different parts of the drawing, etc. Jenne also will email us the link of once per hour medm screenshots.
Kiwamu, Alex and Zach are practicing mandatory IR-safety scan at the 40m-PSL
40m specific safety indoctrination were completed.
[Koji / Gautam (Remote)]
sudo /sbin/ifdown eth0
sudo /sbin/ifup eth0
End RTS recovery
rtcds start --all
Vertex RTS recovery
sudo /sbin/ifup eth1
sudo systemctl start modbusIOC.service
sudo /sbin/ifdown eth1
sudo systemctl start modbusIOC.service
RTS recovery ~ part 2
sudo systemctl start open-mx.service
sudo systemctl start mx.service
sudo systemctl start daqd_*
sudo systemctl start MCautolocker.service
sudo systemctl start FSSSlow.service
In the past year, pygwinc has expanded to support not just fundamental noise calculations (e.g., quantum, thermal) but also any number of user-defined noises. These custom noise definitions can do anything, from evaluating an empirical model (e.g., electronics, suspension) to loading real noise measurements (e.g., laser AM/PM noise). Here is an example of the framework applied to H1.
Starting with the BHD review-era noises, I have set up the 40m pygwinc fork with a working noise budget which we can easily expand. Specific actions:
I set up our fork in this way to keep the 40m separate from the main pygwinc code (i.e., not added to as a built-in IFO type). With the 40m code all contained within one root-level directory (with a 40m-specific name), we should now always be able to upgrade to the latest pygwinc without creating intractable merge conflicts.
The Kinemetrics dudes are going to visit us @ 1:45 tomorrow (Wednesday) to check out our stacks, seismos, etc.
I put these maps here on the elog since people are always getting lost trying to find the lab.
Santa Anna wind speed was locked around 60 kmph last night on campus. The strongest in 30 years. The lab hold up well. We did not lose AC power either.
Threes and windows were blown out and over on campus.
We have 4 sliding glass windows without "heavy-laser proved" inside protection.
We should plan to upgrade ALL sliding glass windows with metal protection from the inside.The strongest in 30 years.
Here's another useful link:
Dan Kozak is rsync transferring /frames from NODUS over to the LDAS grid. He's doing this without a BW limit, but even so its going to take a couple weeks. If nodus seems pokey or the net connection to the outside world is too tight, then please let me and him know so that he can throttle the pipe a little.
The recently observed daqd flakiness looks related to this transfer. It appears to still be ongoing:
nodus:~>ps -ef | grep rsync
controls 29089 382 5 13:39:20 pts/1 13:55 rsync -a --inplace --delete --exclude lost+found --exclude .*.gwf /frames/trend
controls 29100 382 2 13:39:43 pts/1 9:15 rsync -a --delete --exclude lost+found --exclude .*.gwf /frames/full/10975 131.
controls 29109 382 3 13:39:43 pts/1 9:10 rsync -a --delete --exclude lost+found --exclude .*.gwf /frames/full/10978 131.
controls 29103 382 3 13:39:43 pts/1 9:14 rsync -a --delete --exclude lost+found --exclude .*.gwf /frames/full/10976 131.
controls 29112 382 3 13:39:43 pts/1 9:18 rsync -a --delete --exclude lost+found --exclude .*.gwf /frames/full/10979 131.
controls 29099 382 2 13:39:43 pts/1 9:14 rsync -a --delete --exclude lost+found --exclude .*.gwf /frames/full/10974 131.
controls 29106 382 3 13:39:43 pts/1 9:13 rsync -a --delete --exclude lost+found --exclude .*.gwf /frames/full/10977 131.
controls 29620 29603 0 20:40:48 pts/3 0:00 grep rsync
Diagnosing the problem:
I logged into fb and ran "top". It said that fb was waiting for disk I/O ~60% of the time (according to the "%wa" number in the header). There were 8 nfsd (network file server) processes running with several of them listed in status "D" (waiting for disk). The daqd logs were ending with errors like the following suggesting that it couldn't keep up with the flow of data:
[Wed Oct 22 18:58:35 2014] main profiler warning: 1 empty blocks in the buffer
[Wed Oct 22 18:58:36 2014] main profiler warning: 0 empty blocks in the buffer
GPS time jumped from 1098064730 to 1098064731
This all pointed to the possibility that the file transfer load was too heavy.
Reducing the load:
The following configuration changes were applied on fb.
Edited /etc/conf.d/nfs to reduce the number of nfsd processes from 8 to 1:
Ran "ionice" to raise the priority of the framebuilder process (daqd):
controls@fb /opt/rtcds/rtscore/trunk/src/daqd 0$ sudo ionice -c 1 -p 10964
And to reduce the priority of the nfsd process:
controls@fb /opt/rtcds/rtscore/trunk/src/daqd 0$ sudo ionice -c 2 -p 11198
I also tried punishing nfsd with an even lower priority ("-c 3"), but that was causing the workstations to lag noticeably.
After these changes the %wa value went from ~60% to ~20%, and daqd seems to die less often, but some further throttling may still be in order.
As part of the fb40m restart procedure (Sanjit and I were restarting it to add some new channels so they can be read by the OAF model), I checked up on how the backup has been going. Unfortunately the answer is: not well.
Alan imparted to me all the wisdom of frame builder backups on September 28th of this year. Except for the first 2 days of something having gone wrong (which was fixed at that time), the backup script hasn't thrown any errors, and thus hasn't sent any whiny emails to me. This is seen by opening up /caltech/scripts/backup/rsync.backup.cumlog , and noticing that after October 1, 2009, all of the 'errorcodes' have been zero, i.e. no error (as opposed to 'errorcode 2' when the backup fails).
However, when you ssh to the backup server to see what .gwf files exist, the last one is at gps time 941803200, which is Nov 9 2009, 11:59:45 UTC. So, I'm not sure why no errors have been thrown, but also no backups have happened. Looking at the rsync.backup.log file, it says 'Host Key Verification Failed'. This seems like something which isn't changing the errcode, but should be, so that it can send me an email when things aren't up to snuff. On Nov 10th (the first day the backup didn't do any backing-up), there was a lot of Megatron action, and some adding of StochMon channels. If the fb was restarted for either of these things, and the backup script wasn't started, then it should have had an error, and sent me an email. Since any time the frame builder's backup script hasn't been started properly it should send an email, I'm going to go ahead and blame whoever wrote the scripts, rather than the Joe/Pete/Alberto team.
Since our new raid disk is ~28 days of local storage, we won't have lost anything on the backup server as long as the backup works tonight (or sometime in the next few days), because the backup is an rsync, so it copies anything which it hasn't already copied. Since the fb got restarted just now, hopefully whatever funny business (maybe with the .agent files???) will be gone, and the backup will work properly.
I'll check in with the frame builder again tomorrow, to make sure that it's all good.
All is well again in the world of backups. We are now up to date as of ~midnight last night.
Backup Fail. At least this time however, it threw the appropriate error code, and sent me an email saying that it was unhappy. Alan said he was going to check in with Stuart regarding the confusion with the ssh-agent. (The other day, when I did a ps -ef | grep agent, there were ~5 ssh-agents running, which could have been then cause of the unsuccessful backups without telling me that they failed. The main symptom is that when I first restart all of the ssh-agent stuff, according to the directions in the Restart fb40m Procedures, I can do a test ssh over to ldas-cit, to see what frames are there. If I log out of the frame builder and log back in, then I can no longer ssh to ldas-cit without a password. This shouldn't happen....the ssh-agent is supposed to authenticate the connection so no passwords are necessary.)
I'm going to restart the backup script again, and we'll see how it goes over the long weekend.
None of the 3 dd backups I made were bootable - at boot, selecting the drive put me into grub rescue mode, which seemed to suggest that the /boot partition did not exist on the backed up disk, despite the fact that I was able to mount this partition on a booted computer. Perhaps for the same reason, but maybe not.
After going through various StackOverflow posts / blogs / other googling, I decided to try cloning the drives using ddrescue instead of dd.
This seems to have worked for nodus - I was able to boot to console on the machine called rosalba which was lying around under my desk. I deliberately did not have this machine connected to the martian network during the boot process for fear of some issues because of having multiple "nodus"-es on the network, so it complained a bit about starting the elog and other network related issues, but seems like we have a plug-and-play version of the nodus root filesystem now.
chiara and fb1 rootfs backups (made using ddrescue) are still not bootable - I'm working on it.
Nov 6 2017: I am now able to boot the chiara backup as well - although mysteriously, I cannot boot it from the machine called rosalba, but can boot it from ottavia. Anyways, seems like we have usable backups of the rootfs of nodus and chiara now. FB1 is still a no-go, working on it.
Looks to have worked this time around.
controls@fb1:~ 0$ sudo dd if=/dev/sda of=/dev/sdc bs=64K conv=noerror,sync
33554416+0 records in
33554416+0 records out
2199022206976 bytes (2.2 TB) copied, 55910.3 s, 39.3 MB/s
You have new mail in /var/mail/controls
I was able to mount all the partitions on the cloned disk. Will now try booting from this disk on the spare machine I am testing in the office area now. That'd be a "real" test of if this backup is useful in the event of a disk failure.
What are the critical filesystems? I've also indicated the size of these disks and the volume currently used, and the current backup situation.
Not backed up
LDAS pulls files from nodus daily via rsync, so there's no cron job for us to manage. We just allow incoming rsync.
Local backup on /media/40mBackup on chiara via daily cronjob
Remote backup to ldas-cit.ligo.caltech.edu::40m/cvs via daily cronjob on nodus
Currently mounted on Megatron, not backed up.
Then there is Optimus, but I don't think there is anything critical on it.
So, based on my understanding, we need to back up a whole bunch of stuff, particularly the boot disks and root filesystems for Chiara, Megatron and Nodus. We should also test that the backups we make are useful (i.e. we can recover current operating state in the event of a disk failure).
Please edit this elog if I have made a mistake. I also don't have any idea about whether there is any sort of backup for the slow computing system code.
In addition to bootable full disk backups, it would be wise to make sure the important service configuration files from each machine are version controlled in the 40m SVN. Things like apache files on nodus, martian hosts and DHCP files on chiara, nds2 configuration and init scripts on megatron, etc. This can make future OS/hardware upgrades easier too.
I first initialized the drives by hooking them up to my computer and running the setup.app file. After this, plugging the drive into the respective machine and running lsblk, I was able to see the mount point of the external drive. To actually initialize the backup, I ran the following command from a tmux session called ddBackupLaCie:
sudo dd if=/dev/sda of=/dev/sdb bs=64K conv=noerror,sync
Here, /dev/sda is the disk with the root filesystem, and /dev/sdb is the external hard-drive. The installed version of dd is 8.13, and from version 8.21 onwards, there is a progress flag available, but I didn't want to go through the exercise of upgrading coreutils on multiple machines, so we just have to wait till the backup finishes.
We also wanted to do a backup of the root of FB1 - but I'm not sure if dd will work with the external hard drive, because I think it requires the backup disk size (for us, 1TB) to be >= origin disk size (which on FB1, according to df -h, is 2TB). Unsure why the root filesystem of FB is so big, I'm checking with Jamie what we expect it to be. Anyways we have also acquired 2TB HGST SATA drives, which I will use if the LaCie disks aren't an option.
After consulting with Jamie, we reached the conclusion that the reason why the root of FB1 is so huge is because of the way the RAID for /frames is setup. Based on my googling, I couldn't find a way to exclude the nfs stuff while doing a backup using dd, which isn't all that surprising because dd is supposed to make an exact replica of the disk being cloned, including any empty space. So we don't have that flexibility with dd. The advantage of using dd is that if it works, we have a plug-and-play clone of the boot disk and root filesystem which we can use in the event of a hard-disk failure.
I am trying option 3 now. dd however does requrie that the destination drive size be >= source drive size - I'm not sure if this is true for the HGST drives. lsblk suggests that the drive size is 1.8TB, while the boot disk, /dev/sda, is 2TB. Let's see if it works.
Backup of chiara is done. I checked that I could mount the external drive at /mnt and access the files. We should still do a check of trying to boot from the LaCie backup disk, need another computer for that.
nodus backup is still not complete according to the console - there is no progress indicator so we just have to wait I guess.
This is not quite right. First of all, /frames is not NFS. It's a mount of a local filesystem that happens to be on a RAID. Second, the frames RAID is mounted at /frames. If you do a dd of the underlying block device (in this case /dev/sda*, you're not going to copy anything that's mounted on top of it.
What i was saying about /frames is that I believe there is data in the underlying directory /frames that the frames RAID is mounted on top of. In order to not get that in the copy of /dev/sda4 you would need to unmount the frames RAID from /frames, and delete everything from the /frames directory. This would not harm the frames RAID at all.
But it doesn't really matter because the backup disk has space to cover the whole thing so just don't worry about it. Just dd /dev/sda to the backup disk and you'll just be copying the root filesystem, which is what we want.
The nodus backup too is now complete - however, I am unable to mount the backup disk anywhere. I tried on a couple of different machines (optimus, chiara and pianosa), but always get the same error:
mount: unknown filesystem type 'LVM2_member'
The disk itself is being recognized, and I can see the partitions when I run lsblk, but I can't get the disk to actually mount.
Doing a web-search, I came across a few blog posts that look like the problem can be resolved using the vgchange utility - but I am not sure what exactly this does so I am holding off on trying.
To clarify, I performed the cloning by running
sudo dd if=/dev/sda of=/dev/sdb bs=64K conv=noerror,sync
in a tmux session on nodus (as I did for chiara and FB1, latter backup is still running).
The FB1 dd backup process seems to have finished too - but I got the following message:
dd: error writing ‘/dev/sdc’: No space left on device
30523666+0 records in
30523665+0 records out
2000398934016 bytes (2.0 TB) copied, 50865.1 s, 39.3 MB/s
Running lsblk shows the following:
controls@fb1:~ 32$ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sdb 8:16 0 23.5T 0 disk
└─sdb1 8:17 0 23.5T 0 part /frames
sda 8:0 0 2T 0 disk
├─sda1 8:1 0 476M 0 part /boot
├─sda2 8:2 0 18.6G 0 part /var
├─sda3 8:3 0 8.4G 0 part [SWAP]
└─sda4 8:4 0 2T 0 part /
sdc 8:32 0 1.8T 0 disk
├─sdc1 8:33 0 476M 0 part
├─sdc2 8:34 0 18.6G 0 part
├─sdc3 8:35 0 8.4G 0 part
└─sdc4 8:36 0 1.8T 0 part
While I am able to mount /dev/sdc1, I can't mount /dev/sdc4, for which I get the error message
controls@fb1:~ 0$ sudo mount /dev/sdc4 /mnt/HGSTbackup/
mount: wrong fs type, bad option, bad superblock on /dev/sdc4,
missing codepage or helper program, or other error
In some cases useful info is found in syslog - try
dmesg | tail or so.
Looking at dmesg, it looks like this error is related to the fact that we are trying to clone a 2TB disk onto a 1.8TB disk - it complains about block size exceeding device size.
The 4TB HGST drives have arrived. I've started the FB1 dd backup process. Should take a day or so.
controls@fb1:~ 0$ sudo dd if=/dev/sda of=/dev/sdc bs=64K conv=noerror,sync
33554416+0 records in
33554416+0 records out
2199022206976 bytes (2.2 TB) copied, 55910.3 s, 39.3 MB/s
You have new mail in /var/mail/controls
The 40m computers were responding sluggishly yesterday, to the point of being unusable.
The mx_stream code running on c1iscex (the X end suspension control computer) went crazy for some reason. It was constantly writing to a log file in /cvs/cds/rtcds/caltech/c1/target/fb/192.168.113.80.log. In the past 24 hours this file had grown to approximately 1 Tb in size. The computer had been turned back on yesterday after having reconnected its IO chassis, which had been moved around last week for testing purposes - specifically plugging the c1ioo IO chassis in to it to confirm it had timing problems.
The mx_stream code was killed on c1iscex and the 1 Tb file removed.
Computers are now more usable.
We still need to investigate exactly what caused the code to start writing to the log file non-stop.
Alex believes this was due to a missing entry in the /diskless/root/etc/hosts file on the fb machine. It didn't list the IP and hostname for the c1iscex machine. I have now added it. c1iscex had been added to the /etc/dhcp/dhcpd.conf file on fb, which is why it was able to boot at all in the first place. With the addition of the automatic start up of mx_streams in the past week by Alex, the code started, but without the correct ip address in the hosts file, it was getting confused about where it was running and constantly writing errors.
When adding a new FE machine, add its IP address and its hostname to the /diskless/root/etc/hosts file on the fb machine.
The mx_stream code running on c1iscex (the X end suspension control computer) went crazy for some reason. It was constantly writing to a log file in /cvs/cds/rtcds/caltech/c1/target/fb/192.168.113.80.log. In the past 24 hours this file had grown to approximatel
The moral of the story is, PUT THINGS IN THE ELOG. This wild process is one of those things where people say 'this won't effect anything', but in fact it wastes several hours of time.
As part of the effort to debug what was happening with the slow computers, I disabled the auto MEDM snapshots process that Yoichi/Kakeru setup some long time ago:
We have to re-activate it now that the MEDM screen locations have been changed. To do this, we have to modify the crontab on nodus and also the scripts that the cron is calling. I would prefer to run this cron on some linux machine since nodus starts to crawl whenever we run ImageMagick stuff.
Also, we should remember to start moving the old target/ directories into the new area. All of the slow VME controls are still not in opt/rtcds/.
The following is not 100% accurate, but represents my understanding of the events currently. I'm trying to get a full description from Christian and will hopefully be able to update this information later today.
Last night around 7:30 pm, Caltech detected evidence of computer virus located behind a linksys router with mac address matching our NAT router, and at the IP 22.214.171.124. We did not initially recognize the mac address as the routers because the labeled mac address was off by a digit, so we were looking for another old router for awhile. In addition, pings to 126.96.36.199 were not working from inside or outside of the martian network, but the router was clearly working.
However, about 5 minutes after Christian and Mike left, I found I could ping the address. When I placed the address into a web browser, the address brought us to the control interface for our NAT router (but only from the martian side, from the outside world it wasn't possible to reach it).
They turned logging on the router (which had been off by default) and started monitoring the traffic for a short time. Some unusual IP addresses showed up, and Mike said something about someone trying to IP spoof warning coming up. Something about a file sharing port showing up was briefly mentioned as well.
The outside IP address was changed to 188.8.131.52 and dhcp which apparently was on, was turned off. The password was changed and is in the usual place we keep router passwords.
Update: Christian said Mike has written up a security report and that he'll talk to him tomorrow and forward the relevant information to me. He notes there is possibly an infected laptop/workstation still at large. This could also be a personal laptop that was accidently connected to the martian network. Since it was found to be set to dhcp, its possible a laptop was connected to the wrong side and the user might not have realized this.
Mike Pedraza came by today to install a new wireless network router configured for the 40m lab network. It has a 'secret' SSID i.e. not meant for public use outside the lab. You can look up the password and network name on the rack. Pictures below show the location of the labels.
Mike P swapped in a new network router Linksys E1000
I wrote down the settings according to which I tuned the optickle model of the 40m Upgrade.
Basically I set it so that:
In this way when the carrier becomes resonant in the arms we have:
The DARM offset for DC readout is optional, and doesn't change those conditions.
I also plotted the carrier and the sideband's circulating power for both recycling cavities.
I'm attaching a file containing more detailed explanations of what I said above. It also contains the plots of field powers, and transfer functions from DARM to the dark port. I think they don't look quite right. There seems to be something wrong.
Valera thought of fixing the problem, removing the 180 degree offset on the SRM, which is what makes the sideband rather than the carrier resonant in SRC. In his model the carrier becomes resonant and the sideband anti-resonant. I don't think that is correct.
The resonant-carrier case is also included in the attachment (the plots with SRMoff=0 deg). In the plots the DARM offset is always zero.
I'm not sure why the settings are not producing the expected transfer functions.
In my calculation of the digital filters of the optical transfer functions the carrier light is resonant in coupled cavities and the sidebands are resonant in recycling cavities (provided that macroscopic lengths are chosen correctly which I assumed).