40m QIL Cryo_Lab CTN SUS_Lab TCS_Lab OMC_Lab CRIME_Lab FEA ENG_Labs OptContFac Mariner WBEEShop
  40m Log  Not logged in ELOG logo
Entry  Mon Aug 14 19:41:46 2017, Jamie, Update, CDS, front-end/DAQ network down for kernel upgrade, and timing errors 
    Reply  Wed Aug 16 17:05:53 2017, Jamie, Update, CDS, front-end/DAQ network down for kernel upgrade, and timing errors 2017-08-16-163725_1366x495_scrot.png
       Reply  Wed Aug 16 17:14:02 2017, Koji, Update, CDS, front-end/DAQ network down for kernel upgrade, and timing errors 
          Reply  Wed Aug 16 18:01:28 2017, Jamie, Update, CDS, front-end/DAQ network down for kernel upgrade, and timing errors 
             Reply  Wed Aug 16 18:06:01 2017, Koji, Update, CDS, front-end/DAQ network down for kernel upgrade, and timing errors 
       Reply  Wed Aug 16 18:50:58 2017, Jamie, Update, CDS, front-end/DAQ network down for kernel upgrade, and timing errors 2017-08-16-184910_1394x488_scrot.png
          Reply  Mon Aug 28 16:20:00 2017, gautam, Update, CDS, 40m files backup situation 
             Reply  Mon Aug 28 17:13:57 2017, ericq, Update, CDS, 40m files backup situation 
             Reply  Fri Sep 15 15:54:28 2017, gautam, Update, CDS, FB wiper script 
                Reply  Mon Sep 18 17:17:49 2017, gautam, Update, CDS, FB wiper script 
                   Reply  Mon Sep 18 17:30:54 2017, Chris, Update, CDS, FB wiper script wiper.pl
                      Reply  Mon Sep 18 17:51:26 2017, gautam, Update, CDS, FB wiper script perlDiff.png
                         Reply  Mon Sep 18 18:40:34 2017, gautam, Update, CDS, FB wiper script 
             Reply  Tue Sep 26 15:55:20 2017, gautam, Update, CDS, 40m files backup situation 
                Reply  Thu Sep 28 10:33:46 2017, gautam, Update, CDS, 40m files backup situation 
                   Reply  Thu Sep 28 11:13:32 2017, jamie, Update, CDS, 40m files backup situation 
                      Reply  Thu Sep 28 23:47:38 2017, gautam, Update, CDS, 40m files backup situation 
                         Reply  Fri Sep 29 11:07:16 2017, gautam, Update, CDS, 40m files backup situation 
                            Reply  Thu Oct 5 13:58:26 2017, gautam, Update, CDS, 40m files backup situation 
                               Reply  Fri Oct 6 12:46:17 2017, gautam, Update, CDS, 40m files backup situation 
                                  Reply  Sat Oct 28 00:36:26 2017, gautam, Update, CDS, 40m files backup situation - ddrescue 415E2F09-3962-432C-B901-DBCB5CE1F6B6.jpegBFF8F8B5-1836-4188-BDF1-DDC0F5B45B41.jpeg
Message ID: 13262     Entry time: Mon Aug 28 16:20:00 2017     In reply to: 13219     Reply to this: 13263   13312   13332
Author: gautam 
Type: Update 
Category: CDS 
Subject: 40m files backup situation 

This elog is meant to summarize the current backup situation of critical 40m files.

What are the critical filesystems? I've also indicated the size of these disks and the volume currently used, and the current backup situation. 

Name

Disk Usage

Description / remarks

Current backup status

FB1 root filesystem 1.7TB / 2TB
  • FB1 is the machine that hosts the diskless root for the front end machines
  • Additionally, it runs the daqd processes which write data from realtime models into frame files
Not backed up
/frames up to 24TB
  • This is where the frame files are written to 
  • Need to setup a wiper script that periodically clears older data so that the disk doesn't overflow.

Not backed up 

LDAS pulls files from nodus daily via rsync, so there's no cron job for us to manage. We just allow incoming rsync.

Shared user area 1.6TB / 2TB
  • /home/cds on chiara
  • This is exported over NFS to 40m workstations, FB1 etc.
  • Contains user directories, scripts, realtime models etc.

Local backup on /media/40mBackup on chiara via daily cronjob

Remote backup to ldas-cit.ligo.caltech.edu::40m/cvs via daily cronjob on nodus

Chiara root filesystem 11GB / 440GB
  • This is the root filesystem for chiara
  • Contains nameserver stuff for the martian network, responsible for rsyncing /home/cds
Not backed up
Megatron root filesystem 39GB / 130GB
  • Boot disk for megatron, which is our scripts machine
  • Runs MC autolocker, FSS loops etc.
  • Also is the nds server for facilitating data access from outside the martian network
Not backed up
Nodus root filesystem 77GB / 355GB
  • This is the boot disk for our gateway machine
  • Hosts Elog, svn, wikis
  • Supposed to be responsible for sending email alerts for NFS disk usage and vacuum system N2 pressure
Not backed up
JETSTOR RAID Array 12TB / 13TB
  • Old /frames
  • Archived frames from DRFPMI locks
  • Long term trends

Currently mounted on Megatron, not backed up.

Then there is Optimus, but I don't think there is anything critical on it. 

So, based on my understanding, we need to back up a whole bunch of stuff, particularly the boot disks and root filesystems for Chiara, Megatron and Nodus. We should also test that the backups we make are useful (i.e. we can recover current operating state in the event of a disk failure).

Please edit this elog if I have made a mistake. I also don't have any idea about whether there is any sort of backup for the slow computing system code.

ELOG V3.1.3-