40m QIL Cryo_Lab CTN SUS_Lab TCS_Lab OMC_Lab CRIME_Lab FEA ENG_Labs OptContFac Mariner WBEEShop
  40m Log, Page 324 of 335  Not logged in ELOG logo
ID Date Author Type Categoryup Subject
  15404   Wed Jun 17 16:27:51 2020 gautamUpdateVACQuestions/comments on vacuum

I missed the vacuum discussion on the call today, but I have some questions/comments:

  • Isn’t it true that we didn’t digitally monitor any of the TP diagnostic channels before 2018 December? I don’t have the full history but certainly there wasn’t any failure of the vacuum system connected to pump current/temp/speed from Sep 2015-Dec2018, whereas we have had 2 interruptions in 6 months because of flaky serial communications.
  • According to the manuals, the turbo-pumps have their own internal logic to shut off the pump when either bearing temperature exceeds 60C or current exceeds 1.5A. I agree its good to have some redundancy, but do we really expect that our outer interlock loops will function if the internal ones fail?
  • In what scenario do we expect that all our pressure gauge readbacks fail, but not the TP readbacks? If so, won’t the differential pressure conditions protect the vacuum envelope, and the TPs internal shutoffs will protect the pumps? Except during the pump down phase perhaps, when we want to give a little more headroom to the small TPs to stress them less?

At the very least, I think we should consider making the interlock code have levels (like interrupts on a micro controller). So if the pressure gauges are communicating and are reporting acceptable pressure readings, we should be able to reject unphysical readbacks from the TP controllers.

I still don’t understand why TP2 can’t back TP1, but we just disable all the software interlock conditions contingent on TP2 readbacks. This pump is far newer than TP3, and unless I’ve misunderstood something major about the vacuum infrastructure, I don’t really see why we should trust this flaky serial readbacks for any actionable interlocks, at least without some AND logic (since temperature, current and speed aren’t really independent variables).

I also think we should finally implement the email alert in the event the vacuum interlock is tripped. I can implement this if no one else volunteers.

This might also be a good reminder to get the documentation in order about the new vacuum system.

  15406   Thu Jun 18 11:00:24 2020 JonUpdateVACQuestions/comments on vacuum
Quote:
  • Isn’t it true that we didn’t digitally monitor any of the TP diagnostic channels before 2018 December? I don’t have the full history but certainly there wasn’t any failure of the vacuum system connected to pump current/temp/speed from Sep 2015-Dec2018, whereas we have had 2 interruptions in 6 months because of flaky serial communications.

Looking at images of the old vac screens, the TP2/3 rotation speed and status string were digitally monitored. However I don't know if there were software interlocks predicated on those.

Quote:
  • According to the manuals, the turbo-pumps have their own internal logic to shut off the pump when either bearing temperature exceeds 60C or current exceeds 1.5A. I agree its good to have some redundancy, but do we really expect that our outer interlock loops will function if the internal ones fail?

The temperature and current interlocks are implemented precisely because the pumps can shut themselves off. The concern is not about damaging the pumps (their internal logic protects against that); it's that a pump could automatically shut down and back-vent the IFO to atmosphere. Another interlock (e.g., the pressure differentials) might catch it, but it would depend on the back-vent rate and the scenario has never been tested. The temperature and current interlocks are set to trip just before the pump reaches its internal shut-down threshold.

One way we might be able to reduce our reliance on the flaky serial readbacks is to implement rotation-speed hardware interlocks. The old vac documentation alludes to these, but as far as Chub and I could determine in 2018, they never actually existed. The older turbo controllers, at least, had an analog output proportional to speed which could be used to control a relay to interrupt the V4/5 control signals. I'll look into this for the new controllers. If it could be done, we could likely eliminate the layer of serial-readback interlocks altogether.

 
  • I also think we should finally implement the email alert in the event the vacuum interlock is tripped. I can implement this if no one else volunteers.

That would be awesome if you're willing to volunteer. I agree this would be great to have.

  15407   Thu Jun 18 12:00:36 2020 gautamUpdateVACQuestions/comments on vacuum

I agree there were MEDM fields, but I can't find any record of these channels being recorded till 2018 December, so I don't agree that they were being digitally monitored. You can also look back in the elog (e.g. here and here) and see that the display fields are just blank. I would then assume that no interlocks were dependent on these channels, because otherwise the vacuum interlocks would be perpetually tripped.

Looking at images of the old vac screens, the TP2/3 rotation speed and status string were digitally monitored. However I don't know if there were software interlocks predicated on those.

Sorry but I'm having trouble imagining a scenario how the pressure gauges wouldn't register this before the IFO volume is compromised. Is there some back of the envelope calculations I can do to understand this? Since both the pressure gauges and the TP diagnostic channels are being monitored via EPICS, the refresh rate is similar, so I don't see how we can have a pump temperature / speed / current threshold tripped but NOT have this be registered on all the pressure gauges, seems like a bit of a contrived scenario to me. Our thresholds currently seem to be arbitrary numbers anyway, or are they based on some expected backstreaming rate? Isn't this scenario degenerate with a leak elsewhere in the vacuum envelope that would be caught by the differential pressure interlocks?

The temperature and current interlocks are implemented precisely because the pumps can shut themselves off. The concern is not about damaging the pumps (their internal logic protects against that); it's that a pump could automatically shut down and back-vent the IFO to atmosphere. Another interlock (e.g., the pressure differentials) might catch it, but it would depend on the back-vent rate and the scenario has never been tested. The temperature and current interlocks are set to trip just before the pump reaches its internal shut-down threshold.

For the email alert, can you expose a soft channel that is a flag - if this flag is not 1, then the service will send out an email.

That would be awesome if you're willing to volunteer. I agree this would be great to have.
  15408   Thu Jun 18 14:13:03 2020 JonUpdateVACQuestions/comments on vacuum
I agree there were MEDM fields, but I can't find any record of these channels being recorded till 2018 December, so I don't agree that they were being digitally monitored. You can also look back in the elog (e.g. here and here) and see that the display fields are just blank. I would then assume that no interlocks were dependent on these channels, because otherwise the vacuum interlocks would be perpetually tripped.

Right, I doubt they were ever recorded or used for interlocks. But the readbacks did work at one point in the past. There's a photo of the old vac monitor screen on p. 19 of E1500239 (last updated 2017) which shows the fields once alive.

Sorry but I'm having trouble imagining a scenario how the pressure gauges wouldn't register this before the IFO volume is compromised. Is there some back of the envelope calculations I can do to understand this? Since both the pressure gauges and the TP diagnostic channels are being monitored via EPICS, the refresh rate is similar, so I don't see how we can have a pump temperature / speed / current threshold tripped but NOT have this be registered on all the pressure gauges, seems like a bit of a contrived scenario to me. Our thresholds currently seem to be arbitrary numbers anyway, or are they based on some expected backstreaming rate? Isn't this scenario degenerate with a leak elsewhere in the vacuum envelope that would be caught by the differential pressure interlocks?​

I don't disagree that the pressure gauges would register the change. What I'm not sure about is whether the change would violate any of the existing interlock conditions, triggering a shutdown. Looking at what we have now, the only non-pump-related conditions I see that might catch it are the diffpres conditions:

  • abs(P2 - PTP2) > 1 torr (for a TP2 failure)

  • abs(P3 - PTP3) > 1 torr (for a TP3 failure)

  • abs(P1a - P2) > 1 torr (for either a TP2 or TP3 failure)

For the P1a-P2 differential, the threshold of 1 torr is the smallest value that in practice still allows us to pump down the IFO without having to disable the interlocks (P1a-P2 is the TP1 intake/exhaust differential). The purpose of the P2-PTP2/P3-PTP3 differentials is to prevent V4/5 from opening and suddenly exposing the spinning turbo to high pressure. I'm not aware of a real damage threshold calculation that any one has done; I think < 1 torr is lore passed down by Steve.

If a turbo pump fails, the rate it would backstream is unknown (to me, at least) and likely depends on the failure mode. The scenario I'm concerned about is if the backstream rate is slower than the conduction time through the pumspool and into the main volume. In that case, the pressure gauges will rise more or less together all the way up to atmosphere, likely never crossing the 1 torr differential pressure thresholds.

For the email alert, can you expose a soft channel that is a flag - if this flag is not 1, then the service will send out an email.

There's already a channel C1:Vac-error_status, where if the value is anything other than an empty string, there is an interlock tripped. Does that work?

  15409   Thu Jun 18 15:25:08 2020 JordanUpdateVACTP2 and TP3 Forepump removal

I removed the backing pumps for TP2 and TP3 today to test ultimate pressure and determine if they need a tip seal replacement. This was done with Jon backing me on Zoom. We closed off TP3 and powered down TP3 and the auxilliary pump, in order to remove the forepumps from the exhaust line.

  1. Close V1
  2. Close V5
  3. Turn off TP3
  4. Turn off aux dry pump (manually)
  5. Once the PTP3 foreline pressure has come up to atmosphere, you can disconnect the TP3 dry pump and cap the exhaust line with a KF blank.
  6. Restore the vac configuration in reverse order: dry pump ON, TP3 ON, open V5, open V1

Once pumps were removed I connected a Pirani gauge to the pump directly and pumped down, results as follows:

TP2 Forepump (Agilent IDP 7):

  • Ultimate Pressure: 123 mtorr
  • Hours: 10903

TP3 Forepump (Varian SH 110):

  • Ultimate pressure: ~70 torr
  • Hours: 60300

TP3 forepump defintely needs a new tip seal, and while the pressure on TP2 Forepump was good there was a significant amount of particulate that came out of the exhaust line, so a new tip seal might not be needed but it is recommeded.

  15410   Thu Jun 18 15:46:34 2020 gautamUpdateVACQuestions/comments on vacuum

So why not just have a special mode for the interlock code during pumpdown and venting, and during normal operation we expect the main volume pressure to be <100uTorr so the interlock trips if this condition is violated? These can just be EPICS buttons on the Vac control MEDM screen. Both of these procedures are not "business as usual", and even if we script them in the future, it's likely to have some operator supervising, so I don't think it's unreasonable to have to switch between these modes. I just think the pressure gauges have demonstrated themselves to be much more reliable than these TP serial readbacks (as you say, they worked once upon a time, but that is already evidence of its flakiness?). The Pirani gauges are not ultra-reliable, they have failed in the past, but at least less frequently than this serial comm glitching. In fact, if these readbacks are so flaky, it's not impossible that they don't signal a TP shutdown? I just think the real power of having these multi-channel diagnostics is lost without some AND logic - a turbopump failure is likely to result in an increase in pump current and temperature increase and pump speed decrease, so it's not the individual channel values that should be determining if an interlock is tripped.

I definitely think that protecting the vacuum envelope is a priority - but I don't think it should be at the expense of commissioning time. But if you think these extra interlocks are essential to the safety of the vacuum system, I withdraw my request.

I don't disagree that the pressure gauges would register the change. What I'm not sure about is whether the change would violate any of the existing interlock conditions, triggering a shutdown. Looking at what we have now, the only non-pump-related conditions I see that might catch it are the diffpres conditions:

It would be better to have a flag channel, might be useful for the summary pages too. I will make it if it is too much trouble.

There's already a channel C1:Vac-error_status, where if the value is anything other than an empty string, there is an interlock tripped. Does that work?
  15411   Thu Jun 18 16:56:34 2020 JordanUpdateVACTP2 and TP3 Forepump removal
Quote:

I removed the backing pumps for TP2 and TP3 today to test ultimate pressure and determine if they need a tip seal replacement. This was done with Jon backing me on Zoom. We closed off TP3 and powered down TP3 and the auxilliary pump, in order to remove the forepumps from the exhaust line.

  1. Close V1
  2. Close V5
  3. Turn off TP3
  4. Turn off aux dry pump (manually)
  5. Once the PTP3 foreline pressure has come up to atmosphere, you can disconnect the TP3 dry pump and cap the exhaust line with a KF blank.
  6. Restore the vac configuration in reverse order: dry pump ON, TP3 ON, open V5, open V1

Once pumps were removed I connected a Pirani gauge to the pump directly and pumped down, results as follows:

TP2 Forepump (Agilent IDP 7):

  • Ultimate Pressure: 123 mtorr
  • Hours: 10903

TP3 Forepump (Varian SH 110):

  • Ultimate pressure: ~70 torr
  • Hours: 60300

TP3 forepump defintely needs a new tip seal, and while the pressure on TP2 Forepump was good there was a significant amount of particulate that came out of the exhaust line, so a new tip seal might not be needed but it is recommeded.

I agree with your assessment, Jordan.  If I'm not mistaken the scroll pump for TP2 is new; we had a very early failure with the last new scroll pump (the forepump for TP3) tip seals at just over 5000 hours.  Glad to see my replacement seals held up for over 60K hours. If this is the trend with these pumps, we can simply run them to  around 60000 hours and replace the seals at that time, rather than waiting for failure! - Chub

  15412   Thu Jun 18 22:33:57 2020 JonOmnistructureVACVac hardware purchase list

Replacement Hardware Purchase List

I've created a purchase list of hardware needed to restore the aging vacuum system. This wasn't planned as part of the BHD upgrade, but I've added it to the BHD procurement list since hardware replacements have become necessary.

The list proposes replacing the aging TP3 Varian turbo pump with the newer Agilent model which has already replaced TP2. It seems I was mistaken in believing we already had a second Agilent pump on hand. A thorough search of the lab has not turned it up, and Steve himself has told me he doesn't remember ordering a second one. Fortunately Steve did leave us a detailed Agilent parts list [ELOG 14322].

It also proposes replacing the glitching TP2 Agilent controller with a new one. The existing one can be sent back for repair and then retained as a spare. Considering that one of these controllers is already malfunctioning after < 2 years, I think it's a very good idea to have a spare on hand.

Known Hardware Issues

Below is our current list of vacuum hardware issues. Items that this purchase list will address (limited to only the most urgent) are highlighted in yellow.

  • Replace the UPS
    • Need a 240V socket for TP1 (currently TP1 is not protected from power loss)
    • Need RS232/485 comms with the interlock server (current UPS: serial readbacks have failed, battery is failing)
  • Remove/replace the failed pressure gauges (~5)
  • Add more cold cathode sensors to the main volume for sensor redundancy (currently the main-volume interlocks rely on only 1 working sensor)
  • Replace TP3 (controller is failing)
  • Replace TP2 controller (serial interface has failed)
  • Remove RP2
    • Dead and also not needed. We already have to throttle the pumpdown rate with only two roughing pumps
  • Remove/refurbish the cryopump
    • Contamination risk to have it sitting connectable to the main volume
  15413   Fri Jun 19 07:40:49 2020 JonUpdateVACQuestions/comments on vacuum

I think we should discuss interlock possibilities at a 40m meeting. I'm reluctant to make the system more complicated, but perhaps we can find ways to reduce the reliance on the turbo pump readbacks. I agree they've proven to be the least reliable.

While we may be able to improve the tolerance to certain kinds of hardware malfunctions (and if so, we should), I don't see interlocks triggering on abnormal behavior of critical equipment as the root problem. As I see it, our bigger problem is with all the malfunctioning, mostly end-of-lifetime pieces of vacuum equipment still in use. If we can address the hardware problems, as I'm trying to do with replacements [ELOG 15412], I think that in itself will make the interlocking much less of an issue.

Quote:

So why not just have a special mode for the interlock code during pumpdown and venting, and during normal operation we expect the main volume pressure to be <100uTorr so the interlock trips if this condition is violated? These can just be EPICS buttons on the Vac control MEDM screen. Both of these procedures are not "business as usual", and even if we script them in the future, it's likely to have some operator supervising, so I don't think it's unreasonable to have to switch between these modes. I just think the pressure gauges have demonstrated themselves to be much more reliable than these TP serial readbacks (as you say, they worked once upon a time, but that is already evidence of its flakiness?). The Pirani gauges are not ultra-reliable, they have failed in the past, but at least less frequently than this serial comm glitching. In fact, if these readbacks are so flaky, it's not impossible that they don't signal a TP shutdown? I just think the real power of having these multi-channel diagnostics is lost without some AND logic - a turbopump failure is likely to result in an increase in pump current and temperature increase and pump speed decrease, so it's not the individual channel values that should be determining if an interlock is tripped.

Ok, this can be added pretty easily. Its value will just be toggled between 1 and 0 every time the interlock server raises/clears the existing string channel. Adding the channel will require restarting the whole vac IOC, so I'll do it at a time when Jordan is on hand in case something fails to come back up.

Quote:

It would be better to have a flag channel, might be useful for the summary pages too. I will make it if it is too much trouble.

  15415   Fri Jun 19 09:57:35 2020 gautamUpdateVACQuestions/comments on vacuum

For this particular email service, ideally the email should be sent out as soon as the interlock is tripped, so this would require a line of code to be added to the main interlock code. Which I guess would require a restart of the interlock service. So let me know when you guys plan to do the dry-pump tip seal replacement operation (when I presume valves will be closed anyways) so that we can do this in a minimally invasive way.

Quote:

Ok, this can be added pretty easily. Its value will just be toggled between 1 and 0 every time the interlock server raises/clears the existing string channel. Adding the channel will require restarting the whole vac IOC, so I'll do it at a time when Jordan is on hand in case something fails to come back up.

  15417   Fri Jun 19 14:03:50 2020 JordanUpdateVACForepump Tip Seal Replacement

Tip Seals were replaced on the forepumps for TP2 and TP3, and both are ready to be installed back onto the forelines.

TP2 Forepump Ultimate Pressure: 180 mtorr

TP3 Forepump Ultimate Pressure: 120 mtorr

  15421   Mon Jun 22 10:43:25 2020 JonConfigurationVACVac maintenance at 11 am

The vac system is going down at 11 am today for planned maintenance:

  • Re-install the repaired TP2 and TP3 dry pumps [ELOG 15417]
  • Incorporate an auto-mailer and flag channel into the controls code for signaling tripped interlocks [ELOG 15413]

We will advise when the work is completed.

  15424   Mon Jun 22 20:06:06 2020 JonConfigurationVACVac maintenance complete

This work is finally complete. The dry pump replacement was finished quickly but the controls updates required some substantial debugging.

For one, the mailer code I had been given to install would not run against Python 3.4 on c1vac, the version run by the vac controls since about a year ago. There were some missing dependencies that proved difficult to install (related to Debian Jessie becoming unsupported). I ultimately solved the problem by migrating the whole system to Python 3.5. Getting the Python keyring working within systemd (for email account authentication) also took some time.

Edit: The new interlock flag channel is named C1:Vac-interlock_flag.

Along the way, I discovered why the interlocks had been failing to auto-close the PSL shutter: The interlock was pointed to the channel C1:AUX-PSL_ShutterRqst. During the recent c1psl upgrade, we renamed this channel C1:PSL-PSL_ShutterRqst. This has been fixed.

The main volume is being pumped down, for now still in a TP3-backed configuration. As of 8:30 pm the pressure had fallen back to the upper 1E-6 range. The interlock protection is fully restored. Any time an interlock is triggered in the future, the system will send an immediate notification to 40m mailing list. 👍

Quote:

The vac system is going down at 11 am today for planned maintenance:

  • Re-install the repaired TP2 and TP3 dry pumps [ELOG 15417]
  • Incorporate an auto-mailer and flag channel into the controls code for signaling tripped interlocks [ELOG 15413]
  15425   Tue Jun 23 17:54:56 2020 ranaConfigurationVACVac maintenance complete

I propose we go for all CAPS for all channel names. The lower case names is just a holdover from Steve/Alan from the 90's. All other systems are all CAPS.

It avoids us having to force them all to UPPER in the scripts and channel lists.

  15446   Wed Jul 1 18:03:04 2020 JonConfigurationVACUPS replacements

​I looked into how the new UPS devices suggested by Chub would communicate with the vac interlocks. There are several possible ways, listed in order of preference:

  • Python interlock service directly queries the UPS via a USB link using the (unofficial) tripplite package. Direct communication would be ideal because it avoids introducing a dependency on third-party software outside the monitoring/control capability of the interlock manager. However the documentation warns this package does not work for all models...
  • Configure Tripp Lite's proprietary software (PowerAlert Local) to send SYSLOG event messages (UDP packets) to a socket monitored by the Python interlock manager.
  • Configure the proprietary software to execute a custom script upon an event occurring. The script would, e.g., set an EPICS flag channel which the interlock manager is continually monitoring.

I recommend we proceed with ordering the Tripp Lite 36HW20 for TP1 and Tripp Lite 1AYA6 for TP2 and TP3 (and other 120V electronics). As far as I can tell, the only difference between the two 120V options is that the 6FXN4 model is TAA-compliant.

  15465   Thu Jul 9 18:00:35 2020 JonConfigurationVACUPS replacements

Chub has placed the order for two new UPS units (115V for TP2/3 and a 220V version for TP1).

They will arrive within the next two weeks.

Quote:

​I looked into how the new UPS devices suggested by Chub would communicate with the vac interlocks. There are several possible ways, listed in order of preference:

  • Python interlock service directly queries the UPS via a USB link using the (unofficial) tripplite package. Direct communication would be ideal because it avoids introducing a dependency on third-party software outside the monitoring/control capability of the interlock manager. However the documentation warns this package does not work for all models...
  • Configure Tripp Lite's proprietary software (PowerAlert Local) to send SYSLOG event messages (UDP packets) to a socket monitored by the Python interlock manager.
  • Configure the proprietary software to execute a custom script upon an event occurring. The script would, e.g., set an EPICS flag channel which the interlock manager is continually monitoring.

I recommend we proceed with ordering the Tripp Lite 36HW20 for TP1 and Tripp Lite 1AYA6 for TP2 and TP3 (and other 120V electronics). As far as I can tell, the only difference between the two 120V options is that the 6FXN4 model is TAA-compliant.

  15499   Thu Jul 23 15:58:24 2020 JonSummaryVACVacuum controls refurbishment plan

This year we've struggled with vacuum controls unreliability (e.g., spurious interlock triggers) caused by decaying hardware. Here are details of the vacuum refurbishment plan I described on the 40m call this week.

 Refurbish TP2 and TP3 dry pumps. Completed [ELOG 15417].

 Automated notifications of interlock-trigger events. Email to 40m list and a new interlock flag channel. Completed [ELOG 15424].

Replace failing UPS.

  • Two new Tripp Lite units on order, 110V and 230V [ELOG 15465].
  • Jordan will install them in the vacuum rack once received.
  • Once installed, Jon will come test the new units, set up communications, and integrate them into the interlock system following this plan [ELOG 15446].
  • Jon will move the pumps and other equipment to the new UPS units only after completing the above step.

Remove interlock dependencies on TP2/TP3 serial readbacks. Due to persistent glitching [ELOG 15140, ELOG 15392].

Unlike TP2 and TP3, the TP1 readbacks are real analog signals routed to Acromags. As these have caused us no issues at all, the plan is to eliminate dependence on the TP2/3 digital readbacks in favor of the analog controller outputs. All the digital readback channels will continue to exist, but the interlock system will no longer depend on them. This will require adding 2 new sinking BI channels each for TP2 and TP3 (for a total of 4 new channels). We have 8 open Acromag XT1111 channels in the c1vac system [ELOG 14493], so the new channels can be accommodated. The below table summarizes the proposed changes.

Channel Type Status Description Interlock
C1:Vac-TP1_current AI exists Current draw (A) keep
C1:Vac-TP1_fail BI exists Critical fault has occurred keep
C1:Vac-TP1_norm BI exists Rotation speed is within +/-10% of set point new
C1:Vac-TP2_rot soft exists Rotation speed (krpm) remove
C1:Vac-TP2_temp soft exists Temperature (C) remove
C1:Vac-TP2_current soft exists Current draw (A) remove
C1:Vac-TP2_fail BI new Critical fault has occurred new
C1:Vac-TP2_norm BI new Rotation speed is >80% of set point new
C1:Vac-TP3_rot soft exists Rotation speed (krpm) remove
C1:Vac-TP3_temp soft exists Temperature (C) remove
C1:Vac-TP3_current soft exists Current draw (A) remove
C1:Vac-TP3_fail BI new Critical fault has occurred new
C1:Vac-TP3_norm BI new Rotation speed is >80% of set point new
  15500   Fri Jul 24 15:40:59 2020 JordanUpdateVACInstallation of two new UPS units

I installed the Tripp Lite SMX1000RT2U and Tripp Lite Smart1000LCD at the bottom of the 1x8 electronics rack. These are plugged in to power, and are ready for testing. All other cables (serial, usb, etc.) have been left on the table next to the 1x8 rack.

  15501   Mon Jul 27 15:48:36 2020 JonSummaryVACVacuum parts ordered

To carry out the next steps of the vac refurbishment plan [ELOG 15499], I've ordered parts necessary for interfacing the UPS units and the analog TP2/3 controller outputs with c1vac. The purchase list is appended to the main BHD list and is located here. Some parts we already had in the boxes of Acromag materials. Jordan is gathering what we do already have and staging it on the vacuum controls console table - please don't move them or put them away.

Quote:

Replace failing UPS.

Remove interlock dependencies on TP2/TP3 serial readbacks. Due to persistent glitching [ELOG 15140, ELOG 15392].

  15502   Tue Jul 28 12:22:40 2020 JonUpdateVACVac interlock test today 1:30 pm

This afternoon Jordan is going to carry out a test of the V4 and V5 hardware interlocks. To inform the interlock improvement plan [15499], we need to characterize exactly how these work (they pre-date the 2018 upgrade). I have provided him a sequence of steps for each test and will also be backing him up on Zoom.

We will close V1 as a precaution but there should be no other impact to the IFO. The tests are expected to take <1 hour. We will advise when they are completed.

  15504   Tue Jul 28 14:11:14 2020 JonUpdateVACVac interlock test today 1:30 pm

This test has been completed. The IFO configuration has been reverted to nominal.

For future reference: yes, both the V4 and V5 hardware interlocks were found to still be connected and work. A TTL signal from the analog output port of each pump controller (TP2 and TP3) is connected to an auxiliary relay inside the main valve relay box. These serve the purpose of interupting the (Acromag) control signal to the primary V4/5 relay. This interrupt is triggered by each pump's R1 setpoint signal, which is programmed to go low when the rotation speed falls below 80% of the low-speed setting.

Quote:

This afternoon Jordan is going to carry out a test of the V4 and V5 hardware interlocks. To inform the interlock improvement plan [15499], we need to characterize exactly how these work (they pre-date the 2018 upgrade). I have provided him a sequence of steps for each test and will also be backing him up on Zoom.

We will close V1 as a precaution but there should be no other impact to the IFO. The tests are expected to take <1 hour. We will advise when they are completed.

  15526   Fri Aug 14 10:10:56 2020 JonConfigurationVACVacuum repairs today

The vac system is going down now for planned repairs [ELOG 15499]. It will likely take most of the day. Will advise when it's back up.

  15527   Sat Aug 15 02:02:13 2020 JonConfigurationVACVacuum repairs today

Vacuum work is completed. The TP2 and TP3 interlocks have been overhauled as proposed in ELOG 15499 and seem to be performing reliably. We're now back in the nominal system state, with TP2 again backing for TP1 and TP3 pumping the annuli. I'll post the full implementation details in the morning.

I did not get to setting up the new UPS units. That will have to be scheduled for another day.

Quote:

The vac system is going down now for planned repairs [ELOG 15499]. It will likely take most of the day. Will advise when it's back up.

  15528   Sat Aug 15 15:12:22 2020 JonConfigurationVACOverhaul of small turbo pump interlocks

Summary

Yesterday I completed the switchover of small turbo pump interlocks as proposed in ELOG 15499. This overhaul altogether eliminates the dependency on RS232 readbacks, which had become unreliable (glitchy) in both controllers. In their place, the V4(5) valve-close interlocks are now predicated on an analog controller output whose voltage goes high when the rotation speed is >= 80% of the nominal setpoint. The critical speed is 52.8 krpm for TP2 and 40 krpm for TP3. There already exist hardware interlocks of V4(5) using the same signals, which I have also tested.

Interlock signal

Unlike the TP1 controller, which exposes simple relays whose open/closed states are sensed by Acromags, what the TP2(3) controllers output is an energized 24V signal for controlling such a relay (output circuit pictured below). I hadn't appreciated this difference and it cost me time yesterday. The ultimate solution was to route the signals through a set of new 24V Phoenix Contact relays installed inside the Acromag chassis. However, this required removing the chassis from the rack and bringing it to the electronics bench (rather than doing the work in situ, as I had planned). The relays are mounted to the second DIN rail opposite the Acromags. Each TP2(3) signal controls the state of a relay, which in turn is sensed using an Acromag XT1111.

Signal routing

The TP2(3) "normal-speed" signals are already in use by hardware interlocks of V4(5). Each signal is routed into the main AC relay box, where it controls an "interrupter" relay through which the Acromag control signal for the main V4(5) relay is passed. These signals are now shared with the digital controls system using a passive DB15 Y-splitter. The signal routing is shown below.

Interlock conditions

The new turbo-pump-related interlock conditions and their channel predicates are listed below. The full up-to-date channel list and wiring assignments for c1vac are maintained here.

Channel Type New? Interlock-triggering condition
C1:Vac-TP1_norm BI No Rotation speed < 90% nominal setpoint (29 krpm)
C1:Vac-TP1_fail BI No Critical fault occurrence
C1:Vac-TP1_current AI No Current draw > 4 A
C1:Vac-TP2_norm BI Yes Rotation speed < 80% nominal setpoint (52.8 krpm)
C1:Vac-TP3_norm BI Yes Rotation speed < 80% nominal setpoint (40 krpm)

There are two new channels, both of which provide a binary indication of whether the pump speed is outside its nominal range. I did not have enough 24V relays to also add the C1:Vac-TP2(3)_fail channels listed in ELOG 15499. However, these signals are redundant with the existing interlocks, and the existing serial "Status" readback will already print failure messages to the MEDM screens. All of the TP2(3) serial readback channels remain, which monitor voltage, current, operational status, and temperature. The pump on/off and low-speed mode on/off controls remain implemented with serial signals as well.

The new analog readbacks have been added to the MEDM controls screens, circled below:

Other incidental repairs

  • I replaced the (dead) LED monitor at the vac controls console. In the process of finding a replacement, I came across another dead spare monitor as well. Both have been labeled "DEAD" and moved to Jordan's desk for disposal.
  • I found the current TP3 Varian V70D controller to be just as glitchy in the analog outputs as well. That likely indicates there is a problem with the microprocessor itself, not just the serial communications card as I thought might be the case. I replaced the controller with the spare unit which was mounted right next to it in the rack [ELOG 13143]. The new unit has not glitched since the time I installed it around 10 pm last night.
  15537   Mon Aug 24 08:13:56 2020 JonUpdateVACUPS installation

I'm in the lab this morning to interface the two new UPS units with the digital controls system. Will be out by lunchtime. The disruptions to the vac system should be very brief this time.

  15538   Mon Aug 24 11:25:07 2020 JonUpdateVACUPS installation

I'm leaving the lab shortly. We're not ready to switch over the vac equipment to the new UPS units yet.

The 120V UPS is now running and interfaced to c1vac via a USB cable. The unofficial tripplite python package is able to detect and connect to the unit, but then read queries fail with "OS Error: No data received." The firmware has a different version number from what the developers say is known to be supported.

The 230V UPS is actually not correctly installed. For input power, it has a general type C14 connector which is currently plugged into a 120V power strip. However this unit has to be powered from a 230V outlet. We'll have to identify and buy the correct adapter cable.

With the 120V unit now connected, I can continue to work on interfacing it with python remotely. The next implementation I'm going to try is item #2 of this plan [ELOG 15446].

Quote:

I'm in the lab this morning to interface the two new UPS units with the digital controls system. Will be out by lunchtime. The disruptions to the vac system should be very brief this time.

  15541   Wed Aug 26 15:48:31 2020 gautamUpdateVACControl screen left open on vacuum workstation

I found that the control MEDM screen was left open on the c1vac workstation. This should be closed every time you leave the workstation, to avoid accidental button pressing and such.

The network outage meant that the EPICS data from the pressure gauges wasn't recorded until I reset everything ~noon. So there isn't really a plot of the outgassing/leak rate. But the pressure rose to ~2e-4 torr, over ~4 hours. The pumpdown back to nominal pressure (9e-6 torr) took ~30 minutes.

  15556   Fri Sep 4 15:26:55 2020 JonUpdateVACVac system UPS installation

The vac controls are going down now to pull and test software changes. Will advise when the work is completed.

  15557   Fri Sep 4 21:12:51 2020 JonUpdateVACVac system UPS installation

The vac work is completed. All of the vacuum equipment is now running on the new 120V UPS, except for TP1. The 230V TP1 is still running off wall power, as it always has. After talking with Tripp Lite support today, I believe there is a problem with the 230V UPS. I will post a more detailed note in the morning.

Quote:

The vac controls are going down now to pull and test software changes. Will advise when the work is completed.

  15558   Sat Sep 5 12:01:10 2020 JonUpdateVACVac system UPS installation

Summary

Yesterday's UPS switchover was mostly a success. The new Tripp Lite 120V UPS is fully installed and is communicating with the slow controls system. The interlocks are configured to trigger a controlled shutdown upon an extended power outage (> ~30 s), and they have been tested. All of the 120V pumpspool equipment (the full c1vac/LAN/Acromag system, pressure gauges, valves, and the two small turbo pumps) has been moved to the new UPS. The only piece of equipment which is not 120V is TP1, which is intended to be powered by a separate 230V UPS. However that unit is still not working, and after more investigation and a call to Tripp Lite, I suspect it may be defective. A detailed account of the changes to the system follow below.

Unfortunately, I think I damaged the Hornet (the only working cathode ionization gauge in the main volume) by inadvertently unplugging it while switching over equipment to the new UPS. The electronics are run from multiple daisy-chained power strips in the bottom of the rack and it is difficult to trace where everything goes. After the switchover, the Hornet repeatedly failed to activate (either remotely or manually) with the error "HV fail." Its compatriot, the Pirani SuperBee, also failed about a year ago under similar circumstances (or at least its remote interface did, making it useless for digital monitoring and control). I think we should replace them both, ideally with ones with some built-in protection against power failures.

New EPICS channels

Four new soft channels per UPS have been created, although the interlocks are currently predicated on only C1:Vac-UPS120V_status.

Channel Type Description Units
C1:Vac-UPS120V_status stringin Operational status -
C1:Vac-UPS120V_battery ai Battery remaining %
C1:Vac-UPS120V_line_volt ai Input line voltage V
C1:Vac-UPS120V_line_freq ai Input line frequency Hz
C1:Vac-UPS240V_status stringin Operational status -
C1:Vac-UPS240V_battery ai Battery remaining %
C1:Vac-UPS240V_line_volt ai Input line voltage V
C1:Vac-UPS240V_line_freq ai Input line frequency Hz

These new readbacks are visible in the MEDM vacuum control/monitor screens, as circled in Attachment 1:

Continuing issues with 230V UPS

Yesterday I brought with me a custom power cable for the 230V UPS. It adapts from a 208/120V three-phase outlet (L21-20R) to a standard outlet receptacle (5-15P) which can mate with the UPS's C14 power cable. I installed the cable and confirmed that, at the UPS end, 208V AC was present split-phase (i.e., two hot wires separated 120 deg in phase, each at 120V relative to ground). This failed to power on the unit. Then Jordan showed up and suggested to try powering it instead from a single-phase 240V outlet (L6-20R). However we found that the voltage present at this outlet was exactly the same as what the adapter cable provides: 208V split-phase.

This UPS nominally requires 230V single-phase. I don't understand well enough how the line-noise-isolation electronics work internally, so I can think of three possible explanations:

  1. 208V AC is insufficient to power the unit.
  2. The unit requires a true neutral wire (i.e., not a split-phase configuration), in which case it is not compatible with the U.S. power grid.
  3. The unit is defective.

I called Tripp Lite technical support. They thought the unit should work as powered in the configuration I described, so this leads me to suspect #3.

@Chub and Jordan: Can you please look into somehow replacing this unit, potentially with a U.S.-specific model? Let's stick with the Tripp Lite brand though, as I already have developed the code to interface those.

UPS-host computer communications

Unlike our older equipment, which communicates serially with the host via RS232/485, the new UPS units can be connected with a USB 3.0 cable. I found a great open-source package for communicating directly with the UPS from within Python, Network UPS Tools (NUT), which eliminates the dependency on Tripp Lite's proprietary GUI. The package is well documented, supports hundreds of power-management devices, and is available in the Debian package manager from Jessie (Debian 8) up. It consists of a large set of low-level, device-specific drivers which communicate with a "server" running as a systemd service. The NUT server can then be queried using a uniform set of programming commands across a huge number of devices.

I document the full set-up procedure below, as we may want to use this with more USB devices in the future.

How to set up

First, install the NUT package and its Python binding:

$ sudo apt install nut python-nut

This automatically creates (and starts) a set of systemd processes which expectedly fail, since we have not yet set up the config. files defining our USB devices. Stop these services, delete their default definitions, and replace them with the modified definitions from the vacuum git repo:

$ sudo systemctl stop nut-*.service
$ sudo rm /lib/systemd/system/nut-*.service
$ sudo cp /opt/target/services/nut-*.service /etc/systemd/system
$ sudo systemctl daemon-reload

Next copy the NUT config. files from the vacuum git repo to the appropriate system location (this will overwrite the existing default ones). Note that the file ups.conf defines the UPS device(s) connected to the system, so for setups other than c1vac it will need to be edited accordingly.

$ sudo cp /opt/target/services/nut/* /etc/nut

Now we are ready to start the NUT server, and then enable it to automatically start after reboots:

$ sudo systemctl start nut-server.service
$ sudo systemctl enable nut-server.service

If it succeeds, the start command will return without printing any output to the terminal. We can test the server by querying all the available UPS parameters with

$ upsc 120v

which will print to the terminal screen something like

battery.charge: 100
battery.runtime: 1215
battery.type: PbAC
battery.voltage: 13.5
battery.voltage.nominal: 12.0
device.mfr: Tripp Lite 
device.model: Tripp Lite UPS 
device.type: ups
driver.name: usbhid-ups
driver.parameter.pollfreq: 30
driver.parameter.pollinterval: 2
driver.parameter.port: auto
driver.parameter.productid: 2010
driver.parameter.vendorid: 09ae
driver.version: 2.7.2
driver.version.data: TrippLite HID 0.81
driver.version.internal: 0.38
input.frequency: 60.1
input.voltage: 120.3
input.voltage.nominal: 120
output.frequency.nominal: 60
output.voltage.nominal: 120
ups.beeper.status: enabled
ups.delay.shutdown: 20
ups.mfr: Tripp Lite 
ups.model: Tripp Lite UPS 
ups.power.nominal: 1000
ups.productid: 2010
ups.status: OL
ups.timer.reboot: 65535
ups.timer.shutdown: 65535
ups.vendorid: 09ae
ups.watchdog.status: 0

Here 120v is the name assigned to the 120V UPS device in the ups.conf file, so it will vary for setups on other systems.

If all succeeds to this point, what we have set up so far is a set of command-line tools for querying (and possibly controlling) the UPS units. To access this functionality from within Python scripts, a set of official Python bindings are provided by the python-nut package. However, at the time of writing, these bindings only exist for Python 2.7. For Python 3 applications (like the vacuum system), I have created a Python 3 translation which is included in the vacuum git repo. Refer to the UPS readout script for an illustration of its usage.

  15577   Wed Sep 16 12:03:07 2020 JonUpdateVACReplacing pressure gauges

Assembled is the list of dead pressure gauges. Their locations are also circled in Attachment 1.

Gauge Type Location
CC1 Cold cathode Main volume
CC3 Cold cathode Pumpspool 
CC4 Cold cathode RGA chamber
CCMC Cold cathode IMC beamline near MC2
P1b Pirani Main volume
PTP1 Pirani TP1 foreline

For replacements, I recommend we consider the Agilent FRG-700 Pirani Inverted Magnetron Gauge. It uses dual sensing techniques to cover a broad pressure range from 3e-9 torr to atmosphere in a single unit. Although these are more expensive, I think we would net save money by not having to purchase two separate gauges (Pirani + hot/cold cathode) for each location. It would also simplify the digital controls and interlocking to have a streamlined set of pressure readbacks.

For controllers, there are two options with either serial RS232/485 or Ethernet outputs. We probably want the Agilent XGS-600, as it can handle all the gauges in our system (up to 12) in a single controller and no new software development is needed to interface it with the slow controls.

  15582   Sat Sep 19 18:07:35 2020 KojiUpdateVACTP3 RP failure

I came to the campus and Gautam notified that he just had received the alert from the vac watchdog.

I checked the vac status at c1vac. PTP3 went up to 10 torr-ish and this made the diff pressure for TP3 over 1torr. Then the watchdog kicked in.

To check the TP3 functionality, AUX RP was turned on and the manual valve (MV in the figure) was opened to pump the foreline of TP3. This easily made PTP3 <0.2 torr and TP3 happy (I didn't try to open V5 though).

So the conclusion is that RP for TP3 has failed. Presumably, the tip-seal needs to be replaced.

Right now TP3 was turned off and is ready for the tip-seal replacement. V5 was closed since the watchdog tripped.

  15586   Sat Sep 19 19:37:16 2020 not KojiUpdateVACTP3 RP failure

Disconcerting because those tip seals were just replaced [15417]. Maybe they were just defective, but if there is a more serious problem with the pump, there is a spare Varian roughing pump (the old TP2 dry pump) sitting at the X-end.

I reset the interlock error to unfreeze the vac controls (leaving V5 closed).

Quote:

So the conclusion is that RP for TP3 has failed. Presumably, the tip-seal needs to be replaced.

Right now TP3 was turned off and is ready for the tip-seal replacement. V5 was closed since the watchdog tripped.

  15591   Mon Sep 21 15:57:08 2020 JordanUpdateVACTP3 Forepump Replacement and Vac reset

I removed the forepump (Varian SH-110) for TP3 today to see why it had failed over the weekend. I tested it in the C&B lab and the ultimate pressure was only ~40torr. I checked the tip seals and they were destroyed. The scroll housing also easily pulled off of the motor drive shaft, which is indicative of bad bearings. The excess travel in the bearings likely led to significant increase in tip seal wear. This pump will need to be scrapped, or rebuilt.

I tested the spare Varian SH-110 pump located at the X-end and the ultimate pressure was ~98 mtorr. This pump had tip seals replaced on 11/5/18, and is currently at 55163 operating hours. It has been installed as the TP3 forepump.

Once installed, restarting the pump line occured as follows: V5 Closed, VA6 closed, VASE Closed, VASV closed, VABSSCI closed, VABS closed, VABSSCO closed, VAEV closed, VAEE closed,TP3 was restarted and once at normal operation, valves were opened in same order.

The pressure differential interlock condition for V5 was temporaily changed to 10 torr (by Gautam), so that valves could be opened in a controlled manner. Once, the vacuum system was back to normal state the V5 interlock condition was set back to the nominal 1 torr. Vacuum system is now running normally.

  15599   Wed Sep 23 08:57:18 2020 gautamUpdateVACTP2 running HOT

The interlocks tripped at ~630am local time. Jordan reported that TP2 was supposedly running at 52 C (!).

V1 was already closed, but TP2 was still running. With him standing by the rack, I remotely exectued the following sequence:

  • VM1 closed (isolates RGA volume).
  • VA6 closed (isolates annuli from being pumped).
  • V7 opened (TP3 now backs TP1, temporarily, until I'm in the lab to check things out further).
  • TP2 turned off.

Jordan confirmed (by hand) that TP2 was indeed hot and this is not just some serial readback issue. I'll do the forensics later.

  15600   Wed Sep 23 10:06:52 2020 KojiUpdateVACTP2 running HOT

Here is the timeline. This suggests TP2 backing RP failure.

1st line: TP2 foreline pressure went up. Accordingly TP2 P, current, voltage, and temp went up. TP2 rotation went down.

2nd line: TP2 temp triggered the interlock. TP2 foreline pressure was still high (10torr) so TP2 struggled and was running at 1 torr.

3rd line: Gautam's operation. TP2 was isolated and stopped.

Between the 1st line and 2nd line, TP2 pressue (=TP1 foreline pressure) went up to 1torr. This made TP1 current increased from 0.55A to 0.68A (not shown in the plot), but TP1 rotation was not affected.

  15602   Wed Sep 23 15:06:54 2020 JordanUpdateVACTP2 Forepump Re-install

I removed the forepump to TP2 this morning after the vacuum failure, and tested in the C&B lab. I pumped down on a small volume 10 times, with no issue. The ultimate pressure was ~30 mtorr.

I re-installed the forepump in the afternoon, and restarted TP2, leaving V4 closed. This will run overnight to test, while TP3 backs TP1.

In order to open V1, with TP3 backing TP1, the interlock system had to be reset since it is expecting TP2 as a backing pump. TP2 is running normally, and pumping of the main volume has resumed.


gautam 2030:

  1. The monitor (LCD display) at the vacuum rack doesn't work - this has been the case since Monday at least. I usually use my laptop to ssh in so I didn't notice it so it could have been busted from before. But for anyone wishing to use the workstation arrangement at 1X8, this is not great. Today, we borrowed the vertex laptop to ssh in, the vertex laptop has since been returned to its nominal location.
  2. The modification to the interlock condition was made by simply commenting out the line requiring V4 to be open for V1 to be opened. I made a copy of the original .yaml file which we can revert to once we go back to the normal config.
  3. I also opened VM1 to allow the RGA scans to continue to be meaningful.
  4. At the time of writing, all systems seem nominal. See Attachment #2. The vertical line indicates when we started pumping on the main volume again earlier today, with TP3 backing TP1.

Unclear why the TP2 foreline pump failed in the first place, it has been running fine for several hours now (although TP2 has no load, since V4 isolates it from the main volume). Koji's plots show that the TP2 foreline pressure did not recover even after the interlock tripped and V4 was closed (i.e. the same conditions as TP2 sees right now).

  15615   Tue Oct 6 14:35:16 2020 JordanUpdateVACSpare forepumps

I have placed 3 new in box, IDP 7 forepumps along the x arm of the interferometer. These are to be used as spares for both the 40m and Clean and Bake.

  15668   Tue Nov 10 11:59:37 2020 gautamUpdateVACStuck RV2

I've uploaded some more photos here. I believe the problem is a worn out thread where the main rotary handle attaches to the shaft that operates the valve.

This morning, I changed the valve config such that TP2 backs TP1 and that combo continues to pump on the main volume through the partially open RV2. TP3 was reconfigured to pump the annuli - initially, I backed it with the AUX drypump but since the load has decreased now, I am turning the AUX drypump off. At some point, if we want to try it, we can try pumping the main volume via the RGA line using TP2/TP3 and see if that allows us to get to a lower pressure, but for now, I think this is a suitable configuration to continue the IFO work.

There was a suggestion at the meeting that the saturation of the main volume pressure at 1mtorr could be due to a leak - to test, I closed V1 for ~5 hours and saw the pressure increased by 1.5 mtorr, which is in line with our estimates from the past. So I think we can discount that possibility.

  15681   Wed Nov 18 17:51:50 2020 gautamUpdateVACAgilent pressure gauge controller delivered

It is stored along with the cables that arrived a few weeks ago, awaiting the gauges which are now expected next week sometime.

  15686   Mon Nov 23 16:33:10 2020 gautamUpdateVACMore vacuum deliveries

Five Agilent pressure gauges were delivered to the 40m. It is stored with the controller and cables in the office area. This completes the inventory for the gauge replacement - we have all the ordered parts in hand (though. not necessarily all the adaptor flanges etc). I'll see if I can find some cabinet space in the VEA to store these, the clutter is getting out of hand again...
 

in addition, the spare gate valve from LHO was also delivered today to the 40m. It is stored at EX with the other spare valves. 

Quote:

It is stored along with the cables that arrived a few weeks ago, awaiting the gauges which are now expected next week sometime.

  15692   Wed Dec 2 12:27:49 2020 JonUpdateVACReplacing pressure gauges

Now that the new Agilent full-range gauges (FRGs) have been received, I'm putting together an installation plan. Since my last planning note in Sept. (ELOG 15577), two more gauges appear to be malfunctioning: CC2 and PAN. Those are taken into account, as well. Below are the proposed changes for all the sensors in the system.

In summary:

  • Four of the FRGs will replace CC1/2/3/4.
  • The fifth FRG will replace CCMC if the 15.6 m cable (the longest available) will reach that location.
  • P2 and P3 will be moved to replace PTP1 and PAN, as they will be redundant once the new FRGs are installed.

Required hardware:

  • 3x CF 2.75" blanks
  • 10x CF 2.75" gaskets
  • Bolts and nut plates
Volume Sensor Location Status Proposed Action
Main P1a functioning leave
Main P1b local readback only leave
Main CC1 dead replace with FRG
Main CCMC dead replace with FRG*
Pumpspool PTP1 dead replace with P2
Pumpspool P2 functioning replace with 2.75" CF blank
Pumpspool CC2 intermittent replace with FRG
Pumpspool PTP2 functioning leave
Pumpspool P3 functioning replace with 2.75" CF blank
Pumpspool CC3 dead replace with FRG
Pumpspool PTP3 functioning leave
Pumpspool PRP functioning leave
RGA P4 functioning leave
RGA CC4 dead replace with FRG
RGA IG1 dead replace with 2.75" CF blank
Annuli PAN intermittent replace with P3
Annuli PASE functioning leave
Annuli PASV functioning leave
Annuli PABS functioning leave
Annuli PAEV functioning leave
Annuli PAEE functioning leave

 

Quote:

For replacements, I recommend we consider the Agilent FRG-700 Pirani Inverted Magnetron Gauge. It uses dual sensing techniques to cover a broad pressure range from 3e-9 torr to atmosphere in a single unit. Although these are more expensive, I think we would net save money by not having to purchase two separate gauges (Pirani + hot/cold cathode) for each location. It would also simplify the digital controls and interlocking to have a streamlined set of pressure readbacks.

For controllers, there are two options with either serial RS232/485 or Ethernet outputs. We probably want the Agilent XGS-600, as it can handle all the gauges in our system (up to 12) in a single controller and no new software development is needed to interface it with the slow controls.

 

  15698   Thu Dec 3 10:33:00 2020 gautamUpdateVACTrippLite UPS delivered

The latest greatest UPS has been delivered. I will move it to near the vacuum rack in its packaging for storage. It weighs >100lbs so care will have to be taken when installing - can the rack even support this?

  15703   Thu Dec 3 14:53:58 2020 JonUpdateVACReplacing pressure gauges

Update to the gauge replacement plan (15692), based on Jordan's walk-through today. He confirmed:

  • All of the gauges being replaced are mounted via 2.75" ConFlat flange. The new FRGs have the same footprint, so no adapters are required.
  • The longest Agilent cable (50 ft) will NOT reach the CCMC location. The fifth FRG will have to be installed somewhere closer to the X-end.

Based on this info (and also info from Gautam that the PAN gauge is still working), I've updated the plan as follows. In summary, I now propose we install the fifth FRG in the TP1 foreline (PTP1 location) and leave P2 and P3 where they are, as they are no longer needed elsewhere. Any comments on this plan? I plan to order all the necessary gaskets, blanks, etc. tomorrow.

Volume Sensor Location Status Proposed Action
Main P1a functioning leave
Main P1b local readback only leave
Main CC1 dead replace with FRG
Main CCMC dead remove; cap with 2.75" CF blank
Pumpspool PTP1 dead replace with FRG
Pumpspool P2 functioning leave
Pumpspool CC2 dead replace with FRG
Pumpspool PTP2 functioning leave
Pumpspool P3 functioning leave
Pumpspool CC3 dead replace with FRG
Pumpspool PTP3 functioning leave
Pumpspool PRP functioning leave
RGA P4 functioning leave
RGA CC4 dead replace with FRG
RGA IG1 dead remove; cap with 2.75" CF blank
Annuli PAN functioning leave
Annuli PASE functioning leave
Annuli PASV functioning leave
Annuli PABS functioning leave
Annuli PAEV functioning leave
Annuli PAEE functioning leave
  15721   Wed Dec 9 20:14:49 2020 gautamUpdateVACUPS failure

Summary:

  1. The (120V) UPS at the vacuum rack is faulty.
  2. The drypump backing TP2 is faulty.
  3. Current status of vacuum system: 
    • The old UPS is now powering the rack again. Sometime ago, I noticed the "replace battery" indicator light on this unit was on. But it is no longer on. So I judged this is the best course of action. At least this UPS hasn't randomly failed before...
    • main vol is being pumped by TP1, backed by TP3.
    • TP2 remains off.
    • The annular volumes are isolated for now while we figure out what's up with TP2.
    • The pressure went up to ~1 mtorr (c.f. ~600utorr that is the nominal value with the stuck RV2) during the whole episode but is coming back down now.
  4. Steve seems to have taken the reliability of the vacuum system with him.

Details:

Around 7pm, the UPS at the vacuum rack seems to have failed. Don't ask me why I decided to check the vacuum screen 10 mins after the failure happened, but the point is, this was a silent failure so the protocols need to be looked into.

Going to the rack, I saw (unsurprisingly) that the 120V UPS was off. 

  • Pushed the power on button - the LCD screen would briefly light up, say the line voltage was 120 V, and then turned itself off. Not great.
  • I traced the power connection to the UPS itself to a power strip under the rack - then I moved the plug from one port to another. Now the UPS stays on. okay...
  • but after ~3 mins while I'm hunting for a VGA cable, I hear an incessant beeping. The UPS display has the "Fault" indicator lit up. 
  • I decided to shift everything back to the old UPS. After the change was made, I was able to boot up the c1vac machine again, and began the recovery process.
  • When I tried to start TP2, the drypump was unusually noisy, and I noticed PTP2 bottomed out at ~500 torr (yes torr). So clearly something is not right here. This pump supposedly had its tip-seal replaced by Jordan just 3 months ago. This is not a normal lifetime for the tip seal - we need to investigate more in detail what's going on here...
  • Decided that an acceptable config is to pump the main volume (so that we can continue working on other parts of the IFO). The annuli are all <10mtorr and holding, so that's just fine I think.

Questions:

  1. Are the failures of TP2 drypump and UPS related? Or coincidence? Who is the chicken and who is the egg?
  2. What's up with the short tip seal lifetime?
  3. Why did all of this happen without any of our systems catching it and sending an alert??? I have left the UPS connected to the USB/ethernet interface in case anyone wants to remotely debug this.

For now, I think this is a safe state to leave the system in. Unless I hear otherwise, I will leave it so - I will be in the lab another hour tonight (~10pm).

Some photos and a screen-cap of the Vac medm screen attached.

  15722   Thu Dec 10 11:07:24 2020 ChubUpdateVACUPS fault

Is that a fault code that you can decipher in the manual, or just a light telling you nothing but your UPS is dead?

  15723   Thu Dec 10 11:17:50 2020 ChubUpdateVACUPS fault

I can't find anything in the manual that describes the nature of the FAULT message.  In fact, it's not mentioned at all.  If the unit detects a fault at its output, I would expect a bit more information.  This unit does a programmable level of input error protection, too, usually set at 100%.  Still, there is no indication in the manual whether an input issue would be described as a fault; that usually means a short or lifted ground at the output.

Quote:

Is that a fault code that you can decipher in the manual, or just a light telling you nothing but your UPS is dead?

  15724   Thu Dec 10 13:05:52 2020 JonUpdateVACUPS failure

I've investigated the vacuum controls failure that occurred last night. Here's what I believe happened.

From looking at the system logs, it's clear that there was a sudden loss of power to the control computer (c1vac). Also, the system was actually down for several hours. The syslog shows normal EPICS channel writes (pressure readback updates, etc., and many of them per minute) which suddenly stop at 4:12 pm. There are no error or shutdown messages in the syslog or in the interlock log. The next activity is the normal start-up messaging at 7:39 pm. So this is all consistent with the UPS suddenly failing.

According to the Tripp Lite manual, the FAULT icon indicates "the battery-supported outlets are overloaded." The failure of the TP2 dry pump appears to have caused this. After the dry pump failure, the rising pressure in the TP2 foreline caused TP2's current draw to increase way above its normal operating range. Attachment 1 shows anomalously high TP2 current and foreline pressure in the minutes just before the failure. The critical system-wide failure is that this overloaded the UPS before overloading TP2's internal protection circuitry, which would have shut down the pump, triggering interlocks and auto-notifications.

Preventing this in the future:

First, there are too many electronics on the 1 kVA UPS. The reason I asked us to buy a dual 208/120V UPS (which we did buy) is to relieve the smaller 120V UPS. I envision moving the turbo pumps, gauge controllers, etc. all to the 5 kVA unit and reserving the smaller 1 kVA unit for the c1vac computer and its peripherals. We now have the dual 208/120V UPS in hand. We should make it a priority to get that installed.

Second, there are 1 Hz "blinker" channels exposed for c1vac and all the slow controls machines, each reporting the machine's alive status. I don't think they're being monitored by any auto-notification program (running on a central machine), but they could be. Maybe there already exists code that could be co-opted for this purpose? There is an MEDM screen displaying the slow machine statuses at Sitemap > CDS > SLOW CONTROLS STATUS, pictured in Attachment 2. This is the only way I know to catch sudden failures of the control computer itself.

  15725   Thu Dec 10 14:29:26 2020 gautamUpdateVACUPS failure

I don't buy this story - P2 only briefly burped around GPStime 1291608000 which is around 8pm local time, which is when I was recovering the system.

Today. Jordan talked to Jon Feicht - apparently there is some kind of valve in the TP2 forepump, which only opens ~15-20 seconds after turning the pump on. So the loud sound I was hearing yesterday was just some transient phenomenon. So today morning at ~9am, we turned on TP2. Once again, PTP2 pressure hovered around 500 torr for about 15-20 seconds. Then it started to drop, although both Jordan and I felt that the time it took for the pressure to drop in the range 5 mtorr - 1 mtorr was unusually long. Jordan suspects some "soft-start" feature of the Turbo Pumps, which maybe spins up the pump in a more controlled way than usual after an event like a power failure. Maybe that explains why the pressure dropped so slowly? One thing is for sure - the TP2 controller displayed "TOO HIGH LOAD" yesterday when I tried the first restart (before migrating everything to the older UPS unit). This is what led me to interpret the loud sound on startup of TP2 to indicate some issue with the forepump - as it turns out, this is just the internal valve not being opened.

Anyway, we left TP2 on for a few hours, pumping only on the little volume between it and V4, and PTP2 remained stable at 20 mtorr. So we judged it's okay to open V4. For today, we will leave the system with both TP2 and TP3 backing TP1. Given the lack of any real evidence of a failure from TP2, I have no reason to believe there is elevated risk.

As for prioritising UPS swap - my opinion is that it's better to just replace the batteries in the UPS that has worked for years. We can run a parallel reliability test of the new UPS and once it has demonstrated stability for some reasonable time (>4 months), we can do the swap.


I was able to clear the FAULT indicator on the new UPS by running a "self-test". pressing and holding the "mute" button on the front panel initiates this test according to the manual, and if all is well, it will clear the FAULT indicator, which it did. I'm still not trusting this unit and have left all units powered by the old UPS.


Update 1100 Dec 11: The config remained stable overnight so today I reverted to the nominal config of TP3 pumping the annuli and TP2 backing TP1 which pumps the main volume (through the partially open RV2).

Quote:
 

According to the Tripp Lite manual, the FAULT icon indicates "the battery-supported outlets are overloaded." The failure of the TP2 dry pump appears to have caused this. After the dry pump failure, the rising pressure in the TP2 foreline caused TP2's current draw to increase way above its normal operating range. Attachment 1 shows anomalously high TP2 current and foreline pressure in the minutes just before the failure. The critical system-wide failure is that this overloaded the UPS before overloading TP2's internal protection circuitry, which would have shut down the pump, triggering interlocks and auto-notifications.

  15748   Wed Jan 6 15:28:04 2021 gautamUpdateVACVac rack UPS batteries replaced

[chub, gautam]

the replacement was done this afternoon. The red "Replace Battery" indicator is no longer on.

ELOG V3.1.3-