[Yuta, Koji, Paco]
We were having some trouble restarting all the models on the FEs. The error was the famous 0x4000 DC error, which has to do with time de-synchronization between fb1 and a given FE. We tried a combination of things haphazardly, such as reloading the gpstime process using
without much success, even when doing this again after hard rebooting FE + IO chassis combinations around the lab. Koji prompted us to check the local times as reported by the gpstime module, and comparing it to network reported times we saw the expected offset of ~ 3.5 s. On a given FE ("c1***") and fb1 separately, we ran:
which meant a couple of things:
By looking at previous elogs with similar issues, we tried two things;
We tried rebooting fb1 to see if it connected, but eventually what solved this was restarting the bind9 service on chiara! Now we could ping google, and saw this output
meaning fb1 was getting its time served. Going back to the FEs, we still couldn't see the ntp synchronized flag up, but it just took time after a few minutes we saw the FEs in sync! This also meant that we could finally restart all FE models, which we successfully did following the script described in the wiki. Then we had to reload the modbusIOC service in all the slow machines (sometimes this required us to call sudo systemctl daemon-reload) and performed burt restore to a last Friday's snap file collection.
With Koji's help PMC locked, and then Yuta and Paco manually increased the input power to the IFO by rotating the waveplate picomotor to 37.0 deg. After this, we noticed that the MC REFL spot was not hitting the camera, so maybe MC1 was misaligned. Paco checked the AP table and saw the spot horizontally misaligned on the camera, which gave us the initial YAW correction on MC1. After some IMC recovery, we saw only MC1 got spontaneously kicked along both PIT and YAW, making our alignment futile. Though not hard to recover, we wondered why this happened.
We went into the 1X4 rack and pushed MC1 suspension cables in to disregard loose connections, but as we came back into the control room we again saw it being kicked randomly! We even turned damping off for a little while and this random kicking didn't stop. There was no significant seismic motion at the time so it is still unclear of what is happening.