ID |
Date |
Author |
Type |
Category |
Subject |
6584
|
Mon Apr 30 16:56:05 2012 |
Suresh | Update | CDS | Frame Builder is down |
Quote: |
Quote: |
Frame builder is down. PRM has tripped its watch dogs. I have reset the watch dog on PRM and turned on the OPLEV. It has damped down. Unable to check what happened since FB is not responding.
There was an minor earthquake yesterday morning which people could feel a few blocks away. It could have caused the the PRM to unlock.
Jamie,Rolf, is it okay or us to restart the FB?
|
If it's down it's alway ok to restart it. If it doesn't respond or immediately crashes again after restart then it might require some investigation, but it should always be ok to restart it.
|
I tried restarting the fb in two different ways. Neither of them re-established the connection to dtt or epics.
1) I restarted the fb from the control room console with the 'shutdown' command. No change.
2) I halted the machine with 'shutdown -h now' and restarted it with the hardware reset button on its front-panel. No change.
The console connected to the fb showed that the network file systems did not load. Could this have resulted in failure to start several services since it could not find the files which are stored on the network file system?
The fb is otherwise healthy since I am able to ssh into it and browse the directory structure. |
6586
|
Mon Apr 30 20:43:33 2012 |
Suresh | Update | CDS | Frame Builder is down |
Quote: |
Quote: |
Quote: |
Frame builder is down. PRM has tripped its watch dogs. I have reset the watch dog on PRM and turned on the OPLEV. It has damped down. Unable to check what happened since FB is not responding.
There was an minor earthquake yesterday morning which people could feel a few blocks away. It could have caused the the PRM to unlock.
Jamie,Rolf, is it okay or us to restart the FB?
|
If it's down it's alway ok to restart it. If it doesn't respond or immediately crashes again after restart then it might require some investigation, but it should always be ok to restart it.
|
I tried restarting the fb in two different ways. Neither of them re-established the connection to dtt or epics.
1) I restarted the fb from the control room console with the 'shutdown' command. No change.
2) I halted the machine with 'shutdown -h now' and restarted it with the hardware reset button on its front-panel. No change.
The console connected to the fb showed that the network file systems did not load. Could this have resulted in failure to start several services since it could not find the files which are stored on the network file system?
The fb is otherwise healthy since I am able to ssh into it and browse the directory structure.
|
[Mike, Rana]
The fb is okay. Rana found that it works on Pianosa, but not on Allegra or Rossa. It also works on Rosalba, on which Jamie recently installed Ubuntu.
The white fields on the medm 'Status' screen for fb are an unrelated problem.
|
6591
|
Tue May 1 08:18:50 2012 |
Jamie | Update | CDS | Frame Builder is down |
Quote: |
I tried restarting the fb in two different ways. Neither of them re-established the connection to dtt or epics.
|
Please be conscious of what components are doing what. The problem you were experiencing was not "frame builder down". It was "dtt not able to connect to frame builder". Those are potentially completely different things. If the front end status screens show that the frame builder is fine, then it's probably not the frame builder.
Also "epics" has nothing whatsoever to do with any of this. That's a completely different set of stuff, unrelated to DTT or the frame builder. |
6599
|
Thu May 3 19:52:43 2012 |
Jenne | Update | CDS | Output errors from dither alignment (Xarm) script | epicsThreadOnceOsd epicsMutexLock failed.
Segmentation fault
Number found where operator expected at -e line 1, near "0 0"
(Missing operator before 0?)
syntax error at -e line 1, near "0 0"
Execution of -e aborted due to compilation errors.
Number found where operator expected at (eval 1) line 1, near "* * 50"
(Missing operator before 50?)
epicsThreadOnceOsd epicsMutexLock failed.
Segmentation fault
Number found where operator expected at -e line 1, near "0 0"
(Missing operator before 0?)
syntax error at -e line 1, near "0 0"
Execution of -e aborted due to compilation errors.
syntax error at -e line 1, at EOF
Execution of -e aborted due to compilation errors.
Number found where operator expected at (eval 1) line 1, near "* * 50"
(Missing operator before 50?)
epicsThreadOnceOsd epicsMutexLock failed.
status : 0
I am going to execute the following commands
ezcastep -s 0.6 C1:SUS-ETMX_PIT_COMM +-0,50
ezcastep -s 0.6 C1:SUS-ITMX_PIT_COMM +,50
ezcastep -s 0.6 C1:SUS-ETMX_YAW_COMM +,50
ezcastep -s 0.6 C1:SUS-ITMX_YAW_COMM +-0,50
ezcastep -s 0.6 C1:SUS-BS_PIT_COMM +0,50
ezcastep -s 0.6 C1:SUS-BS_YAW_COMM +0,50
hit a key to execute the commands above
|
6609
|
Sun May 6 00:11:00 2012 |
Den | Update | CDS | mx_stream | c1sus and c1iscex computers could not connect to framebuilder, I restarted it, did not help. Then I restarted mx_stream daemon on each of the computers and this fixed the problem.
sudo /etc/init.d/mx_stream restart
|
6616
|
Mon May 7 21:05:38 2012 |
Den | Update | CDS | biquad filter form | I wanted to switch the implementation of IIR_FILTER from DIRECT FORM II to BIQUAD form in C1IOO and C1SUS models. I modified RCG file /opt/rtcds/rtscore/release/src/fe/controller.c by adding #define CORE_BIQUAD line:
#ifdef OVERSAMPLE
#define CORE_BIQUAD
#if defined(CORE_BIQUAD)
C1IOO model compiled, installed and is running now. C1SUS model compiled, but during installation I've got an error:
controls@c1sus ~ 0$ rtcds install c1sus
Installing system=c1sus site=caltech ifo=C1,c1
Installing /opt/rtcds/caltech/c1/chans/C1SUS.txt
Installing /opt/rtcds/caltech/c1/target/c1sus/c1susepics
Installing /opt/rtcds/caltech/c1/target/c1sus
Installing start and stop scripts
/opt/rtcds/caltech/c1/scripts/killc1sus
Performing install-daq
Updating testpoint.par config file
/opt/rtcds/caltech/c1/target/gds/param/testpoint.par
/opt/rtcds/rtscore/branches/branch-2.5/src/epics/util/updateTestpointPar.pl -par_file=/opt/rtcds/caltech/c1/target/gds/param/archive/testpoint_120507_205359.par -gds_node=21 -site_letter=C -system=c1sus -host=c1sus
Installing GDS node 21 configuration file
/opt/rtcds/caltech/c1/target/gds/param/tpchn_c1sus.par
Installing auto-generated DAQ configuration file
/opt/rtcds/caltech/c1/chans/daq/C1SUS.ini
Installing EDCU ini file
/opt/rtcds/caltech/c1/chans/daq/C1EDCU_SUS.ini
Installing Epics MEDM screens
Running post-build script
ERROR: Could not find file: test.py
Searched path: :/opt/rtcds/userapps/release/cds/c1/scripts:/opt/rtcds/userapps/release/cds/common/scripts:/opt/rtcds/userapps/release/isc/c1/scripts:/opt/rtcds/userapps/release/isc/common/scripts:/opt/rtcds/userapps/release/sus/c1/scripts:/opt/rtcds/userapps/release/sus/common/scripts:/opt/rtcds/userapps/release/psl/c1/scripts:/opt/rtcds/userapps/release/psl/common/scripts
Exiting
make: *** [install-c1sus] Error 1
Jamie, what is this test.py? |
6618
|
Mon May 7 21:46:10 2012 |
Den | Update | CDS | guralp signal error | GUR1_XYZ_IN1 and GUR2_XYZ_IN1 are the same and equal to GUR2_XYZ. This is bad since GUR1_XYZ_IN1 should be equal to GUR1_XYZ. Note that GUR#_XYZ are copies of GUR#_XYZ_OUT, so there may be (although there isn't right now) filtering between the _IN1's and the _OUT's. But certainly GUR1 should look like GUR1, not GUR2!!!
Looks like CDS problem, maybe some channel-hopping going on? I'm trying a restart of the c1sus computer right now, to see if that helps.....
Figure: Green and red should be the same, yellow and blue should be the same. Note however that green matches yellow and blue, not red. Bad.
|
6619
|
Mon May 7 22:39:37 2012 |
Den | Update | CDS | c1sus | [Jenne, Den]
We decided to reboot C1SUS machine in hope that this will fix the problem with seismic channels. After reboot the machine could not connect to framebuilder. We restarted mx_stream but this did not relp. Then we manually executed
/opt/rtcds/caltech/c1/target/fb/mx_stream -s c1x02 c1sus c1mcs c1rfm c1pem -d fb:0 -l /opt/rtcds/caltech/c1/target/fb/mx_stream_logs/c1sus.log
but c1sus still could not connect to fb. This script returned the following error:
controls@c1sus ~ 128$ cat /opt/rtcds/caltech/c1/target/fb/mx_stream_logs/c1sus.log
c1x02
c1sus
c1mcs
c1rfm
c1pem
mmapped address is 0x7fb5ef8cc000
mapped at 0x7fb5ef8cc000
mmapped address is 0x7fb5eb8cc000
mapped at 0x7fb5eb8cc000
mmapped address is 0x7fb5e78cc000
mapped at 0x7fb5e78cc000
mmapped address is 0x7fb5e38cc000
mapped at 0x7fb5e38cc000
mmapped address is 0x7fb5df8cc000
mapped at 0x7fb5df8cc000
send len = 263596
OMX: Failed to find peer index of board 00:00:00:00:00:00 (Peer Not Found in the Table)
mx_connect failed
Looks like CDS error. We are leaving the WATCHDOGS OFF for the night. |
6622
|
Tue May 8 09:47:53 2012 |
Jamie | Update | CDS | biquad filter form |
Quote: |
I wanted to switch the implementation of IIR_FILTER from DIRECT FORM II to BIQUAD form in C1IOO and C1SUS models. I modified RCG file /opt/rtcds/rtscore/release/src/fe/controller.c by adding #define CORE_BIQUAD line:
#ifdef OVERSAMPLE
#define CORE_BIQUAD
#if defined(CORE_BIQUAD)
|
I am really not ok with anyone modifying controller.c. If we're going to be messing around with that we need to change procedure significantly. This is the code that runs all the models, and we don't currently have any way to track changes in the code.
Did you change it back? If not, do so immediately and stop messing with it. Please consult with us first before embarking on these kinds of severe changes to our code. This is the kind of shit that other people have done that has bit us in the ass in the past.
Futhermore, there is already a way to enable biquad filters in the new version with out modifying the RCG source. All you need to do is set biquad=1 in the cdsParameters block for you model.
DO NOT MESS WITH CONTROLLER.C! |
6623
|
Tue May 8 09:58:17 2012 |
Den | Update | CDS | SUS -> FB | [Alex, Den]
It was in vain to restart mx_stream yesterday as C1SUS did not see FB
controls@c1sus ~ 0$ /opt/open-mx/bin/omx_info
Open-MX version 1.3.901
build: root@fb:/root/open-mx-1.3.901 Wed Feb 23 11:13:17 PST 2011
Found 1 boards (32 max) supporting 32 endpoints each:
c1sus:0 (board #0 name eth1 addr 00:25:90:06:59:f3)
managed by driver 'igb'
Peer table is ready, mapper is 00:60:dd:46:ea:ec
================================================
0) 00:25:90:06:59:f3 c1sus:0
1) 00:60:dd:46:ea:ec fb:0 // this line was missing
2) 00:14:4f:40:64:25 c1ioo:0
3) 00:30:48:be:11:5d c1iscex:0
4) 00:30:48:bf:69:4f c1lsc:0
5) 00:30:48:d6:11:17 c1iscey:0
At the same time FB saw C1SUS:
controls@fb ~ 0$ /opt/mx/bin/mx_info
MX Version: 1.2.12
MX Build: root@fb:/root/mx-1.2.12 Mon Nov 1 13:34:38 PDT 2010
1 Myrinet board installed.
The MX driver is configured to support a maximum of:
8 endpoints per NIC, 1024 NICs on the network, 32 NICs per host
===================================================================
Instance #0: 299.8 MHz LANai, PCI-E x8, 2 MB SRAM, on NUMA node 0
Status: Running, P0: Link Up
Network: Ethernet 10G
MAC Address: 00:60:dd:46:ea:ec
Product code: 10G-PCIE-8AL-S
Part number: 09-03916
Serial number: 352143
Mapper: 00:60:dd:46:ea:ec, version = 0x00000000, configured
Mapped hosts: 6
ROUTE COUNT
INDEX MAC ADDRESS HOST NAME P0
----- ----------- --------- ---
0) 00:60:dd:46:ea:ec fb:0 1,0
1) 00:30:48:d6:11:17 c1iscey:0 1,0
2) 00:30:48:be:11:5d c1iscex:0 1,0
3) 00:30:48:bf:69:4f c1lsc:0 1,0
4) 00:25:90:06:59:f3 c1sus:0 1,0
5) 00:14:4f:40:64:25 c1ioo:0 1,0
For that reason when I restarted mx_stream on c1sus, the script tried to connect to the standard 00:00:00:00:00:00 address, as the true address was not specified.
Alex restarted mx on FB. Note, DAQD process will not allow one to do that until it runs, at the same time, you can't just kill it, it will restart automatically. For that reason one should open /etc/inittab and replace respawn to stop in the line
daq:345:respawn:/opt/rtcds/caltech/c1/target/fb/start_daqd.inittab
then execute inittab using init q and restart mx on the FB
controls@fb ~ 0$ sudo /sbin/init q
controls@fb ~ 0$ sudo /etc/init.d/mx restart
After that C1SUS started to communicate with FB. But the reason why this happened and how to prevent from this in future Alex does not know.
Restarting DAQD process (or may be C1SUS) also solved the problem with guralp channels, now they are fine. Again, why this happened is unknown.
|
6624
|
Tue May 8 10:43:42 2012 |
Den | Update | CDS | biquad filter form |
Quote: |
Quote: |
I wanted to switch the implementation of IIR_FILTER from DIRECT FORM II to BIQUAD form in C1IOO and C1SUS models. I modified RCG file /opt/rtcds/rtscore/release/src/fe/controller.c by adding #define CORE_BIQUAD line:
#ifdef OVERSAMPLE
#define CORE_BIQUAD
#if defined(CORE_BIQUAD)
|
I am really not ok with anyone modifying controller.c. If we're going to be messing around with that we need to change procedure significantly. This is the code that runs all the models, and we don't currently have any way to track changes in the code.
Did you change it back? If not, do so immediately and stop messing with it. Please consult with us first before embarking on these kinds of severe changes to our code. This is the kind of shit that other people have done that has bit us in the ass in the past.
Futhermore, there is already a way to enable biquad filters in the new version with out modifying the RCG source. All you need to do is set biquad=1 in the cdsParameters block for you model.
DO NOT MESS WITH CONTROLLER.C!
|
ok |
6625
|
Tue May 8 16:43:15 2012 |
Jenne | Update | CDS | Degenerate channels, potentially a big mess | Rana theorized that we're having problems with the MC error signal in the OAF model (separate elog by Den to follow) because we've named a channel "C1:IOO-MC_F", and such a channel already used to exist. So, Rana and I went out to do some brief cable tracing.
MC Servo Board has 3 outputs that are interesting: "DAQ OUT" which is a 4-pin LEMO, "SERVO OUT" which is a 2-pin LEMO, and "OUT1", which is a BNC->2pin LEMO right now.
DAQ OUT should have the actal MC_F signal, which goes through to the laser's PZT. This is the signal that we want to be using for the OAF model.
SERVO OUT should be a copy of this actual MC_F signal going to the laser's PZT. This is also acceptable for use with the OAF model.
OUT1 is a monitor of the slow(er) MC_L signal, which used to be fed back to the MC2 suspension. We want to keep this naming convention, in case we ever decide to go back and feed back to the suspensions for freq. stabilization.
Right now, OUT1 is going to the first channel of ADC0 on c1ioo. SERVOout is going to the 7th channel on ADC0. DAQout is going to the ~12th channel of ADC1 on c1ioo. OUT1 and SERVOout both go to the 2-pin LEMO whitening board, which goes to some new aLIGO-style ADC breakout boards with ribbon cables, which then goes to ADC0. DAQout goes to the 4pin LEMO ADC breakout, (J7 connector) which then directly goes to ADC1 on c1ioo.
So, to sum up, OUT1 should be "adc0_0" in the simulink model, SERVOout should be "adc0_6" on the simulink model, and DAQout should be "adc1_12" (or something....I always get mixed up with the channel counting on 4pin ADC breakout / AA boards).
In the current simulink setup, OUT1 (adc0_0) is given the channel name C1:IOO-MC_F, and is fed to the OAF model. We need to change it to C1:IOO-MC_L to be consistent with the old regime.
In the current simulink setup, SERVOout (adc0_6) is given the channel name C1:IOO-MC_SERVO. It should be called C1:IOO-MC_F, and should go to the OAF model.
In the current simulink setup,DAQout (~adc1_12) doesn't go anywhere. It's completely not in the system. Since the cable in the back of this AA / ADC breakout board box goes directly to the c1ioo I/O chassis, I don't think we have a degenerate MC_F naming situation. We've incorrectly labeled MC_L as MC_F, but we don't currently have 2 signals both called MC_F.
Okay, that doesn't explain precisely why we see funny business with the OAF model's version of MCL, but I think it goes in the direction of ruling out a degenerate MC_F name.
Problem: If you look at the screen cap, both simulink models are running on the same computer (c1ioo), so when they both refer to ADC0, they're really referring to the same physical card. Both of these models have adc0_6 defined, but they're defined as completely different things. Since we can trace / see the cable going from the MC Servo Board to the whitening card, I think the MC_SERVO definition is correct. Which means that this Green_PH_ADC is not really what it claims to be. I'm not sure what this channel is used for, but I think we should be very cautious and look into this before doing any more green locking. It would be dumb to fail because we're using the wrong signals.
|
6626
|
Tue May 8 17:48:50 2012 |
Jenne | Update | CDS | OAF model not seeing MCL correctly | Den noticed this, and will write more later, I just wanted to sum up what Alex said / did while he was here a few minutes ago....
Errors are probably really happening.... c1oaf computer status 4-bit thing: GRGG. The Red bit is indicating receiving errors. Probably the oaf model is doing a sample-and-hold thing, sampling every time (~1 or 2 times per sec) it gets a successful receive, and then holding that value until it gets another successful receive.
Den is adding EPICS channels to record the ERR out of the PCIE dolphin memory CDS_PART, so that we can see what the error is, not just that one happened.
Alex restarted oaf model: sudo rmmod c1oaf.ko, sudo insmod c1oaf.ko . Clicked "diag reset" on oaf cds screen several times, nothing changed. Restarted c1oaf again, same rmmod, insmod commands.
Den, Alex and I went into the IFO room, and looked at the LSC computer, SUS computer, SUS I/O chassis, LSC I/O chassis and the dolphin switch that is on the back of the rack, behind the SUS IO chassis. All were blinking happily, none showed symptoms of errors.
Alex restarted the IOP process: sudo rmmod c1x04, sudo insmod c1x04. Chans on dataviewer still bad, so this didn't help, i.e. it wasn't just a synchronization problem. oaf status: RRGG. lsc status: RGGG. ass status: RGGG.
sudo insmod c1lsc.ko, sudo insmod c1ass.ko, sudo insmod c1oaf.ko . oaf status: GRGG. lsc status: GGGG. ass status: GGGG. This means probably lsc needs to send something to oaf, so that works now that lsc is restarted, although oaf still not receiving happily.
Alex left to go talk to Rolf again, because he's still confused.
Comment, while writing elog later: c1rfm status is RRRG, c1sus status is RRGG, c1oaf status is GRGG, both c1scy and c1scx are RGRG. All others are GGGG. |
6627
|
Wed May 9 00:45:13 2012 |
Jenne | Update | CDS | No signals for DTT from SUS | Upgrades suck. Or at least making everything work again after the upgrade.
On the to-do list tonight: look at OSEM sensor and OpLev spectra for PRM, when PRMI is locked and unlocked. Goal is to see if the PRM is really moving wildly ("crazy" as Kiwamu always described it) when it's nicely aligned and PRMI is locked, or if it's an artifact of lever arm between PRM and the cameras (REFL and AS).
However, I can't get signals on DTT. So far I've checked a bunch of signals for SUS-PRM, and they all either (a) are just digital 0 or (b) are ADC noise. Lame.
Steve's elog 5630 shows what reasonable OpLev spectra should look like: exactly what you'd expect.
Attached below is a small sampling of different SUS-PRM signals. I'm going to check some other optics, other models on c1sus, etc, to see if I can narrow down where the problem is. LSC signals are fine (I checked AS55Q, for example).
UPDATE: SRM channels are same ADC noise. MC1 channels are totally fine. And Den had been looking at channels on the RFM model earlier today, which were fine.
ETMY channels - C1:SUS-ETMY_LLCOIL_IN1 and C1:SUS-ETMY_SUSPOS_IN1 both returned "unable to obtain measurement data". OSEM sensor channels and OpLev _PERROR channel were digital zeros.
ETMX channels were fine
UPDATE UPDATE: Genius me just checked the FE status screen again. It was fine ~an hour ago when I sat down to start interferometer-izing for the night, but now the SUS model and both of the ETMY computer models are having problems connecting to the fb. *sigh* |
6628
|
Wed May 9 01:14:44 2012 |
Jenne | Update | CDS | No signals for DTT from SUS |
Quote: |
UPDATE UPDATE: Genius me just checked the FE status screen again. It was fine ~an hour ago when I sat down to start interferometer-izing for the night, but now the SUS model and both of the ETMY computer models are having problems connecting to the fb. *sigh*
|
Restarted SUS model - it's now happy.
c1iscey is much less happy - neither the IOP nor the scy model are willing to talk to fb. I might give up on them after another few minutes, and wait for some daytime support, since I wanted to do DRMI stuff tonight.
Yeah, giving up now on c1iscey (Jamie....ideas are welcome). I can lock just fine, including the Yarm, I just can't save data or see data about ETMY specifically. But I can see LSC data, so I can lock, and I can now take spectra of corner optics. |
6630
|
Wed May 9 08:21:42 2012 |
Jamie | Update | CDS | No signals for DTT from SUS |
Quote: |
c1iscey is much less happy - neither the IOP nor the scy model are willing to talk to fb. I might give up on them after another few minutes, and wait for some daytime support, since I wanted to do DRMI stuff tonight.
Yeah, giving up now on c1iscey (Jamie....ideas are welcome). I can lock just fine, including the Yarm, I just can't save data or see data about ETMY specifically. But I can see LSC data, so I can lock, and I can now take spectra of corner optics.
|
This is the mx_stream issue reported previously. The symptom is that all models on a single front end loose contact with the frame builder, as opposed to *all* models on all front end loosing contact with the frame builder. That indicates that the problem is a common fb communication issue on the single front end, and that's all handled with mx_stream.
ssh'ing into c1iscey and running "sudo /etc/init.d/mx_stream restart" fixed the problem. |
6632
|
Wed May 9 10:46:54 2012 |
Den | Update | CDS | OAF model not seeing MCL correctly |
Quote: |
Den noticed this, and will write more later, I just wanted to sum up what Alex said / did while he was here a few minutes ago....
|
From my point of view during rfm -> oaf transmission through Dolphin we loose a significant part of the signal. To check that I've created MEDM screen to monitor the transmission errors in the OAF model. It shows how many errors occurs per second. For MCL channel this number turned out to be 2046 +/- 1. This makes sense to me as the sampling rate is 2048 Hz => then we actually receive only 1-3 data points per second. We can see this in the dataviewer.
C1:OAF-MCL_IN follows C1:IOO-MC_F in the sense that the scale of 2 signals are the same in 2 states: MC locked and unlocked. It seems that we loose 2046 out of 2048 points per second.

|
6633
|
Wed May 9 11:31:50 2012 |
Den | Update | CDS | RFM | I added PCIE memory cache flushing to c1rfm model by changing 0 to 1 in /opt/rtcds/rtscore/release/src/fe/commData2.c on line 159, recompiled and restarted c1rfm.
Jamie, do not be mad at me, Alex told me do that!
However, this did not help, C1RFM did not start. I decided to restart all models on C1SUS machine in hope that C1RFM uses some other models and can't connect to them but this suspended C1SUS machine. After reboot encounted the same C1SUS -> FB communication error and fixed it in the same was as in the previous case of C1SUS reboot. This happens already the second time (out of 2) after C1SUS machine reboot.
I changed /opt/rtcds/rtscore/release/src/fe/commData2.c back, recompiled and restarted c1rfm. Now everything is back. C1RFM -> C1OAF is still bad. |
6634
|
Wed May 9 14:32:31 2012 |
Jenne | Update | CDS | Burt restored | Den and Alex left things not-burt restored, and Den mentioned to me that it might need doing.
I burt restored all of our epics.snaps to the 1am today snapshot. We lost a few hours of striptool trends on the projector, but now they're back (things like the BLRMS don't work if the filters aren't engaged on the PEM model, so it makes sense). |
6635
|
Wed May 9 15:02:50 2012 |
Den | Update | CDS | RFM |
Quote: |
However, this did not help, C1RFM did not start. I decided to restart all models on C1SUS machine in hope that C1RFM uses some other models and can't connect to them but this suspended C1SUS machine.
|
This happened because of the code bug -
// If PCIE comms show errors, may want to add this cache flushing
#if 1
if(ipcInfo[ii].netType == IPCIE)
clflush_cache_range (&(ipcInfo[ii].pIpcData->dBlock[sendBlock][ipcIndex].data), 16); // & was missing - Alex fixed this
#endif
After this bug was fixed and the code was recompiled, C1:OAF_MCL_IN is OK, no errors occur during the transmission C1:OAF-MCL_ERR=0.
So the problem was in the PCIE card that could not send such amount of data and the last channel (MCL is the last) was corrupted. Now, when Alex added cache flushing, the problem is fixed.
We should spend some more attention to such problems. This time 2046 out of 2048 points were lost per second. But what if 10-20 points are lost, we would not notice that in the dataviewer, but this will cause problems. |
6639
|
Thu May 10 22:05:21 2012 |
Den | Update | CDS | FB | Already for the second time today all computers loose connection to the framebuilder. When I ssh to framebuilder DAQD process was not running. I started it
controls@fb ~ 130$ sudo /sbin/init q
But I do not know what causes this problem. May be this is a memory issue. For FB
Mem: 7678472k total, 7598368k used, 80104k free
Practically all memory is used. If more is needed and swap is off, DAQD process may die. |
6640
|
Fri May 11 08:07:30 2012 |
Jamie | Update | CDS | FB |
Quote: |
Already for the second time today all computers loose connection to the framebuilder. When I ssh to framebuilder DAQD process was not running. I started it
controls@fb ~ 130$ sudo /sbin/init q
|
Just to be clear, "init q" does not start the framebuilder. It just tells the init process to reparse the /etc/inittab. And since init is supposed to be configured to restart daqd when it dies, it restarted it after the reloading of /etc/inittab. You and Alex must have forgot to do that after you modified the inittab when you're were trying to fix daqd last week.
daqd is known to crash without reason. It usually just goes unnoticed because init always restarts it automatically. But we've known about this problem for a while.
Quote:
|
But I do not know what causes this problem. May be this is a memory issue. For FB
Mem: 7678472k total, 7598368k used, 80104k free
Practically all memory is used. If more is needed and swap is off, DAQD process may die.
|
This doesn't really mean anything, since the computer always ends up using all available memory. It doesn't indicate a lack of memory. If the machine is really running out of memory you would see lots of ugly messages in dmesg. |
6654
|
Mon May 21 21:27:39 2012 |
yuta | Update | CDS | MEDM suspension screens using macro | Background:
We need more organized MEDM screens. Let's use macro.
What I did:
1. Edited /opt/rtcds/userapps/trunk/sus/c1/medm/templates/SUS_SINGLE.adl using replacements below;
sed -i s/#IFO#SUS_#PART_NAME#/'$(IFO)$(SYS)_$(OPTIC)'/g SUS_SINGLE.adl
sed -i s/#IFO#SUS#_#PART_NAME#/'$(IFO)$(SYS)_$(OPTIC)'/g SUS_SINGLE.adl
sed -i s/#IFO#:FEC-#DCU_ID#/'$(IFO):FEC-$(DCU_ID)'/g SUS_SINGLE.adl
sed -i s/#CHANNEL#/'$(IFO):$(SYS)-$(OPTIC)'/g SUS_SINGLE.adl
sed -i s/#PART_NAME#/'$(OPTIC)'/g SUS_SINGLE.adl
2. Edited sitemap.adl so that they open SUS_SINGLE.adl with arguments like
IFO=C1,SYS=SUS,OPTIC=MC1,DCU_ID=36
instead of opening ./c1mcs/C1SUS_MC1.adl.
3. I also fixed white blocks in the LOCKIN part.
Result:
Now you don't have to generate every suspension screens. Just edit SUS_SIGNLE.adl.
Things to do:
- fix every other MEDM screens which open suspension screens, so that they open SUS_SINGLE.adl
- make SUS_SINGLE.adl more cool |
6655
|
Tue May 22 00:23:45 2012 |
Den | Update | CDS | transmission error monitor | I've started to create channels and an medm screen to monitor the errors that occur during the transmission through the RFM model. The screen will show the amount of lost data per second for each channel.
Not all channels are ready yet. For created channels, number of errors is 0, this is good.
 |
6657
|
Tue May 22 11:32:02 2012 |
Jamie | Update | CDS | MEDM suspension screens using macro | Very nice, Yuta! Don't forget to commit your changes to the SVN. I took the liberty of doing that for you. I also tweaked the file a bit, so we don't have to specify IFO and SYS, since those aren't going to ever change. So the arguments are now only: OPTIC=MC1,DCU_ID=36. I updated the sitemap accordingly.
Yuta, if you could go ahead and modify the calls to these screens in other places that would be great. The WATCHDOG, LSC_OVERVIEW, MC_ALIGN screens are ones that immediately come to mind.
And also feel free to make cool new ones. We could try to make simplified version of the suspension screens now being used at the sites, which are quite nice. |
6658
|
Tue May 22 11:45:12 2012 |
Jamie | Configuration | CDS | Please remember to commit SVN changes | Hey, folks. Please remember to commit all changes to the SVN in a timely manor. If you don't, multiple commits will get lumped together and we won't have a good log of the changes we're making. You might also end up just loosing all of your work. SVN COMMIT when you're done! But please don't commit broken or untested code.
pianosa:release 0> svn status | grep -v '^?'
M cds/c1/models/c1rfm.mdl
M sus/c1/models/c1mcs.mdl
M sus/c1/models/c1scx.mdl
M sus/c1/models/c1scy.mdl
M isc/c1/models/c1lsc.mdl
M isc/c1/models/c1pem.mdl
M isc/c1/models/c1ioo.mdl
M isc/c1/models/ADAPT_XFCODE_MCL.c
M isc/c1/models/c1oaf.mdl
M isc/c1/models/c1gcv.mdl
M isc/common/medm/OAF_OVERVIEW.adl
M isc/common/medm/OAF_DOF_BLRMS.adl
M isc/common/medm/OAF_OVERVIEW_BAK.adl
M isc/common/medm/OAF_ADAPTATION_MICH.adl
pianosa:release 0>
|
6659
|
Tue May 22 11:47:43 2012 |
Jamie | Update | CDS | MEDM suspension screens using macro | Actually, it looks like we're not quite done here. All the paths in the SUS_SINGLE screen need to be updated to reflect the move. We should probably make a macro that points to /opt/rtcds/caltech/c1/screens, and update all the paths accordingly. |
6661
|
Tue May 22 20:01:26 2012 |
Den | Update | CDS | error monitor | I've created transmission error monitors in rfm, oaf, sus, lsc, scx, scy and ioo models. I tried to get data from every channel transmitted through PCIE and RFM. I also included some shared memory channels.
The medm screen is in the EF STATUS -> TEM. It shows 16384 for the channels that come from simulation plant. Others are 0, that's fine. |
6663
|
Tue May 22 20:46:38 2012 |
yuta | Update | CDS | MEDM suspension screens using macro | I fixed the problem Jamie pointed out in elog #6657 and #6659.
What I did:
1. Created the following template files in /opt/rtcds/userapps/trunk/sus/c1/medm/templates/ directry.
SUS_SINGLE_LOCKIN1.adl
SUS_SINGLE_LOCKIN2.adl
SUS_SINGLE_LOCKIN_INMTRX.adl
SUS_SINGLE_OPTLEV_SERVO.adl
SUS_SINGLE_PITCH.adl
SUS_SINGLE_POSITION.adl
SUS_SINGLE_SUSSIDE.adl
SUS_SINGLE_TO_COIL_MASTER.adl
SUS_SINGLE_COIL.adl
SUS_SINGLE_YAW.adl
SUS_SINGLE_INMATRIX_MASTER.adl
SUS_SINGLE_INPUT.adl
SUS_SINGLE_TO_COIL_X_X.adl
SUS_SINGLE_OPTLEV_IN.adl
SUS_SINGLE_OLMATRIX_MASTER.adl
To open these files, you have to define $(OPTIC) and $(DCU_ID).
For SUS_SINGLE_TO_COIL_X_X.adl, you also have to define $(FILTER_NUMBER), too. See SUS_SINGLE_TO_COIL_MASTER.adl.
2. Fixed the following screens so that they open SUS_SINGLE.adl.
C1SUS_WATCHDOGS.adl
C1IOO_MC_ALIGN.adl
C1IOO_WFS_MASTER.adl
C1IFO_ALIGN.adl |
6670
|
Thu May 24 01:17:13 2012 |
Den | Update | CDS | PMC autolocker |
Quote: |
- SCRIPT
- Auto-locker for PMC, PSL things - DEN
|
I wrote auto-locker for PMC. It is called autolocker_pmc, located in the scripts directory, svn commited. I connected it to the channel C1:PSL-PMC_LOCK. It is currently running on rosalba. MC autolocker runs on op340m, but I could not execute the script on that machine
op340m:scripts>./autolock_pmc
./autolock_pmc: Stale NFS file handle.
I did several tests, usually, the script locks PMC is a few seconds. However, if PMC DC output has been drift significantly, if might take longer as discussed below.
The algorithm:
if autolocker if enabled, monitor PSL-PMC_PMCTRANSPD channel
if TRANS is less then 0.4, start locking:
disengage PMC servo by enabling PMC TEST 1
change PSL-PMC_RAMP unless TRANS is higher then 0.4 (*)
engage PMC servo by disabling PMC TEST 1
else sleep for 1 sec
(*) is tricky. If RAMP (DC offset) is specified then TRANS will be oscillating in the range ( TRANS_MIN, TRANS_MAX ). We are interested only in the TRANS_MAX. To make sure, we estimate it right, TRANS channel is read 10 times and the maximum value is chosen. This works good.
Next problem is to find the proper range and step to vary DC offset RAMP. Of coarse, we can choose the maximum range (-7, 0) and minimum step 0.0001, but it will take too long to find the proper DC offset. For that reason autolocker tries to find a resonance close to the previous DC offset in the range (RAMP_OLD - delta, RAMP_OLD + delta), initial delta is 0.03 and step is 0.003. It resonance is not found in this region, then delta is multiplied by a factor of 2 and so on. During this process RAMP range is controlled not to be wider then (-7, 0).
The might be a better way to do this. For example, use the gradient descent algorithm and control the step adaptively. I'll do that if this realization will be too slow.
I've disabled autolocker_pmc for the night. |
6699
|
Tue May 29 00:53:57 2012 |
Den | Update | CDS | problems | I've noticed several CDS problems:
- Communication sign on C1SUS model turns to red once in a while. I press diag reset and it is gone. But after some time comes back.
- On C1LSC machine red "U" lamp shines with a period ~5 sec.
- I was not able to read data from the SR785 using netgpibdata.py. Either connection is not established at all, or data starts to download and then stops in the middle. I've checked the cables, power supplies and everything, still the same thing.
|
6734
|
Thu May 31 22:13:08 2012 |
Jamie | Update | CDS | c1lsc: added remaining SHMEM senders for ERR and CTRL, c1oaf model updated appropriately | All the ERR and CTRL outputs in c1lsc now go to SHMEM senders. I renamed the the CTRL output SHMEM senders to be more generic, since they aren't specifically for OAF anymore. See attached image from c1lsc.
c1oaf was updated so that SHMEM receivers pointed to the newly renamed senders.
c1lsc and c1oaf were rebuilt, installed, and restarted and are now running. |
6748
|
Sun Jun 3 23:50:00 2012 |
Den | Update | CDS | biquad=1 | From now all models calculate iir filters using biquad form. I've added biquad=1 to cdsParameters to all models except c1cal, built, installed and restarted them. |
6755
|
Tue Jun 5 14:47:28 2012 |
Jamie | Update | CDS | new c1tst model for testing RCG code | I made a new model, c1tst, that we can use for debugging the FREQUENT RCG bugs that we keep encountering. It's a bare model that runs on c1iscey. Don't do any thing important in here, and don't leave it in some crappy state. Clean if up when you're done. |
6760
|
Wed Jun 6 00:32:22 2012 |
Jenne | Update | CDS | RFM model is way overloading the cpu | We have too much crap in the rfm model. CPU time for the rfm model is regularly above 60us, and sometimes in the mid-70's (but sometimes jumps down briefly to ~47us, which is where I think it "used" to sit, but I don't remember when I last thought about that number)
This is potentially causing lots of asynchronous grief. |
6778
|
Thu Jun 7 03:37:26 2012 |
yuta | Update | CDS | mx_stream restarted on c1lsc, c1ioo | c1lsc and c1ioo computers had FB net statuses all red. So, I restarted mx_stream on each computer.
ssh controls@c1lsc
sudo /etc/init.d/mx_stream restart
|
6787
|
Thu Jun 7 17:49:09 2012 |
Jamie | Update | CDS | c1sus in weird state, running models but unresponsive otherwise | Somehow c1sus was in a very strange state. It was running models, but EPICS was slow to respond. We could not log into it via ssh, and we could not bring up test points. Since we didn't know what else to do we just gave it a hard reset.
Once it came it, none of the models were running. I think this is a separate problem with the model startup scripts that I need to debug. I logged on to c1sus and ran:
rtcds restart all
(which handles proper order of restarts) and everything came up fine.
Have no idea what happened there to make c1sus freeze like that. Will keep an eye out. |
6806
|
Tue Jun 12 17:29:28 2012 |
Den | Update | CDS | dq channels | All PEM and IOO DQ channels disappeared. These channels were commented in C1???.ini files though I've uncommented them a few weeks ago. It happened after these models were rebuild, C1???.ini files also changed. Why?
I added the channels back. mx_stream died on c1sus after I pressed DAQ reload on medm screen. For IOO model it is even worse. After pressing DAQ Reload for C1IOO model DACQ process dies on the FB and IOO machine suspends.
I rebooted IOO, restarted models and fb. Models work now, but there might be an easier way to add channels without rebooting machines and demons. |
6911
|
Wed Jul 4 17:33:04 2012 |
Jamie | Update | CDS | timing, possibly leap second, brought down CDS | I got a call from Koji and Yuta that something was wrong with the CDS system. I somehow had an immediate suspicion that it had something to do with the recent leap second.
It took a while for nodus to respond, and once he finally let me in I found a bunch of the following in his dmesg, repeated and filling the buffer:
Jul 3 22:41:34 nodus xntpd[306]: [ID 774427 daemon.notice] time reset (step) 0.998366 s
Jul 3 22:46:20 nodus xntpd[306]: [ID 774427 daemon.notice] time reset (step) -1.000847 s
Looking at date on all the front end systems, including fb, I could tell that they all looked a second fast, which is what you would expect if they had missed the leap second. Everything syncs against nodus, so given nodus's problems above, that might explain everything.
I stopped daqd and nds on fb, and unloaded the mx drivers, which seemed to be showing problems. I also stopped nodus's xntp:
sudo /etc/init.d/xntpd stop
His ntp config file is in /etc/inet/ntp.conf, which is definitely the WRONG PLACE, given that the ntp server is not, as far as I can tell, being controlled by inetd. (nodus is WAY out of date and desperately needs an overhaul. it's nearly impossible to figure out what the hell is going on in there). I found an old elog of Rana's that mentioned updating his config to point him to the caltech NTP server, which is now listed in the config, so I tried manually resyncing against that:
sudo ntpdate -s -b -u 131.215.239.14
Unfortunately that didn't seem to have any effect. This was making me wonder if the caltech server is off? Anyway, I tried resyncing against the global NTP pool:
sudo ntpdate -s -b -u pool.ntp.org
This seemed to work: the clock came back in sync with others that are known good. Once nodus time was good I reloaded the mx drivers on fb and restarted daqd and nds. They seemed come up fine. At this point front ends started coming back on their own. I went and restarted all the models on the machines that didn't (c1iscey and c1ioo). Currently everything is looking ok.
I'm worried that there is still a problem with one of the NTP servers that nodus is sync'ing against, and that the problem might come back. I'll check in again later tonight. |
6915
|
Thu Jul 5 01:20:58 2012 |
yuta | Summary | CDS | slow computers, 0x4000 for all DAQ status | ALS looks OK. I tried to lock FPMI using ALS, but I feel like I need 6 hands to do it with current ALS stability. Now I have all computers being so slow.
It was fine for 7 hours after Jamie the Great fixed this, but fb went down couple times and DAQ status for all models now shows 0x4000. I tried restarting mx_stream and restarting fb, but they didn't help. |
6917
|
Thu Jul 5 10:49:38 2012 |
Jamie | Update | CDS | front-end/fb communication lost, likely again due to timing offsets | All the front-ends are showing 0x4000 status and have lost communication with the frame builder. It looks like the timing skew is back again. The fb is ahead of real time by one second, and strangely nodus is ahead of real time by something like 5 seconds! I'm looking into it now. |
6918
|
Thu Jul 5 11:12:53 2012 |
Jenne | Update | CDS | front-end/fb communication lost, likely again due to timing offsets |
Quote: |
All the front-ends are showing 0x4000 status and have lost communication with the frame builder. It looks like the timing skew is back again. The fb is ahead of real time by one second, and strangely nodus is ahead of real time by something like 5 seconds! I'm looking into it now.
|
I was bad and didn't read the elog before touching things, so I did a daqd restart, and mxstream restart on all the front ends, but neither of those things helped. Then I saw the elog that Jamie's working on figuring it out. |
6920
|
Thu Jul 5 12:27:05 2012 |
Jamie | Update | CDS | front-end/fb communication lost, likely again due to timing offsets |
Quote: |
All the front-ends are showing 0x4000 status and have lost communication with the frame builder. It looks like the timing skew is back again. The fb is ahead of real time by one second, and strangely nodus is ahead of real time by something like 5 seconds! I'm looking into it now.
|
This was indeed another leap second timing issue. I'm guessing nodus resync'd from whatever server is posting the wrong time, and it brought everything out of sync again. It really looks like the caltech server is off. When I manually sync form there the time is off by a second, and then when I manually sync from the global pool it is correct.
I went ahead and updated nodus's config (/etc/inet/ntp.conf) to point to the global pool (pool.ntp.org). I then restarted the ntp daemon:
nodus$ sudo /etc/init.d/xntpd stop
nodus$ sudo /etc/init.d/xntpd start
That brought nodus's time in sync.
At that point all I had to do was resync the time on fb:
fb$ sudo /etc/init.d/ntp-client restart
When I did that daqd died, but it immediately restarted and everything was in sync. |
6997
|
Fri Jul 20 17:11:50 2012 |
Jamie | Update | CDS | All custom MEDM screens moved to cds_users_apps svn repo | Since there are various ongoing requests for this from the sites, I have moved all of our custom MEDM screens into the cds_user_apps SVN repository. This is what I did:
For each system in /opt/rtcds/caltech/c1/medm, I copied their "master" directory into the repo, and then linked it back in to the usual place, e.g.:
a=/opt/rtcds/caltech/c1/medm/${model}/master
b=/opt/rtcds/userapps/trunk/${system}/c1/medm/${model}
mv $a $b
ln -s $b $a
Before committing to the repo, I did a little bit of cleanup, to remove some binary files and other known superfluous stuff. But I left most things there, since I don't know what is relevant or not.
Then committed everything to the repo.
|
6999
|
Sat Jul 21 14:48:33 2012 |
Den | Update | CDS | RCG | As I've spent many hours trying to determine the error in my C code for online filter I decided to write about it to prevent people from doing it again.
I have a C function that was tested offline. I compiled and installed it on the front end machine without any errors. When I've restarted the model, it did not run.
I modified the function the following way
void myFunction()
{
if(STATEMENT) return;
some code
}
I've adjusted input parameters such that STATEMENT was always true. However the model either started or not depending on the code after if statement. It turned out that the model could not start because of the following lines
cosine[1] = 1.0 - 0.5*a*a + a*a*a*a/24 - a*a*a*a*a*a/720 + a*a*a*a*a*a*a*a/40320;
sine[1] = a - a*a*a/6 + a*a*a*a*a/120 - a*a*a*a*a*a*a/5040;
When I've split the sum into steps, the model began to run. I guess the conclusion is that we can not make too many arithmetical operations for one "=" . The most interesting thing is that these lines stood after true if-statement and should not be even executed. Possible explanation is that some compilers start to process code after if-statement during its slow comparison. In our case it could start and then broke down on these long expressions. |
7008
|
Mon Jul 23 18:57:52 2012 |
Jamie | Update | CDS | c1scx and c1scy models recompiled and restarted | After the changes listed in 7005 and 7007, I have rebuilt, installed, and restarted the c1scx and c1scy models. Everything seems to have come back up ok.
Running into some daqd troubles because of a change to c1ioo, but will report on the new ALS channels when I can. |
7011
|
Mon Jul 23 19:50:43 2012 |
Jamie | Update | CDS | c1gcv model renamed to c1als | I decided to rename the c1gcv model to be c1als. This is in an ongoing effort to rename all the ALS stuff as ALS, and get rid of the various GC{V,X,Y} named stuff.
Most of what was in the c1gcv model was already in a subsystem with and ALS top names, but there were a couple of channels that were outside of that that had funky names, namely the "GCV_GREEN" channels. This fixes that, and make things more consistent and simple.
Of course this required a bunch of other little changes:
- rename model in userapps svn
- target/fb/master had to be modified to point to the new chans/daq/C1ALS.ini channel file and gds/param/tpchn_c1als.par testpoint file
- rename RFM channels appropriately, and fix in receiver models (c1scx, c1scy, c1mcs)
- move custom medm screens in userapps svn (isc/c1/medm/c1als), and link to it at medm/c1als/master
- moved old medm/c1gcv directory into a subdirectory of medm/c1als
- update all medm screens that point to c1gcv stuff (mostly just ALS screens)
The above has been done. Still todo:
- FIX SCRIPTS! There are almost certainly scripts that point to GC{V,X,Y} channels. Those will have to be fixed as we come across them.
- Fix the c1sc{x,y}/master/C1SC{X,Y}_GC{X,Y}_SLOW.adl screens. I need to figure out a more consistent place for those screens.
- Fix the C1ALS_COMPACT screen
- ???
|
7037
|
Thu Jul 26 12:10:28 2012 |
Den | Update | CDS | new c1tst model for testing RCG code |
Quote: |
I made a new model, c1tst, that we can use for debugging the FREQUENT RCG bugs that we keep encountering. It's a bare model that runs on c1iscey. Don't do any thing important in here, and don't leave it in some crappy state. Clean if up when you're done.
|
I wanted to test biquad form in this model. I added biquad=1 flag to cdsParameters, compiled, installed and restarted it. After that c1iscey suspended.
The same thing as we had several month ago
controls@c1iscey /opt/rtcds/caltech/c1/target/c1tst/c1tstepics 0$ cat iocC1.log
Starting iocInit
iocRun: All initialization complete
sh: iniChk.pl: command not found
Failed to load DAQ configuration file
|
7043
|
Fri Jul 27 14:27:14 2012 |
Jamie | Update | CDS | new c1tst model for testing RCG code |
Quote: |
I wanted to test biquad form in this model. I added biquad=1 flag to cdsParameters, compiled, installed and restarted it. After that c1iscey suspended.
The same thing as we had several month ago
controls@c1iscey /opt/rtcds/caltech/c1/target/c1tst/c1tstepics 0$ cat iocC1.log
Starting iocInit
iocRun: All initialization complete
sh: iniChk.pl: command not found
Failed to load DAQ configuration file
|
I have fixed the iniChk.pl issue (which actually fixed a separate model startup-on-boot issue that we had been having). However, that is completely unrelated to the system freeze. I'll discuss that in a separate post. |
7046
|
Fri Jul 27 16:32:17 2012 |
Jamie | Update | CDS | RCG bug exposed by simple c1tst model | As Den mentioned in 7043, attempting to run the c1tst model was causing the entire c1iscey machine to crash. Alex came over this morning and we spend a couple of hours trying to debug what was going on.
c1tst is the simplest possible model you can have: 1 ADC and 2 filter modules. It compiles just fine, but when you tried to load it the machine would completely freeze.
We eventually tracked this down to a non-empty filter file for one of the filter modules. It turns out that the model was crashing when it attempted to load the filter file. Once we completely deleted all the filters in the module, the model would run. But then if you added back a filter to the filter file and tried to "load coefficients", the model/machine would immediately crash again.
So it has something to do with the loading of the filter coefficients from the filter file. We tried different filters and it didn't seem to make a difference. Alex thought it might have something to do with zeros in some of the second-order sections, but that wasn't it either.
There's speculation that it might be related to a very similar bug that Joe reported at LLO a month ago: https://bugzilla.ligo-wa.caltech.edu/bugzilla/show_bug.cgi?id=398
Things we tried, none of which worked:
- adding a DAC
- turning on/off biquad
- disabling the float denormalization fix
This is a real mystery. Alex and I are continuing to investigate. |
|