40m QIL Cryo_Lab CTN SUS_Lab TCS_Lab OMC_Lab CRIME_Lab FEA ENG_Labs OptContFac Mariner WBEEShop
  SUS Lab eLog  Not logged in ELOG logo
Message ID: 178     Entry time: Wed Jan 26 10:34:53 2011
Author: Jan 
Type: Summary 
Category: Seismometry 
Subject: FIR filters and linear estimation 

I wanted to write down what I learned from our filter discussion yesterday. There seem to be two different approaches, but the subject is sufficiently complex to be wrong about details. Anyway, I currently believe that one can distinguish between real filters that operate during run time, and estimation algorithms that cannot be implemented in this way since they are acausal. For simplicity, let's focus on FIR filter and linear estimation to represent the two cases.

A) FIR filters

FIR.jpg

A FIR filter has M tap coefficients per channel. If the data is sampled, then you would take the past M samples (including sample at present time t) of each channel, run them through the FIR and subtract the FIR output from the test-mass sample at time t. This can also be implemented in a feed-forward system so that the test-mass data is not sampled. Test-mass data is only used initially to calclulate the FIR coefficients, unless the FIR is part of an adaptive algorithm. For adaptive filters, you would factor out anything from the FIR that you know already (e.g. your best estimates of transfer functions) and only let it do the optimization around this starting value.

The FIR filter can only work if transfer functions do not change much over time. This is not the case though for Newtonian noise. Imagine the following case:

(S1)-----(TM)----------(S2)

where you have two seismometers around a test mass along a line, one of them can be closer to the test mass than the other. We need to monitor the vertical displacement to estimate NN parallel to the line (at least when surface fields are dominant). If a plane wave propagates upwards, perpendicular to the line, then there will be no NN parallel to this line (because of symmetry). The seismic signals at S1 and S2 are identical. Now a plane wave propagating parallel to the line will produce NN. If the distance between the seismometers happens to be the length of the plane wave, then again, the seismometers will show identical seismic signals, but this time there is NN. An FIR filter would give the same NN prediction in these two cases, but NN is actually different (being absent in the first case). So it is pretty obvious that FIR alone cannot handle this situation.

What is the purpose of the FIR anyway? In the case of noise subtraction, it is a clever time-domain representation of transfer functions. Clever means optimal if the FIR is a Wiener filter. So it contains information of the channels between sensors and test mass, but it does not care at all about information content in the sensor data. This information is (intentionally if you want) averaged out when you calculate the FIR filter coefficients.

B) Linear estimation

Wiener.jpg

So how to deal with information content in sensor data from multiple input channels? We will assume that an FIR can be applied to factor out the transfer functions from this problem. In the surface NN case, this would be the 1/f^2 from NN acceleration to test-mass displacement, and the exp(-2*pi*f*h/c) - h being the height of the test mass above ground - which accounts for the frequency-dependent exponential suppression of NN. Since the information content of the seismic field changes continuously, we cannot train a filter that would be able to represent this information for all times. So it is obvious, that this information needs to be updated continuously.

The problem is very similar to GW data analysis. What we are going to do is to construct a NN template that depends on a few template parameters. We estimate these parameters (maximum likelihood) and then we subtract our best-estimate of the NN signal from the data. This cannot be implemented as feed forward and relies on chopping the data into stretches of M samples (not necessarily the same value for M as in the FIR case). Now what are the template parameters? These are the coefficients used to combine the data stretches of the N sensors. This is great since the templates depend linearly on these parameters. And it is trivial to calculate the maximum-liklihood estimates of the template parameters. The formula is in fact analogous to calculating the Wiener-filter coefficients (optimal linear estimates). If we only use one parameter per channel (as discussed yesterday) or if one should rather chop the sensor data into even smaller stretches and introduce additional template coefficients will depend on the sensor data and how nature links them to the test mass. Results of my current simulation suggest that only one parameter per channel is required.

When I realized that the NN subtraction is a linear estimation problem with templates etc, I immediately realized that one could do higher-order noise subtraction so that we will never be limited by other contributions to the test mass displacement (and here I essentially mean GWs since you don't need to subtract NN below other GWD noise, but maybe below the GW spectrum if other instrumental noise is also weaker). Something to look at in the future (if this scenario is likely or not, i.e. NN > GW > other noise).

ELOG V3.1.3-