LoginSignup
1
0

More than 3 years have passed since last update.

D-term determination

Last updated at Posted at 2018-04-30

This article addresses an algorithm to determine the cross-polarization leakage, a.k.a. D-term, through interferometric observations toward a polarization calibrator. The calibrator is assumed to have known Stokes parameters with significant linear polarization and zero circular polarization.

Basic Equation between Stokes Paramters, D-term, and correlations

Let

\boldsymbol{S} = \left(
\begin{array}{c}
I \\
Q \\
U \\
V \\
\end{array}
\right), \ 
\boldsymbol{X} = \left(
\begin{array}{c}
\left< XX^* \right> / ( G^m_X G^{n*}_X ) \\
\left< XY^* \right> / ( G^m_X G^{n*}_Y ) \\
\left< YX^* \right> / ( G^m_Y G^{n*}_X ) \\
\left< YY^* \right> / ( G^m_Y G^{n*}_Y ) \\
\end{array}
\right),

and

D = 
\left(
\begin{array}{cccc}
1              & D^{n*}_X        & D^m_X           & D^m_X D^{n*}_X \\
D^{n*}_Y       & 1               & D^m_X D^{n*}_Y  & D^m_X          \\
D^m_Y          & D^m_Y D^{n*}_X  & 1               & D^{n*}_X       \\
D^m_Y D^{n*}_Y & D^m_Y           & D^{n*}_Y        & 1              \\
\end{array}
\right), \ \ 
P = 
\left(
\begin{array}{cccc}
1 &  \cos 2\psi  &  \sin 2\psi & 0 \\
0 & -\sin 2\psi  &  \cos 2\psi & i \\
0 & -\sin 2\psi  &  \cos 2\psi & -i \\
1 & -\cos 2\psi  & -\sin 2\psi & 0 \\
\end{array}
\right),

where S is the Stokes parameters, $\left< XX^* \right>$ etc. are correlations, $G$ is the voltage-based complex gain, $D$ is the D-term, and $\psi$ is the parallactic angle. Then we have

\boldsymbol{X} = D P \boldsymbol{S}.

If $S$ is known (i.e. calibrators), we can determine D-terms by solving this equation.

Iterative Solution for D-terms

Solution conditions:
* Unknowns are Dx, Dy … $2N_{\rm ant}$ complex numbers. We have $4N_{\rm ant}$ unknown real parameters.
* We have $4 N_{\rm bl} = 2N_{\rm ant}(N_{\rm ant}-1)$ complex measurements, or $4N_{\rm ant}(N_{\rm ant}-1)$ real measurements.
* $PS$ is considered to be known.

Let $d$ as a vector of unknowns:

d = \left(
\begin{array}{c}
D_{\rm 0,x.real} \\
D_{\rm 1,x.real} \\
\vdots \\
D_{\rm 0,y.real} \\
\vdots \\
D_{\rm 0,x.imag} \\
D_{\rm 1,x.imag} \\
\vdots \\
D_{\rm 0,y.real} \\
D_{\rm 0,y.imag} \\
\vdots \\
D_{\rm Nant-1,y.imag} \\
\end{array} \right),

and let $A$ is a partial matrix between unknowns and measurements:

A = \left(
\begin{array}{cccccccccc}
\frac{\partial \left< X_1X^*_0 \right>_{\rm real}}{\partial D_{\rm 0, x.real}} &
\frac{\partial \left< X_1X^*_0 \right>_{\rm real}}{\partial D_{\rm 1, x.real}} &
\cdots &
\frac{\partial \left< X_1X^*_0 \right>_{\rm real}}{\partial D_{\rm 0, y.real}} &
\cdots &
\frac{\partial \left< X_1X^*_0 \right>_{\rm real}}{\partial D_{\rm 0, x.imag}} &
\cdots &
\frac{\partial \left< X_1X^*_0 \right>_{\rm real}}{\partial D_{\rm 1, x.imag}} &
\cdots &
\frac{\partial \left< X_1X^*_0 \right>_{\rm real}}{\partial D_{\rm  Nant-1,y.imag}}  \\
\frac{\partial \left< X_2X^*_0 \right>_{\rm real}}{\partial D_{\rm 0, x.real}} & \frac{\partial \left< X_2X^*_0 \right>_{\rm real}}{\partial D_{\rm 1, x.real}} & \cdots  \\
\frac{\partial \left< X_2X^*_1 \right>_{\rm real}}{\partial D_{\rm 0, x.real}} & \frac{\partial \left< X_2X^*_1 \right>_{\rm real}}{\partial D_{\rm 1, x.real}} & \cdots  \\
\vdots \\
\frac{\partial \left< X_1Y^*_0 \right>_{\rm real}}{\partial D_{\rm 0, x.real}} & \frac{\partial \left< X_1Y^*_0 \right>_{\rm real}}{\partial D_{\rm 1, x.real}} & \cdots  \\
\frac{\partial \left< X_2Y^*_0 \right>_{\rm real}}{\partial D_{\rm 0, x.real}} & \frac{\partial \left< X_2Y^*_0 \right>_{\rm real}}{\partial D_{\rm 1, x.real}} & \cdots  \\
\vdots \\
\frac{\partial \left< Y_1X^*_0 \right>_{\rm real}}{\partial D_{\rm 0, x.real}} & \frac{\partial \left< Y_1X^*_0 \right>_{\rm real}}{\partial D_{\rm 1, x.real}} & \cdots  \\
\vdots \\
\frac{\partial \left< Y_1Y^*_0 \right>_{\rm real}}{\partial D_{\rm 0, x.real}} & \frac{\partial \left< Y_1Y^*_0 \right>_{\rm real}}{\partial D_{\rm 1, x.real}} & \cdots  \\
\vdots \\
\frac{\partial \left< Y_{\rm Nant - 1}Y^*_{\rm Nant - 2} \right>_{\rm real}}{\partial D_{\rm 0, x.real}} & \frac{\partial \left< Y_{\rm Nant - 1}Y^*_{\rm Nant - 2} \right>_{\rm real}}{\partial D_{\rm 1, x.real}} & \cdots  \\
\vdots \\
{\rm and \ imaginary \ parts}
\end{array} \right).

Then we have iterative solution:

{\rm residual} \leftarrow X - D_{\rm trial} PS \\
{\rm correction} \leftarrow  A \cdot {\rm residual} \\
D_{\rm improved} \leftarrow D_{\rm trial} + {\rm correction} 

Parallel-hand solution (for initial model)

Because parallel-hand products have much greater amplitude and systematic error than cross-hand products, it is better to start only with the parallel-hand to obtain the initial model of $D$ and then later use cross-hand products for fine solution.

The parallel-hand productrs are described as

\left< XX^* \right> = I + QCpUS + (D^{m}_x + D^{n*}_x)UCmQS + D^{m}_x D^{n*}_x (I - QCpUS) \\
\left< YY^* \right> = I - QCpUS + (D^{m}_y + D^{n*}_y)UCmQS + D^{m}_y D^{n*}_y (I + QCpUS)

where $QCpUS = Q \cos 2\psi + U \sin 2\psi$ and $UCmQS = U \cos 2\psi - Q \sin 2\psi$.
The subset of the partial matrix, A, that respond to the parallel-hand products is composed of two $2 \times 2$ quadrants as

B = \left(
\begin{array}{cccc}
B_{00} & B_{01} \\
B_{10} & B_{11} \\
\end{array} \right)\  {\rm and} \   C = \left(
\begin{array}{cccc}
C_{00} & C_{01} \\
C_{10} & C_{11} \\
\end{array} \right)

where

B_{00} = \frac{\partial \left< XX^* \right>_{\rm real}}{\partial D_{\rm x, real}}, \ B_{01} = \frac{\partial \left< XX^* \right>_{\rm real}}{\partial D_{\rm x, imag}}, \ B_{10} = \frac{\partial \left< XX^* \right>_{\rm imag}}{\partial D_{\rm x, real}}, \dots

and

C_{00} = \frac{\partial \left< YY^* \right>_{\rm real}}{\partial D_{\rm y, real}}, \ C_{01} = \frac{\partial \left< YY^* \right>_{\rm real}}{\partial D_{\rm y, imag}}, \ C_{10} = \frac{\partial \left< YY^* \right>_{\rm imag}}{\partial D_{\rm y, real}}, \dots

etc. and, we have

B_{00} = UCmQS (U_F + U_B) + (I - QCpUS) (H^F_X + H^B_X) \\
B_{01} = (I - QCpUS) (K^F_X + K^B_X) \\
B_{10} = (I - QCpUS) (K^B_X - K^F_X) \\
B_{11} = UCmQS (U_F - U_B) + (I - QCpUS) (H^F_X - H^B_X)  \\
C_{00} = UCmQS (U_F + U_B) + (I + QCpUS) (H^F_Y + H^B_Y) \\
C_{01} = (I + QCpUS) (K^F_Y + K^B_Y) \\
C_{10} = (I + QCpUS) (K^B_Y - K^F_Y) \\
C_{11} = UCmQS (U_F - U_B) + (I + QCpUS) H^F_Y - H^B_Y)

where

U_F = \left(\begin{array}{ccccc} 
0 & 1 & 0 & 0 & \dots \\
0 & 0 & 1 & 0 & \dots \\
0 & 0 & 1 & 0 & \dots \\
0 & 0 & 0 & 1 & \dots \\
0 & 0 & 0 & 1 & \dots \\
0 & 0 & 0 & 1 & \dots \\
\vdots & \vdots & \vdots & \vdots & \ddots
\end{array} \right), \ U_B = \left(\begin{array}{ccccc} 
1 & 0 & 0 & 0 & \dots \\
1 & 0 & 0 & 0 & \dots \\
0 & 1 & 0 & 0 & \dots \\
1 & 0 & 0 & 0 & \dots \\
0 & 1 & 0 & 0 & \dots \\
0 & 0 & 1 & 0 & \dots \\
\vdots & \vdots & \vdots & \vdots & \ddots
\end{array} \right), \\
H^F_X = \left(\begin{array}{ccccc}
0 & D_{\rm 0,x,real} & 0 & 0 & \dots \\
0 & 0 & D_{\rm 0,x,real} & 0 & \dots \\
0 & 0 & D_{\rm 1,x,real} & 0 & \dots \\
0 & 0 & 0 & D_{\rm 0,x,real} & \dots \\
0 & 0 & 0 & D_{\rm 1,x,real} & \dots \\
0 & 0 & 0 & D_{\rm 2,x,real} & \dots \\
\vdots & \vdots & \vdots & \vdots & \ddots
\end{array} \right),\  H^B_X = \left(\begin{array}{ccccc}
D_{\rm 1,x,real} & 0 & 0 & 0 & \dots \\
D_{\rm 2,x,real} & 0 & 0 & 0 & \dots \\
0 & D_{\rm 2,x,real} & 0 & 0 & \dots \\
D_{\rm 3,x,real} & 0 & 0 & 0 & \dots \\
0 & D_{\rm 3,x,real} & 0 & 0 & \dots \\
0 & 0 & D_{\rm 3,x,real} & 0 & \dots \\
\vdots & \vdots & \vdots & \vdots & \ddots
\end{array} \right), \\
K^F_X = \left(\begin{array}{ccccc}
0 & D_{\rm 0,x,imag} & 0 & 0 & \dots \\
0 & 0 & D_{\rm 0,x,imag} & 0 & \dots \\
0 & 0 & D_{\rm 1,x,imag} & 0 & \dots \\
0 & 0 & 0 & D_{\rm 0,x,imag} & \dots \\
0 & 0 & 0 & D_{\rm 1,x,imag} & \dots \\
0 & 0 & 0 & D_{\rm 2,x,imag} & \dots \\
\vdots & \vdots & \vdots & \vdots & \ddots
\end{array} \right),\  K^B_X = \left(\begin{array}{ccccc}
D_{\rm 1,x,imag} & 0 & 0 & 0 & \dots \\
D_{\rm 2,x,imag} & 0 & 0 & 0 & \dots \\
0 & D_{\rm 2,x,imag} & 0 & 0 & \dots \\
D_{\rm 3,x,imag} & 0 & 0 & 0 & \dots \\
0 & D_{\rm 3,x,imag} & 0 & 0 & \dots \\
0 & 0 & D_{\rm 3,x,imag} & 0 & \dots \\
\vdots & \vdots & \vdots & \vdots & \ddots
\end{array} \right)

$H^F_Y, H^B_Y, K^F_Y,$ and $K^B_Y$ are also gien with the same manner. The suffices $F$ and $B$ stand for forward and bases antennas of a baseline.

We have

U_F^T U_F = \left(\begin{array}{ccccc}
0 & 0 & 0 & 0 & \dots \\
0 & 1 & 0 & 0 & \dots \\
0 & 0 & 2 & 0 & \dots \\
0 & 0 & 0 & 3 & \dots \\
\vdots & \vdots & \vdots & \vdots & \ddots
\end{array} \right), \ 
U_F^T U_B = \left(\begin{array}{ccccc}
0 & 0 & 0 & 0 & \dots \\
1 & 0 & 0 & 0 & \dots \\
1 & 1 & 0 & 0 & \dots \\
1 & 1 & 1 & 0 & \dots \\
\vdots & \vdots & \vdots & \vdots & \ddots
\end{array} \right), \\
U_B^T U_F = \left(\begin{array}{ccccc}
0 & 1 & 1 & 1 & \dots \\
0 & 0 & 1 & 1 & \dots \\
0 & 0 & 0 & 1 & \dots \\
0 & 0 & 0 & 0 & \dots \\
\vdots & \vdots & \vdots & \vdots & \ddots
\end{array} \right), \ 
U_B^T U_B = \left(\begin{array}{ccccc}
N_{ant}-1 & 0 & 0 & 0 & \dots \\
0 & N_{ant}-2 & 0 & 0 & \dots \\
0 & 0 & N_{ant}-3 & 0 & \dots \\
0 & 0 & 0 & N_{ant}-4 & \dots \\
\vdots & \vdots & \vdots & \vdots & \ddots
\end{array} \right)

Thus, $U_F^T U_F + U_B^T U_B = (N_{ant}-1)E$ and $U_F^TU_B + U_B^TU_F = 1 - E$.

If the first step starts using the initial values of D-term to be set zero, we can omit cross-D terms. This simplifies the equation into linearized one. On the other hand, the solution in the imaginary part will be uncertain because the determinant of $(U_F - U_B)^T(U_F - U_B)$ is zero. To avoid this problem, we have to fix the imaginary part of $D_X$ and $D_y$ for the reference antenna.
Let $U^{\prime}$ the matrix reducing the reference-antenna column, then we have

(U^{\prime T}_F U^{\prime}_B - U^{\prime T}_B U^{\prime}_F) = \left( \begin{array}{ccccc}
 N_a - 1 & -1 & -1 & -1 & \cdots \\
-1 & N_a - 1 & -1 & -1 & \cdots \\
-1 & -1 & N_a-1 & -1 & \cdots \\
-1 & -1 & -1 & N_a-1 & \cdots \\
\vdots & \vdots & \vdots & \vdots &\ddots
\end{array} \right)

and its inverse will be

\left( (U^{\prime T}_F U^{\prime}_B - U^{\prime T}_B U^{\prime}_F) \right)^{-1}= \frac{1}{N_a}\left( \begin{array}{ccccc}
 2 & 1 & 1 & 1 & \cdots \\
 1 & 2 & 1 & 1 & \cdots \\
 1 & 1 & 2 & 1 & \cdots \\
 1 & 1 & 1 & 2 & \cdots \\
\vdots & \vdots & \vdots & \vdots &\ddots
\end{array} \right) \tag{3.5}

Cross-hand solution

Cross-hand products, $\left< X_m Y^_n \right>$ and $\left< Y_m X^_n \right>$ relate to the Stokes parameters nad D-terms as

\left< X_m Y^*_n \right> = D^{n*}_y(I + QCpUS) + (1+ D^{m}_x D^{n*}_y)UCmQS + D^{m}_x (I - QCpUS) \\
\left< Y_m X^*_n \right> =D^{m}_y(I + QCpUS) + (1+ D^{m}_y D^{n*}_x)UCmQS + D^{n*}_x (I - QCpUS).

Inside the partial matrix

The partial matrix, $P$, has 4x4 segments:

P = \left(
\begin{array}{cccc}
P_{00} & P_{01} & P_{02} & P_{03} \\
P_{10} & P_{11} & P_{12} & P_{13} \\
P_{20} & P_{21} & P_{22} & P_{23} \\
P_{30} & P_{31} & P_{32} & P_{33} \\
\end{array} \right).

Each segment stands for

P_{0*} = \frac{\partial}{\partial *}\left< XY^* \right>_{\rm real}, \\ 
P_{1*} = \frac{\partial}{\partial *}\left< XY^* \right>_{\rm imag}, \\
P_{2*} = \frac{\partial}{\partial *}\left< YX^* \right>_{\rm real}, \\
P_{3*} = \frac{\partial}{\partial *}\left< YX^* \right>_{\rm imag}, \\
P_{*0} = \frac{\partial}{\partial D_{\rm x, real}}, \\
P_{*1} = \frac{\partial}{\partial D_{\rm x, imag}}, \\
P_{*2} = \frac{\partial}{\partial D_{\rm y, real}}, \\
P_{*3} = \frac{\partial}{\partial D_{\rm y, imag}}. \\

We have

P_{00} = (I - QCpUS)U_F + UCmQS \ H^F_Y \\
P_{01} = UCmQS \ K^F_Y  \\
P_{02} = (I - QCpUS)U_B + UCmQS \ H^B_X \\
P_{03} = UCmQS \ K^B_X  \\

and $P_{10} = -P_{01}$, $P_{11} = P_{00}$, $P_{12} = P_{03}$, $P_{13} = -P_{02}$, $P_{20} = (I - QCpUS)U_F + UCmQS \ H^F_Y$, $P_{21} = UCmQS\ K^F_Y$, $P_{22} = (I + QCpUS)U_F + UCmQS \ H^B_X$, $P_{23} = UCmQS \ K^F_X$, $P_{30} = UCmQS \ K^B_X$, $F_{31} = -(I - QCpUS) U_B - UCmQS \ K^B_X$, $F_{32} = UCmQS \ K^F_X$, and $P_{33} = (I + QCpUS) U_F - UCmQS \ H^B_X$.

Let us omit $UCmQS \ H$ and $UCmQS \ K$ terms. Then, $P^TP$ will be

P^TP =  \left( \begin{array}{cccc}
ImmQCpUS(U^T_F U_F + U^T_B U_B) & 0 & IpmQCpUS (U^T_F U_B + U^T_B U_F) & 0 \\
0 & ImmQCpUS(U^T_F U_F + U^T_B U_B) & 0 & -IpmQCpUS (U^T_F U_B + U^T_B U_F) \\
IpmQCpUS (U^T_B U_F + U^T_F U_B) & 0 & IppQCpUS (U^T_F U_F + U^T_B U_B) & 0 \\
0 & -IpmQCpUS(U^T_F U_B + U^T_B U_F) & 0 & IppQCpUS (U^T_F U_F + U^T_B U_B) \\
\end{array} \right)

Here are Python codes to get the $P^TP$ matrix:

#-------- <XY*> and <YX*> to determine Dx and Dy 
PTP = np.zeros([4*antNum, 4*antNum])            # (Dx.real, Dx.imag, Dy.real, Dy.imag)^2
PTP[0:2*antNum][:,0:2*antNum] = (ssqQCpUS - 2.0*sumQCpUS + PAnum)* (antNum - 1.0)* np.identity(2*antNum)
PTP[2*antNum:4*antNum][:,2*antNum:4*antNum] = (ssqQCpUS + 2.0*sumQCpUS + PAnum)* (antNum - 1.0)* np.identity(2*antNum) 
PTP[2*antNum:3*antNum][:,0:antNum] = (PAnum - ssqQCpUS) * (1.0 - np.identity(antNum))
PTP[0:antNum][:,2*antNum:3*antNum] = PTP[2*antNum:3*antNum][:,0:antNum]
PTP[3*antNum:4*antNum][:,antNum:2*antNum] = -PTP[2*antNum:3*antNum][:,0:antNum]
PTP[antNum:2*antNum][:,3*antNum:4*antNum] = -PTP[2*antNum:3*antNum][:,0:antNum]

The residual vector is also calculated using given PA, Stokes Q, U, and observed visibilities:

CS, SN = np.cos(2.0* PA), np.sin(2.0*PA)
QCpUS = Stokes[1]*CS + Stokes[2]*SN
UCmQS = Stokes[2]*CS - Stokes[1]*SN
#-------- Residual Vector
residXY, residYX = Vis[1] - UCmQS, Vis[2] - UCmQS
for ant_index in range(antNum):
    index0, index1 = np.where(ant0 == ant_index)[0].tolist(), np.where(ant1 == ant_index)[0].tolist()
    residXY[index0] -= Dx[ant_index]* (Stokes[0] - QCpUS)
    residXY[index1] -= Dy[ant_index].conjugate()* (Stokes[0] + QCpUS)
    residYX[index0] -= Dy[ant_index]* (Stokes[0] + QCpUS)
    residYX[index1] -= Dx[ant_index].conjugate()* (Stokes[0] - QCpUS)
#

The product of $P^T$ and the residual will be

PTRX, PTRY = np.zeros(antNum, dtype=complex), np.zeros(antNum, dtype=complex)
for ant_index in range(antNum):
    index0, index1 = np.where(ant0 == ant_index)[0].tolist(), np.where(ant1 == ant_index)[0].tolist()
    PTRX[ant_index] += (Stokes[0] - QCpUS).dot(np.sum(residXY[index0], axis=0))
    PTRX[ant_index] += (Stokes[0] - QCpUS).dot(np.sum(residYX[index1], axis=0).conjugate())
    PTRY[ant_index] += (Stokes[0] + QCpUS).dot(np.sum(residYX[index0], axis=0))
    PTRY[ant_index] += (Stokes[0] + QCpUS).dot(np.sum(residXY[index1], axis=0).conjugate())
#
resid = np.array([PTRX.real, PTRX.imag, PTRY.real, PTRY.imag]).reshape(4*antNum)

The solution for correction vector will be determined:

#-------- Solution
L = np.linalg.cholesky(PTP)
t = np.linalg.solve(L, resid)
Solution = np.linalg.solve(L.T, t)
#-------- Correction
Dx += Solution[0:antNum] + (1.0j)* Solution[antNum:2*antNum]
Dy += Solution[2*antNum:3*antNum] + (1.0j)* Solution[3*antNum:4*antNum]
1
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
1
0