This article addresses an algorithm to determine the cross-polarization leakage, a.k.a. D-term, through interferometric observations toward a polarization calibrator. The calibrator is assumed to have known Stokes parameters with significant linear polarization and zero circular polarization.
Basic Equation between Stokes Paramters, D-term, and correlations
Let
\boldsymbol{S} = \left(
\begin{array}{c}
I \\
Q \\
U \\
V \\
\end{array}
\right), \
\boldsymbol{X} = \left(
\begin{array}{c}
\left< XX^* \right> / ( G^m_X G^{n*}_X ) \\
\left< XY^* \right> / ( G^m_X G^{n*}_Y ) \\
\left< YX^* \right> / ( G^m_Y G^{n*}_X ) \\
\left< YY^* \right> / ( G^m_Y G^{n*}_Y ) \\
\end{array}
\right),
and
D =
\left(
\begin{array}{cccc}
1 & D^{n*}_X & D^m_X & D^m_X D^{n*}_X \\
D^{n*}_Y & 1 & D^m_X D^{n*}_Y & D^m_X \\
D^m_Y & D^m_Y D^{n*}_X & 1 & D^{n*}_X \\
D^m_Y D^{n*}_Y & D^m_Y & D^{n*}_Y & 1 \\
\end{array}
\right), \ \
P =
\left(
\begin{array}{cccc}
1 & \cos 2\psi & \sin 2\psi & 0 \\
0 & -\sin 2\psi & \cos 2\psi & i \\
0 & -\sin 2\psi & \cos 2\psi & -i \\
1 & -\cos 2\psi & -\sin 2\psi & 0 \\
\end{array}
\right),
where S is the Stokes parameters, $\left< XX^* \right>$ etc. are correlations, $G$ is the voltage-based complex gain, $D$ is the D-term, and $\psi$ is the parallactic angle. Then we have
\boldsymbol{X} = D P \boldsymbol{S}.
If $S$ is known (i.e. calibrators), we can determine D-terms by solving this equation.
Iterative Solution for D-terms
Solution conditions:
- Unknowns are Dx, Dy … $2N_{\rm ant}$ complex numbers. We have $4N_{\rm ant}$ unknown real parameters.
- We have $4 N_{\rm bl} = 2N_{\rm ant}(N_{\rm ant}-1)$ complex measurements, or $4N_{\rm ant}(N_{\rm ant}-1)$ real measurements.
- $PS$ is considered to be known.
Let $d$ as a vector of unknowns:
d = \left(
\begin{array}{c}
D_{\rm 0,x.real} \\
D_{\rm 1,x.real} \\
\vdots \\
D_{\rm 0,y.real} \\
\vdots \\
D_{\rm 0,x.imag} \\
D_{\rm 1,x.imag} \\
\vdots \\
D_{\rm 0,y.real} \\
D_{\rm 0,y.imag} \\
\vdots \\
D_{\rm Nant-1,y.imag} \\
\end{array} \right),
and let $A$ is a partial matrix between unknowns and measurements:
A = \left(
\begin{array}{cccccccccc}
\frac{\partial \left< X_1X^*_0 \right>_{\rm real}}{\partial D_{\rm 0, x.real}} &
\frac{\partial \left< X_1X^*_0 \right>_{\rm real}}{\partial D_{\rm 1, x.real}} &
\cdots &
\frac{\partial \left< X_1X^*_0 \right>_{\rm real}}{\partial D_{\rm 0, y.real}} &
\cdots &
\frac{\partial \left< X_1X^*_0 \right>_{\rm real}}{\partial D_{\rm 0, x.imag}} &
\cdots &
\frac{\partial \left< X_1X^*_0 \right>_{\rm real}}{\partial D_{\rm 1, x.imag}} &
\cdots &
\frac{\partial \left< X_1X^*_0 \right>_{\rm real}}{\partial D_{\rm Nant-1,y.imag}} \\
\frac{\partial \left< X_2X^*_0 \right>_{\rm real}}{\partial D_{\rm 0, x.real}} & \frac{\partial \left< X_2X^*_0 \right>_{\rm real}}{\partial D_{\rm 1, x.real}} & \cdots \\
\frac{\partial \left< X_2X^*_1 \right>_{\rm real}}{\partial D_{\rm 0, x.real}} & \frac{\partial \left< X_2X^*_1 \right>_{\rm real}}{\partial D_{\rm 1, x.real}} & \cdots \\
\vdots \\
\frac{\partial \left< X_1Y^*_0 \right>_{\rm real}}{\partial D_{\rm 0, x.real}} & \frac{\partial \left< X_1Y^*_0 \right>_{\rm real}}{\partial D_{\rm 1, x.real}} & \cdots \\
\frac{\partial \left< X_2Y^*_0 \right>_{\rm real}}{\partial D_{\rm 0, x.real}} & \frac{\partial \left< X_2Y^*_0 \right>_{\rm real}}{\partial D_{\rm 1, x.real}} & \cdots \\
\vdots \\
\frac{\partial \left< Y_1X^*_0 \right>_{\rm real}}{\partial D_{\rm 0, x.real}} & \frac{\partial \left< Y_1X^*_0 \right>_{\rm real}}{\partial D_{\rm 1, x.real}} & \cdots \\
\vdots \\
\frac{\partial \left< Y_1Y^*_0 \right>_{\rm real}}{\partial D_{\rm 0, x.real}} & \frac{\partial \left< Y_1Y^*_0 \right>_{\rm real}}{\partial D_{\rm 1, x.real}} & \cdots \\
\vdots \\
\frac{\partial \left< Y_{\rm Nant - 1}Y^*_{\rm Nant - 2} \right>_{\rm real}}{\partial D_{\rm 0, x.real}} & \frac{\partial \left< Y_{\rm Nant - 1}Y^*_{\rm Nant - 2} \right>_{\rm real}}{\partial D_{\rm 1, x.real}} & \cdots \\
\vdots \\
{\rm and \ imaginary \ parts}
\end{array} \right).
Then we have iterative solution:
{\rm residual} \leftarrow X - D_{\rm trial} PS \\
{\rm correction} \leftarrow A \cdot {\rm residual} \\
D_{\rm improved} \leftarrow D_{\rm trial} + {\rm correction}
Parallel-hand solution (for initial model)
Because parallel-hand products have much greater amplitude and systematic error than cross-hand products, it is better to start only with the parallel-hand to obtain the initial model of $D$ and then later use cross-hand products for fine solution.
The parallel-hand productrs are described as
\left< XX^* \right> = I + QCpUS + (D^{m}_x + D^{n*}_x)UCmQS + D^{m}_x D^{n*}_x (I - QCpUS) \\
\left< YY^* \right> = I - QCpUS + (D^{m}_y + D^{n*}_y)UCmQS + D^{m}_y D^{n*}_y (I + QCpUS)
where $QCpUS = Q \cos 2\psi + U \sin 2\psi$ and $UCmQS = U \cos 2\psi - Q \sin 2\psi$.
The subset of the partial matrix, A, that respond to the parallel-hand products is composed of two $2 \times 2$ quadrants as
B = \left(
\begin{array}{cccc}
B_{00} & B_{01} \\
B_{10} & B_{11} \\
\end{array} \right)\ {\rm and} \ C = \left(
\begin{array}{cccc}
C_{00} & C_{01} \\
C_{10} & C_{11} \\
\end{array} \right)
where
B_{00} = \frac{\partial \left< XX^* \right>_{\rm real}}{\partial D_{\rm x, real}}, \ B_{01} = \frac{\partial \left< XX^* \right>_{\rm real}}{\partial D_{\rm x, imag}}, \ B_{10} = \frac{\partial \left< XX^* \right>_{\rm imag}}{\partial D_{\rm x, real}}, \dots
and
C_{00} = \frac{\partial \left< YY^* \right>_{\rm real}}{\partial D_{\rm y, real}}, \ C_{01} = \frac{\partial \left< YY^* \right>_{\rm real}}{\partial D_{\rm y, imag}}, \ C_{10} = \frac{\partial \left< YY^* \right>_{\rm imag}}{\partial D_{\rm y, real}}, \dots
etc. and, we have
B_{00} = UCmQS (U_F + U_B) + (I - QCpUS) (H^F_X + H^B_X) \\
B_{01} = (I - QCpUS) (K^F_X + K^B_X) \\
B_{10} = (I - QCpUS) (K^B_X - K^F_X) \\
B_{11} = UCmQS (U_F - U_B) + (I - QCpUS) (H^F_X - H^B_X) \\
C_{00} = UCmQS (U_F + U_B) + (I + QCpUS) (H^F_Y + H^B_Y) \\
C_{01} = (I + QCpUS) (K^F_Y + K^B_Y) \\
C_{10} = (I + QCpUS) (K^B_Y - K^F_Y) \\
C_{11} = UCmQS (U_F - U_B) + (I + QCpUS) H^F_Y - H^B_Y)
where
U_F = \left(\begin{array}{ccccc}
0 & 1 & 0 & 0 & \dots \\
0 & 0 & 1 & 0 & \dots \\
0 & 0 & 1 & 0 & \dots \\
0 & 0 & 0 & 1 & \dots \\
0 & 0 & 0 & 1 & \dots \\
0 & 0 & 0 & 1 & \dots \\
\vdots & \vdots & \vdots & \vdots & \ddots
\end{array} \right), \ U_B = \left(\begin{array}{ccccc}
1 & 0 & 0 & 0 & \dots \\
1 & 0 & 0 & 0 & \dots \\
0 & 1 & 0 & 0 & \dots \\
1 & 0 & 0 & 0 & \dots \\
0 & 1 & 0 & 0 & \dots \\
0 & 0 & 1 & 0 & \dots \\
\vdots & \vdots & \vdots & \vdots & \ddots
\end{array} \right), \\
H^F_X = \left(\begin{array}{ccccc}
0 & D_{\rm 0,x,real} & 0 & 0 & \dots \\
0 & 0 & D_{\rm 0,x,real} & 0 & \dots \\
0 & 0 & D_{\rm 1,x,real} & 0 & \dots \\
0 & 0 & 0 & D_{\rm 0,x,real} & \dots \\
0 & 0 & 0 & D_{\rm 1,x,real} & \dots \\
0 & 0 & 0 & D_{\rm 2,x,real} & \dots \\
\vdots & \vdots & \vdots & \vdots & \ddots
\end{array} \right),\ H^B_X = \left(\begin{array}{ccccc}
D_{\rm 1,x,real} & 0 & 0 & 0 & \dots \\
D_{\rm 2,x,real} & 0 & 0 & 0 & \dots \\
0 & D_{\rm 2,x,real} & 0 & 0 & \dots \\
D_{\rm 3,x,real} & 0 & 0 & 0 & \dots \\
0 & D_{\rm 3,x,real} & 0 & 0 & \dots \\
0 & 0 & D_{\rm 3,x,real} & 0 & \dots \\
\vdots & \vdots & \vdots & \vdots & \ddots
\end{array} \right), \\
K^F_X = \left(\begin{array}{ccccc}
0 & D_{\rm 0,x,imag} & 0 & 0 & \dots \\
0 & 0 & D_{\rm 0,x,imag} & 0 & \dots \\
0 & 0 & D_{\rm 1,x,imag} & 0 & \dots \\
0 & 0 & 0 & D_{\rm 0,x,imag} & \dots \\
0 & 0 & 0 & D_{\rm 1,x,imag} & \dots \\
0 & 0 & 0 & D_{\rm 2,x,imag} & \dots \\
\vdots & \vdots & \vdots & \vdots & \ddots
\end{array} \right),\ K^B_X = \left(\begin{array}{ccccc}
D_{\rm 1,x,imag} & 0 & 0 & 0 & \dots \\
D_{\rm 2,x,imag} & 0 & 0 & 0 & \dots \\
0 & D_{\rm 2,x,imag} & 0 & 0 & \dots \\
D_{\rm 3,x,imag} & 0 & 0 & 0 & \dots \\
0 & D_{\rm 3,x,imag} & 0 & 0 & \dots \\
0 & 0 & D_{\rm 3,x,imag} & 0 & \dots \\
\vdots & \vdots & \vdots & \vdots & \ddots
\end{array} \right)
$H^F_Y, H^B_Y, K^F_Y,$ and $K^B_Y$ are also gien with the same manner. The suffices $F$ and $B$ stand for forward and bases antennas of a baseline.
We have
U_F^T U_F = \left(\begin{array}{ccccc}
0 & 0 & 0 & 0 & \dots \\
0 & 1 & 0 & 0 & \dots \\
0 & 0 & 2 & 0 & \dots \\
0 & 0 & 0 & 3 & \dots \\
\vdots & \vdots & \vdots & \vdots & \ddots
\end{array} \right), \
U_F^T U_B = \left(\begin{array}{ccccc}
0 & 0 & 0 & 0 & \dots \\
1 & 0 & 0 & 0 & \dots \\
1 & 1 & 0 & 0 & \dots \\
1 & 1 & 1 & 0 & \dots \\
\vdots & \vdots & \vdots & \vdots & \ddots
\end{array} \right), \\
U_B^T U_F = \left(\begin{array}{ccccc}
0 & 1 & 1 & 1 & \dots \\
0 & 0 & 1 & 1 & \dots \\
0 & 0 & 0 & 1 & \dots \\
0 & 0 & 0 & 0 & \dots \\
\vdots & \vdots & \vdots & \vdots & \ddots
\end{array} \right), \
U_B^T U_B = \left(\begin{array}{ccccc}
N_{ant}-1 & 0 & 0 & 0 & \dots \\
0 & N_{ant}-2 & 0 & 0 & \dots \\
0 & 0 & N_{ant}-3 & 0 & \dots \\
0 & 0 & 0 & N_{ant}-4 & \dots \\
\vdots & \vdots & \vdots & \vdots & \ddots
\end{array} \right)
Thus, $U_F^T U_F + U_B^T U_B = (N_{ant}-1)E$ and $U_F^TU_B + U_B^TU_F = 1 - E$.
If the first step starts using the initial values of D-term to be set zero, we can omit cross-D terms. This simplifies the equation into linearized one. On the other hand, the solution in the imaginary part will be uncertain because the determinant of $(U_F - U_B)^T(U_F - U_B)$ is zero. To avoid this problem, we have to fix the imaginary part of $D_X$ and $D_y$ for the reference antenna.
Let $U^{\prime}$ the matrix reducing the reference-antenna column, then we have
(U^{\prime T}_F U^{\prime}_B - U^{\prime T}_B U^{\prime}_F) = \left( \begin{array}{ccccc}
N_a - 1 & -1 & -1 & -1 & \cdots \\
-1 & N_a - 1 & -1 & -1 & \cdots \\
-1 & -1 & N_a-1 & -1 & \cdots \\
-1 & -1 & -1 & N_a-1 & \cdots \\
\vdots & \vdots & \vdots & \vdots &\ddots
\end{array} \right)
and its inverse will be
\left( (U^{\prime T}_F U^{\prime}_B - U^{\prime T}_B U^{\prime}_F) \right)^{-1}= \frac{1}{N_a}\left( \begin{array}{ccccc}
2 & 1 & 1 & 1 & \cdots \\
1 & 2 & 1 & 1 & \cdots \\
1 & 1 & 2 & 1 & \cdots \\
1 & 1 & 1 & 2 & \cdots \\
\vdots & \vdots & \vdots & \vdots &\ddots
\end{array} \right) \tag{3.5}
Cross-hand solution
Cross-hand products, $\left< X_m Y^*_n \right>$ and $\left< Y_m X^*_n \right>$ relate to the Stokes parameters nad D-terms as
\left< X_m Y^*_n \right> = D^{n*}_y(I + QCpUS) + (1+ D^{m}_x D^{n*}_y)UCmQS + D^{m}_x (I - QCpUS) \\
\left< Y_m X^*_n \right> =D^{m}_y(I + QCpUS) + (1+ D^{m}_y D^{n*}_x)UCmQS + D^{n*}_x (I - QCpUS).
Inside the partial matrix
The partial matrix, $P$, has 4x4 segments:
P = \left(
\begin{array}{cccc}
P_{00} & P_{01} & P_{02} & P_{03} \\
P_{10} & P_{11} & P_{12} & P_{13} \\
P_{20} & P_{21} & P_{22} & P_{23} \\
P_{30} & P_{31} & P_{32} & P_{33} \\
\end{array} \right).
Each segment stands for
P_{0*} = \frac{\partial}{\partial *}\left< XY^* \right>_{\rm real}, \\
P_{1*} = \frac{\partial}{\partial *}\left< XY^* \right>_{\rm imag}, \\
P_{2*} = \frac{\partial}{\partial *}\left< YX^* \right>_{\rm real}, \\
P_{3*} = \frac{\partial}{\partial *}\left< YX^* \right>_{\rm imag}, \\
P_{*0} = \frac{\partial}{\partial D_{\rm x, real}}, \\
P_{*1} = \frac{\partial}{\partial D_{\rm x, imag}}, \\
P_{*2} = \frac{\partial}{\partial D_{\rm y, real}}, \\
P_{*3} = \frac{\partial}{\partial D_{\rm y, imag}}. \\
We have
P_{00} = (I - QCpUS)U_F + UCmQS \ H^F_Y \\
P_{01} = UCmQS \ K^F_Y \\
P_{02} = (I - QCpUS)U_B + UCmQS \ H^B_X \\
P_{03} = UCmQS \ K^B_X \\
and $P_{10} = -P_{01}$, $P_{11} = P_{00}$, $P_{12} = P_{03}$, $P_{13} = -P_{02}$, $P_{20} = (I - QCpUS)U_F + UCmQS \ H^F_Y$, $P_{21} = UCmQS\ K^F_Y$, $P_{22} = (I + QCpUS)U_F + UCmQS \ H^B_X$, $P_{23} = UCmQS \ K^F_X$, $P_{30} = UCmQS \ K^B_X$, $F_{31} = -(I - QCpUS) U_B - UCmQS \ K^B_X$, $F_{32} = UCmQS \ K^F_X$, and $P_{33} = (I + QCpUS) U_F - UCmQS \ H^B_X$.
Let us omit $UCmQS \ H$ and $UCmQS \ K$ terms. Then, $P^TP$ will be
P^TP = \left( \begin{array}{cccc}
ImmQCpUS(U^T_F U_F + U^T_B U_B) & 0 & IpmQCpUS (U^T_F U_B + U^T_B U_F) & 0 \\
0 & ImmQCpUS(U^T_F U_F + U^T_B U_B) & 0 & -IpmQCpUS (U^T_F U_B + U^T_B U_F) \\
IpmQCpUS (U^T_B U_F + U^T_F U_B) & 0 & IppQCpUS (U^T_F U_F + U^T_B U_B) & 0 \\
0 & -IpmQCpUS(U^T_F U_B + U^T_B U_F) & 0 & IppQCpUS (U^T_F U_F + U^T_B U_B) \\
\end{array} \right)
Here are Python codes to get the $P^TP$ matrix:
#-------- <XY*> and <YX*> to determine Dx and Dy
PTP = np.zeros([4*antNum, 4*antNum]) # (Dx.real, Dx.imag, Dy.real, Dy.imag)^2
PTP[0:2*antNum][:,0:2*antNum] = (ssqQCpUS - 2.0*sumQCpUS + PAnum)* (antNum - 1.0)* np.identity(2*antNum)
PTP[2*antNum:4*antNum][:,2*antNum:4*antNum] = (ssqQCpUS + 2.0*sumQCpUS + PAnum)* (antNum - 1.0)* np.identity(2*antNum)
PTP[2*antNum:3*antNum][:,0:antNum] = (PAnum - ssqQCpUS) * (1.0 - np.identity(antNum))
PTP[0:antNum][:,2*antNum:3*antNum] = PTP[2*antNum:3*antNum][:,0:antNum]
PTP[3*antNum:4*antNum][:,antNum:2*antNum] = -PTP[2*antNum:3*antNum][:,0:antNum]
PTP[antNum:2*antNum][:,3*antNum:4*antNum] = -PTP[2*antNum:3*antNum][:,0:antNum]
The residual vector is also calculated using given PA, Stokes Q, U, and observed visibilities:
CS, SN = np.cos(2.0* PA), np.sin(2.0*PA)
QCpUS = Stokes[1]*CS + Stokes[2]*SN
UCmQS = Stokes[2]*CS - Stokes[1]*SN
#-------- Residual Vector
residXY, residYX = Vis[1] - UCmQS, Vis[2] - UCmQS
for ant_index in range(antNum):
index0, index1 = np.where(ant0 == ant_index)[0].tolist(), np.where(ant1 == ant_index)[0].tolist()
residXY[index0] -= Dx[ant_index]* (Stokes[0] - QCpUS)
residXY[index1] -= Dy[ant_index].conjugate()* (Stokes[0] + QCpUS)
residYX[index0] -= Dy[ant_index]* (Stokes[0] + QCpUS)
residYX[index1] -= Dx[ant_index].conjugate()* (Stokes[0] - QCpUS)
#
The product of $P^T$ and the residual will be
PTRX, PTRY = np.zeros(antNum, dtype=complex), np.zeros(antNum, dtype=complex)
for ant_index in range(antNum):
index0, index1 = np.where(ant0 == ant_index)[0].tolist(), np.where(ant1 == ant_index)[0].tolist()
PTRX[ant_index] += (Stokes[0] - QCpUS).dot(np.sum(residXY[index0], axis=0))
PTRX[ant_index] += (Stokes[0] - QCpUS).dot(np.sum(residYX[index1], axis=0).conjugate())
PTRY[ant_index] += (Stokes[0] + QCpUS).dot(np.sum(residYX[index0], axis=0))
PTRY[ant_index] += (Stokes[0] + QCpUS).dot(np.sum(residXY[index1], axis=0).conjugate())
#
resid = np.array([PTRX.real, PTRX.imag, PTRY.real, PTRY.imag]).reshape(4*antNum)
The solution for correction vector will be determined:
#-------- Solution
L = np.linalg.cholesky(PTP)
t = np.linalg.solve(L, resid)
Solution = np.linalg.solve(L.T, t)
#-------- Correction
Dx += Solution[0:antNum] + (1.0j)* Solution[antNum:2*antNum]
Dy += Solution[2*antNum:3*antNum] + (1.0j)* Solution[3*antNum:4*antNum]