Radio Frequency Interference (RFI) corrupts astronomical measurements, thus
affecting the performance of radio telescopes. To address this problem,
supervised segmentation models have been proposed as candidate solutions to RFI
detection. However, the unavailability of large labelled datasets, due to the
prohibitive cost of annotating, makes these solutions unusable. To solve these
shortcomings, we focus on the inverse problem; training models on only
uncontaminated emissions thereby learning to discriminate RFI from all known
astronomical signals and system noise. We use Nearest-Latent-Neighbours (NLN) -
an algorithm that utilises both the reconstructions and latent distances to the
nearest-neighbours in the latent space of generative autoencoding models for
novelty detection. The uncontaminated regions are selected using weak-labels in
the form of RFI flags (generated by classical RFI flagging methods) available
from most radio astronomical data archives at no additional cost. We evaluate
performance on two independent datasets, one simulated from the HERA telescope
and another consisting of real observations from LOFAR telescope. Additionally,
we provide a small expert-labelled LOFAR dataset (i.e., strong labels) for
evaluation of our and other methods. Performance is measured using AUROC, AUPRC
and the maximum F1-score for a fixed threshold. For the simulated data we
outperform the current state-of-the-art by approximately 1% in AUROC and 3% in
AUPRC for the HERA dataset. Furthermore, our algorithm offers both a 4%
increase in AUROC and AUPRC at a cost of a degradation in F1-score performance
for the LOFAR dataset, without any manual labelling.
Preprint
Subject: Astrophysics - Instrumentation and Methods for Astrophysics