By Patrick Aleo, on behalf of the SNAD team
Location of three ZTF fields analysed in this work with marked anomaly candidates
The SNAD team - an international network formed by researchers from Russia, France and the USA - has developed a pipeline
to find rare and exotic objects among the haystacks of data from astronomical surveys.
Given the ever increasing size of astronomical data sets, even if our telescopes do detect unexpected interesting astronomical phenomena,
it is very unlikely that we will be able to recognize them in the middle of millions or even billions of observations.
The solution is to develop automatic tools specifically designed to recognize unusual behaviors hidden among billions of measurements.
Some of these tools already exist and are employed, for example, to identify fraud credit card activities among millions of transactions every day.
However, their adaptation to scientific astronomical data is not straightforward due to complications risen from the nature of observations in astronomy.
The SNAD team has been working for 3 years in the development and adaptations of such solutions to the context of astronomy.
During their last annual meeting, the group focused their efforts on objects whose brightness varies with time. The pipeline combines the strengths
of machine learning algorithms and the irreplaceable knowledge from human experts to build a robust anomaly detection tool. The article describes
results from applying this framework to the third data release of the Zwicky Transient Facility. Its three stage process involved feature extraction
on light curves (which tracks the brightness of objects over time), search for anomaly candidates using several machine learning algorithms and manually
filtering of candidates by a human expert. This last stage also included performing observations with other telescopes whenever possible. In this study,
4 automatic learning algorithms were used to flag 277 anomaly candidates for human investigation - out of an initial data set of 2.25 million objects.
The group also developed a specially designed web interface which allowed immediate visualization and cross-match of each candidate with existing
astronomical catalogs. This was constructed in order to facilitate the work of the experts who need to correlate the anomaly candidates with any other
publicly available information about the sky coordinates under investigation.
From the 277 objects considered as anomalous by the machine, 188 (68%) were found to display unusual features due to non-astrophysical effects
(including defects due to ZTF's image subtraction pipeline), 66 (24%) were objects already cataloged before and 23 (8%) were previously unknown objects.
The first category includes some amusing curiosities and the two latter cases of scientific interest. For example, one object flagged as anomaly by
the machine was actually the occultation of a background star by the Barcelona asteroid, which from the point of view of an observer from Earth was
detected as a variable point source when in reality neither the star nor the asteroid actually changed brightness. The authors also characterised
reoccurring and exotic image subtraction artefacts which interfere with light curve analysis and can trick an anomaly detection pipeline into thinking
it is a real, anomalous object. In order to help quickly sort the first class from the remaining candidates, they were able to identify a simple
bi-dimensional relation which can be used to aid filtering potentially bogus light curves in future studies.
Among the second and third categories, the authors found 4 supernovae candidates, 6 previously unclassified eclipsing binaries, 4 pre-main-sequence
candidates, 1 possible red dwarf flare, and spectroscopically confirmed a RS Canum Venaticorum star, among other anomaly candidates.
Quickly and effortlessly separating artefacts from interesting anomaly candidates are crucial for current and soon-approaching next generation
observatories, such as the Vera Rubin Observatory Legacy Survey of Space and Time (LSST). LSST will generate roughly 10 million alerts per night,
and sophisticated and robust algorithms will be needed to sift through all that data so important and interesting objects are not missed, and
scientists can better understand these space oddities.
Lead author Konstantin Malanchev, researcher at the University of Illinois at Urbana-Champaign (USA) and Sternberg astronomical instute of the Lomonosov Moscow State University (Russia), emphasizes that “designing specifically
dedicated tools to search for astrophysically interesting anomalies is our only option to ensure the full exploitation of data sets we fought so
hard to acquire. The SNAD team is fully committed to help the astronomical community in exploring the full potential of future data sets.”
The article has been accepted for publication in Monthly Notices of the Royal Astronomical Society and is also publicly available as a pre-print.
The source code and results, including a complete list of objects with potential scientific application, as well as the pipeline techniques, are open
to the public for the benefit of and verification by the astronomical community.