Recycling on a Cosmic Scale: Extracting New Information from Old Data Sets

Publikation: Bog/antologi/afhandling/rapportPh.d.-afhandlingForskning

  • Kristoffer Stensbo-Smidt
Astronomy and astrophysics are entering a data-rich era. Large surveys have, quite literally,
seen the light in the past decade, with more and larger telescopes to follow in the
coming years. Data is now so abundant that making use of all the information is a difficult
tasks. This thesis sets out from the assumption that there is more to gain from available
data sets – new information from old data. Three contributions in this direction are considered.
Firstly, a novel texture descriptor for parametrising galaxy morphology is presented. It
uses the shape index and curvedness of local regions in images of galaxies and condenses
information about the local structure to a single value. It is argued that this value can be
interpreted as indicating regions of morphological interest, for example regions of newly
formed stars, of gas and dust, spiral arms etc. The descriptor is shown to extract information
about a galaxy’s specific star formation rate from its images that the usual spectra
energy distribution (SED) fitting misses.
Secondly, a method to evaluate the information content of various features for a given
task is introduced. Selecting the right features, for example colours or magnitudes, for a
specific task can be difficult and often relies on which have been used traditionally. With
current and future surveys giving researchers access to hundreds of features, it is time
to challenge old assumptions on which to use. A completely general method for feature
selection is introduced and shown to increase accuracy of both redshift and specific star
formation estimations.
Thirdly, the problem of quality assessment of quasar candidates is considered. Detection
pipelines searching the sky for quasars produce thousands of candidates, many of
which can be discarded with simple checks. The rest, however, cannot, and images of these
candidates must be manually inspected and evaluated. Still, more than 90% of these can
be false positives, wasting precious time for researchers and forcing a limitation of the
scopes of the detection pipelines. A set of features based on image analysis is presented
and shown to be able to detect the most common situations of false positive quasar candidates.
Incorporation of the derived features into a machine learning frameworks is reviewed
and future directions are discussed.
OriginalsprogEngelsk
ForlagDepartment of Computer Science, Faculty of Science, University of Copenhagen
StatusUdgivet - 2016

ID: 172265953