Recycling on a Cosmic Scale: Extracting New Information from Old Data Sets

Publikation: Bog/antologi/afhandling/rapportPh.d.-afhandlingForskning

Standard

Recycling on a Cosmic Scale : Extracting New Information from Old Data Sets. / Stensbo-Smidt, Kristoffer.

Department of Computer Science, Faculty of Science, University of Copenhagen, 2016.

Publikation: Bog/antologi/afhandling/rapportPh.d.-afhandlingForskning

Harvard

Stensbo-Smidt, K 2016, Recycling on a Cosmic Scale: Extracting New Information from Old Data Sets. Department of Computer Science, Faculty of Science, University of Copenhagen. <https://soeg.kb.dk/permalink/45KBDK_KGL/fbp0ps/alma99122536952205763>

APA

Stensbo-Smidt, K. (2016). Recycling on a Cosmic Scale: Extracting New Information from Old Data Sets. Department of Computer Science, Faculty of Science, University of Copenhagen. https://soeg.kb.dk/permalink/45KBDK_KGL/fbp0ps/alma99122536952205763

Vancouver

Stensbo-Smidt K. Recycling on a Cosmic Scale: Extracting New Information from Old Data Sets. Department of Computer Science, Faculty of Science, University of Copenhagen, 2016.

Author

Stensbo-Smidt, Kristoffer. / Recycling on a Cosmic Scale : Extracting New Information from Old Data Sets. Department of Computer Science, Faculty of Science, University of Copenhagen, 2016.

Bibtex

@phdthesis{65612a4bc6ca44aba7ef1b3942a58943,
title = "Recycling on a Cosmic Scale: Extracting New Information from Old Data Sets",
abstract = "Astronomy and astrophysics are entering a data-rich era. Large surveys have, quite literally,seen the light in the past decade, with more and larger telescopes to follow in thecoming years. Data is now so abundant that making use of all the information is a difficulttasks. This thesis sets out from the assumption that there is more to gain from availabledata sets – new information from old data. Three contributions in this direction are considered.Firstly, a novel texture descriptor for parametrising galaxy morphology is presented. Ituses the shape index and curvedness of local regions in images of galaxies and condensesinformation about the local structure to a single value. It is argued that this value can beinterpreted as indicating regions of morphological interest, for example regions of newlyformed stars, of gas and dust, spiral arms etc. The descriptor is shown to extract informationabout a galaxy{\textquoteright}s specific star formation rate from its images that the usual spectraenergy distribution (SED) fitting misses.Secondly, a method to evaluate the information content of various features for a giventask is introduced. Selecting the right features, for example colours or magnitudes, for aspecific task can be difficult and often relies on which have been used traditionally. Withcurrent and future surveys giving researchers access to hundreds of features, it is timeto challenge old assumptions on which to use. A completely general method for featureselection is introduced and shown to increase accuracy of both redshift and specific starformation estimations.Thirdly, the problem of quality assessment of quasar candidates is considered. Detectionpipelines searching the sky for quasars produce thousands of candidates, many ofwhich can be discarded with simple checks. The rest, however, cannot, and images of thesecandidates must be manually inspected and evaluated. Still, more than 90% of these canbe false positives, wasting precious time for researchers and forcing a limitation of thescopes of the detection pipelines. A set of features based on image analysis is presentedand shown to be able to detect the most common situations of false positive quasar candidates.Incorporation of the derived features into a machine learning frameworks is reviewedand future directions are discussed.",
author = "Kristoffer Stensbo-Smidt",
year = "2016",
language = "English",
publisher = "Department of Computer Science, Faculty of Science, University of Copenhagen",

}

RIS

TY - BOOK

T1 - Recycling on a Cosmic Scale

T2 - Extracting New Information from Old Data Sets

AU - Stensbo-Smidt, Kristoffer

PY - 2016

Y1 - 2016

N2 - Astronomy and astrophysics are entering a data-rich era. Large surveys have, quite literally,seen the light in the past decade, with more and larger telescopes to follow in thecoming years. Data is now so abundant that making use of all the information is a difficulttasks. This thesis sets out from the assumption that there is more to gain from availabledata sets – new information from old data. Three contributions in this direction are considered.Firstly, a novel texture descriptor for parametrising galaxy morphology is presented. Ituses the shape index and curvedness of local regions in images of galaxies and condensesinformation about the local structure to a single value. It is argued that this value can beinterpreted as indicating regions of morphological interest, for example regions of newlyformed stars, of gas and dust, spiral arms etc. The descriptor is shown to extract informationabout a galaxy’s specific star formation rate from its images that the usual spectraenergy distribution (SED) fitting misses.Secondly, a method to evaluate the information content of various features for a giventask is introduced. Selecting the right features, for example colours or magnitudes, for aspecific task can be difficult and often relies on which have been used traditionally. Withcurrent and future surveys giving researchers access to hundreds of features, it is timeto challenge old assumptions on which to use. A completely general method for featureselection is introduced and shown to increase accuracy of both redshift and specific starformation estimations.Thirdly, the problem of quality assessment of quasar candidates is considered. Detectionpipelines searching the sky for quasars produce thousands of candidates, many ofwhich can be discarded with simple checks. The rest, however, cannot, and images of thesecandidates must be manually inspected and evaluated. Still, more than 90% of these canbe false positives, wasting precious time for researchers and forcing a limitation of thescopes of the detection pipelines. A set of features based on image analysis is presentedand shown to be able to detect the most common situations of false positive quasar candidates.Incorporation of the derived features into a machine learning frameworks is reviewedand future directions are discussed.

AB - Astronomy and astrophysics are entering a data-rich era. Large surveys have, quite literally,seen the light in the past decade, with more and larger telescopes to follow in thecoming years. Data is now so abundant that making use of all the information is a difficulttasks. This thesis sets out from the assumption that there is more to gain from availabledata sets – new information from old data. Three contributions in this direction are considered.Firstly, a novel texture descriptor for parametrising galaxy morphology is presented. Ituses the shape index and curvedness of local regions in images of galaxies and condensesinformation about the local structure to a single value. It is argued that this value can beinterpreted as indicating regions of morphological interest, for example regions of newlyformed stars, of gas and dust, spiral arms etc. The descriptor is shown to extract informationabout a galaxy’s specific star formation rate from its images that the usual spectraenergy distribution (SED) fitting misses.Secondly, a method to evaluate the information content of various features for a giventask is introduced. Selecting the right features, for example colours or magnitudes, for aspecific task can be difficult and often relies on which have been used traditionally. Withcurrent and future surveys giving researchers access to hundreds of features, it is timeto challenge old assumptions on which to use. A completely general method for featureselection is introduced and shown to increase accuracy of both redshift and specific starformation estimations.Thirdly, the problem of quality assessment of quasar candidates is considered. Detectionpipelines searching the sky for quasars produce thousands of candidates, many ofwhich can be discarded with simple checks. The rest, however, cannot, and images of thesecandidates must be manually inspected and evaluated. Still, more than 90% of these canbe false positives, wasting precious time for researchers and forcing a limitation of thescopes of the detection pipelines. A set of features based on image analysis is presentedand shown to be able to detect the most common situations of false positive quasar candidates.Incorporation of the derived features into a machine learning frameworks is reviewedand future directions are discussed.

UR - https://soeg.kb.dk/permalink/45KBDK_KGL/fbp0ps/alma99122536952205763

M3 - Ph.D. thesis

BT - Recycling on a Cosmic Scale

PB - Department of Computer Science, Faculty of Science, University of Copenhagen

ER -

ID: 172265953