I have been working on quasar target selection problem for awhile. Essentially, this is a classification problem where one want to identify the objects in the sky as quasars or stars based on their flux measurement. The problem is easy for the low-redshift range because there is a clear separation between quasars and stella objects, but as for the medium- and high-redshift ranges, quasar target selection becomes more difficult. For z>2.2, objects must be targeted down to g=22 mag, where the photometric measurement uncertainty becomes substantial. Moreover, at z = 2.8, the quasar and stella loci cross in color space.
Despite the challenges of the problem itself, it is very important to me to understand why such a distant object is worth detected at all. So I did some researches and came up with a simple explanation.
Shortly after the Big Bang, the cosmic plasma composed of photons and baryons were excited by the initial perturbation. Initially, the pressure from the cosmic microwave background keeps the photon+baryon plasma from decoupling. This plasma acts like a sound wave that moves outward until the Universe becomes neutral at redshift 1000. As the Universe has cooled enough, the proton captures the electron to form neutral Hydrogen, which also decouple the photons from the baryons. Photons continue to stream away, leading to the dramatic acoustic oscillations seen in cosmic microwave background anisotropy data. The baryons, on the other hand, remain in place and leave the baryon peak stalled at about 150 comoving Mpc. This causes a small excess in number of pairs of galaxies separated by such distance. These features are often referred to as the baryon acoustic oscillations (BAO). BAO determine the rate of growth of cosmic structure with the overall expansion of the universe. The observability of BAO will help cosmologists measure the expansion history of the universe and thereby a probe of cosmic dark energy.
In principle, BAO can also be observed in all forms of cosmic structure including the distribution of intergalactic medium as probed by the Lyman alpha forest (LAF). The LAF can be seen in the spectra of high redshift quasars. To detect BAO in the LAF, one may cross-correlate absorption spectra in widely separate quasar pairs. This has been previously impossible due to lack of sufficient data. Therefore, detection of sufficiently large number of high redshift quasars becomes substantially important.
After working in this direction for awhile, I have a feeling that machine learning in astronomy has not been explored much. There might be some open problems that one can tackle from machine learning point of view. I have also got this inspiration from the talk by David Hogg.