Category Archives: Discussion

Supernova Classification

It has been a long week. We had the nips deadline on Friday. Fortunately, we manage to submit the papers. Let's keep the finger crossed! I was very fortunate to receive very constructive comments about my nips paper from Ross Fedely. David Hogg also gave last-minute comments which helped improve the paper further (and special thank goes to Bernhard for that). Here is a picture of us working toward the deadline:

2013-05-31 16.48.36

After submitting the papers, we hanged out with many people, including Rob Fergus, in the park to celebrate our submissions.

Right. The main topic of this post is about supernova classification. Early this week, Bernhard and I had a quick meeting with astronomy people from CCPP (Center for Cosmology and Particle Physics). They are working on the problem of supernova classification (identifying the type of supernova from their spectra), and are interested in applying machine learning techniques to solve this problem. Briefly, the main challenge of this problem is the fact that the supernova itself change over time. That is, it can belong to different type depending on when it is observed. . Another challenge of this problem is that we have a small dataset, usually in the order of hundred.

According to wikipedia, a supernova is an energetic explosion of a star. The explosion can be triggered either by the reignition of nuclear fusion in a degenerate star or by the collapse of the core of a massive star, Either way, a massive amount of energy is generated. Interestingly, the expanding shock waves of supernova explosions can trigger the formation of new stars.

Supernovae are important in cosmology because maximum intensities of their explosions could be used as "standard candles". Briefly, it helps astronomers indicate the astronomical distances.

One of the previous works used the correlation between the objects' spectra and set of templates to identify their type. I will read the paper on the weekend and see if we can build something better than just simple correlation.

Does incorporating prior cause additional uncertainty?

I have recently thought about the question that I had long time ago. This question has arisen during the discussion at the astro-imaging workshop in Switzerland. If I remembered correctly, the discussion went along the two school of thoughts on how to model the astronomical images. The Frequentist school of thought was primarily supported by Stefan Harmeling. Christian Schule and Rob Fergus, on the other hand, represented the Bayesian school of thought. Other people who were at the discussion included Bernhard Schölkopf, David Hogg, Dillip Khrisnan, Michael Hirsch, etc.

In short, the story goes like this:

Stefan believed that he could somehow solve the problem by directly formulating the objective function and optimizing it. The message here, as I understood, was to avoid any prior knowledge. I believe there is more to his point of view on this problem, but for the sake of brevity, I will skip it as it is not the main topic we are going to discuss about.

On the other hand, Christian and Rob had a slightly different point of view. They believed one should incorporate a "prior information". They pointed out that the prior for modelling the astronomical images is a key. Similarly, there is more to the story, but I will skip it.

As an observer, I agreed with all three of them. Using only Stefan's objective function, I think he could find a reasonably good solution. Similarly, Christian and Rob might be able to find a better solution with a "reasonably right" prior. The question is which approach should I use?

This question essentially arises before you actually solve the problem. Christian and Rob may have a good prior which can possibly helps obtain better solutions than Stefan's approach. But as a observer, who does not know anything about the prior, it seems that I need to deal with another source of uncertainty: Is the prior actually a good one? The aforementioned statement may not hold anymore if one has a bad prior.

In summary, I would like to know the answer to the following questions:

  1. Does incorporating prior actually cause more uncertainty about the problem we are trying to solve?
  2. If so, is it then harder to solve a problem with a prior as opposed to without one?
  3. Statistically speaking, how do most statisticians deal with this uncertainty?

Feel free to leave comments if you have one. Thanks.