# Overfitting!

Recently, I have been struggling to understand something, which is related to the notion of "overfitting". In learning theory, It usually refers to the situation in which one try to infer a general concept (e.g., regressor or classifier) from finite observations (aka data). The concept is said to overfit the observation if it performs well on the set of observations, but performs worse on the previously unseen observations. The ability to generalize to the future observation is known as a "generalization" ability of the concept we have inferred. In practice, we would prefer the concept that not only performs well on the current data, but also do so on the  data we might observe in the future (we generally assume that the data come from the same distribution).

When I think of overfitting, it is unavoidable to refer to "generalisation". In fact, as we can see above, we give the definition of overfitting based on the generalization ability of the concept. The notion of overfitting is also closely related to the notion of ill-posedness in the inverse problem. The is in fact the motivation of the regularisation problem we often encounter in the machine learning.

In the past few weeks, I have to deal with estimation problem. In principle, it is different from regression or classification problem(and regularization problem as well). However, there seems to be a connection between estimation problem and regularization problem, that still puzzle me. In estimation theory, the problem is mostly unsupervised, so it is not clear how to define "overfitting" in this case. Can we look at overfitting based on something beyond generalisation?

So the question I would like to ask in this post is "what else can we see/consider as overfitting?" If you have good examples, please feel free to leave comments.

# Japan Visit

I am currently visiting Prof. Kenji Fukumizu at the Institute of Statistical Mathematics in Tokyo, Japan, where I will be spending most of the time working. Since I arrived last week, Kenji and I have already produced some interesting results on our joint work in kernel mean embedding for distributions. Hopefully, I can keep myself consistently productive in the next few weeks.

Apart from work, I also had some trips to Tachikawa and Tokyo downtown, despite the fact that the weather was not on my side. They includes a trip to Shinjuku (walking around in the area and enjoying the Japanese lifestyle), Asakusa, and the Tokyo Skytree Tower. The weather is getting better this week so I hope I will have a wonderful trip this weekend.

Well, this post is not really about machine learning, I will keep posting about what I learn while I am here.