Loss Function and cost function both measure how much is our predicted output/calculated output is different than actual output. The loss functions are defined on a single training example. It means it measures how well your model performing on a single training example. But if we consider the entire training set and try to measure how well is our model performing on it, we define a function called the cost function. Mathematically the cost function is the average of the loss function for the entire training set. In other words, the loss function measures the error for a single training example, the cost function measures the average error for the entire training set.

Let’s understand it with an example,

Suppose, the training set is {(x^{1}, y^{1}), (x^{2}, y^{2}), (x^{3}, y^{3})....... (x^{m}, y^{m})}, where x^{m}’s are training inputs and y^{m}’s are respective actual outputs and m is the total number of training examples. Let {yhat^{1},yhat^{2},yhat^{3},…,yhat^{m}} be the predicted outputs of our model corresponding to the {x^{1} ,x^{2},x^{3},.....,x^{m}} inputs.

If we use binary cross-entropy as loss function, then for training example 1 it is calculated as,

As we have described above, the cost function is the average of the loss functions for the entire training set, thus cost function J is calculated as,

Post a Comment

No Comments