As we know, in the ReLU activation function, the gradient is 0 for all the negative values of inputs(x), which further may lead to a dead ReLU problem, ...
確定! 回上一頁