The initial learning rate was 0.01, and the optimization function was the SGD.