Training Paradigms For Deep Residual Networks