Generalizable Learning for Natural Language Instruction Following on Physical Robots