I think it is probably because they based it on the biped (i.e. player) model. If you think about a quadraped, you just take a biped and rotate the body.
The only thing I can think of is texture coordinates: fitting all the boxes cleanly onto the texturemap without overlapping or leaving large void areas.