Of coarse simple single use code is going to be faster then unrelated complex code.
The kicker is the integration with Minecraft itself, and taking into account the things you so nonchalantly ignored, user configs dynamic textures, or any of the plethora of other things mods do that screw up the cache.
Also, what parts of the texture stitching event actually takes time? What parts could be accelerated? The actual slot allocation shouldn't be that time consuming. But it'd be worth quantifying.
So, as I said before, write something up and add benchmarks. And i'm talking real world implementations not unrelated theoretical code.