Minimize GPU to GPU latency for small scale deployments
Maximize efficiency for large scale deployments