Training and Inference
In AI we must first draw a distinction between training an AI model and using the trained model for inference on a real world data stream.
Training stage requires magnitude amount of compute intensive while inference stage requires minimum reasonable amount of compute resources to provide service response time effectivly.
Training stage - system requirements by AI model
The three major categories of AI architectures and the challenges
- Convolutional Neural Networks (CNNs)
- CNNs then benefit from processing architectures that can expose parallel processing capability such as FPGAs and GPUs rather than sequential CPU and MCU architectures
- Recursive Neural Networks (RNNs)
- RNN appications like language translation, mathematical biology and financial analysis need to include some form of “memory” in the AI algorithm so that past data can be used to gain insight from subsequent readings by using the "Memory"
- The memory capability of RNNs makes them good at processing sequential architectures such as CPUs and MCUs than GPU based parallelism
- Transformers
- Transformers contain massive parallelism. Transformers are more complex and have higher compute requirements than their CNN predecessors and benefit from the massive compute and parallel architectures of FPGAs and GPUs
Inference stage - system requirments
Inference stage solutions must have low power consumption, be cost effective and be able to operate in harsh environments. GPUs and traditional FPGAs offer a parallel compute capability but consume too much power and are too expensive. Microcontrollers and microprocessors are sequential in nature and lack the performance.
The open source community has put a lot of work into developing tools that can quantize AI models and reduce their complexity to a point where running on a microcontroller can, in some cases, deliver the desired performance. example, TensorFlow Lite can create models that, with the aid of runtime libraries, can execute on microcontroller architectures.
High performance light weight programable FPGA would be the next blue ocean .