Deep learning growth is putting resource pressure on available compute power. AI and deep learning models are growing exponentially across all industries for a large range of applications.
There seemed to be conflicting design goals, but the conflict presented no problem for IBM. While low precision computation was necessary to obtain higher density and faster computation, the accuracy for deep learning models had to be at a level consistent with high-precision computation. TOPS measures how many math problems an accelerator can solve in one second. It provides a method to compare how different accelerators perform on a given inference task. Using INT4 for inference, the experimental chip achieved 16.5 TOPS/W, bettering Qualcomm’s low-power Cloud AI module.Although few specs and no pricing were released, a broad price estimate would be in the $1500 to $2000 range.