Microsoft has been investing a lot in machine learning, working with silicon vendors to support running AI models on your PC as fast as possible. This has required the development of a whole new generation of silicon from Intel, AMD and ARM.
Often called “AI accelerators,” neural processing units are dedicated hardware that handle specific machine learning tasks such as computer vision algorithms. You can think of them much like a GPU, but for AI rather than graphics. They often share a lot of features with GPUs, having many relatively low precision processor cores that implement common machine learning algorithms. They don’t even need to be fabricated in advance, as FPGAs offer programmable silicon that can be used to build and test accelerators.
Got a Surface Pro X? Your NPU is here already.
Surface already ships hardware with NPUs, with Microsoft’s co-developed SQ1 and SQ2 processors for its ARM-based Surface Pro X hardware using a built-in NPU to add eye-tracking features to its camera. If you’re using Microsoft Teams or similar on a Surface Pro X, it will correct your gaze so whoever you’re chatting to will see you looking at them rather than at the camera.
It’s features like those that Microsoft is planning to build into Windows. Its April 2022 hybrid work event used them as an example of how we can use NPUs to make working from home easier for teams. As well as gaze tracking, NPUs will power automatic framing for cameras as well as dynamically blurring backgrounds to reduce distraction. That could mean NPUs running in dedicated hardware, built into webcams and offloading complex image-processing tasks in the camera before you even get to use the resulting video on your PC.
The aim is to turn an artificial on-screen experience into one that’s focused on the people involved rather than the technology. Audio processing will be used to take out noise, as well as focusing on a speaker’s voice rather than a room as a whole. Some of these techniques, like voice focus, are intended to assist remote attendees in a meeting, allowing them to hear what’s being said by a speaker using a shared microphone in a meeting room as well as they would be able to hear someone alone in a room with a dedicated microphone.
SEE: Artificial Intelligence Ethics Policy (TechRepublic Premium)
NPUs will make these techniques easier to implement, allowing them to run in real time without overloading your CPU or GPU. Having accelerators that target these machine learning models ensures that your PC won’t overheat or run out of battery.
Adding NPU support to Windows application development
Windows will increasingly rely on NPUs in the future, with Microsoft announcing its Project Volterra ARM-based development hardware as a platform for building and testing NPU-based code at its Microsoft Build developer event. Ready to ship in the near future, Project Volterra is a desktop device that is likely to be powered by a SQ3 variant of the Qualcomm 8cx Gen3 processor with Microsoft’s own custom NPU. That NPU is designed to help developers start to use its features in their code, handling video and audio processing using a version of Qualcomm’s Neural Processing SDK for Windows.
Microsoft expects NPUs to become a standard feature in mobile and desktop hardware, and that requires getting NPU-based hardware like Project Volterra into the hands of developers. Project Volterra is designed to be stackable, so it should be possible to build several into the development rack, allowing developers to write code, build applications and run tests at the same time. It’s also a good-looking piece of hardware, designed by the Surface hardware team and with a similar look to the flagship Surface Laptop Studio and Surface Pro X devices.
Project Volterra is only part of an end-to-end set of tools for building ARM-based NPU applications. It will be joined by ARM native versions of Visual Studio along with .NET and Visual C++. If you’re looking at building your own machine learning models on Volterra hardware, there’s ARM support for WSL — the Windows Subsystem for Linux — where you can quickly install common machine learning frameworks. Microsoft will be working with many familiar open-source projects to deliver ARM-native builds so all your toolchain will be ready for the next generation of Windows hardware.
While the Qualcomm Neural Processing SDK is part of the initial Project Volterra toolchain, it’s really only a start. As more NPU silicon rolls out, you should expect to see Microsoft building support into Windows with its own developer SDKs and hardware-agnostic runtimes that allow you to build AI code once and have it accelerated anywhere.
Get started with portable AI using WinML and ONNX
We can get a feel for what that might look like by looking at the WinML tools already shipping in the Windows SDK, which can use GPU acceleration to host ONNX models. ONNX, the Open Neural Network eXchange, is a common run time for portable AI models, which can be built using high performance computer systems like Azure Machine Learning. Here you can work with the large amounts of data needed to train machine learning with the necessary computer power and use familiar machine learning frameworks like PyTorch and TensorFlow before exporting the trained models as ONNX for use in WinML.
NPUs aren’t only for desktop devices. They’re key to Microsoft’s IoT strategy, with the low-code Azure Percept platform built around an Intel Movidius vision processing unit, allowing it to work on complex computer vision tasks without requiring high-power hardware. That’s probably the biggest benefit of using NPUs to accelerate AI tasks: The ability to run them at the edge of the network on relatively low-cost hardware.
NPUs in tomorrow’s silicon
Looking at the silicon roadmaps of the various processor and GPU vendors, it’s clear that AI acceleration is key to their next generation of hardware. Intel is building it into its 2023 Meteor Lake mobile processor family, with the desktop Raptor Lake working with M.2-based AI accelerator modules. At the same time, AMB is working on integrating AI and ML optimizations in its next generation Zen 5 hardware.
While only a few PCs like the Surface Pro X have NPU support today, it’s clear that the future looks very different, with AI accelerators of different types becoming either built-in to chiplet systems-on-a-chip or as plugins using widely available PCIe ports. With Microsoft ready to deliver tools for building code that can use them, as well as demonstrating its own AI-powered enhancements to Windows, it looks as though we won’t have to wait to take advantage of NPUs — especially as they should be built into our next generation of PCs.