Groups | Search | Server Info | Login | Register
Groups > comp.lang.vhdl > #5538
| From | Fereydoun Memarzanjany <thraetaona@ieee.org> |
|---|---|
| Newsgroups | comp.lang.vhdl, comp.arch.fpga, comp.arch.embedded |
| Subject | Re: Innervator: Hardware Acceleration for Neural Networks |
| Date | 2024-08-06 23:02 -0600 |
| Organization | A noiseless patient Spider |
| Message-ID | <v8uv4i$27c09$1@dont-email.me> (permalink) |
| References | <v8ut7t$26n8q$1@dont-email.me> |
Cross-posted to 3 groups.
Pasted below is an overview/abstract, and you will find more information
(including a paper, demo video, statistics, slides, and source code) at
the following GitHub repository:
https://github.com/Thraetaona/Innervator
------------------------------------------------------------------------
Artificial intelligence ("AI") is deployed in various applications, from
noise cancellation to image recognition, but AI-based products often
come with high hardware and electricity costs; this makes them
inaccessible for consumer devices and small-scale edge electronics.
Inspired by biological brains, deep neural networks ("DNNs") are modeled
using mathematical formulae, yet general-purpose processors treat
otherwise-parallelizable AI algorithms as step-by-step sequential logic.
In contrast, programmable logic devices ("PLDs") can be customized to
the specific parameters of a trained DNN, thereby ensuring data-tailored
computation and algorithmic parallelism at the register-transfer level.
Furthermore, a subgroup of PLDs, field-programmable gate arrays
("FPGAs"), are dynamically reconfigurable. So, to improve AI runtime
performance, I designed and open-sourced my hardware compiler:
Innervator. Written entirely in VHDL-2008, Innervator takes any DNN's
metadata and parameters (e.g., number of layers, neurons per layer, and
their weights/biases), generating its synthesizable FPGA hardware
description with the appropriate pipelining and batch processing.
Innervator is entirely portable and vendor-independent. As a proof of
concept, I used Innervator to implement a sample 8x8-pixel handwritten
digit-recognizing neural network in a low-cost AMD Xilinx Artix-7(TM)
FPGA @ 100 MHz. With 3 pipeline stages and 2 batches at about 67% LUT
utilization, the Network achieved ~7.12 GOP/s, predicting the output in
630 ns and under 0.25 W of power. In comparison, an Intel(R) Core(TM)
i7-12700H CPU @ 4.70 GHz would take 40,000-60,000 ns at 45 to 115 W.
Ultimately, Innervator's hardware-accelerated approach bridges the
inherent mismatch between current AI algorithms and the general-purpose
digital hardware they run on.
------------------------------------------------------------------------
(Forgot to cross-post to c.a.fpga and c.a.embedded; adding them now.)
Back to comp.lang.vhdl | Previous | Next — Previous in thread | Find similar
Innervator: Hardware Acceleration for Neural Networks Fereydoun Memarzanjany <thraetaona@ieee.org> - 2024-08-06 22:29 -0600 Re: Innervator: Hardware Acceleration for Neural Networks Fereydoun Memarzanjany <thraetaona@ieee.org> - 2024-08-06 23:02 -0600
csiph-web