

# XXXVI Cycle

# **Efficient Hardware Architectures for Brain-inspired Computing and Sensors** Fabrizio Ottati

Supervisors: Prof. Marco Vacca, Prof. Guido Masera

#### **Research context and motivation**

- The rapid growth of deep learning has also triggered a growing interest in the design of specialized hardware accelerators to support it. This specialized hardware targets one of two categories — either operating in datacenters or on mobile devices at the network edge. While energy efficiency is important in both cases, the need is extremely stringent in the latter class of applications due to limited battery life.
- The same applies to sensors, which should efficiently convey information to processing engines while balancing low power consumption, responsiveness, and energy spent on data transfer. Event-based





## **Novel contributions**

- Production of IMC CMOS SRAM arrays for inmemory maximum value computation.
- Development of an HDC algorithm in Python for the
- generated) MNIST, targeting Xilinx FPGAs.





sensors are an emerging class of devices that physical quantity and transfer such measure a information in a frame-less fashion. Dynamic Vision Sensors (DVSs), which asynchronously measure the brightness changes in the field of view, belong to this category.

[1] Sironi et al., "HATS: Histogram of Averaged Time Surfaces", CVPR 2018. [2] "What is the role of ML in IoT", https://www.iotworlds.com/what-is-the-role-of-machine-learning-in-iot/

#### Addressed research questions/problems

• ML tasks are energy-hungry ones. To improve efficiency one can take inspiration from the brain and adopting a "neuromorphic" approach for both sensing and computation: In-**Memory Computing** (IMC), embedding the computation where the information is stored; **Dynamic Vision Sensors** (DVSs), emulating the human eye; **Spiking Neural Networks** (SNNs), emulating the human brain functioning; **Hyper Dimensional Computing** (HDC), emulating the human brain dimensionality.



(3) Three different IMC SRAM cells for in-memory computation of maximum value.

#### Adopted methodologies

- AND computation capabilities have been added to a standard CAM array. The circuit has been characterized at physical level in Cadence Virtuoso, measuring energy and latency of each memory operation.
- The HDC circuit has been ported to many FPGA families, measuring power consumption and critical path for different architecture configurations (word width).



[1] Karunaratne et al., "In Memory Hyperdimensional Computing", Nature 2020. [2] Duan et al., "LeHDC: Learning-Based Hyperdimensional Computing Classifier", DAC 2022. [3] Frenkel et al., "SNN ICs: A Review of Trends and Future Directions", IEEE CICC 2022. [4] Schaefer et al., "AEGNN: Asynchronous Event-based GNNs", CVPR 2022.

## Submitted and published works

- Ottati F.; Turvani G.; Masera G.; Vacca M. "Custom Memory Design for Logic-in-Memory: Drawbacks and Improvements over Conventional Memories", MDPI Electronics, V. 10, no. 18: 2291, 2021.
- Andrighetti, M.; Turvani, G.; Santoro, G.; Vacca, M.; Marchesin, A.; Ottati, F.; Ruo Roch, M.; Graziano, M.; Zamboni, M. "Data Processing and Information Classification—An In-Memory Approach". MDPI Sensors, V. 20, no. 1681, 2020.



| FPGA                    | Dynamic Power |       | Static Power |       | Total Power |
|-------------------------|---------------|-------|--------------|-------|-------------|
|                         | W             | %     | W            | %     | W           |
| Spartan 7 $25C$         | 0.029         | 25.66 | 0.084        | 74.34 | 0.113       |
| Spartan 7 $100F$        | 0.032         | 17.67 | 0.149        | 82.33 | 0.181       |
| Artix 7 $15T$           | 0.030         | 24.39 | 0.093        | 75.61 | 0.123       |
| Artix 7 $200T$          | 0.054         | 23.28 | 0.177        | 76.72 | 0.231       |
| Kintex 7 $70T$          | 0.031         | 20.94 | 0.117        | 79.06 | 0.148       |
| Kintex 7 $70\mathrm{T}$ | 0.031         | 6.9   | 0.417        | 93.1  | 0.448       |
| Virtex 7 $585T$         | 0.033         | 6.38  | 0.485        | 93.62 | 0.518       |
| Virtex 7 1140T          | 0.033         | 2.63  | 1.223        | 97.37 | 1.256       |

(2) Resource utilization and critical path of HDC architecture on Spartan 100F; power consumption on various FPGA families.

#### **Future work**

- Generalization of the HDC accelerator to multiple applications through a Python framework, allowing for training on-chip with STOA algorithms.
- Investigations of SNN and GNN algorithms and circuits for DVS-based application in contexts in which minimum latency and energy consumptions are needed.
- Development of an SNN accelerator for smart drones in collaboration with TU Delft (abroad period), with professors Charlotte Frenkel and Guido De Croon.

## List of attended classes

- 01DNMIU Optimized execution of neural networks at the edge (2/8/22, 25h)
- 01DUCRV Principles of digital image processing and technologies (22/7/22, 27h)
- 01UNRRV Entrepreneurship and start-up creation (31/5/21, 40h)
- 02SFURV Programmazione scientifica avanzata in Matlab (27/4/21, 30h)
- 01QTEIU Data mining concepts and algorithms (1/2/21, 20h)
- 01TWNQW Electronic systems for sensor acquisition (27/1/21, 30h)





#### **Electrical, Electronics and**

#### **Communications Engineering**