Stacked-3D and Processing-in-memory Solutions for Data-intensive and Persistent Applications
Author | : Akshay Krishna Ramanathan |
Publisher | : |
Total Pages | : 0 |
Release | : 2022 |
ISBN-13 | : OCLC:1346411066 |
ISBN-10 | : |
Rating | : 4/5 ( Downloads) |
Download or read book Stacked-3D and Processing-in-memory Solutions for Data-intensive and Persistent Applications written by Akshay Krishna Ramanathan and published by . This book was released on 2022 with total page 0 pages. Available in PDF, EPUB and Kindle. Book excerpt: With the dominance of data-intensive workloads and applications, the current von-Neumann-based architectures suffer from memory bandwidth problems, popularly known as the "memory wall". In order to alleviate the problem of memory bandwidth, processing-in-memory (PiM) has gained a lot of attention in recent years. In the PiM architectures, the compute logics are moved closer to or within the memory where the data resides, enabling the PiM architectures to exploit the high internal bandwidth of the memories. This dissertation explores the opportunities provided by the recent advancements in-memory technologies to design highly efficient PiM architectures for mainly deep-learning, database, and persistence applications. The first work in this dissertation presents a novel 3D-SRAM circuit design using a Monolithic 3D Integration process (M3D) for realizing beyond-Boolean in-memory compare operation without any area overheads compared to the standard 6T-SRAM. We also showcase measurement results from the fabricated PiM macro with the same circuit design for performing massively parallel compare operation used in the database, machine learning, and scientific applications. The proposed PiM technique supports operations like data filtering, sorting, and index handling of sparse matrix-matrix multiplication (SpGEMM). The second work presents a Look-Up Table (LUT) based PiM technique for conventional SRAM memory technology (i.e., single layer) with the potential for running Neural Network inference tasks. We implement a bitline computing free technique to avoid frequent bitline accesses to the cache sub-arrays and thereby considerably reducing the memory access energy overhead. Our proposed LUT-based PiM methodology exploits substantial parallelism using look-up tables, which do not alter the memory structure/organization. This methodology showcases a PiM architecture for current memory technologies with minimal changes to the monolithic custom memory blocks. The third work deals with crash consistency for critical applications like financial trading, cyber threat analysis, IoT, etc. At present, non-volatile memory technologies promise the opportunity for maintaining persistent data in memory. However, providing crash consistency in such systems can be costly as any update to the persistent data has to reach the persistent hard drive in a specific order, imposing a high overhead. In this work, we propose an architecture design that employs a hybrid volatile, non-volatile memory cell employing M3D and Ferroelectric technology in the L1 data cache to guarantee crash consistency with almost no performance overhead. Memory technologies like high bandwidth memory (HBM), and solid-state drives (SSD) make use of parallel-3D integration process to stack memory layers in order to increase the density per mm2. The final work presents cost-effective potential N-layer logic designs realized by the same process. This work discusses the stricter rules and constraints enforced by the fabrication process when designing N-layer designs and then explores different adder designs.