PIM

Application Scenario Marker

General Purpose
Neural Network
Graph Processing
Bioinformatics
Data Analytics
Associative Computing
Automata Computing
Data Manipulation
Security
Others

PIM

Circuit level researches

DRAM based

[ISSCC 1997][Intelligent RAM (IRAM): chips that remember and compute]
[GLSVLSI 2005][PIM lite: A multithreaded processor-in-memory prototype]

SRAM based

[ICASSP 2014][An Energy-Efficient VLSI Architecture for Pattern Recognition via Deep Embedding of Computation in SRAM]
[VLSI 2016][A machine-learning classifier implemented in a standard 6T SRAM array]
[ISSCC 2018][A 65nm 4Kb Algorithm-Dependent Computing-in-Memory SRAM Unit-Macro with 2.3ns and 55.8TOPS/W Fully Parallel Product-Sum Operation for Binary DNN Edge Processors]
[ISSCC 2018][Conv-RAM: An Energy-Efficient SRAM with Embedded Convolution Computation for Low-Power CNN-Based Machine Learning Applications]
[JSSC 2018][A 4 + 2T SRAM for Searching and In-Memory Computing With 0.3-V VDDmin]
[DAC 2108][Parallelizing SRAM arrays with customized bit-cell for binary neural networks]

RRAM based

[nature 2015][Training and operation of an integrated neuromorphic network based on metal-oxide memristors]
[ASP-DAC 2017][MPIM: Multi-purpose in-memory processing using configurable resistive memory]
[IEEE Electron Device Letters 2018][Reconfigurable Boolean Logic in Memristive Crossbar: the Principle and Implementation]

PCRAM based

[TED 2015][Experimental demonstration and tolerancing of a large-scale neural network (165,000 synapses), using phase-change memory as the synaptic weight element]

STT-RAM based

[ISCAS 2014][Spin-Transfer Torque Magnetic Memory as a Stochastic Memristive Synapse]
[ISVLSI 2017][Hybrid Polymorphic Logic Gate with 5-Terminal Magnetic Domain Wall Motion Device]

[ASP-DAC 2018][HieIM: Highly Flexible In-Memory Computing using STT MRAM]
[IEEE Transactions on Magnetics 2017][In-Memory Processing Paradigm for Bitwise Logic Operations in STT–MRAM]
[DATE 2018][Computing-in-memory with spintronics]

Architecture level researches

[IEEE Micro 1976][A case for intelligent RAM]
[Computer 1995][Processing in Memory: The Terasys Massively Parallel PlM Array]
[Frontiers 1996][Pursuing a Petaflop: Point Designs for 100 TF Computers Using PIM Technologies]
[Computer 1997][Scalable Processors in the Billion-Transistor Era: IRAM]
[ISCA 1997][Processing in memory: Chips to petaflops]
[ASPDAC 2018][PIMCH: cooperative memory prefetching in processing-in-memory architecture]

DRAM based

[CICC 1992][Computational RAM: A Memory-SIMD Hybrid and Its Application to DSP]
Arch: add logic within DRAM to perform vector operations
[ICPP 1994][EXECUBE-A New Architecture for Scaleable MPPs]
Arch: add logic within DRAM to perform vector operations
[IEEE Computer 1995][Processing in memory: The Terasys massively parallel PIM array]
Arch: add logic within DRAM to perform vector operations
[IEEE Micro 1997][A Case for Intelligent RAM]
Arch: add logic within DRAM to perform vector operations
[ICCD 1997][Intelligent RAM (IRAM): the industrial setting, applications, and architectures]
Arch: add logic within DRAM to perform vector operations
[ISCA 1998][Active Pages: A Computation Model for Intelligent Memory]
[SC 1999][Mapping Irregular Applications to DIVA, a PIM based Data-Intensive Architecture]
[MTDT 1999][The Dynamic Associative Access Memory Chip and its Application to SIMD Processing and Full-text Database Retrieval]
[IEEE DT 1999][Computational RAM: Implementing processors in memory]
[ISCA 2000][Smart Memories: A Modular Reconfigurable Architecture]
[IPDPS 2002][Memory-intensive benchmarks: IRAM vs. cache-based machines]
[ICS 2002][The Architecture of the DIVA Processing-In-Memory Chip]
[ICCD 2012][FlexRAM: Toward an Advanced Intelligent Memory System]
[Micro 2013][RowClone: Fast and energy-efficient in-DRAM bulk data copy and initialization]
[HPEC 2013][Accelerating Sparse Matrix-Matrix Multiplication with 3D-Stacked Logic-in-Memory Hardware]
[SIGMOD 2015][JAFAR: Near-Data Processing for Databases]
[CAL 2015][Fast Bulk Bitwise AND and OR in DRAM]
[MemSys 2015][NCAM: Near-Data Processing for Nearest Neighbor Search]
[MemSys 2015][Opportunities and Challenges of Performing Vector Operations inside the DRAM]
[MemSys 2015][SIMT-based Logic Layers for Stacked DRAM Architectures: A Prototype]
[DaMoN 2015][Beyond the Wall: Near-Data Processing for Databases]
[ISCA 2016][DRAF: a low-power DRAM-based reconfigurable acceleration fabric]
FPGA style
[HPCA 2016][Low-Cost Inter-Linked Subarrays (LISA): Enabling Fast Inter-Subarray Data Movement in DRAM]
[arXiv 2016][Buddy-RAM: Improving the Performance and Efficiency of Bulk Bitwise Operations Using DRAM]
[TACO 2016][AIM: Energy-Efficient Aggregation Inside the Memory Hierarchy]
[Micro 2017][Ambit: In-Memory Accelerator for Bulk Bitwise Operations Using Commodity DRAM Technology]
[Micro 2017][DRISA: a DRAM-based Reconfigurable In-Situ Accelerator]
[MemSys 2017][PHOENIX: Efficient Computation in Memory]
[TVLSI 2017][Excavating the Hidden Parallelism Inside DRAM Architectures With Buffered Compares]
[Micro 2018][SCOPE: A Stochastic Computing Engine for DRAM-based In-situ Accelerator]
[PACT 2018][In-DRAM Near-Data Approximate Acceleration for GPUs]
[DAC 2018][DrAcc: a DRAM based accelerator for accurate CNN inference]
[TCAD 2018][McDRAM: Low Latency and Energy-Efficient Matrix Computations in DRAM]
[TransPDS 2018][Exploiting Parallelism for CNN Applications on 3D Stacked Processing-In-Memory Architecture]
[arXiv 2018][The processing using memory paradigm: In-DRAM bulk copy, initialization, bitwise AND and OR]

SRAM based

[PACT 2014][SQRL: hardware accelerator for collecting software data structures]
[Micro 2017][Cache automaton]
[HPCA 2017][Compute Caches]
[ISCA 2018][Neural Cache: Bit-Serial In-Cache Acceleration of Deep Neural Networks]
[arXiv 2018][Xcel-RAM: Accelerating Binary Neural Networks in High-Throughput SRAM Compute Arrays]

RRAM based

[MemSys 2016][Processing Acceleration with Resistive Memory-based Computation]
[HPCA 2016][Memristive Boltzmann machine: A hardware accelerator for combinatorial optimization and deep learning]
[ISCA 2016][PRIME: a novel processing-in-memory architecture for neural network computation in ReRAM-based main memory]
[ISCA 2016][ISAAC: a convolutional neural network accelerator with in-situ analog arithmetic in crossbars]
[ICCAD 2016][Reconfigurable In-Memory Computing with Resistive Memory Crossbar]
[DAC 2016][Pinatubo: A Processing-in-Memory Architecture for Bulk Bitwise Operations in Emerging Non-Volatile Memories]
[HPCA 2017][PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning]
[HPCA 2017][GraphR: Accelerating Graph Processing Using ReRAM]
[TCAD 2017][MNSIM: Simulation Platform for Memristor-Based Neuromorphic Computing System]
[CAL 2017][IMEC: A Fully Morphable In-Memory Computing Fabric Enabled by Resistive Crossbar]
RRAM FPGA
[ICCAD 2017][RRAM-based Reconfigurable In-Memory Computing Architecture with Hybrid Routing]
RRAM FPGA
[HPCA 2018][Making Memristive Neural Network Accelerators Reliable]
[ISCA 2018][Enabling Scientific Computing on Memristive Accelerators]
[ASPDAC 2018][ReGAN: A pipelined ReRAM-based accelerator for generative adversarial networks]
[ASPDAC 2018][Training Low Bitwidth Convolutional Neural Network on RRAM]
RRAM training
[JESTCS 2018][Multiscale Co-Design Analysis of Energy, Latency, Area, and Accuracy of a ReRAM Analog Neural Training Accelerator]
RRAM training]
[MICRO 2018][LerGAN: A Zero-free, Low Data Movement and PIM-based GAN Architecture]
[IEEE Micro 2018][Newton: Gravitating Towards the Physical Limits of Crossbar Acceleration]
[TCSI 2018][IMAGING: In-Memory AlGorithms for Image processiNG]
[ISCAS 2018][Efficient Algorithms for In-Memory Fixed Point Multiplication Using MAGIC]

PCRAM based

[DAC 2015][ProPRAM: exploiting the transparent logic resources in non-volatile memory for near data computing]

STT-RAM based

[ISCA 2013][AC-DIMM: Associative computing with STT-MRAM]
[DAC 2018][CMP-PIM: An Energy-Efficient Comparator-based Processing-in-Memory Neural Network Accelerator]

System level researches

ISA / Compiler

[HPCA 2001][Automatically mapping code on an intelligent memory architecture]
[ICRC 2017][Generalize or Die: Operating Systems Support for Memristor-based Accelerators]
[IPDPS 2017][Similarity Search on Automata Processors]
[ASPLOS 2018][Bridge the Gap between Neural Networks and Neuromorphic Hardware with a Neural Network Compiler]
[ASPLOS 2018][Liquid Silicon-Monona: A Reconfigurable Memory-Oriented Computing Fabric with Scalable Multi-Context Support]
RRAM FPGA
[DATE 2018][Prometheus: Processing-in-memory Heterogeneous Architecture Design From a Multi-layer Network Theoretic Strategy]
[ISCA 2018][PROMISE: an end-to-end design of a programmable mixed-signal accelerator for machine-learning algorithms]

Firmware / Runtime / Middleware

[SC 2002][Gilgamesh: A multithreaded processor-in-memory architecture for petaflops computing]
[arXiv 2017][CODA: Enabling Co-location of Computation and Data for Near-Data Processing]
Platform: GPU + HBM(SM style)
Programming model: GPU programming model
Two ideas: (1) selectively localize data / scatter data; (2) thread-block and data co-location
Virtual Memomry assumption: This paper assumes SMs in the memory stack are equipped with a hardware TLB and memory management units (MMUs) that access page tables and are capable of performing virtual address translation.

Coherence / Consistence / Concurrency (atomicity) issues

[CAL 2017][LazyPIM: An Efficient Cache Coherence Mechanism for Processing-in-Memory]
Also works for NDP architecture
PIM kernel identification: programmer / compiler
[TACO 2015][GP-SIMD Processing-in-Memory]
Coherence: restrict PIM processing logic to execute on only non-cacheable data, which forces cores within the CPU to read PIM data directly from DRAM.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PIM.md

PIM.md

PIM

Circuit level researches

DRAM based

SRAM based

RRAM based

PCRAM based

STT-RAM based

Architecture level researches

DRAM based

SRAM based

RRAM based

PCRAM based

STT-RAM based

System level researches

ISA / Compiler

Firmware / Runtime / Middleware

Coherence / Consistence / Concurrency (atomicity) issues

Files

PIM.md

Latest commit

History

PIM.md

File metadata and controls

PIM

Circuit level researches

DRAM based

SRAM based

RRAM based

PCRAM based

STT-RAM based

Architecture level researches

DRAM based

SRAM based

RRAM based

PCRAM based

STT-RAM based

System level researches

ISA / Compiler

Firmware / Runtime / Middleware

Coherence / Consistence / Concurrency (atomicity) issues