Massive data rates in cybersecurity, simulation, and social media analysis applications are driving rapid advances in the field of streaming graph analytics. The data structures that enable streaming graph analytics pose unique challenges for high-performance computing system designers. When the sorted, contiguous arrays of static graphs are replaced with the fragmented, linked data structures of dynamic graphs, these systems struggle to reach the memory bandwidth saturation point. Behaviors such as pointer-chasing and poor spatial locality expose the true latency of modern memory devices, which has not kept up with processor clock rates. This dissertation develops a streaming graph benchmark, DynoGraph, which is distinguished from static graph benchmarks by the use of realistic streaming graph inputs and dynamic graph data structures. The benchmark is used to expose performance pitfalls in existing implementations. These insights flow into the design of near-memory accelerators for streaming graph analytics, as well as software improvements. The Emu architecture is identified as a promising solution for accelerating algorithms with low spatial locality, unbalanced parallelism, and fine-grained memory accesses, since it is able to maintain high memory bandwidth utilization in a worst-case pointer-chasing scenario. The work culminates in a characterization of the Emu Chick hardware prototype, proposing efficient programming primitives, highlighting necessary system improvements, and demonstrating the potential for greatly improved performance on this important class of workloads.