Manil Dev Gomony just had a journal article entitled “A Globally Arbitrated Memory Tree for Mixed-Time-Criticality Systems” accepted in the high-impact journal IEEE Transactions on Computers. This article extends a conference paper published at DATE in 2015 that was called “A Generic, Scalable and Globally Arbitrated Memory Tree for Shared DRAM Access in Real-Time Systems” that was published in collaboration with Jamie Garside and Neil Audsley from University of York. The original paper explained the design and efficient hardware implementation of a transaction arbiter for real-time systems that could be configured to behave like any of five well-known arbiters, i.e. TDM, Round Robin, Credit-Controlled Static Priority, Priority-Based Scheduler, and Frame-Based Static Priority. The key feature of the arbiter is that it is distributed, which means that accounting and enforcement is not done in a single centralized location, allowing it to scale to systems with many resource clients without negatively impacting the maximum frequency at which it operates.
The journal article extends the original conference paper by adding more detail and examples on the design of the memory tree, as well as improving positioning. However, it also extends the scope of the work to consider more complex Mixed-Time-Criticality systems where some clients are more concerned about average-case than worst-case performance. It also considers that the requirements of the clients may be diverse, i.e. that some may have high bandwidth requirements and are latency-tolerant, while others have low bandwidth requirements, but are latency-critical. This is diversity of requirements is addressed by showing how the memory tree supports the transaction arbiter to be chosen individually per client rather than once for the entire system. For example, some real-time clients may be configured by non-work-conserving TDM arbitration to get predictable bandwidth and latency while enjoying complete temporal isolation from other clients, which simplifies integration and certification. Other clients sharing the same resource, may be scheduled using e.g. using a work-conserving Frame-Based Static Priority scheduler to reflect an interest in low average latency while still distinguishing their relative latency-sensitivity. The memory tree supports any combination of the mechanisms discussed above, but we provide a formal analysis of the mixed arbitration algorithm explained above. The article demonstrates the benefits of this approach on a VHDL hardware implementation, as well as its cost in terms of area and power compared to centralized non-mixed arbitration policies by means of ASIC synthesis.
Today, Manil Dev Gomony has successfully defended his PhD thesis entitled “Scalable and Bandwidth-Efficient Memory Subsystem Design for Real-Time Systems“. The thesis proposes an architecture for a real-time memory subsystem that scales well in terms of area and maximum synthesizable frequency with an increasing number of memory clients. This subsystem architecture comprises a memory interconnect called Globally Arbitrated Memory Tree (GAMT) a Multi-Channel Memory Controller (MCMC), as well as a technique to couple those components and have a single point of arbitration for both resources. The thesis also proposes a design flow for automatically choosing the memory device, mapping clients to memory channel, and configure arbiters to satisfy client requirements.
Among Manil’s achievements, we specifically highlight two achievements with respect to publishing. First of all, he had a paper accepted at the DATE conference every year during his PhD. Secondly, none of his publications were ever rejected anywhere. This shows that Manil managed to publish in competitive forums in his field and that his work was well-received. Currently, Manil works as a Researcher at Bell Laboratories of Alcatel-Lucent in Belgium. We wish him the best of luck in his future career!
A journal article entitled “A Framework for Memory Contention Analysis in Multi-Core Platforms” has been accepted for publication in Real-Time Systems. This article is a collaboration with Dakshina Dasari and Vincent Nelis and is a result from the time I spent with the CISTER-ISEP Research Unit in Porto.
The article proposes a unified framework to bound memory interference in multi-core platforms for a variety of different arbiters, such as time-division multiplexing (TDM), fixed priority, and an unspecified work-conserving arbiter. Our framework clearly demarcates the arbiter-dependent and independent stages in the analysis of interference. The arbiter-dependent phase takes the arbiter and the task memory-traffic pattern as inputs and produces a model of the availability of the bus to a given task. Then, based on the availability of the bus, the arbiter-independent phase determines the worst-case request-release scenario that maximizes the interference experienced by the tasks due to memory contention. We experimentally evaluate the quality of the analysis by comparison with a state-of-the-art TDM analysis approach and consistently showing a considerable reduction in maximum interference.
The notifications from the DATE conference are in and the Memory Team scores 2 out of 2, just like in 2014. The first paper entitled “A Generic, Scalable and Globally Arbitrated Memory Tree for Shared DRAM Access in Real-Time Systems” was first-authored by Manil and is a collaboration with Jamie Garside and Neil Audsley from University of York. The paper proposes a memory interconnect for shared memory architectures in many-core systems. A main architectural feature is that the interconnect is heavily pipelined enabling it to be synthesized at high frequencies even with a large number of clients. Another highlight is that it has global arbitration that can be programmed to behave like several different arbitration mechanisms, such as TDM, CCSP and FBSP.
The second paper “Retention Time Measurements and Modelling of Bit Error Rates of WIDE I/O DRAM in MPSoCs”was first-authored by our colleagues at Kaiserslautern University of Technology in collaboration with Sven Goossens from our Memory Team. This paper looks into the thermal behavior of 3D-stacked WIDE I/O DRAM and compares its impact on retention time and bit error rates to conventional 2D DRAM chips.
Today we celebrate that the Memory Team had both papers submitted to DATE accepted as full papers at the conference. The first paper was written by Manil Dev Gomony and is entitled “Coupling TDM NoC and DRAM Controller for Cost and Performance Optimization of Real-Time Systems”. This paper discusses area, power and performance benefits of coupling the arbitration in a TDM NoC with the memory controller arbitration, thereby reducing the number of arbitration points on the path from processor to memory. The second paper entitled “Exploiting Expendable Process-Margins in DRAMs for Run-Time Performance Optimization” was first-authored by Karthik Chandrasekar. This paper shows how to exploit excessive process margins in DRAMs by proposing a methodology for how to determine the minimum timings that a memory can safely run at, thereby improving performance.
Sahar Foroutan had a paper entitled “A General Framework for Average-Case Performance Analysis of Shared Resources” accepted at DSD 2013. This paper is a result of her six month collaboration visit in Eindhoven last year. The two main contributions of the paper are: 1) a general model for resource sharing based on queuing theory that can be used with different arbiters and that captures architectural features of the shared resource, such as pipelining and arbitration delay, and 2) three arbiter models for time-division multiplexing, static-priority arbitration, and round-robin, respectively, that assume general distributions (G/G/1) and fits within the framework.