Two Articles Appeared in Journal of Systems Architecture

Two articles that were submitted to a Journal of Systems Architecture Special Issue on High-performance and Real-time Embedded Systems have now appeared online. The first article is called “T-CREST: Time-predictable Multi-Core Architecture for Embedded Systems” and summarizes the work done in the recently concluded FP7 STREP project T-CREST, where me and my students worked on time-predictable memory controllers.

The second article is entitled “Dataflow Formalisation of Real-Time Streaming Applications on a Composable and Predictable Multi-Processor SOC” and shows how data-flow graphs can be used to model streaming applications mapped to the CompSoc platform and predict its minimum throughput. The basic idea is to start from a data-flow graph of the application and add additional nodes and edges that capture the mapping and timing behavior of all hardware components software libraries, and schedulers in the system. The approach is verified by comparing the predicted performance to the actual performance of an application executing on a CompSoc instance on an FPGA. The article clearly demonstrates the potential of modeling systems in which the behavior of all hardware and software components are known.

Invited Presentation at CMAS 2015

I have recently accepted an invitation to speak at the First TCRTS Workshop on Certifiable Multicore Avionics Systems (CMAS), which takes place on April 13 and is co-located with RTAS 2015 in Seattle. The presentation is made in collaboration with Jan Nowotsch at Airbus Group Innovations, where I was a Visiting Researcher during two months last year. The title of the presentation is Towards Certifiable Resource Sharing in Safety-Critical Multi-Core Real-Time Systems and discusses current problems and state-of-the-art methods for resource sharing in real-time multi-core platforms. The abstract of the presentation is found below:

The proliferation of multi-core platforms has had great impact on embedded computing. Multiple cores exploiting task-level parallelism offer performance far beyond what is possible with a single core, while staying within an acceptable power envelope. Since resources, such as interconnect and memories, are often shared between cores, the platforms have also become increasingly cost efficient. However, resource sharing results in interference between concurrently executing applications, which causes problems in real-time systems where such interference must be either bounded or completely eliminated. As a result, safety-critical systems, for example in the avionics domain, have not yet been able to capitalize on the benefits of multi-core platforms due to stringent certification requirements.

This presentation discusses the state-of-the-art in resource sharing in multi-core systems and its application to safety-critical real-time systems. First, a survey of efforts to build time-predictable resources, such as interconnects and memory controllers, is provided. Then, software-based interference mitigation mechanisms and analyses for these resources in commercial-of-the-shelf platforms are discussed. This is followed by an overview of the approach proposed by Airbus Group Innovations to manage interference and compute worst-case execution times of applications running on a Freescale P4080 multi-core platform. The presentation is concluded by highlighting open issues and future directions towards certifiable resource sharing in safety-critical multi-core real-time systems.

Update: The slides are available here.

Paper Accepted at RTAS 2015

We just had a paper accepted at the Real-Time and Embedded Technology and Applications Symposium (RTAS) in Seattle. The paper is entitled “An Efficient Configuration Methodology for Time-Division Multiplexed Single Resources” and presents an ILP-based methodology to allocate TDM slots to resource clients, such that bandwidth and latency constraints are satisfied while resource utilization is minimized. A heuristic algorithm is furthermore proposed to determine the number of TDM slots in the schedule. This paper is a collaboration both with colleagues here at CTU Prague and with Andrew Nelson from Eindhoven University of Technology.

For the camera-ready version of the paper, please click here.

Back in Prague

I am now back from my two month research visit at Airbus Group Innovations. During my stay, I primarily worked on two things:

  1. Performance analysis of memory accesses in two COTS multi-core platforms. My work extended existing analysis to include the configuration of the memory controller. In particular, the existing setup was improved to enable evaluation of rank-level parallelism within a memory controller, channel parallelism between memory controllers, and different mapping options of cores to memory channels.
  2. I familiarized myself with the certification process for the avionics domain by reading and discussing key standardization documents, e.g. DO-178C for software certification, DO-254 for hardware certification, and DO-297 for integrated modular avionics. I also read several position papers from the Certification Authorities Software Team, most importantly about partitioning guidelines and certification of dual-core platforms. Lastly, I read the ARINC 653 standard, which details the application interface commonly used in avionics systems.

Thanks to Jan Nowotsch and Stefan Schneele for making the visit possible and to my office mates for providing a fun environment to work and learn in.

Paper Accepted at PDP 2015

Today, we congratulate Hazem Ali for having a paper accepted at PDP 2015. The paper is entitled “Generalized Extraction of Real-Time Parameters for Homogeneous Synchronous Dataflow Graphs” and proposes a heuristic methodology for extracting real-time parameters, such as periods, deadlines and offsets, for applications specified as homogeneous synchronous data-flow (HSDF) graphs. The benefit of the approach is that it enables HSDF applications to be analyzed using traditional real-time techniques and scheduled with common real-time schedulers, such as earliest-deadline first.

Memory Team has Two Papers Accepted at DATE 2015

The notifications from the DATE conference are in and the Memory Team scores 2 out of 2, just like in 2014. The first paper entitled “A Generic, Scalable and Globally Arbitrated Memory Tree for Shared DRAM Access in Real-Time Systems” was first-authored by Manil and is a collaboration with Jamie Garside and Neil Audsley from University of York. The paper proposes a memory interconnect for shared memory architectures in many-core systems. A main architectural feature is that the interconnect is heavily pipelined enabling it to be synthesized at high frequencies even with a large number of clients. Another highlight is that it has global arbitration that can be programmed to behave like several different arbitration mechanisms, such as TDM, CCSP and FBSP.

The second paper “Retention Time Measurements and Modelling of Bit Error Rates of WIDE I/O DRAM in MPSoCs”was first-authored by our colleagues at Kaiserslautern University of Technology in collaboration with Sven Goossens from our Memory Team. This paper looks into the thermal behavior of 3D-stacked WIDE I/O DRAM and compares its impact on retention time and bit error rates to conventional 2D DRAM chips.

Visiting Researcher at Airbus Group Innovations

Today, I start a two month stay as Visiting Researcher at Airbus Group Innovations in Ottobrunn, Germany. I will be working together with Jan Nowotsch on topics related to performance analysis of memory accesses in COTS multi-core platforms. During my stay, I look forward to meeting new people learning more about real-time systems in the avionics domain.

DRAMPower v4.0 Released!

A new version of the DRAMPower tool has been released. The two main features of version 4 are:

  1. DRAMPower can now be compiled as a library. This enables a user to access the tool through an API and log commands and their corresponding time stamps, removing the need to store large command traces on disk. The key benefit of this feature is that users can easily integrate DRAMPower into their own memory controller simulators and obtain power and energy consumption estimates. In fact, this version of DRAMPower is already integrated into the memory controller of the gem5 simulator system and is provided with the latest release.
  2. Improved robustness. The latest build is checked out every night on a test server, compiled, and tested to verify that the output matches an expected reference for a battery of tests. The code is also compiled with a large number of warning flags enabled and treats all warnings as errors. This feature makes it easier for the community to reliably contribute to the tool, which is now possible through github.

Check it out the new version of DRAMPower here.

First PhD Student Graduates From the Memory Team

Today, Karthik Chandrasekar was promoted to doctor as he confidently defended his PhD thesis “High-Level Power Estimation and Optimization of DRAMs”. The thesis proposes a high-level power estimation tool called DRAMPowerthat estimates the power and energy consumption of different generations of DRAMs based on a memory command trace and current values from the memory datasheet. Since current numbers in datasheets are often pessimistic for a majority of the manufactured memory devices, a methodology is also proposed to characterize DRAM modules post-manufacturing to achieve more accurate power and performance estimates for the characterized devices. Lastly, the thesis discusses power optimization in the context of real-time memory controllers and proposes two power-down strategies to reduce the power consumption of memories in real-time systems without sacrificing worst-case performance.

The defense went very well and the committee was particularly pleased with how the DRAMPower tool was verified using measurements on real hardware and how it has attracted interest from industry. Karthik is the first PhD student to graduate from the Memory Team and the rest of the team wishes him all the best for his future career at Nvidia.

Article in ACM Transactions on Embedded Computing Systems (TECS)

Manil Dev Gomony just had his first journal article accepted in ACM Transactions on Embedded Computing Systems. The article is entitled “A Real-Time Multi-Channel Memory Controller and Optimal Mapping of Memory Clients to Memory Channels” and is an extension of his DATE paper from 2013, which was the first paper to provide architectures and techniques for multi-channel memory controllers in real-time systems.

The two main contributions of the article are: 1) A configurable real-time multi-channel memory controller architecture with a novel method for logical-to-physical address translation. 2) Two design-time methods to map memory clients to the memory channels, one an optimal algorithm based on an integer programming formulation of the mapping problem, and the other a fast heuristic algorithm. The mapping algorithms are experimentally evaluated, showing benefits over two state-of-the-art mapping algorithms. Finally, a case study is presented that demonstrates how to configure a Wide IO DRAM in a High-Definition (HD) video and graphics processing system to emphasize the practical applicability and effectiveness of the work.