Paper Accepted at SCOPES 2017

Hazem had a paper entitled “Combining Dataflow Applications and Real-time Task Sets on Multi-core Platforms” accepted at the 2017 Workshop on Software and Compilers for Embedded Systems (SCOPES). This paper is a short overview of his PhD dissertation, which will be defended in Porto on May 23, and explains an approach to map and schedule a multi-/many-core system containing both applications described as traditional real-time task sets and synchronous data-flow graphs. Hazem’s approach is to convert the data-flow graph into a periodic real-time task set to unify the models before mapping, which enables him to leverage existing real-time analysis techniques and schedulers. However, converting a complex data-flow graph into a periodic task set may result in a large number of tasks, resulting in long analysis times. To mitigate this problem, he proposes a slack-based merging algorithm that allows the number of tasks to be reduced by carefully sacrificing parallelism in the data-flow graph, subject to its latency and throughput constraints. Lastly, the resulting unified real-time task set is mapped to a multi-/many-core platform interconnected by a TDM NoC using a sensitive-path-first algorithm, which first allocates tasks derived from the original data-flow graph that have the highest impact on its execution and schedulability. It is also able to exploit parallelism in graph during mapping.

We hope you enjoy the paper and wish Hazem all the best for his upcoming defense.

Paper Accepted at ECRTS 2017

Our paper “Mixed-criticality Scheduling with Dynamic Redistribution of Shared Cache” has been accepted at ECRTS 2017, marking the end of yet another succesful collaboration with my former colleagues at CISTER. The paper proposes an extension of Vestal’s model for mixed-criticality multi-core systems that 1) accounts for the per-task partitioning of the last-level cache, and 2) supports dynamic reassignment of cache portions initially reserved for lower-criticality tasks to the higher-criticality tasks when switching to high-criticality mode. A schedulability analysis based on partitioned EDF is presented that is aware of the cache resources assigned to each task and leverages the dynamic reconfiguration to improve schedulability. We also propose heuristics for partitioning the cache in low- and high-criticality mode. Experimental result indicate tangible improvements in schedulability compared to a baseline cache-aware arrangement where there is no redistribution of cache resources from low- to high-criticality tasks in the event of a mode change.

Paper Accepted at RTAS

A paper entitled “Partitioning and Analysis of the Network-on-Chip on a COTS Many-Core Platform” was recently accepted for publication at RTAS. This paper was a collaboration with former colleagues at the CISTER Research Unit, as well as friends from MDH in Sweden. The paper addresses the issue of interference between applications in many-core platforms interconnected using rate-regulated Networks-on-Chip (NoC), such as the Kalray MPPA. The main contributions of the paper are 1) a partitioning strategy for reducing contention on the NoC, 2) an analysis technique to determine the Worst-Case Traversal Time of packages under the proposed strategy, and 3) a method to determine parameters for the NoCs rate regulators to get minimal WCTT and ensure that buffers never overflow. The benefits of the proposed approach is evaluated both using simulation and by experiments on a Kalray MPPA. Furthermore, an industrial case study from the automotive domain shows the tightness of the proposed analysis.

Yonghui Li Defends Dissertation

Today, we celebrate that Yonghui Li successfully defended his PhD dissertation “Design and Formal Analysis of Real-Time Memory Controllers” and became Dr. Li. The thesis defines a dynamically scheduled real-time memory controller architecture, which is implemented as a SystemC simulation model. It then continues by analyzing the worst-case response time and minimum guaranteed bandwidth using three different formal frameworks. The first framework is a mathematical formulation of both the actual and worst-case timing behavior as a set of equations and proofs of their correctness. These equations are also implemented in an open-source tool. The drawback of this kind of mathematical formulation is that it takes a long time to derive and prove correct. The second analysis approach addresses this by shifting the effort of the user from performance analysis to modeling the memory controller as a mode-controlled data-flow graph, which can be analyzed with existing tools. This approach is faster, but only bounds the minimum guaranteed bandwidth and not the worst-case response time. This limitation is overcome by the final approach, which is to model the memory controller using timed automata and bound its worst-case performance using a model checker. So, in summary, one controller architecture and three approaches to analyse its worst-case performance. This work hence gives unique insight into the strengths and weaknesses of different modeling and analysis approaches in terms of accuracy, expressiveness, memory consumption, and computation time.

The defense itself was well-prepared and confident and the committee seemed to really like the work. I am also really pleased with how it came out and I would like to thank Yonghui for the years of hard work that went into creating it. It was a pleasure to work with you during these years and I wish you all the best in your future career.

Outstanding Paper Award at ECRTS

I am pleased to announce that our paper “Cache-Persistence-Aware Response-Time Analysis for Fixed-Priority Preemptive Systems” got an Outstanding Paper Award at the Euromicro Conference on Real-Time Systems (ECRTS) in Toulouse. We are glad that the work was well-received and hope that the community will enjoy reading the paper.

Article Accepted in IEEE Transactions on Computers

Manil Dev Gomony just had a journal article entitled “A Globally Arbitrated Memory Tree for Mixed-Time-Criticality Systems” accepted in the high-impact journal IEEE Transactions on Computers. This article extends a conference paper published at DATE in 2015 that was called “A Generic, Scalable and Globally Arbitrated Memory Tree for Shared DRAM Access in Real-Time Systems” that was published in collaboration with Jamie Garside and Neil Audsley from University of York. The original paper explained the design and efficient hardware implementation of a transaction arbiter for real-time systems that could be configured to behave like any of five well-known arbiters, i.e. TDM, Round Robin, Credit-Controlled Static Priority, Priority-Based Scheduler, and Frame-Based Static Priority. The key feature of the arbiter is that it is distributed, which means that accounting and enforcement is not done in a single centralized location, allowing it to scale to systems with many resource clients without negatively impacting the maximum frequency at which it operates.

The journal article extends the original conference paper by adding more detail and examples on the design of the memory tree, as well as improving positioning. However, it also extends the scope of the work to consider more complex Mixed-Time-Criticality systems where some clients are more concerned about average-case than worst-case performance. It also considers that the requirements of the clients may be diverse, i.e. that some may have high bandwidth requirements and are latency-tolerant, while others have low bandwidth requirements, but are latency-critical. This is diversity of requirements is addressed by showing how the memory tree supports the transaction arbiter to be chosen individually per client rather than once for the entire system. For example, some real-time clients may be configured by non-work-conserving TDM arbitration to get predictable bandwidth and latency while enjoying complete temporal isolation from other clients, which simplifies integration and certification. Other clients sharing the same resource, may be scheduled using e.g. using a work-conserving Frame-Based Static Priority scheduler to reflect an interest in low average latency while still distinguishing their relative latency-sensitivity. The memory tree supports any combination of the mechanisms discussed above, but we provide a formal analysis of the mixed arbitration algorithm explained above. The article demonstrates the benefits of this approach on a VHDL hardware implementation, as well as its cost in terms of area and power compared to centralized non-mixed arbitration policies by means of ASIC synthesis.

Journal Article Accepted in ACM TODAES

We just received the good news that Hazem’s article “ Reducing the Complexity of Dataflow Graphs using Slack-based Merging” has been accepted for publication in ACM Transactions on Design Automation of Electronic Systems (TODAES). The article addresses an important problem when working with synchronous data-flow (SDF) graphs, namely that the size of the graph explodes when transforming it to its equivalent homogeneous (HSDF) representation, which prevents any design or analysis algorithms requiring this transformation as a first step from scaling to larger graphs. In the scope of Hazem’s work, this has caused problems when converting an SDF graph into a set of independent periodic real-time tasks.

This article proposes a heuristic algorithm to reduce the size of the resulting HSDF graph prior to analysis by merging actors in the graph, thereby speeding up analysis algorithms using the resulting graph. Three key properties of the algorithm are: 1) it cannot violate the latency or throughput requirements of the original graph, 2) it cannot cause deadlock in the resulting merged graph, and 3) only HSDF actors corresponding to firings of the same SDF actor can be merged to enable the resulting merged graph to be efficiently used by mapping algorithms. The behavior of the algorithm is evaluated with applications from the SDF3 benchmark suite and it is compared to results of an optimal exhaustive merging algorithm for smaller graphs.

Two Papers Accepted at ECRTS 2016!

Two papers have been accepted for presentation at the 28th Euromicro Conference on Real-Time Systems (ECRTS 2016) in Toulouse, France. The first paper is entitled “Cache-Persistence-Aware Response-Time Analysis for Fixed-Priority Preemptive Systems” as is a collaboration with Syed Aftab Rashid, Geoffrey Nelissen, and Eduardo Tovar from CISTER and Damien Hardy and Isabelle Puaut from University of Rennes. This paper presents a WCRT analysis for single-core fixed-priority preemptive systems that exploits persistent cache blocks that are known to be in the cache to reduce WCRT.

The title of the second paper is “Contention-Free Execution of Automotive Applications on a Clustered Many-Core Platform” that was written together with Borislav Nikolic and Vincent Nelis from CISTER, Matthias Becker and Thomas Nolte from MRTC, and Dakshina Dasari from Bosch. This work presents a contention-free execution framework for automotive applications on many-core platforms, which combines privatization of memory banks together with defined access phases to shared memory resources. An Integer Linear Programming (ILP) formulation is presented to find the optimal time-triggered schedule for execution as well as for accesses to shared memory. Additionally, a heuristic solution is presented that generates the schedule in a fraction of the time required by the ILP.

New Book Available for Pre-order

Our new book “Memory Controllers for Mixed-Time-Criticality Systems: Architectures, Methodologies and Trade-offs” is now available for pre-order at Springer. The book is based on the excellent PhD thesis of Sven Goossens and discusses the design and FPGA implementation of a real-time memory controller for mixed-criticality systems. The controller can provide complete temporal isolation to its clients as well as hard bounds on the worst-case response time of transactions and the bandwidth offered by the memory. In addition, it provides competitive average-case performance for soft real-time and best-effort applications using a conservative open-page policy. The design is highly configurable and the book carefully quantifies the trade-offs between bandwidth, response time, and power that this enables. To facilitate the discussion about power, the book also presents the power model that came out of the PhD dissertation of Karthik Chandrasekar and gives an up-to-date description of the open-source DRAMPower tool that implements it.

Update: The contents of the book are now available on SpringerLink

New Position at TNO-ESI

Today, I started a new position as a Research Fellow at Embedded Systems Innovation by TNO (TNO-ESI) in Eindhoven. TNO-ESI is a leading Dutch research group for high-tech embedded systems design and engineering. It has a close cooperation with high-tech industry, as well as a strong association with fundamental research of academia, both national and international. This means I am now transitioning to applied science in an industrial setting and I look forward to the new challenges and opportunities that entails.

I want to thank the good people a CISTER for the time I have spent with the unit. I find it a very nice place to work with good researchers and a friendly atmosphere. I appreciate the intellectual freedom I had to pursue my ideas and interests, as well as the interesting collaborations and growth opportunities I got sucked into. I hope we will have the pleasure of working together again in the future.