Model-Centric Debugging Principles

2016-06-30

Slides in PDF

The principles of programming-model-centric debugging consist of a set of functionalities that a debugger should implement in order to offer a model centric vision of multicore application debugging. With this approach, we expect to provide developers with more efficient tools to debug their applications. Instead of working with system abstractions like threads and processes, they will interact with the very entities and communication operations defined by the programming model abstract machine of their application.

Our approach relies on detecting and handling of the key programming-model related operations of the execution. Debuggers should interpret these operations to be able to follow the abstract-machine state evolution. This would enable them to provide accurate and high-level information and control mechanisms to debug applications based on programming model.

Providing a Structural Representation

The application architecture is an important aspect of the state, that debuggers should monitor and represent. It can be static or dynamic. In the latter case, the architecture can vary over the time, so it is crucial that debuggers follow its evolution in order to provide an accurate view of the current deployment.

Debuggers should provide developers with catchpoints (ie operation breakpoints) on these architecture-modification operations.

They should also capture and represent the relationship between the different entities, in particular when the model explicitly defines such connections. Otherwise, different metrics can be evaluated to estimate the entities’ affinities, such as their communication frequency or the underlying processor topology, if entities are pinned to a particular core. The set of entities and inter-connections forms a graph (directed or not) that can help developers to detect unexpected situations, for instance by studying the graph shape or the dynamic of inter-entity exchanges.

Monitoring the Application’s Dynamic Behaviors

The different entities of a parallel application usually collaborate in order to complete their task. Hence, debuggers should provide information about these interactions. Namely, they should interpret communication and synchronization events and represent them with respect to the graph structure of the application. They should also be aware of the pattern and the semantics of communication operations in order to precisely interpret their behavior. For instance, communication patterns can be one-to-one, one-to-many, global or local barriers, … The semantics of the link can be First-In-First-Out (FIFO) or implementation specific. This last case might require further cooperation with the supportive environment.

Debugger should also allow developers to stop the execution based on these interactions.

Interacting with the Abstract Machine

Parallel programming models should facilitate application decomposition, hence a programming-model centric debugger should be able to distinguish and identify these different entities. It should also provide indications about their inner state, like their schedulability or outstanding communication events.

A modification of an entity state may indicate a turnaround in the execution, so debuggers should provide watchpoints (ie state-modification breakpoints) for such events. These watchpoints would allow developers to reach different time and space locations of the execution more easily.

Two-level Debugging

Finally, as the instructions of any application are eventually written in a standard programming language and executed by the processor like traditional code (We assume compiled languages here, but the rationals are similar for interpreted languages.), language-based and low-level debugging commands should still be available. Indeed, although some bugs may lay in the programming-model related aspects of the application, there is a chance that the problems are hidden deep down in the language instructions. So, memory and processor inspection, breakpoints and watchpoints (maybe entity-specific) and other step-by-step execution control primitives should be directly available.

Open Up to Model and Environment Specific Features

Different programming models do not provide the same functionalities, not do they require the same debugging capabilities. Therefore, programming-model centric debuggers should adapt their debugging features to the specifics of the programming models and environments they are targeting.

Debuggers can follow messages transmitted from entity to entity, either based on a model-defined routing table for the entity being considered, or through user-provided tables;
Debuggers can check user-defined constraints on the graph topology, on message payload, paths, …, and stop the execution in case of violation.

More advanced features can also be designed thanks to the strong programming-model knowledge achieved by the debugger:

Debuggers can detect deadlock situations with loops in the graph of blocking communications;
If the debugger supports non-stop debugging, smart breakpoints can stop the tasks trying to communicate with tasks already stopped by the debugger, in order to limit the intrusiveness.

In the following section, we delimit the scope of applicability of this approach.

Scope of Applicability

Model-centric debugging can be applied to various kinds of targets. Its primary objective is task-based programming models for multicore processors. Indeed, such tasks should communicate with each other and form, implicitly or explicitly, a graph. They should also be executable in parallel. Component and dataflow programming perfectly fit in this area.

However, the scope of application is broader than that, as we demonstrate later in this chapter with kernel-based programming. Any programming model defining an abstract machine complex enough may benefit from this approach. And, as explained in the previous chapter, the notion of abstract machine is loosely defined, on purpose. Thus, model-based debugging can be applied on top of any API, provided that someone devotes time to its implementation.

We can exemplify this last point with a video decoder, where a model-centric debugger could recognize the different modules (eg sound decoder, beginning/end of a frame, the error channel, …). This would help developers to understand more rapidly the current state of the execution: decoding frame N, previous frame dropped, error channel empty, …

On the other hand, it is important to note that we only focused on a particular aspect of multicore computing: analyzing the cooperation between entities running in parallel. We do not address the problem of debugging a large number tasks, neither the time-related challenges of concurrent executions.

The main reason for that is that we believe that interactive debuggers are not suitable for this kind of problems. Indeed, for the former aspect, the quantity of information developers can understand at each step of the execution limits the possibilities of interactive debugging. If thousands of tasks are running concurrently, developers cannot go through all of them and verify that their state matches their expectations. Designing tools offering such capabilities is another research topic. Instead, they should try to narrow the problem down to a minimum size, both in term of number of parallel executions and processing time. Time-related issues are well-studies, although not yet solved. Limiting the intrusivity of interactive debugging, and furthermore improving it for such problems is yet another independent research topic.

For similar reasons, we do not target SIMD parallel computing. For such applications, a simple alternative would consist in running the code sequentially and use traditional debugging tools (or model-centric, if applicable).

Finally, the industrial context of this thesis set an additional constraint to the scope, which was that the work should focus on applications scaled for the companies’ embedded boards. This implied embedded multicore MPSoC platforms, but not large-scale, HPC-like computers.

Now that we have delimited the scope of applicability, we present, in the next section, how the principles of model-centric debugging apply to our three programming models.