[Cplex] Re: parallelism manager / scheduler

Edwards, Harold C hcedwar at sandia.gov
Thu Jun 20 00:43:46 CEST 2013


Intro: My focus is enabling HPC libraries & applications to maximize performance, and their developers to maximize productivity.

The separation of concerns direction that this discussion has taken is outstanding.  I am confident the parallelism manager concept is the key.  I'm elaborating on Darryl's summary with concepts from HWLOC, MPI, Qthreads, and heterogeneous architectures.

The pool of thread resources (thread pool) includes a hardware processing units (borrowing vocabulary from HWLOC), different memory regions with affinity to processing units (e.g., shared caches and NUMA) and performance characteristics, and as heterogeneous architectures evolve I anticipate other resources-with-affinity will emerge.  Thus there is a topology of resources, which HWLOC strives to capture and express.

A "manager" is responsible for allocating these resources to programs (just as memory is currently allocated and released).  When bare-metal performance is critical we want exclusive control of those resources (analogous to locking memory).  Otherwise the "manager" is free to share those resources among multiple programs.

Allocation and release of subsets of resources from thread pool, conformal to the topology, is the fundamental management operation.  For example, requesting exclusive control to a subset of processing units to perform a data-parallel operation and then releasing control.  Given an allocated subset of resources a "submanager" (think MPI root-communicator and sub-communicator) can have a scheduler which virtualizes those resources, or an HPC scheduler which gives bare-metal control for maximum performance.

Thus I see three conceptual components: 
1) "Manager" of a topology of processing units, memory regions, and other components.  A manager allocates subsets of itself conformal to the topology.  In the HPC realm I really want the ability to restrict the OS to a subset of resources, or conversely claim exclusive control of a subset of resources.
2) "Scheduler" attached to a manager's subset of resources, which could virtualize "hard" resources.
3) Language extensions and libraries which allocate and use "submanagers" and choose a particular scheduler that fits their semantics.

For backward compatibility I see an implicit "root" manager (think MPI_COMM_WORLD) and default scheduler from the current runtime environment.  Looking forward I see Darryl's "my_new_manager" of the form:

	my_first_manager = my_new_manager( ROOT_MANAGER , subset_selection , scheduler )
	my_nested_manager = my_new_manager( my_first_manager , subset_selection , scheduler )


Carter Edwards
Computing Research Center
Sandia National Laboratories


-----Original Message-----
From: cplex-bounces at open-std.org [mailto:cplex-bounces at open-std.org] On Behalf Of Darryl Gove
Sent: Wednesday, June 19, 2013 3:12 PM
Cc: 'chandlerc at google.com'; Artur Laksberg; Jeffrey Yasskin; cplex at open-std.org; Niklas Gustafsson
Subject: [EXTERNAL] Re: [Cplex] Cplex: suggested topics for discussion on the next teleconf.

Hi,

This is an interesting discussion. I'd like to try and capture what I think the key points are, pulling in some of the earlier discussion on the alias.

We have some general parallelisation concepts extracted from Cilk and OpenMP which are tasks, parallel for, parallel region, reductions, etc.

In OpenMP we have a set of what might be called scheduling or execution directives which have no direct equivalent in Cilk.

In Cilk we have composability because everything resolves down to tasks, and the tasks can be placed on a "single" queue and therefore it doesn't matter who produced the tasks because they all end up on the same queue. 
In OpenMP we have to manage composability through nested parallelism. 
This gives us more control over which threads perform a task, or where that task is executed, but it makes it difficult if you have nested parallelism from combined applications and libraries from different sources - the developer needs to more carefully manage the nesting.

The recent discussions on this alias have talked about "schedulers", and the Ada paper talked about "parallelism manager". I've not seen a definitive definition, so I'm mapping them onto what amounts to a thread pool plus some kind of "how do I schedule the work" manager (which kind-of looks a bit like a beefed up OpenMP schedule directive).

Conceptually I think we can do the following.

We have a parallelism manager which has a pool of threads. Each thread could be bound to a particular hardware thread or locality group. The parallelism manager also handles how a new task is handled, and which task is picked next for execution.

A parallel program has a default manager which has a single pool of threads - which would give Cilk-like behaviour. If we encounter a parallel region, or a parallel-for, the generated tasks are assigned to the default manager.

However, we can also create a new manager, give it some threads, set up a scheduler, and then use that manager in a delineated region. This could enable us to provide the same degree of control as nested parallelism provides in OpenMP.

For example:

parallel-for(...) {...} // would use the current manager.

Or

p_manager_t pman = my_new_manager();

p_manager_t_ old_pman = _Use_manager(pman);

parallel_for(...) {...} // would use a new manager for this loop

_Use_manager(old_pman);

[Note: I'm not proposing this as syntax or API, just trying out the concept.]

If the above doesn't seem too outlandish, then I think we can separate the "parallelism manager" from the parallelism keywords. So we should be able to put together a separate proposal for the "manager".

This is good because the starting point proposal that Robert and I provided was based on existing Cilk/OpenMP functionality. This "manager" 
concept is less nailed down, so would presumably take a bit more refinement.

One of the other comments was about how this works on a system-wide level, where multiple applications are competing for resources. That is a concern of mine as well. But reflecting on that issue this morning, it's not that dissimilar to the current situation. We will certainly be making it easier to develop applications that request multiple threads, but the instance of Thunderbird that I'm currently running has 98 threads. I rely on the OS to mediate, or I can potentially partition the system to appropriately allocate resources. Hence I'm not convinced that we need to prioritise solving this in the general case, and potentially it becomes a separate "proposal" that works with the "manager" proposal.

Regards,

Darryl.






More information about the Cplex mailing list