[Cplex] Integrating OpenMP and Cilk into C++

Hoeflinger, Jay P jay.p.hoeflinger at intel.com
Thu Jun 20 01:04:12 CEST 2013

I'm struggling with what it means to merge OpenMP and Cilk into C++.  OpenMP and Cilk already exist outside of C++ and can both be used today in a C++ program.  Pulling syntax or semantics of OpenMP and Cilk into C++ for some vendors would just mean a large effort to move code from one part of their compiler and/or runtime to another, with the result being no more than we already have today.  And today, the complexity is less because the implementations are partitioned.

The thing that we don't have today, that makes this something useful to discuss, is some way of making the OpenMP, Cilk, and C++ threading models work together.  I think that should be the focus of this effort.  I have lobbied within OpenMP for better interoperability with other threading models, but nothing has come of that yet.  Perhaps *this* effort can allow that to start happening.

The interoperability problem comes down to knowing how many threads to use for an OpenMP parallel region, and how to manage thread usage with Cilk.  For that, the scheduler needs to predict future thread usage.  One way to do that is to give the programmer some set of API routines to give the schedulers hints.  Other API routines could be used to query the current overall threading state, to allow programmers to adjust their parallelism accordingly.

So, I say we should keep OpenMP and Cilk separate: to allow C++ to be as agile as possible, reduce complexity, and allow OpenMP and Cilk to continue changing in their own organic ways, but add ways that allow them to work together with C++ threads and each other.


-----Original Message-----
From: cplex-bounces at open-std.org [mailto:cplex-bounces at open-std.org] On Behalf Of cplex-request at open-std.org
Sent: Wednesday, June 19, 2013 4:12 PM
To: cplex at open-std.org
Subject: Cplex Digest, Vol 2, Issue 19

Send Cplex mailing list submissions to
	cplex at open-std.org

To subscribe or unsubscribe via the World Wide Web, visit
or, via email, send a message with subject or body 'help' to
	cplex-request at open-std.org

You can reach the person managing the list at
	cplex-owner at open-std.org

When replying, please edit your Subject line so it is more specific than "Re: Contents of Cplex digest..."

Today's Topics:

   1. Re: Cplex: suggested topics for discussion on the next
      teleconf. (Jeffrey Yasskin)
   2. Re: Cplex: suggested topics for discussion on the	next
      teleconf. (Darryl Gove)


Message: 1
Date: Wed, 19 Jun 2013 10:43:04 -0700
From: Jeffrey Yasskin <jyasskin at google.com>
Subject: Re: [Cplex] Cplex: suggested topics for discussion on the
	next	teleconf.
To: Herb Sutter <hsutter at microsoft.com>
Cc: Artur Laksberg <Artur.Laksberg at microsoft.com>,
	"chandlerc at google.com" <chandlerc at google.com>,	Niklas Gustafsson
	<Niklas.Gustafsson at microsoft.com>,	"cplex at open-std.org"
	<cplex at open-std.org>
	<CANh-dX=9tNk+mnpxz3WCoNHH=JLmkeJJLn1C_0+kc28EPC9Pjw at mail.gmail.com>
Content-Type: text/plain; charset="windows-1252"

On Wed, Jun 19, 2013 at 7:57 AM, Herb Sutter <hsutter at microsoft.com> wrote:

>  *[adding 4 folks to the To: line who are working on parallel 
> ?executors?/schedulers in WG21/SG1, but I?m not sure if they?re on 
> this list ? they may have relevant comments about the possibility of 
> standardizing a cross-language scheduling layer at the C level]*
> **

2 thoughts:

(0: your email's quoting is so confusing.)

>  As Hans wrote:****
> I fully agree with the need for a common parallel runtime across 
> languages. One could go even further and ask for a common runtime 
> across applications, allowing applications to adapt their degree of 
> parallelism to the current system load.****
> Yes. IMO the three problems we?re solving in the mainstream industry, 
> in necessary order, are: (A) Make it possible to even express 
> parallelism reliably. We?ve all been working on enabling that, and we?re partway there.
> Only once (A) is in place do you get to the second-order problems, 
> which arise only after people can and do express parallelism: (B1) 
> Make it possible for multiple libraries in the same application to use 
> parallelism internally without conflicting/oversubscribing/etc. = 
> common intra-app scheduler. (B2) Ditto across multiple applications, 
> driving the scheduling into the OS (or equivalent 
> inter-app/intra-machine scheduler).****
> ** **
> It would be immensely valuable to standardize (B).****
> **
Establishing a common scheduling runtime is quite hard, since you have to cover both CPU-bound tasks that should be limited to 1-per-core, and IO-bound tasks that need many more scheduled at once. What existing examples of this do we have to learn from? Google's internal attempt has not been very successful, IMO. Are people happy with Microsoft's system of TaskCreationOptions passed to the global TaskScheduler? Are people happy with Grand Central Dispatch's global queue options? By "happy with", I mean, do they use these to control all concurrency in their systems, or do many of them create other threads manually?

>  Yes. I would love to see us undertake to first do (2) and enable 
> different forms of (1) as library and language extensions, then see if 
> we can standardize (1). As someone noted, there is work on (2) being 
> done in
> WG21/SG1 right now with Google?s (and collaborators?) ?executors? proposal.
> Should I see if those folks can join this group if they aren?t on it 
> already? (CC?ing three of the people on that effort.)****
> **

Google's "executors" get a lot of mileage by assuming that users with different constraints can instantiate different executors. That assumption conflicts with the idea that we'll have one scheduling library to coordinate across multiple processes. If we get a shared scheduling library, we'd likely wrap it in a set of executors, but it shouldn't itself be an executor: it needs too many options.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.open-std.org/pipermail/cplex/attachments/20130619/c3a95149/attachment-0001.html 


Message: 2
Date: Wed, 19 Jun 2013 14:11:56 -0700
From: Darryl Gove <darryl.gove at oracle.com>
Subject: Re: [Cplex] Cplex: suggested topics for discussion on the
	next	teleconf.
Cc: "'chandlerc at google.com'" <chandlerc at google.com>,	Artur Laksberg
	<Artur.Laksberg at microsoft.com>,	Jeffrey Yasskin <jyasskin at google.com>,
	"cplex at open-std.org" <cplex at open-std.org>,	Niklas Gustafsson
	<Niklas.Gustafsson at microsoft.com>
Message-ID: <51C21E9C.8090801 at oracle.com>
Content-Type: text/plain; charset=windows-1252; format=flowed


This is an interesting discussion. I'd like to try and capture what I think the key points are, pulling in some of the earlier discussion on the alias.

We have some general parallelisation concepts extracted from Cilk and OpenMP which are tasks, parallel for, parallel region, reductions, etc.

In OpenMP we have a set of what might be called scheduling or execution directives which have no direct equivalent in Cilk.

In Cilk we have composability because everything resolves down to tasks, and the tasks can be placed on a "single" queue and therefore it doesn't matter who produced the tasks because they all end up on the same queue. 
In OpenMP we have to manage composability through nested parallelism. 
This gives us more control over which threads perform a task, or where that task is executed, but it makes it difficult if you have nested parallelism from combined applications and libraries from different sources - the developer needs to more carefully manage the nesting.

The recent discussions on this alias have talked about "schedulers", and the Ada paper talked about "parallelism manager". I've not seen a definitive definition, so I'm mapping them onto what amounts to a thread pool plus some kind of "how do I schedule the work" manager (which kind-of looks a bit like a beefed up OpenMP schedule directive).

Conceptually I think we can do the following.

We have a parallelism manager which has a pool of threads. Each thread could be bound to a particular hardware thread or locality group. The parallelism manager also handles how a new task is handled, and which task is picked next for execution.

A parallel program has a default manager which has a single pool of threads - which would give Cilk-like behaviour. If we encounter a parallel region, or a parallel-for, the generated tasks are assigned to the default manager.

However, we can also create a new manager, give it some threads, set up a scheduler, and then use that manager in a delineated region. This could enable us to provide the same degree of control as nested parallelism provides in OpenMP.

For example:

parallel-for(...) {...} // would use the current manager.


p_manager_t pman = my_new_manager();

p_manager_t_ old_pman = _Use_manager(pman);

parallel_for(...) {...} // would use a new manager for this loop


[Note: I'm not proposing this as syntax or API, just trying out the concept.]

If the above doesn't seem too outlandish, then I think we can separate the "parallelism manager" from the parallelism keywords. So we should be able to put together a separate proposal for the "manager".

This is good because the starting point proposal that Robert and I provided was based on existing Cilk/OpenMP functionality. This "manager" 
concept is less nailed down, so would presumably take a bit more refinement.

One of the other comments was about how this works on a system-wide level, where multiple applications are competing for resources. That is a concern of mine as well. But reflecting on that issue this morning, it's not that dissimilar to the current situation. We will certainly be making it easier to develop applications that request multiple threads, but the instance of Thunderbird that I'm currently running has 98 threads. I rely on the OS to mediate, or I can potentially partition the system to appropriately allocate resources. Hence I'm not convinced that we need to prioritise solving this in the general case, and potentially it becomes a separate "proposal" that works with the "manager" proposal.



More information about the Cplex mailing list