[Cplex] Integrating OpenMP and Cilk into C++

Tom Scogland tom at scogland.com
Sun Jun 23 07:24:36 CEST 2013

On Sat, Jun 22, 2013 at 12:10 AM, Nelson, Clark <clark.nelson at intel.com>wrote:

> > In my opinion, if we do not specify a scheduler interface as part of this
> > effort, we will have done nothing worth doing.  Neither Cilk nor OpenMP
> are
> > useful without a runtime managing concurrency, scheduling work and (in
> some
> > fashion) allowing the user to control concurrency.  Either one can be
> used
> > without one, but then all that's left is an overly verbose serial program
> > (well, with better than average SIMD usage, but I digress).  I am
> personally
> > not interested in specifying a parallel language extension with no
> standard
> > way to control its behavior.
> I'm sure you're right -- from your perspective. I absolutely believe that
> you,
> personally, would benefit in no way.
> However, I'm pretty confident that there are quite a few less sophisticated
> programmers in the world who would benefit considerably from having an
> easier way to write a parallel program than by using pthreads, and an
> easier
> way of writing a scalable, composable parallel program than by using
> OpenMP;
> and they would benefit even more if that way were in some way standard.

BTW, when you talk about a "standard way to control its behavior", do you
> mean
> "control" in any broader sense than would be covered by "tune"?

"​Tune​" to me implies that it is an alteration to increase performance by
some metric without affecting correctness.  For example, changing the
number of threads available to the program as a whole, requesting a
specific minimum chunk size for a loop, limiting a section of code to a
specific number of threads, specifying an alternative scheduling scheme,
etc. all of these could reasonably be called "tuning".  In that sense,
tuning is sufficient, but I have a feeling that is not the meaning you had
in mind.

> > In that sense I believe we need to specify *something* as a standard
> > scheduler interface. The question at hand is not necessarily whether we
> > should attempt to make *a single scheduler* which is perfect for all
> > occasions, a nigh impossible task on the best of days, but rather a
> standard
> > interface and mechanism for composing schedulers within an application,
> > probably a standard library interface of some sort.  In that fashion we
> > allow users and runtime implementers to define schedulers which fit their
> > needs, but still incorporate into the standard system.
> I wholeheartedly support this idea. But I feel it's a separate and deeper
> topic (and in a lot of ways more interesting :-) than the simple ability to
> express what I'll call opportunistic parallelism, as in a program that can
> take advantage of more than one processor, which can still be useful even
> if
> it isn't tuned to within an inch of its life.
> Please note that I *didn't* say that this separate and deeper topic should
> be given lower priority than the other.

​I actually agree with you on this point in that expressing parallelism is
a  simpler problem to which we have a couple of reasonable solutions on the
table already.  The big issue is that I don't think we can reasonably
release such an extension unless it can interoperate sensibly with the low
level alternatives, especially threading at the existing C11 level.

In the current proposal I see no way for even an expert programmer to
produce a C11 threads, or pthreads etc., library to compose sensibly with
the extension.  There isn't even a way for such a library to query the
number of threads in use by the system.  For that matter, given a threaded
application that wants to call a library using the new extension, how is it
meant to convey the amount of parallelism for the library to use, let alone
give it existing thread resources to run on?

I have no issue with providing a sensible default for when a user does not
care, but there has to be a way for library writers and experts to be able
to interface with the new runtime intelligently, and without being forced
to reimplement everything in terms of the new extension.  At a minimum that
means offering hooks to set properties such as number of threads, binding
of threads, and query those same values.  Given that interface, we could at
least introduce this without clobbering everyone else, this is acceptable
but sub-optimal territory as far as I'm concerned.

Perhaps that's what we should seek to do in this group, and create another
group to look into the issue of composing schedulers, although I'm not sure
what it would necessarily accomplish.  This group needs a runtime interface
to rely upon so that it can specify a standard ABI for enqueueing tasks
etc. and the scheduling interface would serve that purpose if developed.

For that matter, I'm not sure how much more effort would really be
required.  There is already an interface defined for passing tasks to a
scheduler, and another (even if its presently hidden) for specifying the
resources that scheduler is allowed to use.  To use a well established
example, in OpenMP parallelism is managed in a hierarchy, or a scope if you
will.  If you run a work-sharing loop in a parallel region with 8 threads,
it will spread across exactly those 8 threads, even if at an outer parallel
region there are 80 available.  Effectively each parallel region is a
scheduling scope.  Would it be so bad to allow the specification of a,
potentially user-defined, scheduler and resources (number of threads etc.)
for each region?  When I say scheduler here, I mean something like the
"executor" discussed in another thread on this list.  At the present
moment, I can't think of any issue I have that could not be solved by some
combination or implementation of those options.
-Tom Scogland
"A little knowledge is a dangerous thing.
 So is a lot."
-Albert Einstein
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.open-std.org/pipermail/cplex/attachments/20130623/ffa6b45f/attachment.html 

More information about the Cplex mailing list