[Cplex] Cplex: suggested topics for discussion on the next teleconf.

Tom Scogland tom.scogland at gmail.com
Wed Jun 19 00:12:35 CEST 2013


Jeff, have you read the c11 extensions? It already has the majority of pthreads in it, and several primitives from the assembly list as well. 


​Timothy, all true in terms of static parallelism. I would argue with you on the point that OpenMP only dynamically distributes task constructs however. It also dynamically distributes loop iterations if any non static schedule is selected, and the thread ordering in that case is unspecified.  Regardless, it is certainly lower level, in that it allows you to do things one frequently needs to do to get performance, but if one does not wish to specify the number of threads they need not do so. It would be completely reasonable to write an OpenMP program which never specifies the mapping of anything at all to threads, as Cilk does, the fact few people deign to do that says something in my opinion. 



​Robert, in what sense? It guarantees that if you get parallelism that it will tell you how much. That does not make it mandatory. 


If I understand the argument for Cilk properly, it is that throwing tasks into the ether across the entire program lets the scheduler make better overall decisions. In several ways, that's true, but why is Cilk the model for this? There are several related efforts to create "task based"parallel systems of a similar character. See OmpSs and qthreads with its Rose OpenMP frontend for examples. They have, as far as I can tell, every bit as much composability as Cilk while allowing for the management of concurrency as well.


​At least in my mind, composability requires at least one more level though. A user should be able to limit the scope of the concurrency of their program, and preferably of libraries they call as well. The example that jumps to mind is a NUMA machine with 4 dual core CPUs spawning eight threads, allocating memory on each memory node and spawning tasks to work on that data. I want a way to tell the system to limit tasks to being run on threads on each die.  Will that always be right? No, but it needs to be an option.  If it isn't, then where are we? We're stuck with opencl before the fission extension, where all cores get used regardless of what the user wants, and that's not a good place for CPUs. 


As Jeff said before me, I am probably making people angry saying this, but I consider Cilk incomplete as an extension for parallelism. Its runtime is efficient and well designed, but leaving concurrency uncontrollable via the extension is an issue.



—
Tom Scogland

On Tue, Jun 18, 2013 at 11:16 PM, Jeff Hammond <jhammond at alcf.anl.gov>
wrote:

> It is also worth thinking about why people use C instead of C++ in the
> same manner as we think about HPC vs. non-HPC w.r.t. low level.
> At least among the people I know, which includes HPC jocks and
> scientists but no one from commercial ISVs or other
> finite-billion-dollar industries, people use C either because they are
> too dumb to use C++ (or know they'll blow their leg off if they do) or
> because they are too smart to use C++ (because they want to
> reimplement all the features of C++ themselves e.g. MPICH and PETSc).
> My point is that, if Cplex is C-centric, we should be attentive to
> what C programmers want and not what C++ programmers want.  I don't
> really know that much about Cilk but it seems - like TBB - to be for
> the C++ programmers of the world who do not like the bare-metal
> perspective of C.  OpenMP is a good match for Fortran or Fortran-like
> programs written in C/C++.  The way I write C for parallel programs is
> to use Pthreads and inline assembly or language extensions that are
> syntactic sugar for inline assembly; I do not want all sorts of fancy
> tasking models pre-loaded on a silver spoon for me.
> My hope is that Cplex can provide a mechanism for C programs to
> achieve what they can only today achieve with non-portable language
> constructs like inline assembly.  I see absolutely no value in putting
> into the language that which is already provided for by libraries like
> Pthreads unless it is provable that compiler interaction provides a
> measurable _performance_ benefit.
> It will not surprise me if my position pisses off 110% of the people
> who read this email, but this is how I see the world.  Lots of people
> jumped on the discussion of SIMD yet it seems that is not the priority
> so far.
> Finally, I am reminded of what a friend and colleague said to me many
> years ago about C: "It has all the performance of assembly language
> but with all the expressivity of assembly language."
> Best,
> Jeff
> On Tue, Jun 18, 2013 at 3:56 PM, Mattson, Timothy G
> <timothy.g.mattson at intel.com> wrote:
>> Composability is great and very important, but not if it can’t be provided
>> without killing performance.
>>
>>
>>
>> My experience with the HPC community is that they perceive that Cilk does
>> not deliver adequate performance on HPC applications.  The comment I often
>> hear is that people do not know of anyone using Cilk in production HPC
>> applications.  The reason given is that on HPC applications you need to
>> design your algorithms around data movement which causes HPC programmers to
>> adopt an SPMD style of programming with OpenMP.  Therefore, the HPC
>> community would not be happy if the Cilk style of multithreading came to
>> dominate.
>>
>>
>>
>> Note very carefully my choose of words … the “perception” exists.  I
>> personally believe that for HPC applications, this perception is based  on
>> experience and deserves to be taken very seriously.  But at the same time, I
>> am more than willing to entertain the idea that this perception is incorrect
>> and with adequate education, the HPC community would embrace the Cilk style
>> of multithreading.  IF that is the case, however, we have a lot of work to
>> do to educate the HPC software community.
>>
>>
>>
>> Of course, it may be that we don’t really care about supporting the HPC
>> community.  They will always be my primary community of concern, but I am
>> the first to admit that in the overall scheme of things, HPC is a small
>> market and we may legitimately decide that a solution they won’t use is OK.
>>
>>
>>
>> --Tim
>>
>>
>>
>>
>>
>>
>>
>> From: Geva, Robert
>> Sent: Tuesday, June 18, 2013 1:44 PM
>> To: Mattson, Timothy G; Robison, Arch; Tom Scogland; Nelson, Clark
>> Cc: cplex at open-std.org
>> Subject: RE: [Cplex] Cplex: suggested topics for discussion on the next
>> teleconf.
>>
>>
>>
>> I think that this clarification from Tim as well as the other one from Arch
>> explain the “mandatory” characteristic of OpenMP.
>>
>> My question would be, can we specify a language that allows both this
>> behavior and also composability, or do we have to pick one?
>>
>>
>>
>> Robert.
>>
>>
>>
>> From: cplex-bounces at open-std.org [mailto:cplex-bounces at open-std.org] On
>> Behalf Of Mattson, Timothy G
>> Sent: Tuesday, June 18, 2013 1:39 PM
>> To: Robison, Arch; Tom Scogland; Nelson, Clark
>> Cc: cplex at open-std.org
>> Subject: Re: [Cplex] Cplex: suggested topics for discussion on the next
>> teleconf.
>>
>>
>>
>> And just to be absolutely clear … once the OpenMP runtime decides how many
>> threads a program will get, it is required to keep that number of threads
>> available throughout the parallel region.  In static mode, if the programmer
>> doesn’t ask for a different number of threads, the system is required to
>> make that same number of threads available in subsequent parallel regions.
>> Not only that, it is required to keep the threadprivate variables and their
>> mapping onto thread IDs between parallel regions.
>>
>>
>>
>> This is important since it means you can use OpenMP for concurrent
>> algorithms, not just parallel algorithms.  You just have to be careful that
>> after you enter the first parallel region, you check that you got enough
>> threads to support your concurrent algorithm (since as Arch pointed out, a
>> system can legally decide that you only get one thread).
>>
>>
>>
>> OpenMP exposes the threads and it is an explicit API.  Other than the task
>> construct, how constructs map onto explicit threads is exposed.  In many
>> ways, this makes OpenMP a much lower level programming model than Cilk.  For
>> folks in the HPC market, this lower level control is perceived to be very
>> important and a big part of why in HPC, OpenMP is dramatically more popular
>> than Cilk or TBB (not surprisingly, outside of HPC it’s probably the other
>> way around).
>>
>>
>>
>> --Tim
>>
>>
>>
>> From: cplex-bounces at open-std.org [mailto:cplex-bounces at open-std.org] On
>> Behalf Of Robison, Arch
>> Sent: Tuesday, June 18, 2013 1:27 PM
>> To: Tom Scogland; Nelson, Clark
>> Cc: cplex at open-std.org
>> Subject: Re: [Cplex] Cplex: suggested topics for discussion on the next
>> teleconf.
>>
>>
>>
>> Technically, no parallelism is mandatory in OpenMP since an implementation
>> is allowed to limit the number of threads to 1.  ( “Algorithm 2.1” in the
>> OpenMP standard has details.)  Though an implementation that does that is
>> unlikely to succeed in the marketplace.
>>
>>
>>
>> What is effectively mandatory in OpenMP is that when a parallel construct is
>> entered, the system must figure out how many threads (thread = stack +
>> threadprivate memory) to allocate to that construct, using Algorithm 2.1.
>> The “how many threads” value must be computed before the amount of work
>> inside the construct is known.  This is both a boon to high performance in
>> the hands of programmers who control the entire program, and a bane to
>> programmers who don’t have that control and want composability.
>>
>>
>>
>> In contrast, Cilk implementations really do pick and choose, at runtime,
>> which potential parallelism should be converted to real parallelism to keep
>> the machine busy, but not oversubscribed.
>>
>>
>>
>> - Arch
>>
>>
>>
>> From: cplex-bounces at open-std.org [mailto:cplex-bounces at open-std.org] On
>> Behalf Of Tom Scogland
>> Sent: Tuesday, June 18, 2013 11:47 AM
>> To: Nelson, Clark
>> Cc: cplex at open-std.org
>> Subject: Re: [Cplex] Cplex: suggested topics for discussion on the next
>> teleconf.
>>
>>
>>
>> Clark, your answer here, as well as some statements in the call yesterday,
>> imply that you believe there is a case where parallelism is mandatory in
>> OpenMP applications.  To my understanding, there is no such case for a
>> conforming application, is there one I am missing?
>>
>>
>>
>> To the point of the pragmas being "hints" however, I completely agree.
>> While both the Cilk and OpenMP constructs can be safely ignored as a whole,
>> if one is honored, they all must be or correctness is highly unlikely.
>>
>>
>>
>> On Tue, Jun 18, 2013 at 6:20 PM, Nelson, Clark <clark.nelson at intel.com>
>> wrote:
>>
>>> Are we defining compiler instructions or hints?: The proposal points out
>>> that Cilk uses keywords, whilst OpenMP is based on pragmas, so presumably
>>> OpenMP is providing 'hints' to the compiler that it is free or accept, or
>>> not. If one of our objectives is to support both approaches, does this
>>> mean that the Cilk-like keywords also have to be treated as 'optional',
>>> at least in the sense of having an agreed semantics expressible in the
>>> language without the parallelism extensions?
>>
>> That's a very good question. The answer is kind of subtle; it depends on
>> *exactly* what is considered "optional".
>>
>> The semantics of Cilk are defined such that actual parallel execution is
>> never guaranteed/mandatory, so all parallelism could be considered optional;
>> in that sense even the Cilk keywords could be considered to be hints.
>>
>> But from a different perspective, the keywords are interpreted as a
>> guarantee by the programmer that it is safe to do certain things in
>> parallel;
>> in other words, that the compiler is free to transform things in a way that
>> might otherwise cause undefined behavior. In that sense, the keywords are
>> semantically significant: putting one in the wrong place causes a program
>> to be broken.
>>
>> The fact that OpenMP constructs are expressed as pragmas does not mean that
>> any OpenMP pragma can be taken as hint when a program is interpreted as an
>> OpenMP program. The intention is that it is OK to ignore all of them, but if
>> any of them are honored, they should all be. So it can also be misleading to
>> call OpenMP pragmas hints.
>>
>> But it's very important to keep in mind the exact sense in which any given
>> construct is or is not a hint.
>>
>> Clark
>>
>> _______________________________________________
>> Cplex mailing list
>> Cplex at open-std.org
>> http://www.open-std.org/mailman/listinfo/cplex
>>
>>
>>
>>
>>
>> --
>> -Tom Scogland
>>
>> http://tom.scogland.com
>> "A little knowledge is a dangerous thing.
>>  So is a lot."
>> -Albert Einstein
>>
>>
>> _______________________________________________
>> Cplex mailing list
>> Cplex at open-std.org
>> http://www.open-std.org/mailman/listinfo/cplex
>>
> -- 
> Jeff Hammond
> Argonne Leadership Computing Facility
> University of Chicago Computation Institute
> jhammond at alcf.anl.gov / (630) 252-5381
> http://www.linkedin.com/in/jeffhammond
> https://wiki.alcf.anl.gov/parts/index.php/User:Jhammond
> ALCF docs: http://www.alcf.anl.gov/user-guides
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.open-std.org/pipermail/cplex/attachments/20130618/e483f2cf/attachment-0001.html 


More information about the Cplex mailing list