mpich-intel.git
15 months agohwloc: null check for cpuset master
Kavitha Tiptur Madhu [Tue, 30 Jan 2018 17:20:16 +0000]
hwloc: null check for cpuset

node_split_processor method queries for the hwloc object
covering the cpuset of a process. The returned object could be null
if the cpuset is empty or is not covered by the root of the hwloc
topology object. Hence, a null check on the returned hwloc object
of the binding query needs to be performed.

Signed-off-by: Pavan Balaji <balaji@anl.gov>

15 months agocoll: TSP collectives framework
Akhil Langer [Tue, 26 Dec 2017 01:54:34 +0000]
coll: TSP collectives framework

This infrastructure allows for C++-template style definition of
collective algorithms, where an algorithm is written using MPIR_TSP
functions.  Then the algorithm can be instantiated for specific
transports (either generic or device-specific) to form
transport-specific algorithm functions.

Signed-off-by: Pavan Balaji <balaji@anl.gov>

Co-authored-by: Pavan Balaji <balaji@anl.gov>
Co-authored-by: Rubasri Kalidas <rubasri.kalidas@intel.com>
Co-authored-by: Alexander Sannikov <alexander.sannikov@intel.com>
Co-authored-by: Wesley Bland <wesley.bland@intel.com>

15 months agoUse MPIR_<coll> instead of MPID_<coll>
Pavan Balaji [Thu, 25 Jan 2018 05:20:39 +0000]
Use MPIR_<coll> instead of MPID_<coll>

Using the MPIR_ version of the collectives, instead of the MPID_
version, respects the MPIR_CVAR_DEVICE_COLLECTIVES and
MPIR_CVAR_<coll>_DEVICE_COLLECTIVE CVARs.

Signed-off-by: Wesley Bland <wesley.bland@intel.com>

15 months agoch3: warning squash
Pavan Balaji [Tue, 23 Jan 2018 18:19:44 +0000]
ch3: warning squash

Signed-off-by: Wesley Bland <wesley.bland@intel.com>

15 months agoupdate copyright years for all files
Pavan Balaji [Fri, 19 Jan 2018 20:34:45 +0000]
update copyright years for all files

Signed-off-by: Ken Raffenetti <raffenet@mcs.anl.gov>

15 months agomaint: improve copyright checker
Pavan Balaji [Fri, 19 Jan 2018 17:48:02 +0000]
maint: improve copyright checker

Signed-off-by: Ken Raffenetti <raffenet@mcs.anl.gov>

15 months agoMove MPIR_Localcopy function to misc folder
Akhil Langer [Thu, 18 Jan 2018 00:18:46 +0000]
Move MPIR_Localcopy function to misc folder

This function is used at various places for purposes other than
collectives as well. Move it to misc folder.

Signed-off-by: Pavan Balaji <balaji@anl.gov>

15 months agocoll: Add copyright to the Makefiles
Akhil Langer [Thu, 18 Jan 2018 00:01:07 +0000]
coll: Add copyright to the Makefiles

Signed-off-by: Pavan Balaji <balaji@anl.gov>

15 months agocoll: minor function renames
Pavan Balaji [Wed, 17 Jan 2018 23:18:29 +0000]
coll: minor function renames

Signed-off-by: Akhil Langer <akhil.langer@intel.com>

15 months agocoll: white space changes
Pavan Balaji [Wed, 17 Jan 2018 23:14:01 +0000]
coll: white space changes

Signed-off-by: Akhil Langer <akhil.langer@intel.com>

15 months agocoll: improve comment wording
Pavan Balaji [Wed, 17 Jan 2018 23:26:21 +0000]
coll: improve comment wording

Signed-off-by: Akhil Langer <akhil.langer@intel.com>

15 months agompl: Add utility functions to mpl_math.h
Akhil Langer [Tue, 26 Dec 2017 19:12:43 +0000]
mpl: Add utility functions to mpl_math.h

Adds MPL_ilog, MPL_ipow, MPL_getdigit, MPL_setdigit fns
to mpl_math.h

Signed-off-by: Pavan Balaji <balaji@anl.gov>

15 months agoAdd debug class for collectives
Akhil Langer [Wed, 15 Nov 2017 18:29:20 +0000]
Add debug class for collectives

A new MPL_dbg_class, MPIR_DBG_COLL, has been added

Signed-off-by: Pavan Balaji <balaji@anl.gov>

15 months agocoll: Add Ibcast prefix to ibcast util functions
Akhil Langer [Wed, 17 Jan 2018 19:09:07 +0000]
coll: Add Ibcast prefix to ibcast util functions

So that they do not conflict with other utility functions.

Signed-off-by: Pavan Balaji <balaji@anl.gov>

15 months agocoll: Change name of ibcast status struct
Akhil Langer [Wed, 17 Jan 2018 16:21:10 +0000]
coll: Change name of ibcast status struct

Change MPII_Ibcast_status to MPII_Ibcast_state. Also change its
instantiations to be named state instead of status. This is being
done so as to not confuse this with MPI_Status.

Signed-off-by: Pavan Balaji <balaji@anl.gov>

15 months agompl: Add collectives memory class
Wesley Bland [Wed, 8 Nov 2017 21:01:10 +0000]
mpl: Add collectives memory class

Add a memory class for tracking memory allocated for collectives

Signed-off-by: Pavan Balaji <balaji@anl.gov>

15 months agoFix .sln line endings
Wesley Bland [Mon, 27 Nov 2017 19:27:32 +0000]
Fix .sln line endings

The line endings in the Windows solutions files don't appear to even be
regular Windows line endings. Convert them to the standard CRLF format.

Signed-off-by: Ken Raffenetti <raffenet@mcs.anl.gov>

15 months agohwloc: upgrade to hwloc-2.0.0rc2.
Kavitha Tiptur Madhu [Fri, 19 Jan 2018 17:31:43 +0000]
hwloc: upgrade to hwloc-2.0.0rc2.

Update the hwloc submodule to point to 2.0.0rc2.

Cache detection has changed from the hwloc-1.x series to the hwloc-2.x
series.  This patch makes the necessary changes to deal with that
change.  Specifically, the base cache object type HWLOC_OBJ_CACHE has
been removed in hwloc v2.0.0.  It has been replaced with L1-L5
d/u-cache and L1-L3 icache objects.

Signed-off-by: Pavan Balaji <balaji@anl.gov>

15 months agohydra: remove replicate membind option
Kavitha Tiptur Madhu [Fri, 19 Jan 2018 16:44:56 +0000]
hydra: remove replicate membind option

The "replicate" option of membind is not well supported on a number of
operating systems, and has been removed from hwloc (in v2.0) as well.
This patch gets rid of that option in Hydra, in preparation for an
upgrade to hwloc-2.0.

Signed-off-by: Pavan Balaji <balaji@anl.gov>

15 months agohwloc: remove io devices filter
Kavitha Tiptur Madhu [Tue, 23 Jan 2018 21:03:58 +0000]
hwloc: remove io devices filter

IO device filters for comm_split_type are not currently supported.

Signed-off-by: Pavan Balaji <balaji@anl.gov>

15 months agodatatype: Ensure MPIR_Type_get_contents goes in libpmpi
Ken Raffenetti [Mon, 29 Jan 2018 15:05:00 +0000]
datatype: Ensure MPIR_Type_get_contents goes in libpmpi

For builds without weak symbols, we need to make sure internal
functions like MPIR_Type_get_contents end up in libpmpi.

Fixes pmodels/mpich#2936.

Signed-off-by: Wesley Bland <wesley.bland@intel.com>

15 months agocomm: Convert char to int type for boolean values
Ken Raffenetti [Mon, 29 Jan 2018 22:12:57 +0000]
comm: Convert char to int type for boolean values

Using char for these values requires casts to make the compiler
happy, and any savings would be minimal. Just use int for simplicity.

Signed-off-by: Min Si <msi@anl.gov>

15 months agocomm: Remove anonymous union/struct in MPIR_Comm
Ken Raffenetti [Mon, 29 Jan 2018 21:58:45 +0000]
comm: Remove anonymous union/struct in MPIR_Comm

Partial revert of [a62a1748be01]. Fixes pmodels/mpich#2937.

Signed-off-by: Min Si <msi@anl.gov>

15 months agoch4: Squash uninitialized warning
Wesley Bland [Tue, 23 Jan 2018 18:44:59 +0000]
ch4: Squash uninitialized warning

Signed-off-by: Ken Raffenetti <raffenet@mcs.anl.gov>

15 months agoch4/ofi: Squash uninitialized variable warning
Wesley Bland [Tue, 23 Jan 2018 18:27:15 +0000]
ch4/ofi: Squash uninitialized variable warning

Signed-off-by: Ken Raffenetti <raffenet@mcs.anl.gov>

15 months agoch4/ofi: Change am buffer allocation to malloc
Wesley Bland [Tue, 23 Jan 2018 18:22:15 +0000]
ch4/ofi: Change am buffer allocation to malloc

Despite efforts to try to preserve the dynamic array declaration, the
compiler still doesn't like using `MPIDI_Global.max_buffered_send` as
the basis for the size of the buffer. Since this is only for active
message buffers (and not things that are as performance critical like
the non-am path).

This reverts commit cfd716260d4d1579691cbc8fa1d5bb0348663c45.

Signed-off-by: Ken Raffenetti <raffenet@mcs.anl.gov>

15 months agoch4/ofi: Enable scalable enpoints for psm2
Hajime Fujita [Wed, 7 Jun 2017 20:34:06 +0000]
ch4/ofi: Enable scalable enpoints for psm2

Scalable endpoints support in the psm2 provider has been implemented
and upstreamed. This patch enables scalable endpoints for psm2.

Signed-off-by: Ken Raffenetti <raffenet@mcs.anl.gov>

15 months agoch4/stubshm: Remove duplicate startall definition
Alexey Malkhanov [Thu, 25 Jan 2018 15:07:50 +0000]
ch4/stubshm: Remove duplicate startall definition

Signed-off-by: Ken Raffenetti <raffenet@mcs.anl.gov>

15 months agoch4/stubshm: Change count type to MPI_Aint
Alexey Malkhanov [Thu, 25 Jan 2018 15:06:50 +0000]
ch4/stubshm: Change count type to MPI_Aint

Missed this component in [de2d85cabc57].

Signed-off-by: Ken Raffenetti <raffenet@mcs.anl.gov>

15 months agoch4/nm: Add missing collective definitions
Alexey Malkhanov [Thu, 25 Jan 2018 15:06:19 +0000]
ch4/nm: Add missing collective definitions

Signed-off-by: Ken Raffenetti <raffenet@mcs.anl.gov>

15 months agoch4/nm: Fix request type in function signatures
Alexey Malkhanov [Thu, 25 Jan 2018 15:05:39 +0000]
ch4/nm: Fix request type in function signatures

Follow up commit to [b382680c641e].

Signed-off-by: Ken Raffenetti <raffenet@mcs.anl.gov>

15 months agoerrhan: Include system headers later
Ken Raffenetti [Wed, 24 Jan 2018 16:30:08 +0000]
errhan: Include system headers later

System header files must be included after mpl.h and mpichconf.g in
case of macros defined to enable/disable features (e.g. _GNU_SOURCE).

No reviewer.

15 months agoch4/ucx: Update submodule
Ken Raffenetti [Wed, 24 Jan 2018 02:37:08 +0000]
ch4/ucx: Update submodule

Update to v1.2.2 of UCX.

15 months agohwloc: Update submodule
Ken Raffenetti [Thu, 18 Jan 2018 21:52:29 +0000]
hwloc: Update submodule

Signed-off-by: Pavan Balaji <balaji@anl.gov>

15 months agopm/hydra: Update embedded hwloc settings
Ken Raffenetti [Thu, 18 Jan 2018 16:27:48 +0000]
pm/hydra: Update embedded hwloc settings

Use the recommended method from
https://www.open-mpi.org/projects/hwloc/doc/v1.11.8/a00304.php.

Signed-off-by: Pavan Balaji <balaji@anl.gov>

15 months agoconfig: Update embedded hwloc settings
Ken Raffenetti [Fri, 12 Jan 2018 23:05:40 +0000]
config: Update embedded hwloc settings

Based on the instructions here:
https://www.open-mpi.org/projects/hwloc/doc/v1.11.8/a00304.php

This method avoids polluting CFLAGS and CPPFLAGS when configuring
other sub-projects, including a known build issue when embedded the
UCX library.

Signed-off-by: Pavan Balaji <balaji@anl.gov>

15 months agoch3/nemesis: Include system headers later
Ken Raffenetti [Tue, 23 Jan 2018 15:39:25 +0000]
ch3/nemesis: Include system headers later

System header files must be included after mpl.h and mpichconf.g in
case of macros defined to enable/disable features (e.g. _GNU_SOURCE).

Signed-off-by: Akhil Langer <akhil.langer@intel.com>

15 months agoch4/ofi: Change max_buffered_send to uint16_t
Wesley Bland [Fri, 12 Jan 2018 17:29:53 +0000]
ch4/ofi: Change max_buffered_send to uint16_t

The variable array allocation of a buffer for small sends is triggering
a warning about potentially using too much stack memory. This buffer is
almost certainly not going to be above UINT16_MAX in the forseable
future so cap `max_buffered_send` at that value to avoid the runaway
stack memory warning.

Signed-off-by: Ken Raffenetti <raffenet@mcs.anl.gov>

15 months agocoll: Remove unused labels
Wesley Bland [Tue, 16 Jan 2018 17:10:40 +0000]
coll: Remove unused labels

Signed-off-by: Ken Raffenetti <raffenet@mcs.anl.gov>

15 months agobinding/mpif: Squash uninitialized variable warning
Wesley Bland [Mon, 15 Jan 2018 23:29:28 +0000]
binding/mpif: Squash uninitialized variable warning

Signed-off-by: Ken Raffenetti <raffenet@mcs.anl.gov>

15 months agoch4: Squash uninitialized variable warning
Wesley Bland [Mon, 15 Jan 2018 22:33:53 +0000]
ch4: Squash uninitialized variable warning

Signed-off-by: Ken Raffenetti <raffenet@mcs.anl.gov>

15 months agompi: Squash uninitialized warnings
Wesley Bland [Fri, 12 Jan 2018 16:19:51 +0000]
mpi: Squash uninitialized warnings

Fixes csr/mpich-ofi#994

Signed-off-by: Ken Raffenetti <raffenet@mcs.anl.gov>

15 months agoch4: Squash unused variable warnings
Wesley Bland [Mon, 15 Jan 2018 22:31:30 +0000]
ch4: Squash unused variable warnings

Signed-off-by: Ken Raffenetti <raffenet@mcs.anl.gov>

15 months agoch4: Remove unused init_comm function
Wesley Bland [Mon, 15 Jan 2018 22:53:18 +0000]
ch4: Remove unused init_comm function

Signed-off-by: Ken Raffenetti <raffenet@mcs.anl.gov>

15 months agompir: Initialize CHKLMEM_DECL memory
Wesley Bland [Fri, 12 Jan 2018 16:37:36 +0000]
mpir: Initialize CHKLMEM_DECL memory

The array of pointers here is generating uninitialized warnings in some
cases. By initializing to NULL (as we do with the CHKPMEM memory), we
squash them.

Signed-off-by: Ken Raffenetti <raffenet@mcs.anl.gov>

15 months agompi: Guard function with debug logging #ifdef
Wesley Bland [Fri, 12 Jan 2018 16:20:38 +0000]
mpi: Guard function with debug logging #ifdef

This function is only used in a debug macro and therefore generates an
unused warning if debug output is compiled out.

Signed-off-by: Ken Raffenetti <raffenet@mcs.anl.gov>

15 months agohydra: Squash warning about unexpected assignment
Wesley Bland [Fri, 15 Dec 2017 21:51:07 +0000]
hydra: Squash warning about unexpected assignment

Make the loop condition more clear with a traditional for loop.

Signed-off-by: Ken Raffenetti <raffenet@mcs.anl.gov>

15 months agompit: Avoid shadowing variable
Wesley Bland [Fri, 15 Dec 2017 19:30:05 +0000]
mpit: Avoid shadowing variable

The variable `tmp` used inside this macro is shadowing other usages.
Add a `_` suffix to the variable name to avoid the shadow.

Signed-off-by: Ken Raffenetti <raffenet@mcs.anl.gov>

15 months agompl: Avoid shadowing variable
Wesley Bland [Fri, 15 Dec 2017 19:20:45 +0000]
mpl: Avoid shadowing variable

Signed-off-by: Ken Raffenetti <raffenet@mcs.anl.gov>

15 months agoch4: Remove unnecessary malloc
Wesley Bland [Mon, 15 Jan 2018 22:37:27 +0000]
ch4: Remove unnecessary malloc

It is not necessary to call malloc, snprintf, and malloc again based on
the output of snprintf. Instead, it is legal to just call snprintf with
a NULL pointer and 0 bytes to get the length of the string, then provide
that length to malloc.

Signed-off-by: Ken Raffenetti <raffenet@mcs.anl.gov>

15 months agoch4/ofi: Create one VNI by default
Hajime Fujita [Wed, 27 Dec 2017 19:00:42 +0000]
ch4/ofi: Create one VNI by default

Currently MPIR_CVAR_CH4_OFI_MAX_VNIS is set to -1 by default,
meaning it would claim all avaialable contexts from the provider.
This patch sets it to 1 by default.

This was supposed to be fixed in 8f103d755d98 but this part was missing.

Signed-off-by: Ken Raffenetti <raffenet@mcs.anl.gov>

15 months agoromio: Enable system extentions
Gengbin Zheng [Tue, 19 Dec 2017 20:26:03 +0000]
romio: Enable system extentions

This patch enables system extensions in configure by
`AC_USE_SYSTEM_EXTENSIONS`.

Signed-off-by: Ken Raffenetti <raffenet@mcs.anl.gov>

15 months agoromio: Include system headers later
Gengbin Zheng [Tue, 19 Dec 2017 20:24:14 +0000]
romio: Include system headers later

System header files must be included after defining macros in adio.h
to enable/disable library features (e.g. _GNU_SOURCE).

Signed-off-by: Ken Raffenetti <raffenet@mcs.anl.gov>

15 months agompl: Enable system extentions
Gengbin Zheng [Tue, 19 Dec 2017 20:21:01 +0000]
mpl: Enable system extentions

This patch enables system extensions in configure by
`AC_USE_SYSTEM_EXTENSIONS`.

This is needed to correctly build MPL within MPICH, when it is
configured with `--enable-strict`.

Signed-off-by: Ken Raffenetti <raffenet@mcs.anl.gov>

15 months agompl: Include system headers after mpl.h
Gengbin Zheng [Tue, 19 Dec 2017 20:17:24 +0000]
mpl: Include system headers after mpl.h

System header files must be included after mpl.h in case mpl.h
defines macros to enable/disable library features (e.g. _GNU_SOURCE).

Signed-off-by: Ken Raffenetti <raffenet@mcs.anl.gov>

15 months agoch3/nemesis: Remove unused code
Ken Raffenetti [Tue, 19 Dec 2017 20:48:30 +0000]
ch3/nemesis: Remove unused code

Signed-off-by: Yanfei Guo <yguo@anl.gov>

15 months agompl/shm: Add missing memory class for allocation in sysv impl
Ken Raffenetti [Wed, 20 Dec 2017 22:08:26 +0000]
mpl/shm: Add missing memory class for allocation in sysv impl

This allocation was not updated when memory classes were added in
[d3bc0095107c].

Signed-off-by: Yanfei Guo <yguo@anl.gov>

15 months agompl/shm: Fix function namespace in sysv impl
Ken Raffenetti [Wed, 20 Dec 2017 22:03:17 +0000]
mpl/shm: Fix function namespace in sysv impl

Usage of these functions in the sysv shared memory implementation had
been broken since [b6e85ed8e711].

Signed-off-by: Yanfei Guo <yguo@anl.gov>

16 months agoromio: Initialize error_code before calling MPIO_DATATYPE_ISCOMMITTED
Wei-keng Liao [Fri, 15 Dec 2017 00:43:31 +0000]
romio: Initialize error_code before calling MPIO_DATATYPE_ISCOMMITTED

This patch initializes error_code to MPI_SUCCESS before calling
MPIO_DATATYPE_ISCOMMITTED. Without this fix, an error message of
"Invalid MPI_Op" appears when ROMIO is built stand-alone. Note in file
mpi-io/mpioimpl.h, MPIO_DATATYPE_ISCOMMITTED does nothing when
ROMIO_INSIDE_MPICH is not defined. Thus, uninitialized error_code can
pass through MPIO_DATATYPE_ISCOMMITTED and is checked against
MPI_SUCCESS in a few places, including mpi-io/set_view.c and all files
that call MPIO_CHECK_DATATYPE.

Signed-off-by: Ken Raffenetti <raffenet@mcs.anl.gov>

16 months agoromio: Fix shadow variable warning
Akhil Langer [Wed, 10 Jan 2018 20:06:27 +0000]
romio: Fix shadow variable warning

The global variables are shadowed by function variables/arguments of the
same name. This generates compiler warnings. Suffix '_global' at the
end of global variables to distinguish them from function arguments/variables.

Signed-off-by: Paul Coffman <pcoffman@anl.gov>

16 months agoch4/ucx: Fix snprintf usage
Ken Raffenetti [Wed, 10 Jan 2018 22:07:17 +0000]
ch4/ucx: Fix snprintf usage

Avoid always truncating because of a bad buffer size calculation.

No reviewer.

16 months agoch4/ofi: Honor max_msg_size when using iov
Hajime Fujita [Fri, 12 Jan 2018 21:13:37 +0000]
ch4/ofi: Honor max_msg_size when using iov

OFI's ep_attr::max_msg_size (==MPIDI_Global.max_send in OFI netmod)
has to be always honored even when iovec is used for data transmission.

Fixes csr/mpich-ofi#945

Signed-off-by: Ken Raffenetti <raffenet@mcs.anl.gov>

16 months agoch4/ofi: Refactor iov send/recv code
Hajime Fujita [Thu, 11 Jan 2018 21:11:13 +0000]
ch4/ofi: Refactor iov send/recv code

OFI send/recv functions have grown too big.
This patch takes out the iov handling code in send/recv to
reduce average function size.

Signed-off-by: Ken Raffenetti <raffenet@mcs.anl.gov>

16 months agoch4/ofi: Move sync_send_ack bit out of protocol mask
Wesley Bland [Tue, 9 Jan 2018 17:39:44 +0000]
ch4/ofi: Move sync_send_ack bit out of protocol mask

Because the sync_send_ack bit is in the protocol mask, it is ignored
when matching things like `MPI_ANY_SOURCE` receives. This means that if
there are two outstanding operations where one is an ssend and the other
is an `MPI_ANY_SOURCE` (or a receive with the same tag as the ssend),
the receive will match the acknowledgement of the ssend.

Move the sync_send_ack bit out of the protocol mask so it will not be
ignored when matching receive messages.

See csr/mpich-ofi#485

Signed-off-by: Ken Raffenetti <raffenet@mcs.anl.gov>

16 months agoch4: Change count type to MPI_Aint in send/recv functions
Akhil Langer [Wed, 10 Jan 2018 22:08:14 +0000]
ch4: Change count type to MPI_Aint in send/recv functions

count in send/recv functions is currently of type int in ch4.
This is inconsistent with ch3 and also gives errors when handling
large count values. Change count type to MPI_Aint.

Signed-off-by: Ken Raffenetti <raffenet@mcs.anl.gov>

16 months agoch4/ofi: Avoid race condition when poking progress
Sannikov, Alexander [Mon, 15 Jan 2018 17:07:18 +0000]
ch4/ofi: Avoid race condition when poking progress

Poking progress from the OFI netmod could cause deadlocks when issuing
nonblocking collectives. Ensure we only poke the nm and shm progress
functions, which are reentrant-safe.

Signed-off-by: Ken Raffenetti <raffenet@mcs.anl.gov>

16 months agoch4: Define more fine-grained progress controls
Sannikov, Alexander [Mon, 15 Jan 2018 17:03:32 +0000]
ch4: Define more fine-grained progress controls

Add intermediate progress test function (MPIDI_Progress_test) that
takes flags to control progress on netmod, shmmod, and progress
hooks. This is useful because progress on one or more components may
not be desired due to issues with reentrancy.

Signed-off-by: Ken Raffenetti <raffenet@mcs.anl.gov>

16 months agohwloc: Update submodule
Wesley Bland [Sun, 7 Jan 2018 20:09:39 +0000]
hwloc: Update submodule

Signed-off-by: Ken Raffenetti <raffenet@mcs.anl.gov>

16 months agoch4: distinguish the collective name from the algo name.
Pavan Balaji [Wed, 10 Jan 2018 20:57:08 +0000]
ch4: distinguish the collective name from the algo name.

Signed-off-by: Wesley Bland <wesley.bland@intel.com>

16 months agoromio: prevent erroneous inclusion of mpioprof.h
Akhil Langer [Tue, 9 Jan 2018 22:05:20 +0000]
romio: prevent erroneous inclusion of mpioprof.h

This file is a no-op without MPIO_BUILD_PROFILING defined.
Do not allow erroneous inclusion as that will falsely define
MPIO_PROF_H_INCLUDED

Signed-off-by: Pavan Balaji <balaji@anl.gov>

16 months agocoll: Change MPID collective calls to MPIR
Wesley Bland [Fri, 5 Jan 2018 20:54:01 +0000]
coll: Change MPID collective calls to MPIR

Use the wrapper function to make sure device overrides are turned on
before calling them.

Signed-off-by: Pavan Balaji <balaji@anl.gov>

16 months agocoll: Reintroduce MPIR_<coll>_impl functions
Wesley Bland [Fri, 5 Jan 2018 19:37:50 +0000]
coll: Reintroduce MPIR_<coll>_impl functions

In order to allow non-MPI functions to correctly determine whether to
use the device collective overrides, an intermediate function between
`MPI_<coll>` and `MPI(R/D)_<coll>` needs to be introduced.

The new function is placed inbetween the `MPI` call and the decision to
call the device overrides or directly call the `MPIR` collective
function.

Signed-off-by: Pavan Balaji <balaji@anl.gov>

16 months agocoll: Add fn entry/exit debug macros to mpidu_sched
Akhil Langer [Wed, 3 Jan 2018 23:03:52 +0000]
coll: Add fn entry/exit debug macros to mpidu_sched

Signed-off-by: Pavan Balaji <balaji@anl.gov>

16 months agoofi: Move common code out of the #ifdef-else construct
Akhil Langer [Wed, 3 Jan 2018 21:19:08 +0000]
ofi: Move common code out of the #ifdef-else construct

Signed-off-by: Pavan Balaji <balaji@anl.gov>

16 months agoch4: Refer to array element through a tmp var
Akhil Langer [Wed, 3 Jan 2018 20:48:04 +0000]
ch4: Refer to array element through a tmp var

requests[i] is being used many times and will be used even more as we
implement persistent collectives. Make code simpler by referring to it
through a variable.

Signed-off-by: Pavan Balaji <balaji@anl.gov>

16 months agoch4/shm/posix: Convert if-elseif construct to switch-case
Akhil Langer [Wed, 3 Jan 2018 20:04:48 +0000]
ch4/shm/posix: Convert if-elseif construct to switch-case

This is to be consistent with other implementations of
startall function. Also, switch-case is faster than
if else when there are many cases. This function will
have many cases as we implement persistent collectives

Signed-off-by: Pavan Balaji <balaji@anl.gov>

16 months agoch4/generic: Move var to case where it is used
Akhil Langer [Tue, 2 Jan 2018 20:37:53 +0000]
ch4/generic: Move var to case where it is used

MPI_Request sreq_handle is used only when the
request type is Bsend. Move it inside the Bsend case
in the switch statement.

Signed-off-by: Pavan Balaji <balaji@anl.gov>

16 months agoch4/generic: Move params inside each case in startall fn
Akhil Langer [Tue, 2 Jan 2018 20:07:20 +0000]
ch4/generic: Move params inside each case in startall fn

As we add more cases (for example, for collectives) to
the switch statement, many of these parameteres
(like tag, datatype, rank, etc.) would no longer
be common across the cases in the switch statement. Hence, it
does not deem fit to calculate them for every case. Get
these variables directly inside each case. This also
makes it consistent with the way ofi_startall is
implemented.

Signed-off-by: Pavan Balaji <balaji@anl.gov>

16 months agoch4u: Add request util fns to calculate tag, context_offset
Akhil Langer [Mon, 1 Jan 2018 22:33:29 +0000]
ch4u: Add request util fns to calculate tag, context_offset

Add utility functions for CH4U to calculate tag and
context_offset from the match_bits field stored in the
active message request.
These utilify functions make it possible to get
tag and context_offset through a single statement
and can thus be placed directly in the argument
list of send/recv call. Doing so is required
for the switch statement in ch4/generic/mpidig_startall.h to
add more cases (like collectives) that do not
need the send/recv arguments.

Signed-off-by: Pavan Balaji <balaji@anl.gov>

16 months agoch4u: Change variable name from tag to match_bits
Akhil Langer [Mon, 1 Jan 2018 22:33:29 +0000]
ch4u: Change variable name from tag to match_bits

The use of the term tag is overloaded in CH4U that leads to
ambiguity. It is used to refer to MPI level tag as well as the message
match bits (context_id + rank + tag). Rename the variable
to match_bits to accurately represent what it is.

Signed-off-by: Pavan Balaji <balaji@anl.gov>

16 months agocoll: Remove unused variable from reduce/allreduce
Akhil Langer [Wed, 3 Jan 2018 18:51:08 +0000]
coll: Remove unused variable from reduce/allreduce

comm_size was being used to calculate the nearest power of 2.
But now nearest power-of-2 is stored in the communicator itself
during communicator creation.

Signed-off-by: Pavan Balaji <balaji@anl.gov>

16 months agoadio: Fix uninitialized variable warnings
Akhil Langer [Wed, 3 Jan 2018 18:42:46 +0000]
adio: Fix uninitialized variable warnings

Signed-off-by: Pavan Balaji <balaji@anl.gov>

16 months agoadio: Remove unused variable
Akhil Langer [Wed, 3 Jan 2018 18:11:03 +0000]
adio: Remove unused variable

Signed-off-by: Pavan Balaji <balaji@anl.gov>

16 months agocoll: Add Makefiles for op,reduce_local,allreduce_group
Akhil Langer [Wed, 3 Jan 2018 16:35:20 +0000]
coll: Add Makefiles for op,reduce_local,allreduce_group

Signed-off-by: Pavan Balaji <balaji@anl.gov>

16 months agoch4: document p_type variable
Akhil Langer [Mon, 1 Jan 2018 21:18:10 +0000]
ch4: document p_type variable

Signed-off-by: Pavan Balaji <balaji@anl.gov>

16 months agocoll: Change MPID/MPIR_I<coll> to accept MPIR_Request **
Akhil Langer [Fri, 29 Dec 2017 17:10:29 +0000]
coll: Change MPID/MPIR_I<coll> to accept MPIR_Request **

To be consistent with pt2pt calls like MPID/MPIR_Isend,
the argument to MPID/MPIR_I<coll> needs to be changed
to accept MPIR_Requet ** instead of MPI_Request*. Also,
this is more natural since internal functions should deal
with MPIR_Request as much as possible.

Signed-off-by: Pavan Balaji <balaji@anl.gov>

16 months agoMove request related API impl to new folder
Akhil Langer [Sat, 30 Dec 2017 18:13:45 +0000]
Move request related API impl to new folder

Added a new high level 'request' folder. All the `MPI_Request/Status`
related functions were in pt2pt folder. But requests are also used
for collectives, rma, etc. There is enough requests related API to
call for its own folder. Moreover, with persistent collectives
implementation, some collectives specific cases will be added to
request functions. Hence, they do not fit well inside the pt2pt folder.

Signed-off-by: Pavan Balaji <balaji@anl.gov>

16 months agoCall MPIR_Request_create with relevant request kind
Akhil Langer [Thu, 28 Dec 2017 22:10:12 +0000]
Call MPIR_Request_create with relevant request kind

At many places in ch3 MPIR_Request_create was being called
with kind set to MPIR_REQUEST_KIND__UNDEFINED, following the
function call kind was updated. This is redundant and makes
code reading difficult. Just make the function invocations
with the correct kind.

Signed-off-by: Pavan Balaji <balaji@anl.gov>

16 months agompierr: Protect macro variables with parentheses
Akhil Langer [Fri, 29 Dec 2017 21:56:39 +0000]
mpierr: Protect macro variables with parentheses

This is required for consistency and also for correctness
in some cases

Signed-off-by: Pavan Balaji <balaji@anl.gov>

16 months agoofi: Move comment to correct location
Akhil Langer [Wed, 27 Dec 2017 20:49:24 +0000]
ofi: Move comment to correct location

Signed-off-by: Pavan Balaji <balaji@anl.gov>

16 months agoch4: Move startall fn to separate file
Akhil Langer [Tue, 10 Oct 2017 22:04:43 +0000]
ch4: Move startall fn to separate file

startall function does not belong to *_send.h file
as it can be used for any persistent request type including
persistent recv and persistent collectives. Move it to a
separate file. This is in preparation for implementing
persistent collectives.

Signed-off-by: Pavan Balaji <balaji@anl.gov>

16 months agoop: Typo fix in comment
Akhil Langer [Tue, 26 Dec 2017 22:13:23 +0000]
op: Typo fix in comment

Signed-off-by: Pavan Balaji <balaji@anl.gov>

16 months agompl: Move math functions in coll_util.h to mpl_math.h
Akhil Langer [Tue, 26 Dec 2017 18:09:22 +0000]
mpl: Move math functions in coll_util.h to mpl_math.h

These functions are completely independent of MPI/MPICH and belong in
MPL.  Update the usage of these functions throughout mpich to use the
MPL version and delete the old header file.

Signed-off-by: Pavan Balaji <balaji@anl.gov>

16 months agocoll: Move MPIR_Op_is_commutative fn to op folder
Akhil Langer [Tue, 26 Dec 2017 16:08:54 +0000]
coll: Move MPIR_Op_is_commutative fn to op folder

It better belongs to the op_commutative file instead
of coll_util.h. There is a similar function
MPIR_Op_commutative that takes MPIR_Op as argument
rather than MPI_Op by MPIR_Op_is_commutative.
One of the two functions can be made redundant
provided there is no perf cost added.

Signed-off-by: Pavan Balaji <balaji@anl.gov>

16 months agocoll: Move allreduce_group fn decls to its own header file
Akhil Langer [Tue, 26 Dec 2017 15:40:34 +0000]
coll: Move allreduce_group fn decls to its own header file

These functions are quite unrelated to other math utility
functions in coll_util.h. Therefore, moving them to a more
appropriate file. Also, coll_util.h file will now go away
as its contents will be moved to mpl

Signed-off-by: Pavan Balaji <balaji@anl.gov>

16 months agocoll: Use pof2 from MPIR_Comm
Akhil Langer [Tue, 26 Dec 2017 03:04:46 +0000]
coll: Use pof2 from MPIR_Comm

Use pof2 stored in communicator instead of
calculating it every time.

Signed-off-by: Pavan Balaji <balaji@anl.gov>

16 months agocoll: Store power of 2 in MPIR_Comm
Akhil Langer [Tue, 26 Dec 2017 02:32:58 +0000]
coll: Store power of 2 in MPIR_Comm

Collective communication operations use nearest power
of 2 calculation for algorithm selection and also for
algorithm execution. Some code paths calculate it twice during
the course of a call to collective operation. Calculate it
just once and store it in the communicator. Retrieve it from the
communicator whenever needed instead of calculating
it on the spot everytime. This will save some performance
cost. pof2 is stored just after the local_size
parameter inside the communicator. local_size is anyways
accessed during collective execution. Therefore, chances
of a cache miss on pof2 are small.

Signed-off-by: Pavan Balaji <balaji@anl.gov>

16 months agocoll: Use MPIU_pof2
Akhil Langer [Mon, 25 Dec 2017 04:03:33 +0000]
coll: Use MPIU_pof2

Use the utility function instead of implementing it every time.

Signed-off-by: Pavan Balaji <balaji@anl.gov>

16 months agocoll: make comment more informative and fix typo
Akhil Langer [Mon, 25 Dec 2017 03:33:52 +0000]
coll: make comment more informative and fix typo

Signed-off-by: Pavan Balaji <balaji@anl.gov>

16 months agocoll: Remove dead code from a reduce_scatter* algo
Akhil Langer [Mon, 25 Dec 2017 02:49:10 +0000]
coll: Remove dead code from a reduce_scatter* algo

The pairwise algorithm for (i)reduce_scatter(_block) collective
operations works correctly only for commutative operations.
Delete code in it meant for non-commutative operations.

Signed-off-by: Pavan Balaji <balaji@anl.gov>