------------------- Upcoming version 1.3 -----------------------------

- Improved automatic MPI detection in instrumenter (helpful on Cray,
  as cc/CC/ftn is the compile command for both MPI and non-MPI).
- Use Process Manager Interface (PMI) to get fine-granular information
  about the system topology on Cray machines.
- Changed paradigm selection in the instrumenter to match the
  selection options in the config tool. Thus, introduced
  --mpp=<paradigm> and --thread=<paradigm> flags for the instrumenter
  to select the multi-process paradigm and the threading paradigm. The
  old options --mpi, --nompi, --openmp, --noopenmp are marked as
  deprecated and no longer documented.
- Implemented the possibility to write CUBE profiles with the tuple
  values containing sum, minimum, maximum, number of samples, sum of
  squares.
- Added handling for special characters, like space, in file names and
  path names. However, there are still some limitation when using
  special characters: The PDT parser can not deal with these
  characters and, thus, fails if PDT instrumentation is enabled and
  special characters appear. Furthermore, compilation fails when
  double quotes appear in source file names and preprocessing is
  enabled.
- Unified naming of macros in the user adapter. In C/C++ the macros to
  define global region handles (SCOREP_GLOBAL_REGION_DEFINE and
  SCOREP_GLOBAL_REGION_EXTERNAL) and in Fortran the parameter macros
  (SCOREP_PARAMETER_DEFINE, SCOREP_PARAMETER_INT64,
  SCOREP_PARAMETER_UINT64, SCOREP_PARAMETER_STRING) got the prefix
  SCOREP_USER instead of only SCOREP.
- The new SIONlib integration of OTF2 extends the support of writing
  SION traces to all multi-process paradigms, not only MPI. Though
  only pure multi-process measurements are supported for now. No
  threads, no CUDA, no non-CPU metrics. Score-P itself does not depend
  on SIONlib any longer, only OTF2 does now. The configure option
  '--with-sionlib' (formerly '--with-sionconfig') is passed to OTF2.
  As part of this integration the measurement configuration variable
  'SCOREP_TRACING_NLOCATIONS_PER_SION_FILE' was renamed to
  'SCOREP_TRACING_MAX_PROCS_PER_SION_FILE' to clarify that Score-P can
  only distribute whole processes into a multi-file SION trace.
- Extended the TAU adapter to allow input of location properties,
  which are location specific meta data presented as key/value pair.
- Added selection for mutex locking, allowing to use the parameter
  --mutex=<locking> to switch between known locking mechanisms within
  the measurement system (omp,pthread,pthread:spinlock).
- When using the Intel compiler, we inspected the symbol table of the
  executable and evaluated the filtering on all functions in the
  executable. Thus, compiler instrumented functions from shared
  libraries are automatically filtered, when using the Intel
  compiler. Now, the filters are evaluated when the functions appear
  the first time. Thus, functions from shared libraries now appear in
  the measurement output.
- Added support for CUDA 5.5 and CUDA 6.0: The CUPTI activity buffer handling
  has changed. Therefore the environment variable SCOREP_CUDA_BUFFER_CHUNK
  has been introduced (see user documentation). Changed the default size for
  SCOREP_CUDA_BUFFER to '1M'.
- Added SCOREP_CUDA_ENABLE option 'references' to track references between
  CUDA host and device activities in the OTF2 trace.
- Fix handling of Intel compiler options starting with "-o".
- The pgCC compiler version 13.9 and newer preinclude omp.h if OpenMP
  is enabled. This leads to multiply defined symbols if the source file
  is preprocessed before compilation. Prevent the preinclusion for the
  compilation of preprocessed files if an appropriate compiler option
  exists (exists since pgCC version 14.1).
- Fix a deadlock on AIX, if MPI_Abort was called.
- If a system provides only shared OpenMP runtime libraries and a compiler
  does not add rpath information but relies on LD_LIBRARY_PATH, the
  scorep instrumenter fails execution. Fixed.
- Added SCOREP_CUDA_ENABLE option 'flushatexit', which forces pending CUDA
  activities to be flushed at program exit. This can avoid records to be
  dropped in OpenACC programs.
- Changed default method of CUDA kernel recording to concurrent. Therefore
  the SCOREP_CUDA_ENABLE option 'concurrent' has been replaced with
  'kernel_serial'.
- Fix missing flags in OPARI2 call to disable OpenMP instrumentation, if the
  user selected POMP instrumentation for a serial program without specifying
  that the program is serial, manually.
- Prepend link calls to the intel compiler with seetings to VT_LIB_DIR and
  VT_LIBS to avoid remarks.
- Improved event size estimation in scorep-score using otf2-estimator.
- Improved initialization of adapters which results in a reduced
  number of libraries needed to be linked into the application.
- Install Cube remap specification file and provide its location via
  scorep-config tool.
- Changed enumeration of threads in the profile from a global enumeration
  to an enumeration from 0 to N-1 on each process.
- Basic support for the K Computer and Fujitsu FX10 systems added. The
  Tofu network topology will be supported in a subsequent release.
  Note that some C++ OpenMP programs fail during measurement
  initialization for unknown reasons.
- Use "-G2" if the Cray compiler instrumentation is used.
  The previous "-g" flag disabled all optimizations.
- Fix creation of experiment directory, if the monitored application make
  use of 'chdir' operation.
- Add support for instrumenting programs which use SHMEM library calls for
  one-sided communication.
- Basic support for Pthread instrumentation. Supported Pthread
  routines are pthread_create, pthread_join, pthread_mutex_init,
  pthread_mutex_destroy, pthread_mutex_lock, pthread_mutex_trylock,
  pthread_mutex_unlock, pthread_cond_init, pthread_cond_destroy,
  pthread_cond_signal, pthread_cond_broadcast, pthread_cond_wait, and
  pthread_cond_timedwait. Following thread management functions are
  currently not supported and will abort the program: pthread_exit and
  pthread_cancel. The usage of pthread_detach will cause the pogram to
  fail if the detached thread is still running after the end of
  main. These limitations will be addressed in an upcoming version of
  Score-P.

------------------- Released version 1.2.3 ---------------------------

- Fixed a failed assertion that occurs if selective recording was
  enabled in profiling mode.
- Fixed wrong path names in the instrumenter, when Score-P was
  configured with the --bindir flag.
- Install scorep-score in the correct directory, if Score-P was
  configured with the --bindir flag.
- Reduce per-event measurement overhead by improving Score-P's assert
  and error handling.
- Adapt configure to recent Cray installations.
- Score-P measurements provided with a SCOREP_EXPERIMENT_DIRECTORY,
  say foo, used to overwrite an existing foo even if this foo is not a
  directory. Will now abort with a meaningful message.
- Metric plugin component: handling of multiple metrics improved.
- Don't remove source files during make distclean in an in-place
  build.
- Fix failing detection of nvcc in case it was called with a path.
- The measurement configuration (stored in the file `scorep.cfg') is
  now also preserved in the experiment directory in case of an failed
  measurement.
- Added compiler instrumentation flags also to the ldflags to fix
  missing instrumentation if high optimization levels recompile parts
  of the code.
- Changed the region names of OPARI2 instrumented named criticals.
  If a name for the critical region is provided, the enclosing region
  will have the name '!$omp critical <name>' and the structured block
  '!$omp critical sblock'. Replace <name> by the given name.

------------------- Released version 1.2.2 ---------------------------

- The Fortran Cray compiler instrumentation did not create an exit
  event. Thus, we add an exit on Score-P finalization.
- Removed remark of the Intel compiler during instrumentation that
  VT_ROOT is not set, if preprocessing was used.
- MPI parallel measurements with just one process were fixed.
- Fixed a race condition during initialization of the
  TRACE_BUFFER_FLUSH region, that could lead to incomplete profiles if
  a user runs a hybrid (MPI + OpenMP) application and enables
  profiling and tracing at the same time.
- Fix error message when scorep-config is called without arguments in
  a non-mpi installation.
- In scorep-config's rpath options, omit paths searched by ldconfig,
  even if Score-P was installed there, in order to comply to packaging
  guidelines of some Linux distributions.
- Fixed broken MPI detection in the instrumenter if the MPI compiler
  wrapper is specified with the full path.
- If Score-P is build with static and dynamic libraries, the selection
  of using static or dynamic libraries was improved. Using -Bstatic or
  -Bshared had some side effects and was sometimes unreliable.
- On Cray system, change libtools default to prefer static linking of
  external libraries.
- Suppress failed assertion messages when initializing compiler
  instrumentation with Intel compilers without libbfd. The measurement
  completes even if these messages exist.
- Added options to scorep-config and the scorep instrumenter to
  enable/disable online access support.
- Fixed broken --includedir configure option that installed Score-P
  headers in a wrong directory.
- Fix SCOREP_RECORDING_IS_ON(isOn) user macro; in Fortran codes, isOn
  was not set to false when instrumented with --nouser.
- Fixed instrumentation compilation error that occurred if
  --opari="--disable=atomic" was specified without OpenMP compilation
  flags.
- Improvements in obtaining region information via libbfd.
- Improved configure checks to determine values of MPI
  constants. Previous tests failed on AIX.
- Improvements of measurement reconfiguration in Online Access mode.
- Honor --without-mpi when --with-custom-compilers is given at
  configure time.
- Several smaller fixes.

------------------- Released version 1.2.1 ---------------------------

- Allow configuration without support for the MPI programming model by
  specifying --without-mpi on the configure line.
- Abort during instrumentation with a meaningful error message if
  a user requests MPI but the Score-P installation does not support MPI
- On Blue Gene/Q, detect PAMI library at configure time. The location
  and names of the PAMI files changes during a system upgrade. Search
  all known directories and library names.
- Improve --with-custom-compilers, customization files are now
  recognized also in the build directory (see INSTALL).
- On SGI MPT systems, or more generally on systems that don't use
  compiler wrappers for building MPI programs, improve the automatic
  detection of the MPI programming paradigm during instrumentation.
- Abort with an error message during instrumentation if the user wants
  to build a shared library with static Score-P libraries.
- Abort if the user specified a filter file which cannot be opened.
- Improved the auto-detection in the instrumenter for MPI libraries. This
  should fix some failures with MPI programs that do not use a compiler
  wrapper, e.g., when using SGI MPT.
- Fixed that the instrumenter fails to detect whether an application
  uses OpenMP with the XL compiler if the user specifies more than one
  option to '-qsmp="
- Abort configuration when the user specified --without-cube on the
  commandline as cube is a required component.

------------------- Released version 1.2 -----------------------------

- Simplified MPI compiler detection, passing '--with-mpi' to configure
  is usually not necessary if your MPI compiler is in PATH.
- Support for Cray systems. PrgEnv-(cray|gnu|intel|pgi) are supported
  in static mode (static is the default). Please note that OpenMP
  instrumentation is currently broken for PrgEnv-cray.
- Compilation units getting processed by OPARI2 are now being
  preprocessed by the C/C++ preprocessor. This way it is possible to
  instrument OpenMP directives in header files. It also solves
  instrumentation problems cause by OpenMP pragmas within preprocessor
  defines. Preprocessing is the default but can be deactivated using
  --nopreprocess. When using PDT instrumentation, preprocessing is
  deactivated.
- To reduce the memory demands of dynamic regions in profiling mode,
  this version provides a lossy compression mechanism called
  'clustering'; similar subtrees of a dynamic region are clustered
  into one. This feature is enabled by default. There are three new
  environment variables for customization, please see the documentation
  for details.
- The new keyword 'MANGLED' was added to the filter file format to
  deal with cases where the displayed name and mangled name are
  different. The keyword 'FORTRAN' was removed.
- External metric sources can be utilized via a a plug-in mechanism.
  This feature is controlled via the SCOREP_METRIC_PLUGIN environment
  variable. Please see the documentation for details and an example.
- The CUDA adapter got refactored and extended to provide much more
  useful metrics. There are several new values to the environment
  variable SCOREP_CUDA_ENABLE. Please see the documentation for
  details.
- The machine name used in the profile and trace output is now
  configurable at built-time with the --with-machine-name flag or at
  run-time with the SCOREP_MACHINE_NAME measurement configuration
  variable.
- Full support to track the incurred OpenMP thread teams and utilizing
  the new generic threading records of OTF2.
- The Score-P internals were significantly refactored in order to
  increase flexibility to adapt to new programming paradigms and event
  sources.
- Please note that the feature 'selective tracing' was renamed to
  'selective recording' as it also applies to profiling.
- Please note that CUBE is a hard requirement when build Score-P from
  a tarball. This is due to the fact that we want to provide the user
  with 'scorep-score', that can't be build without the CUBE reader
  library available.

------------------- Released version 1.1 -----------------------------

- Rewind, a new event-trace recording mode for long-running
  experiments, triggered by user-instrumentation macros. Writes
  semantics information in OTF2 anchor file as rewind might affect
  analysis.
- ARM support (detection + compiler adapter).
- Metric service improvements. Support for per-process metrics and
  per-system-tree-class metrics.
- Support for OpenMP-task profiling and tracing alongside with
  improvements of the POMP adapter.
- Component separation: Score-P can now use pre-installed OTF2,
  OPARI2, and CUBE packages instead of the internal ones.
  - Removed dependency to external repository that was used by
    Score-P, OTF2, and OPARI2 in order to prevent version conflicts.
- Support for CUDA profiling and tracing.
- Easier experiment configuration via scorep-info which provides a
  list of all measurement configuration variables.
- scorep-info also provides the improved configure-summary of the
  installation.
- Scoring of profile experiments via scorep-score (if configured with
  external CUBE) to prepare a filter for subsequent trace experiment.
- Documentation improvements.
- Numerous configure improvements. Let external libraries use
  generic configure options (tbc). Fixed portability issues.
- Numerous instrumenter improvements. All possible combinations of
  options supported.
- MPI profiling improvements.
- OpenMP nesting supported although little tested.
- Several compiler-dependent OpenMP-related bugfixes.

------------------- Released version 1.0.2 ---------------------------

- Several instrumentation fixes:
  - Improvements for PDT Fortran instrumentation.
  - Improvements for C++ user instrumentation.
  - Return real failure if instrumentation is erroneous. Failures may
    went undetected previously.
  - Allow for out-of-place builds.
  - Provide correct parameter to SCOREP_USER_REGION_ENTER macro.

- Provide correct timestamp to OmpTaskCreate events.

- Fix invalid order of arguments provided to MpiCollectiveEnd events.

- Fix bug in parameter profiling.

- Enable SIONlib support, currently just for MPI applications.

- Various fixes for the generated OpenMP region names:
  - Inner and outer blocks got different names.
  - Regions with the ordered clause got a special name.
  - All region names got it '@file:lno' appended, to make them distinguishable.

------------------- Released version 1.0.1 ---------------------------

- Renaming of the configure related variable LD_FLAGS_FOR_BUILD to
  LDFLAGS_FOR_BUILD for consistency.

- Renaming of installed tool and options for consistency, i.e.
  changing underscores to dashes. Also, the --(no)openmp_support
  option changed to --(no)openmp.

- Improved linking on AIX systems.

- Robustness improvements when instrumenting with PDT.

- On x86 platforms, be more cautious using the tsc counter. If
  /proc/cpuinfo reports constant_tsc but not nonstop_tsc, then it is
  likely that the counter is unreliable.

- Improved configure summary.

- configure will not fail if -q or --silent is passed.

------------------- Released version 1.0 -----------------------------
