Updated 2002/03/31 |
Forte[tm] Developer 7: Sun Performance Library[tm] Readme |
Contents
- Introduction
- About the Forte Developer 7 Sun Performance Library
- New and Changed Features
- Software Corrections
- Problems and Workarounds
- Limitations and Incompatibilities
- Documentation Errors
A. Introduction
This document contains information about the Forte[tm] Developer 7 Sun Performance Library[tm]. This document describes the new features and software corrections that are introduced in this release and lists known problems, limitations, and incompatibilities. Information in this document updates and extends information in the software manuals.
Information in the release notes updates and extends information in all readme files. To access the release notes and the complete Forte Developer documentation set, go to the documentation index at file:/opt/SUNWspro/docs/index.html.
To view the text version of this readme, type the following at a command prompt:
more /opt/SUNWspro/READMEs/performance_library
To view the HTML version of this readme, go to:
file:/opt/SUNWspro/docs/index.htmlNote - If your Forte Developer 7 software is not installed in the /opt directory, ask your system administrator for the equivalent path on your system.
B. About the Forte Developer 7 Sun Performance Library
This release of Sun Performance Library is available on the Solaris[tm] operating environment (SPARC[tm] Platform Edition) versions 7, 8, and 9.
Sun Performance Library is a set of optimized, high-speed mathematical subroutines for solving linear algebra and other numerically intensive problems. Sun Performance Library is based on a collection of public domain applications available from Netlib at http://www.netlib.org. Sun has enhanced these public domain applications and bundled them as the Sun Performance Library.
Sun Performance Library contains enhanced versions of the following standard libraries:
- LAPACK version 3.0. For solving linear algebra problems.
- BLAS1 (Basic Linear Algebra Subprograms). For performing vector-vector operations.
- BLAS2. For performing matrix-vector operations.
- BLAS3. For performing matrix-matrix operations.
Sun Performance Library includes the following additional routines:
- Fast Fourier transform (FFT) routines
- Sparse solver routines
- Sparse BLAS routines
- Interval BLAS routines
Compatibility
The LAPACK 3.0 routines in Sun Performance Library are compatible with the user routines from previous versions of LAPACK, including 1.x and 2.0, and with all routines in LAPACK 3.0. However, due to internal changes in LAPACK 3.0, compatibility with internal routines cannot be guaranteed. Internal routines that might be incompatible are called auxiliary routines in the LAPACK source code available from Netlib. Some information on auxiliary routines is included in the LAPACK Users' Guide, available from the Society for Industrial and Applied Mathematics (SIAM) at http://www.siam.org.
Because the user interfaces to the LAPACK auxiliary routines can change from release to release of LAPACK, the user interfaces to the LAPACK auxiliary routines in Sun Performance Library can change as well. Auxiliary routines compatible with LAPACK 3.0 are generally available for users to call; however, the auxiliary routines are not specifically documented, tested, or supported. Be aware that the user interfaces for the LAPACK auxiliary routines can change in future releases of Sun Performance Library, so that the user interfaces comply with the version of LAPACK supported by that version of the Sun Performance Library.
Documentation
The following Sun Performance Library documentation is available:
- Man pages (section 3p) for each function and subroutine in the library
- Sun Performance Library User's Guide, which describes and shows examples for:
- Using Sun Performance Library routines
- Using the Fortran and C interfaces
- Using optimization and parallelization options
- Using the sparse solver package
- Using the FFT routines
- Sun Performance Library Reference Manual, which is the HTML and PDF versions of the section 3p man pages. The Sun Performance Library Reference Manual is available at http://docs.sun.com.
For additional reference information, see the LAPACK Users' Guide 3rd ed., by Anderson, E. and others, SIAM, 1999, which is available from the Society for Industrial and Applied Mathematics (SIAM) or your local bookstore. The LAPACK Users' Guide is the official reference for the base LAPACK 3.0 routines available on Netlib and provides mathematical descriptions of the LAPACK 3.0 routines.
C. New and Changed Features
This section describes new and changed features for the Forte Developer 7 Sun Performance Library. For information about other Forte Developer components, see the What's New manual. To access this manual on your local system or network, go to file:/opt/SUNWspro/docs/index.html. You can also access this manual by going to http://docs.sun.com.
- New FFT interfaces supersede FFTPACK and VFFTPACK routines
- Interval BLAS routines added
- Sort and permute routines added
- Select Sparse BLAS routines have been parallelized
- Sparse Solver support for real, complex, and double complex data types added
- LINPACK removed from Sun Performance Library
- Legacy so.2 and so.3 Libraries not included with Forte Developer 7 Sun Performance Library
New FFT Interfaces Supersede FFTPACK and VFFTPACK Routines
This version of Sun Performance Library provides a new set of FFT interfaces that supersedes a subset of the FFTPACK and VFFTPACK routines provided in earlier Sun Performance Library releases. For information on using these routines, see the section 3p man pages. An overview of the new interface is documented in the fft man page in section 3p.
The new FFT routines will be described in the final version of the Using Sun Performance Library User's Guide, which will be available with the final release of Forte Developer 7.
The following table shows the mapping between the new FFT routines and the corresponding FFTPACK and VFFTPACK routines. (P) denotes FFT routines that are parallelized.
New FFTRoutineName ReplacesFFTPACK/VFFTPACKRoutine Description CFFTC (P) CFFTICFFTF (P)CFFTB (P) Initialize the trigonometric weight and factor tables or compute the one-dimensional forward or inverse FFT of a complex sequence. CFFTC2 (P) CFFT2ICFFT2F (P)CFFT2B (P) Initialize the trigonometric weight and factor tables or compute the two-dimensional forward or inverse FFT of a two-dimensional complex array. CFFTC3 (P) CFFT3ICFFT3F (P)CFFT3B (P) Initialize the trigonometric weight and factor tables or compute the three-dimensional forward or inverse FFT of three-dimensional complex array. CFFTCM (P) VCFFTIVCFFTF (P)VCFFTB (P) Initialize the trigonometric weight and factor tables or compute the one-dimensional forward or inverse FFT of a set of data sequences stored in a two-dimensional complex array. CFFTS RFFTIRFFTBEZFFTIEZFFTB Initialize the trigonometric weight and factor tables or compute the one-dimensional inverse FFT of a complex sequence. CFFTS2 RFFT2IRFFT2B Initialize the trigonometric weight and factor tables or compute the two-dimensional inverse FFT of a two-dimensional complex array. CFFTS3 (P) RFFT3IRFFT3B Initialize the trigonometric weight and factor tables or compute the three-dimensional inverse FFT of three-dimensional complex array. CFFTSM VRFFTIVRFFTB (P) Initialize the trigonometric weight and factor tables or compute the one-dimensional inverse FFT of a set of data sequences stored in a two-dimensional complex array. DFFTZ DFFTIDFFTFDEZFFTIDEZFFTF Initialize the trigonometric weight and factor tables or compute the one-dimensional forward FFT of a double precision sequence. DFFTZ2 DFFT2IDFFT2F Initialize the trigonometric weight and factor tables or compute the two-dimensional forward FFT of a two-dimensional double precision array. DFFTZ3 (P) DFFT3IDFFT3F Initialize the trigonometric weight and factor tables or compute the three-dimensional forward FFT of three-dimensional double precision array. DFFTZM VDFFTIVDFFTF (P) Initialize the trigonometric weight and factor tables or compute the one-dimensional forward FFT of a set of data sequences stored in a two-dimensional double precision array. SFFTC RFFTIRFFTFEZFFTIEZFFTF Initialize the trigonometric weight and factor tables or compute the one-dimensional forward FFT of a real sequence. SFFTC2 RFFT2IRFFT2F Initialize the trigonometric weight and factor tables or compute the two-dimensional forward FFT of a two-dimensional real array. SFFTC3 (P) RFFT3IRFFT3F Initialize the trigonometric weight and factor tables or compute the three-dimensional forward FFT of three-dimensional real array. SFFTCM VRFFTIVRFFTF (P) Initialize the trigonometric weight and factor tables or compute the one-dimensional forward FFT of a set of data sequences stored in a two-dimensional real array. ZFFTD DFFTI
DFFTBDEZFFTIDEZFFTBInitialize the trigonometric weight and factor tables or compute the one-dimensional inverse FFT of a double complex sequence. ZFFTD2 DFFT2I
DFFT2BInitialize the trigonometric weight and factor tables or compute the two-dimensional inverse FFT of a two-dimensional double complex array. ZFFTD3 (P) DFFT3I
DFFT3BInitialize the trigonometric weight and factor tables or compute the three-dimensional inverse FFT of three-dimensional double complex array. ZFFTDM VDFFTIVDFFTB (P) Initialize the trigonometric weight and factor tables or compute the one-dimensional inverse FFT of a set of data sequences stored in a two-dimensional double complex array ZFFTZ (P) ZFFTIZFFTF (P)ZFFTB (P) Initialize the trigonometric weight and factor tables or compute the one-dimensional forward or inverse FFT of a double complex sequence. ZFFTZ2 (P) ZFFT2IZFFT2F (P)ZFFT2B (P) Initialize the trigonometric weight and factor tables or compute the two-dimensional forward or inverse FFT of a two-dimensional double complex array. ZFFTZ3 (P) ZFFT3IZFFT3F (P)ZFFT3B (P) Initialize the trigonometric weight and factor tables or compute the three-dimensional forward or inverse FFT of three-dimensional double complex array. ZFFTZM (P) VZFFTIVZFFTF (P)VZFFTB (P) Initialize the trigonometric weight and factor tables or compute the one-dimensional forward or inverse FFT of a set of data sequences stored in a two-dimensional double complex array. FFTPACK and VFFTPACK routines are still included with this version of Sun Performance Library, but they are no longer supported. For information on using the FFTPACK and VFFTPACK routines, see the section 3p man pages or the bookUsing Sun Performance Library with Fortran and C User's Guide provided with the Sun WorkShop 6 update 2 release.
Interval BLAS Routines Added
This release of the Sun Performance Library includes the interval BLAS routines, which operate on interval scalars, interval vectors, and interval matrices (dense, banded, symmetric, and triangular).
The interval BLAS routines are listed in the following table.
amax_val_i ge_interiorm_i sp_acc_i sy_winterm_i tpsv_i amin_val_i ge_interm_i sp_add_i symm_i tr_acc_i axpby_i ge_lrscale_i sp_constructm_i symv_i tr_add_i cancel_i ge_midm_i sp_copy_i syr_i tr_constructm_i constructv_i ge_norm_i sp_disjm_i tb_acc_i tr_copy_i copy_i ge_permute_i sp_emptyelem_i tb_add_i tr_disjm_i disjv_i ge_supm_i sp_encm_i tb_constructm_i tr_emptyelem_i dot_i ge_trans_i sp_hullm_i tb_copy_i tr_encm_i emptyelev_i ge_whullm_i sp_infm_i tb_disjm_i tr_hullm_i encv_i ge_widthm_i sp_interiorm_i tb_emptyelem_i tr_infm_i fpinfo_i ge_winterm_i sp_interm_i tb_encm_i tr_interiorm_i gb_acc_i gemm_i sp_lrscale_i tb_hullm_i tr_interm_i gb_add_i gemv_i sp_midm_i tb_infm_i tr_midm_i gb_constructm_i ger_i sp_norm_i tb_interiorm_i tr_norm_i gb_copy_i hullv_i sp_supm_i tb_interm_i tr_supm_i gb_diag_scale_i infv_i sp_whullm_i tb_midm_i tr_whullm_i gb_disjm_i interiorv_i sp_widthm_i tb_norm_i tr_widthm_i gb_emptyelem_i interv_i sp_winterm_i tb_supm_i tr_winterm_i gb_encm_i midv_i spmv_i tb_whullm_i trmm_i gb_hullm_i norm_i spr_i tb_widthm_i trmv_i gb_infm_i permute_i sum_i tb_winterm_i trsm_i gb_interiorm_i rscale_i sumsq_i tbmv_i trsv_i gb_interm_i sb_acc_i supv_i tbsv_i waxpby_i gb_lrscale_i sb_add_i swap_i tp_acc_i wcancel_i gb_midm_i sb_constructm_i sy_acc_i tp_add_i whullv_i gb_norm_i sb_copy_i sy_add_i tp_constructm_i widthv_i gb_supm_i sb_disjm_i sy_constructm_i tp_copy_i winterv_i gb_whullm_i sb_emptyelem_i sy_copy_i tp_disjm_i gb_widthm_i sb_encm_i sy_disjm_i tp_emptyelem_i gb_winterm_i sb_hullm_i sy_emptyelem_i tp_encm_i gbmv_i sb_infm_i sy_encm_i tp_hullm_i ge_acc_i sb_interiorm_i sy_hullm_i tp_infm_i ge_add_i sb_interm_i sy_infm_i tp_interiorm_i ge_constructm_i sb_lrscale_i sy_interiorm_i tp_interm_i ge_copy_i sb_midm_i sy_interm_i tp_midm_i ge_diag_scale_i sb_norm_i sy_lrscale_i tp_norm_i ge_disjm_i sb_supm_i sy_midm_i tp_supm_i ge_emptyelem_i sb_whullm_i sy_norm_i tp_whullm_i ge_encm_i sb_widthm_i sy_supm_i tp_widthm_i ge_hullm_i sb_winterm_i sy_whullm_i tp_winterm_i ge_infm_i sbmv_i sy_widthm_i tpmv_i See the section 3p man pages for information on using each routine.
Reference information for the interval BLAS routines is also located in Appendix C of the Basic Linear Algebra Subprograms Technical (BLAST) Forum Standard, located at http://www.netlib.org/blas/blast-forum/.
Important: Information in the section 3p man pages supersedes information in Appendix C.
Users can access the interval BLAS routines in the library by doing one of the following:
a. Use the -xia flag at link time to link with all the necessary libraries. For example,
f95 -c -xia -dalign foo.f f95 -o foo -xia foo.o -lsuniperf
or
f95 -xia -dalign foo.f -lsuniperf
b. Use -xlic_lib=suniperf. For example,
f95 -c -xia -dalign foo.f f95 -o foo foo.o -xlic_lib=suniperf
Sort and Permute Routines Added
The following sort and permute routines have been added to this version of Sun Performance Library. For more information, see the section 3p man pages.
blas_dsort blas_isort blas_ssort blas_dsortv blas_isortv blas_ssortv blas_dpermute blas_ipermute blas_spermute
Select Sparse BLAS Routines Have Been Parallelized
All of the sparse matrix-matrix multiply (*mm) routines and the sparse triangular solve (*sm) routines have been parallelized. New sparse matrix-matrix multiply and sparse triangular solve routines have also been added to this release. The routines that have been parallelized are listed in the following table.
Existing Routines New Routines Matrix-MatrixMultiplyRoutines Sparse TriangularSolveRoutines Matrix-MatrixMultiplyRoutines Sparse TriangularSolveRoutines dbcomm dbdism cbcomm cbdism dbdimm dbelsm cbdimm cbelsm dbelmm dbscsm cbelmm cbscsm dbscmm dbsrsm cbscmm cbsrsm dbsrmm dcscsm cbsrmm ccscsm dcoomm dcsrsm ccoomm ccsrsm dcscmm ddiasm ccscmm cdiasm dcsrmm dellsm ccsrmm cellsm ddiamm djadsm cdiamm cjadsm dellmm dskysm cellmm cskysm djadmm dvbrsm cjadmm cvbrsm dskymm sbdism cskymm zbdism dvbrmm sbelsm cvbrmm zbelsm sbcomm sbscsm zbcomm zbscsm sbdimm sbsrsm zbdimm zbsrsm sbelmm scscsm zbelmm zcscsm sbscmm scsrsm zbscmm zcsrsm sbsrmm sdiasm zbsrmm zdiasm scoomm sellsm zcoomm zellsm scscmm sjadsm zcscmm zjadsm scsrmm sskysm zcsrmm zskysm sdiamm svbrsm zdiamm zvbrsm sellmm zellmm sjadmm zjadmm sskymm zskymm svbrmm zvbrmm
Sparse Solver Support for Real, Complex, and Double Complex Data Types
The following sparse solver routines that provide support for real, complex, and double complex data types have been added to this version of Sun Performance Library. For more information, see the section 3p man pages.
Real Complex DoubleComplex sgssco cgssco zgssco sgssfa cgssfa zgssfa sgssin cgssin zgssin sgssps cgssps zgssps sgsssl cgsssl zgsssl sgssda cgssda zgssda sgssfs cgssfs zgssfs sgssor cgssor zgssor sgssrp cgssrp zgssrp sgssuo cgssuo zgssuo
LINPACK Removed From Sun Performance Library
LINPACK is no longer included with Sun Performance Library. LAPACK version 3.0 supersedes LINPACK and all previous versions of LAPACK. If legacy user codes that call LINPACK routines cannot be modified to use LAPACK routines, the public domain version of LINPACK can still be obtained from Netlib at http://www.netlib.org/linpack/index.html. Listed below are the LINPACK routines that have been removed.
Legacy so.2 and so.3 Libraries Not Included with Forte Developer 7 Sun Performance Library
To reduce the size of the Sun Performance Library for Forte Developer 7, only the so.4 libraries have been included, and the legacy so.2 and so.3 libraries have been removed from this release.
To maintain compatibility with applications that require the legacy so.2 and so.3 libraries, the FD7 packages will search for previous installations of the Sun Performance Library on the system and install symlinks to the location of the previously installed libraries. If no prior Sun Performance Library packages are found, the FD7 packages will install only the current so.4 libraries. To have symlinks to the legacy libraries, you will need to install the legacy packages before installing the FD7 Sun Performance Library packages.
The legacy libraries for a particular version of Sun Performance Library are located on the Forte Developer product CD for that release. For example, the packages for the Forte Developer 6 update 2 release are located in /cdrom/cdrom0/products/packages.
The shared libraries are contained in the following packages.
Package Library SPROplms Sun Performance Library 32-bit (Shared/MT) SPROpls Sun Performance Library 32-bit (Shared) SPROplmsx Sun Performance Library 64-bit (Shared/MT) SPROplsx Sun Performance Library 64-bit (Shared) You can install one or more of the packages, based upon your needs.
To install the packages use the pkgadd command. See the pkgadd man page for usage instructions.
If the legacy libraries have not been installed before installing FD7 and you want the symlinks to the legacy libraries, you will need to do the following:
- Uninstall the FD7 Shared Performance Library packages
- Install the legacy shared packages
- Install the FD7 shared packages
D. Software Corrections
Documentation of Complex Scalar Values in Man Pages and sunperf.h Files for Sun Performance Library C Interfaces Corrected
Complex scalars in the Sun Performance Library C interfaces are passed by reference. However, the Sun Performance Library section 3p man pages and sunperf.h files for Forte Developer 6, Forte Developer 6 update 1, and Forte Developer 6 update 2 incorrectly showed complex scalars as being passed by value (4531571). Sun Performance Library man pages and the sunperf.h file now correctly show complex scalars as being passed by reference. For example, the man page for caxpy now shows the complex scalar alpha as being passed by reference.
void caxpy(int n, complex *alpha, complex *x, int incx, complex *y, int incy);
The Sun Performance Library Reference Manual, a PDF document available on docs.sun.com, also incorrectly displays the complex scalars as being passed by value. The Sun Performance Library Reference Manual for Forte Developer 6, Forte Developer 6 update 1, and Forte Developer 6 update 2 will not be updated with the corrections. The version of the Sun Performance Library Reference Manual that will be released with the final version of Sun Performance Library for Forte Developer 7 will include the corrections.
E. Problems and Workarounds
This section discusses known software problems and possible workarounds for those problems. For updates, check Forte Developer Hot Product News at http://www.sun.com/forte/fcc/hotnews.html.
Slow Link Times When Linking With v9 (64-bit) libsunperf Libraries
A link editor bug (4369068) in Solaris 7 and 8 can cause slow link times when linking with 64-bit libraries with many weak symbols, such as libsunperf. Only make performance, not runtime performance, is affected by this link editor bug.
This link editor bug is fixed by the following patches:
- Solaris 7 operating environment patch 106950-15 (minimum revision required)
- Solaris 8 operating environment patch 109147-09 (minimum revision required)
These patches can be downloaded from http://sunsolve.sun.com.
Some FFTPACK and VFFTPACK Routines Not Accessible Through Their Generic Names
A number of FFTPACK and VFFTPACK routines are not accessible through their generic names when shared versions of libsunperf are used.
The code in the following example
foo.f: ... USE SUNPERF ... CALL COST(M,N,X,XT,LD) ... f95 -dalign foo.f -xlic_lib=sunperfwill result in a link error. The link error can be fixed by using the static library, as follows:
f95 -dalign foo.f -Bstatic -xlic_lib=sunperf -BdynamicThe following symbols are affected.
generic name specific name(s) sint ___pl_dsint_f90, ___pl_vsint_f90, ___pl_vdsint_f90 cosqi ___pl_dcosqi_f90, ___pl_vdcosqi_f90 cost ___pl_dcosti_f90, ___pl_vdcosti_f90 sinqi ___pl_dsinqi_f90 sinti ___pl_dsinti_f90 cosqb ___pl_dcosqb_f90, ___pl_vcosqb_f90, ___pl_vdcosqb_f90 cosqf ___pl_dcosqf_f90, ___pl_vcosqf_f90, ___pl_vdcosqf_f90 cost ___pl_dcost_f90, ___pl_vcost_f90, ___pl_vdcost_f90 sinqb ___pl_dsinqb_f90, ___pl_vsinqb_f90, ___pl_vdsinqb_f90 sinqf ___pl_vsinqf_f90, ___pl_dsinqf_f90, ___pl_vdsinqf_f90 vsinqi ___pl_vdsinqi_f90 vsinti ___pl_vdsinti_f90 vffti ___pl_vcffti_f90, ___pl_vzffti_f90 sint_64 ___pl_dsint_f90_64, ___pl_vsint_f90_64, ___pl_vdsint_f90_64 cosqi_64 ___pl_dcosqi_f90_64, ___pl_vdcosqi_f90_64 costi_64 ___pl_dcosti_f90_64, ___pl_vdcosti_f90_64 sinqi_64 ___pl_dsinqi_f90_64 sinti_64 ___pl_dsinti_f90_64 cosqb_64 ___pl_dcosqb_f90_64, ___pl_vcosqb_f90_64, ___pl_vdcosqb_f90_64 cosqf_64 ___pl_dcosqf_f90_64, ___pl_vcosqf_f90_64, ___pl_vdcosqf_f90_64 cost_64 ___pl_dcost_f90_64, ___pl_vcost_f90_64, ___pl_vdcost_f90_64 sinqb_64 ___pl_dsinqb_f90_64, ___pl_vsinqb_f90_64, ___pl_vdsinqb_f90_64 sinqf_64 ___pl_vsinqf_f90_64, ___pl_dsinqf_f90_64, ___pl_vdsinqf_f90_64 vsinqi_64 ___pl_vdsinqi_f90_64 vsinti_64 ___pl_vdsinti_f90_64 vffti_64 ___pl_vcffti_f90_64, ___pl_vzffti_f90_64
F. Limitations and Incompatibilities
There is no new information at this time.
G. Documentation Errors
There is no new information at this time.
Copyright © 2002 Sun Microsystems, Inc. All rights reserved. Use is subject to license terms.