发布网友 发布时间:2022-05-15 08:58
共1个回答
热心网友 时间:2023-10-16 17:25
ifort/icc
Ver:11.083 ./configure -c++=icpc -cc=icc -f77=ifort -f90=ifort --prefix=/home/soft/mpi/mpich-1.2.7-intel
make
make install vi ~/.bashrc
添加如下:
##############MPICH###########
export PATH=/home/soft/mpi/mpich-1.2.7-intel/bin:$PATH
################intel compiler###################
. /home/soft/intel/Compiler/11.0/083/bin/intel64/ifortvars_intel64.sh
. /home/soft/intel/Compiler/11.0/083/bin/intel64/iccvars_intel64.sh
###############intel mkl###################
export LD_LIBRARY_PATH=/home/soft/intel/mkl/10.1.2.024/lib/em64t/:$LD_LIBRARY_PATH tar zxf fftw-2.1.5.tar.gz
cd fftw-2.1.5/
export F77=ifort
export CC=icc
./configure --prefix=/home/soft/mathlib/fftwv215-mpich --enable-mpi
make
make install 进入安装用户目录
su - mjhe
mkdir ~/WIEN2k_09
cp WIEN_2k.tar ~/WIEN2k_09 cd ~/WIEN2k_09
tar xf WIEN2k_09.tar
./expand_lapw ./siteconfig_lapw
其中几个编译参数需要修改: (可以参考如下)
specify a system
K Linux (Intel ifort 10.1 compiler + mkl 10.0 )
specify compiler
Current selection: ifort
Current selection: icc
specify compiler options, BLAS and LAPACK
Current settings:
O Compiler options: -FR -mp1 -w -prec_div -pc80 -pad -align -DINTEL_VML -traceback
L Linker Flags: $(FOPT) -L/home/soft/intel/mkl/10.1.2.024/lib/em64t/ -pthread -i-static
P Preprocessor flags '-DParallel'
mkl的库用静态的:
R R_LIB (LAPACK+BLAS): /home/soft/intel/mkl/10.1.2.024/lib/em64t/libmkl_lapack.a /home/soft/intel/mkl/10.1.2.024/lib/em64t/libguide.a /home/soft/intel/mkl/10.1.2.024/lib/em64t/libmkl_core.a /home/soft/intel/mkl/10.1.2.024/lib/em64t/libmkl_em64t.a
configure Parallel execution
Shared Memory Architecture? (y/n):n
Remote shell (default is ssh) = ssh
Do you have MPI and Scalapack installed and intend to run
finegrained parallel? (This is usefull only for BIG cases)!
(y/n) n
Current selection: mpiifort
Current settings:
采用静态库
RP RP_LIB(SCALAPACK+PBLAS): -lmkl_intel_lp64 /home/soft/intel/mkl/10.1.2.024/lib/em64t/libmkl_scalapack_lp64.a /home/soft/intel/mkl/10.1.2.024/lib/em64t/libmkl_sequential.a /home/soft/intel/mkl/10.1.2.024/lib/em64t/libmkl_blacs_lp64.a /home/soft/mathlib/fftwv215-mpich/lib/libfftw_mpi.a /home/soft/mathlib/fftwv215-mpich/lib/libfftw.a -lmkl /home/soft/intel/mkl/10.1.2.024/lib/em64t/libguide.a
//
RP RP_LIB(SCALAPACK+PBLAS): -lmkl_intel_lp64 /home/soft/intel/mkl/10.1.2.024/lib/em64t/libmkl_scalapack_lp64.a /home/soft/intel/mkl/10.1.2.024/lib/em64t/libmkl_sequential.a /home/soft/intel/mkl/10.1.2.024/lib/em64t/libmkl_blacs_lp64.a -L/data1/soft/lib/lib/ -lfftw_mpi -lfftw -lmkl /data1/soft/intel/mkl/10.0.3.020/lib/em64t/libguide.a
FP FPOPT(par.comp.options): $(FOPT)
MP MPIRUN commando : mpirun -np _NP_ -machinefile _HOSTS_ _EXEC_
Dimension Parameters
该部分可以采用默认值,也可以设置为(4GB以上内存)
PARAMETER (NMATMAX= 30000)
PARAMETER (NUME= 1000)
进入编译部分:
Compile/Recompile
A Compile all programs (suggested)
主要在编译mpi并行版本的5个可执行文件时会出错,因此编译后需要检查以下文件是否存在:
./SRC_lapw0/lapw0_mpi
./SRC_lapw1/lapw1_mpi
./SRC_lapw1/lapw1c_mpi
./SRC_lapw2/lapw2_mpi
./SRC_lapw2/lapw2c_mpi ./userconfig_lapw
editor shall be: vi
其余都回车
修改.bashrc,注释以下这行:
#ulimit -s unlimited
修改parallel_options
setenv WIEN_MPIRUN “mpirun -machinefile _HOSTS_ -np _NP_ _EXEC_” 用root用户打开apache服务
service apache2 start
在普通用户下执行
w2web
将打开7890端口作为wien2k的web界面 进行串行计算:
以系统自带算例TiC为例:
cd TiC
mkdir TiC
cp ../TiC.struct .
生成原子信息:
instgen_lapw
初始化算例:
init_lapw –b
计算:
run_lapw
可以看到程序的输出结果在*.output中,如有错误可以在TiC.dayfile中查询。
进行并行计算:
测试并行环境是否设置:
testpara_lapw
测试算例计算状态:
testpara1_lapw
testpara2_lapw
根据.machines文件不同决定进行k点或mpi并行计算:
K点:
granularity:1
1:node31:1
1:node31:1
1:node32:1
1:node32:1
lapw0:node31:2 node32:2
extrafine:1
mpi:
granularity:1
1:node31:2
1:node32:2
lapw0:node31:2 node32:2
extrafine:1
计算:
run_lapw -p cat wien2k.pbs
###########################################################################
# #
# Script for submitting parallel wien2k_09 jobs to Dawning cluster. #
# #
###########################################################################
###########################################################################
# Lines that begin with #PBS are PBS directives (not comments).
# True comments begin with # (i,e., # followed by a space).
###########################################################################
#PBS -S /bin/bash
#PBS -N TiO2
#PBS -j oe
#PBS -l nodes=1:ppn=8
#PBS -V
#############################################################################
# -S: shell the job will run under
# -o: name of the queue error filename
# -j: merges stdout and stderr to the same file
# -l: resources required by the job: number of nodes and processors per node
# -l: resources required by the job: maximun job time length
#############################################################################
#########parallel mode is mpi/kpoint############
PARALLEL=mpi//表示采用mpi并行或k点并行
echo $PARALLEL
################################################
NP=`cat ${PBS_NODEFILE} | wc -l`
NODE_NUM=`cat $PBS_NODEFILE|uniq|wc -l`
NP_PER_NODE=`expr $NP / $NODE_NUM`
username=`whoami`
export WIENROOT=/home/users/mjhe/wien2k_09/
export PATH=$PATH:$WIENROOT:.
WIEN2K_RUNDIR=/scratch/${username}.${PBS_JOBID}
export SCRATCH=${WIEN2K_RUNDIR}
#creat scratch dir
if [ ! -a $WIEN2K_RUNDIR ]; then
echo Scratch directory $WIEN2K_RUNDIR created.
mkdir -p $WIEN2K_RUNDIR
fi
cd $PBS_O_WORKDIR
###############creating .machines################
case $PARALLEL in
mpi)
echo granularity:1 >.machines
for i in `cat $PBS_NODEFILE |uniq`
do
echo 1:$i:$NP_PER_NODE >> .machines
done
printf lapw0:>> .machines
#####lapw0 用mpi并行#############
for i in `cat ${PBS_NODEFILE}|uniq`
do
printf $i:$NP_PER_NODE >>.machines
done
#################################
####lapw0用mpi并行 报错的算例用以下 mpi error lapw0########
# printf `cat ${PBS_NODEFILE}|uniq|head -1`:1>>.machines
#############end#################
printf /n >>.machines
echo extrafine:1>>.machines
;;
kpoint)
echo granularity:1 >.machines
for i in `cat $PBS_NODEFILE`
do
echo 1:$i:1 >> .machines
done
printf lapw0:>> .machines
#####lapw0 用mpi并行#############
for i in `cat ${PBS_NODEFILE}|uniq`
do
printf $i:$NP_PER_NODE >>.machines
done
#################################
####lapw0用mpi并行 报错的算例用以下 mpi error lapw0########
# printf `cat ${PBS_NODEFILE}|uniq|head -1`:1>>.machines
#############end#################
printf /n >>.machines
echo extrafine:1>>.machines
;;
esac
#################end creating####################
####### Run the parallel executable WIEN2K#########
instgen_lapw
init_lapw -b
clean -s
echo ##################start time is `date`########################
run_lapw -p
echo ###################end time is `date`########################
rm -rf $WIEN2K_RUNDIR
########################END########################
一般需要修改的地方已用粗体标出
该脚本可以实现算例的初始化,必须在存在*.struct的前提下进行。 CB65
Shanghai 2382:16GB 147GB SAS
1000Gb/mpich v1.2.7
TiO2算例:
NMATMAX=30000
2进程k点,mpi并行lapw0、k点并行lapw1、lapw2模块
4m44s
4进程k点,mpi并行lapw0、k点并行lapw1、lapw2模块
4m30s
8进程k点,mpi并行lapw0、k点并行lapw1、lapw2模块
6m29s
2进程mpi,mpi并行lapw0、lapw1、lapw2模块
7m53s
4进程mpi,mpi并行lapw0、lapw1、lapw2模块
6m56s
8进程mpi,mpi并行lapw0、lapw1、lapw2模块
9m5s
标准测试算例:
官方提供的测试算例:
串行:
test_case
export OMP_NUM_THREADS=1
time x lapw1 –c
SUM OF WALL CLOCK TIMES: 135.0 (INIT = 1.0 + K-POINTS = 133.9)
export OMP_NUM_THREADS=4
time x lapw1 –c
SUM OF WALL CLOCK TIMES: 62.0 (INIT = 1.0 + K-POINTS = 61.0)
export OMP_NUM_THREADS=8
time x lapw1 –c
SUM OF WALL CLOCK TIMES: 56.2 (INIT = 1.0 + K-POINTS = 55.2)
并行:
time x lapw1 –p
test_case
2 kpoint:
test_case.output1: SUM OF WALL CLOCK TIMES: 62.0 (INIT = 1.0 + K-POINTS = 61.0)
test_case.output1_1: SUM OF WALL CLOCK TIMES: 138.5 (INIT = 1.0 + K-POINTS = 137.5)
4 kpoint:
test_case.output1: SUM OF WALL CLOCK TIMES: 62.0 (INIT = 1.0 + K-POINTS = 61.0)
test_case.output1_1: SUM OF WALL CLOCK TIMES: 134.9 (INIT = 1.0 + K-POINTS = 133.9)
mpi-benchmark
2process:
mpi-benchmark.output1_1: TIME HAMILT (CPU) = 134.1, HNS = 116.4, HORB =0.0, DIAG=697.5
mpi-benchmark.output1_1: TOTAL CPU TIME: 950.0 (INIT = 1.9 + K-POINTS = 948.1)
mpi-benchmark.output1_1: SUM OF WALL CLOCK TIMES: 1138.9 (INIT =2.2 + K-POINTS =1136.7)
4process:
mpi-benchmark.output1_1: TIME HAMILT (CPU) = 67.8, HNS = 70.5, HORB = 0.0, DIAG = 420.6
mpi-benchmark.output1_1: TOTAL CPU TIME: 560.7 (INIT = 1.8 + K-POINTS = 558.9)
mpi-benchmark.output1_1: SUM OF WALL CLOCK TIMES: 643.2 (INIT = 2.2 + K-POINTS = 640.9)
8process:
mpi-benchmark.output1_1: TIME HAMILT (CPU) = 40.4, HNS = 44.9, HORB = 0.0, DIAG = 422.0
mpi-benchmark.output1_1: TOTAL CPU TIME: 509.3 (INIT = 1.9 + K-POINTS = 507.4)
mpi-benchmark.output1_1: SUM OF WALL CLOCK TIMES: 614.3 (INIT = 2.2 + K-POINTS = 612.0)
16process:
mpi-benchmark.output1_1: TIME HAMILT (CPU) = 22.6, HNS = 32.5, HORB = 0.0, DIAG = 140.5
mpi-benchmark.output1_1: TOTAL CPU TIME: 197.5 (INIT = 1.9 + K-POINTS = 195.7)
mpi-benchmark.output1_1: SUM OF WALL CLOCK TIMES: 1190.0 (INIT =2.8 + K-POINTS =1187.2)
可以用grep TIME *output1* 显示计算时间