parallel computing(并行計算):is the use of a parallel computer to reduce the time needed to slove a single computation problem.(使用并行計算機來減少解決單個問題所需的時間)
parallel computer(并行計算機):is a multiple-processor computer system supporting parallel ?programming.(支持并行計算的多處理器計算機系統)
multicomputer--多計算機 ?centralized multiprocessors--集中式多處理器
a centrlized multiprocessors(also called a symmetrical multiprocessor or SMP--對稱多處理器系統) is a more higly integrated system in which all CPUs share access to single golbal memory.This shared memory supports communication(通信)?and synchronization(同步)
parallel programming?(并行程序設計) : is ?programming in a language that allows you to explicity indicate how different portions of computation may be executed concurrently by different processors.(使用程序設計語言顯式的說明計算中不同的部分如何在不同處理器上同時執行)
MPI--Message Passing Interface(消息傳遞接口) :is a standard specificatuon for message-passing libraries.(消息傳遞庫的標準)
? ? ? ? ? ? The MPI library supports parallel programming through message passing. It allows processors that do not share memory to coopertate in performing a parallel computation.
Data Depedence Graphs--數據相關圖
????????Data Parallelism--數據并行性:independent tasks applying the same operation to different elements of a data set.(不相關的任務對數據集的不同元素進行相同的操作)
? ? ? ? Functional Parallelism--功能并行性 :independent tasks applying different operations to different data elements.(不相關的任務對數據集的不同元素進行不同的操作)
? ? ? ? Pipelining--流水線: the ouput of one stage is the input of the next.
? ? (數據相關圖中的并行性,節點代表 任務,節點內的字母代表執行的操作,邊代表任務間的相關性.(a)具有數據并行性的圖,不同的工人可以同時進行操作B;(b)具有功能并行性的圖,執行B,C,D操作的任務可以同時進行;(C)完全串行的相關圖,但是如果每個任務都需要相同的時間,并且需要處理多個問題實例的時候,可以在處理第i個問題的操作C的同時,處理第i+1個問題的B操作以及第i+2個問題的操作A,稱為流水線結構)
Data Clustering--數據聚類
data mining -- 數據挖掘
scientific data analysis -- 科學數據分析
Programming parallel computers -- 為并行計算機編程
? ? ? ? ? ? Extend a Compiler -- 擴展編譯器:?? ? One approach to the problem of programming parallel computers ?is to develop parallelizing compilers that can detect and exploit the parallelism in existing programs written in a sequential language.(解決并行編程的方法之一是開發并行化編譯器,使其能夠發現和表達現有串行語言程序中的并行性) ? ? ?
? ? ? ? ? ? Extend a Sequential Programming Language -- 擴展串行編程語言:
? ? ? ? ? ? Add a Parallel Programming Layer -- 增加并行編程層
? ? ? ? ? ? ? ? ? ? ? ? CODE(Computationlly Oriented Display Evironment--面向計算的顯示環境)
? ? ? ? ? ? ? ? ? ? ? ? Hence(Heterogeneous Network Computing Evironment--異構網絡計算環境)
? ? ? ? ? ? ? ? ? ? ? ? These system allow the use t depict a parallel program as a directed graph,where ????????????????????????nodes represent sequential procedures and arcs(邊) represent data depedences ????????????????????????among procedures.
? ? ? ? ? ? Create a Parallel Lanugae -- 創造一個并行語言
? ? ? ? ? ? ? ? ? ? ? ? ? ? HPF(high performance fortan--高性能Fortan)
? ? ? ? ? ? ? ? ? ? ? ? ? ? HPC(high performance compututing -- 高性能計算)
The process of the operations to deliver a C program with MPI to a remote Linux server and launch to the program with dedicated configuration of MPI.
Principles:
Install PuTTY on a Windows machine and then login a remote Linux server (hpc.fafu.edu.cn ) with password-based authentication.
pscp: 將pscp放在C:\WINDOWS\SYSTEM32下
? ? ? ? ? ? WIN+R 打開CMD
? ? ? ? ? ? 發送文件到Linux上:(如hello.c在D盤中)
輸入:
pscp ?D:\hello.c ? 3156010011@hpc.fafu.edu.cn:/export/home/student/3156010011
? ? ? ? ? ? ? ? 查看man mpicc 以及man mpirun的幫助信息
The command mpirun -help gives all the options available, using these options as appropriate to better run the application and improve system health.
The basic format of mpirun is:
Mpirun [mpirun-options ...] <progname> [options ...]
Where [mpirun-options ...], the main options are as follows:
-np <np> The number of processes to load.
-p4pg <pgfile> Loads the user process as required by the pgfile file. The pgfile file describes what user processes are loaded on those nodes. The format of the file is:
The first line: <node name> <0> <user to load the process - allows the use of absolute path>
The second line: <node name> <1> <user to load the process - allows the use of absolute path>
Line n: <node name> <1> <user to load the process - allow the use of absolute path>
Where n is the number of users to load the process. The node name can be the same or different. And the user has this option, the -np option is invalid.
?mpicc [option] source:
Commanly used options list:
·-c: Only compile, do not link, that is, only the target file (.o file)
·-o filename: Specifies the file name of the output, usually by default(a.out)
·-Ipath: Specify (add) header files (eg * .h) Search path (directory)
·-Lpath: Specify (increase) the search path (directory) of the library file
·-lname: With the library file libname. link
·-g: Make The object code contains the source file name, line number and other information(for program debugging)
2.The differences between Process and Thread.
The Process is running program, it provides two abstractions: one is a Logical control flow. The second is Private address space. The first abstract implantation is to slice the CPU time, each time slice is used to execute a process. When the time slice is consumed, the operating system switches the current process to another. This called a context switch.
A process which has a good number of threads, share code, data, as well as Kernel context before the thread. Each thread has its own stack and context. A process which is the same level of threads that share data and code. There is a hierarchical relationship between processes. Threads are relatively lightweight relative to the process. ?
Threads can communicate and coordinate through shared variables. Shared variables are less costly than message passing and network. Interprocess communication is mainly through messaging and networking.
So the key difference between the two term is that the threads are a part of a process, i.e. a process may contain one or more threads, but a thread cannot contain a process.