其他分享
首页 > 其他分享> > Multiprocessing (parallel computer architecture) (多处理系统)

Multiprocessing (parallel computer architecture) (多处理系统)

作者:互联网

目录

Multiprocessing

subprocessor: 协处理器

Classification of computer architecture

Flynn’s Taxonomy (Flynn 分类法):


在这里插入图片描述


SISD v. SIMD v. MIMD

在这里插入图片描述

PU: 处理单元;Instruction Pool: 指令 Cache;Data Pool: 数据 Cache

Challenges to Parallel Programming

First challenge is % of program inherently sequential

Example


Second challenge is long latency to remote memory

Example

SIMD (Data-Level Parallelism)

Vector Processor

Why Vector Processors?

Basic Vector Architecture

vector-register processors

memory-memory vector processors


Vector Memory-Memory vs. Vector Register Machines

在这里插入图片描述

指令后加 V V V 表示向量指令

Vector Supercomputers

Epitomized by Cray-1, 1976: Scalar Unit + Vector Extensions

Vector Programming Model

在这里插入图片描述

Stride: 例如二维数组按列取数时就要用到 stride

Multimedia Extensions (aka SIMD extensions)

当前 CPU 里集成的一般是多媒体扩展,类似于向量操作


Multimedia Extensions versus Vectors

The basic structure of a vector-register architecture

VMIPS

在这里插入图片描述
Primary Components of VMIPS

Vector Code Example

在这里插入图片描述

VLR: vector length register

Automatic Code Vectorization

Vector Arithmetic Execution

Vector Stripmining

分段开采

Vector Stride

跨距

Vector Chaining

Multiple Lanes

多航道 / 多车道

Array Processor

阵列机 (SIMD)

Basic idea:

*GPU

GPU / CPU架构比较


CUDA 高速运算的基础

CUDA (Compute Unified Device Architecture) → \rightarrow → 编程接口

适合的应用

不适合的应用 (需要重新设计算法和数据结构或者打包处理)

MIMD (Thread-Level Parallelism)

Communication Models

Shared Memory Model

Multi-processors (多处理器系统): 基于共享存储器 Shared Memory


shared memory multiprocessors either

Message Passing

Multi-computers (多计算机系统): 基于消息传递 Message Passing

MIMD Memory Architecture: Centralized (SMP) vs. Distributed

2 classes of multiprocessors with respect to memory:

互联网络可以是以太网、光纤等高速网络

The Flynn-Johnson classification of computer systems

在这里插入图片描述

横轴是存储器架构;纵轴是通信模型

考点;例如描述一下为什么这么划分

Typical parallel computer architectures

对称多处理机 SMP(Symmetric Multiprocessor)

工作站机群 COW (Cluster of Workstation)

Cluster categorizations

大规模并行处理机 MPP (Massively Parallel Processor)

Cluster vs. MPP

体系结构方面的区别

Interconnection Networks

Processor-to-Memory Interconnection Networks


互连网络的分类


Connecting Multiple Computers


Switching scheme


Multistage Network

Cache Coherence and Coherence Protocol

多处理器缓存一致性 (这个问题只可能在 “Shared Memory Models” 中才可能发生)

What Is Multiprocessor Cache Coherence?

Cache Coherence problem

Notice that the coherence problem exists because we have both a global state, defined primarily by the main memory, and a local state, defined by the individual caches, which are private to each processor core. Thus, in a multi-core where some level of caching may be shared (e.g., an L3), although some levels are private (e.g., L1 and L2), the coherence problem still exists and must be solved.

Coherent Memory Model

Coherent Memory System

Memory Consistency Model


Memory Consistency Model

We will rely on this assumption until we reach Section 5.6, where we will see exactly the implications of this definition, as well as the alternatives.

Basic Schemes for Enforcing Coherence

Cache Coherence Protocols (HW)

Snooping Coherence Protocols

Write Invalidate, Write Update

下面主要介绍 Write Invalidate

Basic Implementation Techniques (Write Invalidate)

Broadcast Medium Transactions (e.g., bus)

Locate up-to-date copy of data

Write-through cache


Write-back cache

An Example Protocol (Snoopy, Invalidate)

Write-through Cache Protocol

Write Back Cache Protocol


Write-Back State Machine - CPU

Writes to clean blocks are treated as misses (都是发信号到总线通知写无效)

Write-Back State Machine- Bus request

Write-back State Machine-III

在这里插入图片描述

标签:computer,processors,Vector,Memory,vector,architecture,memory,data,parallel
来源: https://blog.csdn.net/weixin_42437114/article/details/116322986