High Performance Computing 25 SP Potpourri
总字数:1600字,预计阅读时间 02分 40秒。
Potpourri has a good taste.
Heterogeneous System Architecture
The goals of the HSA:
- Enable power efficient performance.
- Improve programmability of heterogeneous processors.
- Increase the portability of code across processors and platforms.
- Increase the pervasiveness of heterogeneous solutions.
The Runtime Stack
Accelerated Processing Unit
A processor that combines the CPU and the GPU elements into a single architecture.
Intel Xeon Phi
The goal:
- Leverage X86 architecture and existing X86 programming models.
- Dedicate much of the silicon to floating point ops.
- Cache coherent.
- Increase floating-point throughput.
- Strip expensive features.
The reality:
- 10s of x86-based cores.
- Very high-bandwidth local GDDR5 memory.
- The card runs a modified embedded Linux.
Deep Learning: Deep Neural Networks
The network can used as a computer.
Tensor Processing Unit
A custom ASIC for the phase of Neural Networks (AI accelerator).
TPUv1 Architecture
TPUv2 Architecture
Advantages of TPU:
- Allows to make predications very quickly and respond within fraction of a second.
- Accelerate performance of linear computation, key of machine learning applications.
- Minimize the time to accuracy when you train large and complex network models.
Disadvantages of TPU:
- Linear algebra that requires heavy branching or are not computed on the basis of element wise algebra.
- Non-dominated matrix multiplication is not likely to perform well on TPUs.
- Workloads that access memory using sparse technique.
- Workloads that use highly precise arithmetic operations.
文章作者:Ricardo Ren
版权声明:本博客所有文章除特别声明外,均采用
CC BY-NC-SA 4.0
许可协议,诸位读者如有兴趣可任意转载,不必征询许可,但请注明“转载自
Ricardo's Blog
”。
如果觉得不错的话,可以支持一下作者哦~

请我喝奶茶

请我吃晚饭
2021 - 2025 © Ricardo Ren ,由 .NET 9.0.2 驱动。
Build Commit # dab866f13a