DPU + IPU (Arm, NVIDIA, Intel)



5:27 p.m. EDT – Welcome to Hot Chips! This is the annual conference dedicated to the latest, best and upcoming big silicon that we are all passionate about. Stay tuned Monday and Tuesday for our regular AnandTech Live blogs.

5:30 p.m. EDT – Just wait for this session to start, it should take a few minutes

5:30 p.m. EDT – Arm yourself first with its Neoverse N2 cores

5:34 p.m. EDT – Roadmap, objectives, basic architecture, system architecture, performance, conclusions

5:34 p.m. EDT – Second generation infrastructure according to N1

5:34 p.m. EDT – 4-128 basic designs

5:35 p.m. EDT – 5G infrastructure to cloud data centers

5:35 p.m. EDT – Arm sells intellectual property and definitions

5:35 p.m. EDT – SBSA / SBBR support

5:36 p.m. EDT – Marvell already uses N2, up to 36 in a SoC

5:36 p.m. EDT – High speed packet processing

5:36 p.m. EDT – Everything about the SpecINT score with DDR5 and PCIe 5.0

5:37 p.m. EDT – N2 with v9 arm

5:37 p.m. EDT – Two sets of scalable vector extensions, SVE, SVE2

5:37 p.m. EDT – Support BF16, INT8 mul

5:38 p.m. EDT – Side channel safety, SHA, SM3 / 4

5:38 p.m. EDT – * SHA3 / SHA512

5:38 p.m. EDT – Persistent memory support

5:38 p.m. EDT – memory and monitoring partition

5:39 p.m. EDT – Generation-to-generation improvements with virtualization

5:39 p.m. EDT – + 40% increase in the CPI

5:39 p.m. EDT – Power / area similar to N1, maximizes performance / Watt

5:39 p.m. EDT – an intense PPA trajectory

5:40 p.m. EDT – 3.6 GHz maximum base frequency

5:40 p.m. EDT – N1 on 7nm, vs N2 on 5nm

5:41 p.m. EDT – uArch – Most structures are larger

5:41 p.m. EDT – bigger

5:42 p.m. EDT – Collect more per cycle on the front-end – increase the accuracy of branch prediction

5:42 p.m. EDT – Reinforced security to avoid secondary channels

5:43 p.m. EDT – Larger structures at the back

5:44 p.m. EDT – N2 has a correlated Miss Caching prefetch (CMC)

5:45 p.m. EDT – Improved latency on L2 thanks to CMC

5:45 p.m. EDT – 32% improvement in the isofrequency IPC

5:46 p.m. EDT – SPEC2006 was mentioned 40% earlier

5:47 p.m. EDT – Coherent Mesh Network – CMN700 – chiplets and multi-socket

5:47 p.m. EDT – Also compatible with CXL

5:48 p.m. EDT – improvements on 600 – double mesh links, 3x BW in cross section

5:48 p.m. EDT – Programmable re-routing of hot spots

5:49 p.m. EDT – Composable Data Center SoC – Super Home I / O Chips & Arrays & Arrays

5:51 p.m. EDT – balance memory demands

5:51 p.m. EDT – capacity or bandwidth control

5:52 p.m. EDT – Cbusy – limitation of current transactions to the CPU – affects the aggressiveness of the hardware preloader

5:53 p.m. EDT – Cbusy and MPAM are intended to work together

5:54 p.m. EDT – Best performance result

5:56 p.m. EDT – Compared to the market with N2

5:56 p.m. EDT – full performances only

5:57 p.m. EDT – “Real world workload” figures based on pre-silicon models

5:58 p.m. EDT – Up to 256 L2 cores should be fun

5:59 p.m. EDT – arrive on the market in the coming months

5:59 p.m. EDT – Questions and answers

5:59 p.m. EDT – Q: Is N1 / N2 at iso-freq – what frequency on slide 10? A: A range of power modes, quoted 2-2.5 GHz which is what customers will use

6:01 p.m. EDT – Q: Busy for a heterogeneous multi-die system? A: All IPs will get the CBusy information and acceleration requests,

6:03 p.m. EDT – Q: MPAM cache partitioning? weight? A: It can do it. But also supports fine-grained thresholds for control – you can tune based on capacity without over-partitioning

6:03 p.m. EDT – Second session presentation – NVIDIA DPU

6:04 p.m. EDT – Idan Burstein, NVMoF co-author

6:04 p.m. EDT – Platform architecture and use cases

6:05 p.m. EDT – The data center is undergoing a revolution

6:05 p.m. EDT – Entirely disaggregate your server between calculation, memory, acceleration, storage and software. Requires accelerated networking and DPUs to control everything

6:06 p.m. EDT – 10-20x bandwidth deployed per server requires better networking

6:06 p.m. EDT – a Datacenter infrastructure workload

6:08 p.m. EDT – Moving workloads from infrastructure to CPU is a bad idea

6:08 p.m. EDT – Need for proper unloading

6:08 p.m. EDT – Acceleration of data passage required

6:09 p.m. EDT – Bluefield-2

6:09 p.m. EDT – Roadmap

6:09 p.m. EDT – Currently in delivery BF-2, announced BF-3 with double bandwidth

6:09 p.m. EDT – BF-4 is 4x BF-3

6:09 p.m. EDT – BF-4 also uses NVIDIA AI

6:10 p.m. EDT – 22 billion transistors

6:10 p.m. EDT – PCIe 5.0×32

6:10 p.m. EDT – Crypto 400 Gb / s

6:10 p.m. EDT – 300 equivalent x86 cores

6:10 p.m. EDT – 16 A78 arm cores

6:10 p.m. EDT – DDR5-5600, 128-bit bus

6:11 p.m. EDT – supports 18m IOPs

6:11 p.m. EDT – Connect-X 7

6:11 p.m. EDT – DOCA framework

6:11 p.m. EDT – Program on DOCA on BF-2, immediately switches to BF-3 and BF-4

6:12 p.m. EDT – 3 different programmable motors

6:12 p.m. EDT – 16x Arm A78 – server grade processor

6:12 p.m. EDT – 16-core, 256-thread data path accelerator (SMT16?)

6:12 p.m. EDT – ASAP – As soon as possible, programmable packet processor flow pipeline

6:13 p.m. EDT – Bluefield-4X

6:13 p.m. EDT – Bluefield-3X

6:13 p.m. EDT – Not ASAP, ASAP-square

6:15 p.m. EDT – Isolated boot domain in RT OS

6:16 p.m. EDT – PCIe optimized for DPU

6:17 p.m. EDT – Differentiate data paths – software-defined network stack

6:18 p.m. EDT – speed up the full path from host to network

6:18 p.m. EDT – it says bare metal host – can it make virtual hosts?

6:19 p.m. EDT – Encryption, tunneling, NAT, routing, QoS, emulation

6:19 p.m. EDT – DPDK 100G

6:19 p.m. EDT – Millions of packets per second

6:20 p.m. EDT – vs AMD EPYC 7742 64 cores

6:20 p.m. EDT – It’s Bluefield 2

6:20 p.m. EDT – TCP stream with 100G IPSEC

6:22 p.m. EDT – Acceleration of storage processing

6:22 p.m. EDT – In-flight data encryption

6:24 p.m. EDT – NVMe on Fabric

6:27 p.m. EDT – Cloud native supercomputing with non-blocking MPI performance

6:27 p.m. EDT – Accelerated FFT performance on multi-node HPC

6:28 p.m. EDT – DPU isolates Geforce Now – 10 million simultaneous users

6:28 p.m. EDT – + 50% more users per server

6:28 p.m. EDT – Push more simultaneous users

6:29 p.m. EDT – Bluefield 3X has an integrated GPU

6:30 p.m. EDT – CUDA support

6:30 p.m. EDT – GPU + DPU + network connectivity, fully programmable on a single PCIe card

6:31 p.m. EDT – Question / answer time

6:31 p.m. EDT – Q: Cortex A rather than N1 / N2 A: A78 was the best performing kernel at the time

6:31 p.m. EDT – Q: Add CXL to the future Bluefield? A: I cannot comment. See CXL as important

6:33 p.m. EDT – Q: RT-OS cores? A: Designed in-house, the arc is RISC-V compatible

6:34 p.m. EDT – Q: Can DPU speed up RAID build? A: Yes, it is possible – trivial and complex

6:36 p.m. EDT – It’s the end, the next one is Intel

6:36 p.m. EDT -Bradley Burres

6:37 p.m. EDT – Network management through the data center for Intel

6:38 p.m. EDT – Same five-minute IPU intro as previous conferences

6:40 p.m. EDT – Over time, the UIP has gained control to free up CPU resources. Move those workloads to the UIP = more performance!

6:41 p.m. EDT – “solve the infrastructure tax”

6:41 p.m. EDT – Mount Evans

6:41 p.m. EDT – Developed with a CSP

6:41 p.m. EDT – Baidiu or JD?

6:42 p.m. EDT – 16 Neoverse N1 cores

6:42 p.m. EDT – MAC Ethernet 200G

6:42 p.m. EDT -Pcie 4.0×16

6:42 p.m. EDT – NVMe storage with Optane recognition

6:42 p.m. EDT – Advanced cryptography and compression acceleration

6:42 p.m. EDT – Software, Hardware, Accelerator co-design

6:43 p.m. EDT – solve the long-tail infrastructure problem

6:44 p.m. EDT – Dataplane on the left, calculation on the right

6:44 p.m. EDT – Support for 4 socket systems with a Mount Evans

6:44 p.m. EDT – RDMA and ROCE v2

6:44 p.m. EDT – QoS and telemetry up to 200 million packets per second

6:45 p.m. EDT – IPSec online

6:46 p.m. EDT – N1 at 3 GHz

6:46 p.m. EDT – three channels of dual mode LPDDR4 – 102 Gb / s bandwidth

6:46 p.m. EDT – Engines for crypto

6:47 p.m. EDT – Intel didn’t just stick assets together

6:49 p.m. EDT – Programmable pipeline P4

6:51 p.m. EDT – Most applications for the UIP are brownfield – must be integrated into the current infrastructure

6:54 p.m. EDT – Now let’s talk about system security with independent workload and tenant isolation and recovery

6:55 p.m. EDT – QoS, availability

6:56 p.m. EDT – malicious driver detection

6:56 p.m. EDT – Sustainability

6:56 p.m. EDT – Complies with NSA and FIPS140 standards. Did you say something about 2030?

6:57 p.m. EDT – More info at the intel On event

6:57 p.m. EDT – Question / answer time

6:58 p.m. EDT – Q: PPA with Arm vs IA A: IP available and programmed arming

6:59 p.m. EDT – Q: Compliant with SBSA? A: Yes

6:59 p.m. EDT – Q: TDP? A: work with PCIe power

7:00 p.m. EDT – Q: Working with SPR given the two crypto? A: Yes

7:00 p.m. EDT – Q: Is Mount Evans replacing the PCH server? A: No, orthogonal

7:02 p.m. EDT – Q: specific to Xeon A: Use with any CPU

7:02 p.m. EDT – It’s over, it’s time for a keynote!

7:02 p.m. EDT -.



Source link

Previous Miami football set to finish 10-2 by national website
Next The Lafayette Police Civil Service Board did not post a public meeting on its website; abruptly canceled meeting