
Our next generation Meta Training and Inference Accelerator - AI at Meta
Apr 10, 2024 · Last year, we unveiled the Meta Training and Inference Accelerator (MTIA) v1, our first-generation AI inference accelerator that we designed in-house with Meta’s AI workloads in mind – specifically our deep learning recommendation models that are improving a variety of experiences across our products.
MTIA: First Generation Silicon Targeting Meta’s Recommendation Systems ISCA’23, June 2023, Orlando, Florida USA Given the experience deploying NNPI and GPUs as accelerators, it was clear that there is room for a more optimized solution for …
MTIA v1: Meta’s first-generation AI inference accelerator - AI at Meta
May 18, 2023 · Meta is executing on an ambitious plan to build the next generation of its infrastructure backbone – specifically for AI. This includes our first custom chip for running AI models, a new AI-optimized data center design, and phase 2 …
Meta Unveils Second-Gen MTIA NPU Delivering 3× Faster
Apr 25, 2024 · Fabricated in TSMC’s N5 process, the second-generation Meta Training and Inference Accelerator (MTIA) is part of Meta’s broader AI-infrastructure investment. The new MTIA AI accelerator (NPU) is deployed and running production workloads. Ranking and recommending models are MTIA’s targets.
Experience Meta Llama 3 with AMD Ryzen™ AI and Rad ... - AMD …
Apr 19, 2024 · AMD Ryzen™ Mobile 7040 Series and AMD Ryzen™ Mobile 8040 Series processors feature a Neural Processing Unit (NPU) which is explicitly designed to handle emerging AI workloads. Featuring up to 16 TOPs, the NPU allows the user to execute AI workloads with maximum power efficiency.
ok.. NPU's and how do I make use of them?? - Reddit
If you have recent GPU, your GPU already has what is functionality equivalent of NPU. NPU seems to be dedicated block for doing matrix multiplication which is more efficient for AI workload than more general purpose CUDA cores or equivalent GPU vector units from other brands GPUs.
Introducing quantized Llama models with increased speed and a …
Oct 24, 2024 · Today, we’re releasing our first lightweight quantized Llama models that are small and performant enough to run on many popular mobile devices. At Meta, we’re uniquely positioned to provide quantized models because of access to compute resources, training data, full evaluations, and safety.
Meta推出新版自研AI芯片:性能较上代提高三倍,降低对英伟达依 …
Apr 11, 2024 · 当地时间4月10日,社交巨头Meta公布了自主研发芯片MTIA的最新版本。MTIA是Meta专门为AI训练和推理工作设计的定制芯片系列。和去年五月官宣的Meta第一代AI推理加速器MTIA v1相比,最新版本芯片在性能上有显著提升,专为Meta旗下社交软件的排名和推荐系统而设 …
[译] Meta/Facebook 超大规模 AI/GPU 基础设施设计(2024)
Apr 21, 2024 · 作为对未来人工智能的重要投资,Meta 打造了两个大规模 AI 集群,每个集群由 2.4w 张 GPU 组成, 本文分享其计算、网络、存储等设计细节。 Meta 很早就开始构建 AI 基础设施,但第一次对外分享是在 2022 年,介绍了我们的 Research SuperCluster (RSC),它由 1.6w 个 A100 GPU 组成。 RSC 支撑了 Meta 第一代先进 AI 模型的开发,在训练 Llama/llama2 、 计算机视觉、NLP、语音识别、图像生成甚至编码等 AI 工作中发挥了重要作用。 精确数字是每个 …
Meta Attempts an Inference Chip Again - XPU.pub
Feb 2, 2024 · Reuters reports that Meta (Facebook) plans to deploy its own data-center inference chip. The appeal of such a design is clear: a chip specifically for inference should be less expensive and require less power than ASSPs such as Nvidia’s GPUs.