Tag: Inference
All the articles with the tag "Inference".
IIT Madras' Ziroh Labs Challenges GPU Dominance with CPU-Based AI
Published: at 10:39 AMZiroh Labs, an IIT Madras startup, aims to provide a cost-effective alternative to GPU-based AI inference using CPUs, targeting applications with less stringent latency requirements and potentially democratizing AI access.
Optimizing Small Language Model Inference on CPUs with Arm
Published: at 03:52 PMThe Arm podcast discusses optimizing Small Language Model (SLM) inference on CPUs using techniques like quantization and architecture-specific libraries, making AI more accessible and cost-effective.
Microsoft Achieves Breakthrough with 1-Bit AI LLM, Enabling CPU-Based Inference
Published: at 02:06 AMMicrosoft's BitNet b1.58, a 2-billion parameter 1-bit LLM, achieves comparable performance to FP16 models and can run efficiently on CPUs, lowering hardware requirements and democratizing AI access.