Tag: Inference

All the articles with the tag "Inference".

IIT Madras' Ziroh Labs Challenges GPU Dominance with CPU-Based AI
Published:Apr 24, 2025 at 10:39 AM
Ziroh Labs, an IIT Madras startup, aims to provide a cost-effective alternative to GPU-based AI inference using CPUs, targeting applications with less stringent latency requirements and potentially democratizing AI access.
Optimizing Small Language Model Inference on CPUs with Arm
Published:Apr 23, 2025 at 03:52 PM
The Arm podcast discusses optimizing Small Language Model (SLM) inference on CPUs using techniques like quantization and architecture-specific libraries, making AI more accessible and cost-effective.
Microsoft Achieves Breakthrough with 1-Bit AI LLM, Enabling CPU-Based Inference
Published:Apr 18, 2025 at 02:06 AM
Microsoft's BitNet b1.58, a 2-billion parameter 1-bit LLM, achieves comparable performance to FP16 models and can run efficiently on CPUs, lowering hardware requirements and democratizing AI access.

IIT Madras' Ziroh Labs Challenges GPU Dominance with CPU-Based AI