hw-native-sys is an open-source community dedicated to building a robust AI Infrastructure (AI Infra) and MLsys ecosystem grounded in different computing hardware.
Our philosophy is "Hardware-Native": we believe that to achieve peak performance and efficiency, system software—from compilers to kernels—must be designed with a deep awareness of the underlying hardware architecture. We aim to bridge the gap between rapidly evolving AI models and the diverse landscape of emerging NPU/GPU accelerators.
We focus on the full stack of Machine Learning Systems, specifically tailored for heterogeneous and domestic hardware:
- AI Compilers & Intermediate Representation: Exploring hardware-aware optimizations using compilation technologies
- High-Performance Kernels: Developing highly optimized operators (FlashAttention, GEMM, etc.) for domestic chips using Triton, TileLang, or native language.
- Distributed Training Infra: Optimizing model pre-training and post-traning via different parallalism and communication optimization
- Hardware-Software Co-design: Researching novel system architectures that fully utilize the specific features of specific hardware.