paper
an archive of posts with this tag
Mar 23, 2025 | ParetoQ: Scaling Laws in Extremely Low-bit LLM Quantization |
---|---|
Mar 16, 2025 | Minions: Cost-efficient Collaboration Between On-device and Cloud Language Models |
Feb 23, 2025 | Memory-Efficient Fine-Tuning of Compressed Large Language Models via sub-4-bit Integer Quantization |
Feb 16, 2025 | HALO: Hadamard-Assisted Lower-Precision Optimization for LLMs |
Jul 11, 2024 | State Space Models |
Jul 11, 2024 | HiPPO Matrices |