paper
an archive of posts with this tag
| Mar 23, 2025 | ParetoQ: Scaling Laws in Extremely Low-bit LLM Quantization |
|---|---|
| Mar 16, 2025 | Minions: Cost-efficient Collaboration Between On-device and Cloud Language Models |
| Feb 23, 2025 | Memory-Efficient Fine-Tuning of Compressed Large Language Models via sub-4-bit Integer Quantization |
| Feb 16, 2025 | HALO: Hadamard-Assisted Lower-Precision Optimization for LLMs |
| Jul 11, 2024 | State Space Models |
| Jul 11, 2024 | HiPPO Matrices |