Blog | Hung-Yueh Chiang

ParetoQ: Scaling Laws in Extremely Low-bit LLM Quantization

ParetoQ: Scaling Laws in Extremely Low-bit LLM Quantization

6 min read · March 23, 2025

2025 · ml paper llm quantization
Minions: Cost-efficient Collaboration Between On-device and Cloud Language Models

Minions: Cost-efficient Collaboration Between On-device and Cloud Language Models

6 min read · March 16, 2025

2025 · ml paper llm federated distributed
Memory-Efficient Fine-Tuning of Compressed Large Language Models via sub-4-bit Integer Quantization

Memory-Efficient Fine-Tuning of Compressed Large Language Models via sub-4-bit Integer Quantization

5 min read · February 23, 2025

2025 · ml paper llm quantization finetuning
HALO: Hadamard-Assisted Lower-Precision Optimization for LLMs

HALO: Hadamard-Assisted Lower-Precision Optimization for LLMs

5 min read · February 16, 2025

2025 · ml paper llm quantization training
Solving `ptrace: Operation not permitted.` for GDB

Solving `ptrace: Operation not permitted.` for GDB

1 min read · October 5, 2024

2024 · linux ubuntu commands