-
ParetoQ: Scaling Laws in Extremely Low-bit LLM Quantization
ParetoQ: Scaling Laws in Extremely Low-bit LLM Quantization
-
Minions: Cost-efficient Collaboration Between On-device and Cloud Language Models
Minions: Cost-efficient Collaboration Between On-device and Cloud Language Models
-
Memory-Efficient Fine-Tuning of Compressed Large Language Models via sub-4-bit Integer Quantization
Memory-Efficient Fine-Tuning of Compressed Large Language Models via sub-4-bit Integer Quantization
-
HALO: Hadamard-Assisted Lower-Precision Optimization for LLMs
HALO: Hadamard-Assisted Lower-Precision Optimization for LLMs
-
Solving `ptrace: Operation not permitted.` for GDB
Solving `ptrace: Operation not permitted.` for GDB