Xudong Lu*,Qi Liu*, Yuhui Xu, Aojun Zhou, Siyuan Huang, Bo Zhang, Junchi Yan, Hongsheng Li (* indicates equal contribution) python main.py --method layerwise_pruning --r 6 --calib_set c4 --model_path ...
KTransformers, pronounced as Quick Transformers, is designed to enhance your 🤗 Transformers experience with advanced kernel optimizations and placement/parallelism strategies. KTransformers is a ...