The Lottery Ticket Hypothesis: finding sparse trainable NNs with 90% less params [2018]

Gsus4@mander.xyz · edit-2 4 months ago

The Lottery Ticket Hypothesis: finding sparse trainable NNs with 90% less params [2018]

afk_strats@lemmy.world · 4 months ago

Working pruning techniques are tested and seem at least good at maintaining coherent transformer MOE models. https://doi.org/10.48550/arXiv.2510.13999

There are several working examples of REAP pruned models HuggingFace and that method seems very good.

The op paper suggests a technique which starts with an arbitrary structured expers pruned during training. I’m not 100% understanding it, but I still don’t think I’ve seen this exact technique which might be even more efficient

The Lottery Ticket Hypothesis: finding sparse trainable NNs with 90% less params [2018]

The Lottery Ticket Hypothesis: finding sparse trainable NNs with 90% less params [2018]

The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks