We’ve just released finetuning in Model2Vec, a feature we have been working on for a long time. Model2Vec is a library for making state-of-the-art static word embedders by distilling sentence transformers.
Finetuning allows you to train lightweight text classifiers directly on top of Model2Vec models, creating very fast and performant models in a couple of lines of code.
Main features:
- Better Performance: Improves classification accuracy across diverse NLP tasks.
- Fast Training: Train in minutes, not hours, on a CPU.
- Fast inference: Make classifiers that can actually run in production on budget hardware
- Lightweight Deployment: Save and load models as scikit-learn pipelines—no Torch needed for inference.
We’ve benchmarked this on a large number of datasets, the results can be found in the results section of the repo. We are curious to hear your feedback and whether there are any other features you’d like to see!
Our benchmark results are documented here: https://github.com/MinishLab/model2vec/tree/main/results#tra...