I need to summarise a bunch of long form text, and I'd ideally like to run it locally.
I'm not an NLP expert, but from what I can tell, the best evaluation benchmarks are G-Eval, SummEval and SUPERT. But I can't find any recent evaluation results.
Has anyone here run evaluations on more recent models? And can you recommend a model?