4 days ago
Mon Apr 6, 2026 6:10pm PST
Screenshot of interesting generalization from a tiny corpus on linear RNN
this is a linear RNN I've worked on, with the goal to solve the long range memory problem in RNNs as well as other improvements such as constant speed in generation...the rest of the text is copy/pasted as the original since due to the limited size this is massive overfit (perplexity is under 1.20):

https://i.imgur.com/p6AmBrq.png

Another way explain this is: "like Mamba but with good long term memory and fixed inference/generation speed". At the moment in C only but I bet it can be ported to python.

comments:
add comment
loading comments...