this is a linear RNN I've worked on, with the goal to solve the long range memory problem in RNNs as well as other improvements such as constant speed in generation...the rest of the text is copy/pasted as the original since due to the limited size this is massive overfit (perplexity is under 1.20):
https://i.imgur.com/p6AmBrq.png
Another way explain this is: "like Mamba but with good long term memory and fixed inference/generation speed". At the moment in C only but I bet it can be ported to python.