Mamba - a replacement for Transformers? - YouTube

Excerpt

Mamba is a new neural network architecture proposed by Albert Gu and Tri Dao.Timestamps:00:00 - Mamba - a replacement for Transformers?00:19 - The Long Range



This is my first time watching your channel. Impressive walkthrough. When I first heard of Q* my imagination started to build a very similar architecture
 I don’t follow too much of the technical, but I saw how the sandwiched gates, shown in the video, could be used almost in an analogue fashion. This is brilliant! Watching this made me grin like crazy
 This might not be zero memory, but dang if it isn’t a huge step in that direction. Using local memory is genius. And that token interpretation length, yes
 So
 physically, I guess, in my mind the next step is to localize the memory to the operation even more, but it looks like in that architecture it’s as local as it’s going to get
 What about something like
 “Sample-and-hold,” from actual analogue circuits? That might be something to think about.

Read more