eventually, we provide an example of an entire language model: a deep sequence product spine (with repeating Mamba blocks) + language product head.
Although the recipe for ahead move needs to be defined in this https://tesslyks471397.bloggazza.com/29352936/rumored-buzz-on-mamba-paper