March 9
Post 03/09/2026
I should admit that I didn't put enough emphasis on the idea I shared couple of weeks ago:
The simplest thing to outperform current AWS Inferentia 2 is to implement neural model with weigths in metal (see https://blog.azolotarev.com/billion-dollar-idea-ai)
Nowadays, Taalas is beating nVidia's 230 tokens per second with 16,960 tks/s generation speed by implementing Llama 3.1 8B model in metal.
By securing $219M in investment and current valuation of $700M.