Post 03/09/2026

I should admit that I didn't put enough emphasis on the idea I shared couple of weeks ago:

The simplest thing to outperform current AWS Inferentia 2 is to implement neural model with weigths in metal (see https://blog.azolotarev.com/billion-dollar-idea-ai)

Nowadays, Taalas is beating nVidia's 230 tokens per second with 16,960 tks/s generation speed by implementing Llama 3.1 8B model in metal.

By securing $219M in investment and current valuation of $700M.

Good job!