I think that the concept of world models (together with improved interactions with AI voice and video) haven’t yet really became part of the products offered by public companies. Players like Google and to a certain extent Meta are probably the best positioned, although a revamped Siri might bring back some of the consumer interest in multi modal interaction.
Long term it’s a critical play, short term still feels like there is lack of real adoption or a “killer app”.
Couldn't agree more, 'text always seemed like a middle state' truly captures why multi-modal is the only path forwad.
Enjoyed this one. 🫡
I think that the concept of world models (together with improved interactions with AI voice and video) haven’t yet really became part of the products offered by public companies. Players like Google and to a certain extent Meta are probably the best positioned, although a revamped Siri might bring back some of the consumer interest in multi modal interaction.
Long term it’s a critical play, short term still feels like there is lack of real adoption or a “killer app”.
Can you expand on what you mean by inference getting cheaper? The frontier models are getting increasingly expensive.