Together AI and the Infrastructure Beneath the Model Race

Together AI and the Infrastructure Beneath the Model Race

Together AI runs every major open-source model on a single platform. Llama, DeepSeek, Qwen, Kimi, GLM, Mistral. American and Chinese models, side by side, on the same GPUs. The company crossed $1B in annual revenue in February, serves 850,000 developers, and processes trillions of tokens daily.

Co-investors include NVIDIA, General Catalyst, Kleiner Perkins, and Salesforce Ventures.

We have written extensively about open-source AI infrastructure as the defining investment category of this cycle. Together AI is the highest-conviction expression of that thesis.

The model race is accelerating on both sides. Chinese open-source models went from 1.2% of global usage to 30% in twelve months. Western labs are shipping faster than ever. Every new model from either ecosystem is incremental demand for the infrastructure companies powering them.

Together AI sits at the center of that demand. Their research team invented FlashAttention, the most widely adopted inference optimization in the industry. They run models up to 2.75x faster than competing providers. They are transitioning to owned data centers with a 750MW North America buildout planned for 2026.

The infrastructure layer does not pick winners in the model race. It benefits from all of them. That is the position.

Bashar Aboudaoud
Managing Member, UpRound

Together AI and the Infrastructure Beneath the Model Race

Together AI and the Infrastructure Beneath the Model Race

Keep Reading

Copyright 2025 © Design & Developed by Brinc