2025 is the year of inference. We're thrilled to announce our $75m Series C co-led by IVP and Spark Capital with participation from Greylock, Conviction, basecase capital, South Park Commons and Lachy Groom. We're also excited to add Dick Costolo and Adam Bain from 01 Advisors as new investors. Check out our CEO Tuhin's blog to learn more. It's time to build!
Baseten
Software Development
San Francisco, CA 8,400 followers
Fast, scalable inference in our cloud or yours
About us
At Baseten we provide all the infrastructure you need to deploy and serve ML models performantly, scalably, and cost-efficiently. Get started in minutes, and avoid getting tangled in complex deployment processes. You can deploy best-in-class open-source models and take advantage of optimized serving for your own models. We also utilize horizontally scalable services that take you from prototype to production, with light-speed inference on infra that autoscales with your traffic. Best in class doesn't mean breaking the bank. Run your models on the best infrastructure without running up costs by taking advantage of our scaled-to-zero feature.
- Website
-
https://www.baseten.co/
External link for Baseten
- Industry
- Software Development
- Company size
- 51-200 employees
- Headquarters
- San Francisco, CA
- Type
- Privately Held
- Specialties
- developer tools and software engineering
Products
Baseten
Machine Learning Software
At Baseten we provide all the infrastructure you need to deploy and serve ML models performantly, scalably, and cost-efficiently. Get started in minutes, and avoid getting tangled in complex deployment processes. You can deploy best-in-class open-source models and take advantage of optimized serving for your own models. We also utilize horizontally scalable services that take you from prototype to production, with light-speed inference on infra that autoscales with your traffic. Best in class doesn't mean breaking the bank. Run your models on the best infrastructure without running up costs by taking advantage of our scaled-to-zero feature.
Locations
-
Primary
San Francisco, CA, US
-
New York, NY, US
Employees at Baseten
Updates
-
The first Baseten bot is live on Poe! It's very fast, you can ask questions in your language of choice and get instant answers. We're excited to partner with Quora to power the fastest open-source models for the Poe community. Our first bot is powered by Qwen, and we have lots more coming—if you have requests, let us know!
-
🚀 We’re thrilled to introduce Baseten Embeddings Inference (BEI), the fastest embeddings solution available! 🚀 Embedding, reranker, and classification models power a huge part of the AI landscape. From search and retrieval to enterprise AI agents, performant embeddings inference is the backbone of an excellent user experience. After working with AI builders who are shipping embeddings pipelines to millions of users across the globe, we saw the need for a more performant solution. Other solutions on the market focus on scale (an inherent feature of Baseten’s infrastructure) or accuracy at the model level, and simply fall short in terms of throughput and latency. That’s why we built BEI. BEI is optimized specifically for embeddings workloads, which often receive high numbers of requests and require low latency for individual queries. Across any embedding solution, BEI provides: • The highest-throughput inference (over 2x higher) • The lowest-latency inference (over 10% lower) • The smallest memory footprint (67% less) Coupled with our optimized cold starts, elastic horizontal scale, and five nines uptime, you can use BEI with open-source, custom, or fine-tuned models or as part of compound AI systems for fast, reliable inference in production. Learn more in our launch blog: https://lnkd.in/ebnrZgiy Shoutout to Michael Feil on our model performance team for his work carefully optimizing BEI for production AI workloads!
Introducing Baseten Embeddings Inference: The fastest embeddings solution available
-
Thank you Wing Venture Capital and Eric Newcomer for recognizing Baseten in the Enterprise Tech 30! Shout out to our incredible customers and partners for your support. Congrats to the other winners! Check out the full list: https://lnkd.in/eB5KYpXX
-
-
Thanks to everyone who came by to see us at NVIDIA GTC last week! If you didn't catch us then (or even if you did), you can meet the Baseten team at KubeCon London next week and Google Cloud Next the week after. Swing by the booths any time, or book a demo or coffee chat here: 📍 KubeCon London: https://lnkd.in/ej_89myr 📍 Google Cloud Next: https://lnkd.in/ez45bhWq
-
-
Baseten reposted this
#Hiring: Infrastructure Engineers at Baseten The infra team is building the backbone of Baseten's ML inference platform, tackling fascinating challenges in distributed systems and resource optimization. If you're passionate about building scalable infrastructure for ML workloads and want to join a Series C startup making an impact in the AI space, check out the open roles: Infrastructure Software Engineer: https://lnkd.in/gVpy-rQZ Senior Infrastructure Software Engineer: https://lnkd.in/gdgi69VX DMs open for questions! #AIInfrastructure #Hiring #MLOps
-
Baseten is growing! If you're looking for your next opportunity, take a look at our 18 open roles across engineering and GTM. Apply directly or share with someone you think would be a great fit! You can find the full list here: https://lnkd.in/eBiU-5kn
-
-
Thanks to everyone who came to Pankaj G. and Philip Kiely's talk on inference optimization yesterday—we had a full house! If you haven't had time to talk to the team at NVIDIA GTC yet, you can catch us all day at booth #1233. Get a demo, grab some swag, or grab coffee with one of our execs (we're serving Baseten Blend all day).
-