About
LLM Architect learning to innovate, optimize, and scale the next generation of large…
Articles by Santosh
Contributions
-
How can you use software architecture to advance your career?
Sharing your knowledge and experience is one of the best way to advance your carees. I recall countless time people have reachout to me to review their design. During this exchange I found that it not only help me to improve my exisiting skills but also stay rooted and experiement various grounding principle on which a typical software architecture is usually based on.
Activity
-
Some meetings are just meant to happen! 😃. Walked into Joe & The Juice Cafe and boom — there’s Roy! (Anandamoy Roychowdhary). After all these years,…
Some meetings are just meant to happen! 😃. Walked into Joe & The Juice Cafe and boom — there’s Roy! (Anandamoy Roychowdhary). After all these years,…
Liked by Santosh Sawant
-
Day 28: Quantum Wavefunction Evolution with CUDA Today, I implemented a CUDA-based simulation of the time evolution of a quantum wavefunction using…
Day 28: Quantum Wavefunction Evolution with CUDA Today, I implemented a CUDA-based simulation of the time evolution of a quantum wavefunction using…
Liked by Santosh Sawant
-
🚀 Just dropped a new tutorial: Build Your Own Medical Mini-DeepSeek R1 with Reinforcement Learning — for under $3 on a T4 GPU. The RL finetuned…
🚀 Just dropped a new tutorial: Build Your Own Medical Mini-DeepSeek R1 with Reinforcement Learning — for under $3 on a T4 GPU. The RL finetuned…
Liked by Santosh Sawant
Experience
Education
-
Visvesvaraya Technological University
-
Activities and Societies: Recipient of Merit Scholarship from VTU for academic performance.
-
-
Activities and Societies: Recipient of Merit Scholarship from BVBCET for academic performance.
Licenses & Certifications
Recommendations received
2 people have recommended Santosh
Join now to viewMore activity by Santosh
-
DuoGuard: A Two-Player RL-Driven Framework for Multilingual LLM Guardrails The rapid advancement of large language models (LLMs) has increased the…
DuoGuard: A Two-Player RL-Driven Framework for Multilingual LLM Guardrails The rapid advancement of large language models (LLMs) has increased the…
Shared by Santosh Sawant
-
How do you currently deploy open LLMs? With vLLM, with Kubernetes? vLLM production-stack is an new open-source batteries included reference…
How do you currently deploy open LLMs? With vLLM, with Kubernetes? vLLM production-stack is an new open-source batteries included reference…
Liked by Santosh Sawant
-
Let’s dive into Group Relative Policy Optimization (GRPO) the loss function used in the RL training process by DeepSeek. 📔 Background Info GRPO is…
Let’s dive into Group Relative Policy Optimization (GRPO) the loss function used in the RL training process by DeepSeek. 📔 Background Info GRPO is…
Liked by Santosh Sawant
-
To help developers securely experiment and build their own specialized agents, the 671-billion-parameter DeepSeek-R1 model is now available as an…
To help developers securely experiment and build their own specialized agents, the 671-billion-parameter DeepSeek-R1 model is now available as an…
Liked by Santosh Sawant
-
DeepSeek-R1: Incentivizing Reasoning Capability in Large Language Models via Reinforcement Learning A typical training process for LLMs consists of…
DeepSeek-R1: Incentivizing Reasoning Capability in Large Language Models via Reinforcement Learning A typical training process for LLMs consists of…
Shared by Santosh Sawant
-
C++ remains one of the top choices of programming languages for mission-critical systems and software that interface with hardware. There’s already…
C++ remains one of the top choices of programming languages for mission-critical systems and software that interface with hardware. There’s already…
Liked by Santosh Sawant
-
Mind Evolution: Evolving Deeper LLM Thinking Recently Google have released an evolutionary search strategy for scaling inference time compute in…
Mind Evolution: Evolving Deeper LLM Thinking Recently Google have released an evolutionary search strategy for scaling inference time compute in…
Shared by Santosh Sawant
-
Don't just study how diffusion models work - train one! Sony Research released Micro Diffusion, a minimal implementation that allows training a…
Don't just study how diffusion models work - train one! Sony Research released Micro Diffusion, a minimal implementation that allows training a…
Liked by Santosh Sawant
-
MiniMax-01: Scaling Foundation Models with Lightning Attention Recently, Long context LLMs have been pinnacle in further advancement of generative…
MiniMax-01: Scaling Foundation Models with Lightning Attention Recently, Long context LLMs have been pinnacle in further advancement of generative…
Shared by Santosh Sawant
-
Nobody will hire you to code without end goals. Solve problems: 1) Take a LM, compress its KV-Cache (choose technique). Try to retain its…
Nobody will hire you to code without end goals. Solve problems: 1) Take a LM, compress its KV-Cache (choose technique). Try to retain its…
Liked by Santosh Sawant
-
How can AI make reading more enjoyable? What would an AI-powered reading experience look like? Over the holidays, I prototyped aireadingclub.com to…
How can AI make reading more enjoyable? What would an AI-powered reading experience look like? Over the holidays, I prototyped aireadingclub.com to…
Liked by Santosh Sawant
-
Separator tokens like the new line and period character seem to be quite important to LLMs. SepLLM uses this finding to create a special attention…
Separator tokens like the new line and period character seem to be quite important to LLMs. SepLLM uses this finding to create a special attention…
Liked by Santosh Sawant
-
Here is everything that happened in AI Agents this week 🧵 (save for later) 1/ Alex Reibman shared his vision for the modern AI Agent…
Here is everything that happened in AI Agents this week 🧵 (save for later) 1/ Alex Reibman shared his vision for the modern AI Agent…
Liked by Santosh Sawant
-
Many people asking me how to start learning CUDA and Triton. Honestly, this is the only resource you need. I teach CUDA and Triton from scratch…
Many people asking me how to start learning CUDA and Triton. Honestly, this is the only resource you need. I teach CUDA and Triton from scratch…
Liked by Santosh Sawant
-
Excited to share insights from Walmart 's groundbreaking semantic search system that revolutionizes e-commerce product discovery! The team at…
Excited to share insights from Walmart 's groundbreaking semantic search system that revolutionizes e-commerce product discovery! The team at…
Liked by Santosh Sawant
-
One of my favorite lectures on ML/LLMs in 2024: Hyung Won Chung from OpenAI - "Don't teach. Incentivize." - https://lnkd.in/eANKf4ND
One of my favorite lectures on ML/LLMs in 2024: Hyung Won Chung from OpenAI - "Don't teach. Incentivize." - https://lnkd.in/eANKf4ND
Liked by Santosh Sawant
-
DeepSeek v3 is the most powerful open source AI model to be released! I read through their technical report / paper, and found some cool things: 1…
DeepSeek v3 is the most powerful open source AI model to be released! I read through their technical report / paper, and found some cool things: 1…
Liked by Santosh Sawant
Other similar profiles
Explore collaborative articles
We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.
Explore MoreOthers named Santosh Sawant in India
-
Santosh Sawant
-
SANTOSH SAWANT
Vice President - Client Services & IT
-
Santosh Sawant
--
-
Santosh Sawant
Co- Founder & Creative Director at Afternoon Films
698 others named Santosh Sawant in India are on LinkedIn
See others named Santosh Sawant