Soumith Chintala

I am an AI researcher, engineer and community builder. I occassionally invest.
You can reach me at myfirstname@gmail.com

Some things about me:

Live in New York. Grew up in Hyderabad, India.
Studied at NYU and VIT Vellore.
Currently at Meta and NYU, focusing on AI Infrastructure, AI Research and Robotics.
Co-founded PyTorch, maintained Torch-7 and actively work on open-source. [expand]
- PyTorch is the most impactful project that I've been involved in, and now powers most of the world's AI research and product -- with hundreds of companies, research labs and individuals using and maintaining it. It has significant and tangible real-world impact from self-driving cars (Tesla, Cruise, Uber, etc.) to drug discovery (Astra Zenenca, Genentech, etc.) to cancer research (Bamfield / MARS Company) to NASA and other space research to several consumer products (Instagram, TikTok, Snapchat, Pinterest, etc.), and this amount of real-world usage often intimidates me. There's more than one bug in PyTorch, when it was uncovered, I couldn't sleep thinking about what the downstream effects could be. PyTorch has been impactful far beyond my expectations or ambition.
- Torch-7 at it’s peak was used by Google Deepmind, Twitter and Meta and singificant AI research was powered by it.
- Created convnet-benchmarks, a once popular benchmarking suite for deep learning. It was used from 2015 to 2017 as a gold standard by NVIDIA, AMD, Intel Nervana and many other hardware manufacturers to optimize their systems
- I used to maintain EBLearn, a C++ based Deep Learning Framework pre-2012, together with Pierre Sermanet and Yann LeCun.
Published well-cited AI research in Generating Images and other AI things. [expand]
- Some of my most cited work is on Generative Adversarial networks (GANs), where I co-authored three well-cited papers: LAPGAN (demo), DCGAN (code/demo) and Wasserstein GAN. I gave up on GANs after failing to make them stable training algorithms.
- I've worked on object and human detection, generative modeling of {images, videos}, AI for video games, ML systems research.
- A full list of my peer-reviewed or pre-print manuscripts are on my Google Scholar page.
Working on automating chores at home using robots.

Some thoughts and directions:

I embrace laziness. Automate everything that you dont want to do.
I embrace simplicity and reducing paging, even if complexity has upside.
- In programming – fast iteration, fast time-to-first-output. For a large enough project, it breaks.
I embrace iterating and incremental ambition
- I think products and communities have to be rapidly iterated and evolved to get to a good state – with feedback loops and incentive structures designed very carefully to get to good dynamics.
I avoid toy or hypothetical problems, even as proxies.
- I’ve seen way too many people build up their own toy complexity and solve useless problems.
- I ground work towards applications with obvious benefits. This makes my funnel limited, but I’m okay with it.
I rabidly love open-source and open research.
- This love is borne out of growing up without answers to all the questions I had.
- Giving away technology and knowledge is one of the best ways to equal the playing field.
I think deeply about communities, ecosystems, incentives, social dynamics and power.
- I’ve studied them via two avenues so far: open-source ecosystems, large organizations
- maintaining Torch-7 was interesting – a large project written in a niche language. It’s been very interesting to contrast that to PyTorch.
Keeping close to the grassroots is really important to build great products and organizations.
- I’ve answered thousands of questions across the PyTorch and Torch forums, investing a significant portion of my lifetime on this.
- Grassroots signals have been disproportionately important to me in building good products and organizations.

Home Robots:

I want to build a household robot that helps me with all kinds of chores. To help this robot reason well with little data, I want to build a world simulator (so that it can rollout scenarios in it’s head and pick the best ones). To build this world simulator, I’ve been interested in multi-modal models (that combine vision, speech, text, robots), generative models (for vision and speech) and efficient representations for encoding the human-centric world.
I’ve recently made a decent amount of progress on home robotics at NYU, with my collaborator Lerrel Pinto, often using a Hello Robot | Stretch. Here’s some of my recent work that I’m excited about:
- Robot Utility Models: enable basic tasks – door opening, drawer opening, object reorientation, etc. – at ~90% accuracy without ANY finetuning (i.e. zero-shot) in unseen new environments.
- On Bringing Robots Home: 109 tasks. 10 NYC homes. 81% success rate. 20 minutes to learn a new task
- Dexterity from Touch: Self-Supervised Pre-Training of Tactile Representations with Robotic Play
- CLIP-Fields: Weakly Supervised Semantic Fields for Robotic Memory
- Holo-Dex: Teaching Dexterity with Immersive Mixed Reality

Investments:

I mostly invest within my network – when friends start companies. Rarely do I invest otherwise. I’ve invested in Runway, 1X, Osmo, Anthropic, Together.ai, Lepton among others.