
Kabir's Tech Dives
I'm always fascinated by new technology, especially AI. One of my biggest regrets is not taking AI electives during my undergraduate years. Now, with consumer-grade AI everywhere, I’m constantly discovering compelling use cases far beyond typical ChatGPT sessions.
As a tech founder for over 22 years, focused on niche markets, and the author of several books on web programming, Linux security, and performance, I’ve experienced the good, bad, and ugly of technology from Silicon Valley to Asia.
In this podcast, I share what excites me about the future of tech, from everyday automation to product and service development, helping to make life more efficient and productive.
Please give it a listen!
Episodes
246 episodes
🗣️ Dia: New Open Source Text-to-Speech Model
Nari Labs, a two-person startup, has launched Dia, an open-source text-to-speech model. This model, boasting 1.6 billion parameters, is designed to generate natural-sounding dialogue from text, even incorporating emotional tones and nonverbal c...
•
Season 3
•
Episode 19
•
11:31

🐬 DolphinGemma: AI Decodes Dolphin Communication
Google AI has developed DolphinGemma, a new AI model, to help scientists at the Wild Dolphin Project decode the complex communication of Atlantic spotted dolphins. Trained on decades of dolphin vocalization data, DolphinGemma i...
•
Season 3
•
Episode 18
•
10:44

🛡️ Microsoft SFI April 2025 Security Progress Report
This episode is about Microsoft's April 2025 progress report on its Secure Future Initiative (SFI), a comprehensive, multiyear effort to enhance the security of its products and services. The report highlights advancements across var...
•
Season 3
•
Episode 17
•
17:46

LLM Advancements, Applications, and Industry Impact in 2024-2025
This episode explores the current landscape and future trajectory of large language models (LLMs) and generative AI. One document details ten practical applications of LLMs in 2024, highlighting tools like ChatGPT and Grammarly, while another i...
•
Season 3
•
Episode 16
•
16:48

🎬 AI's Impact and Innovations in Video Production
This episode explores the burgeoning field of AI in video production, highlighting advancements like Runway Gen-4's precise camera controls and the emergence of powerful generative models such as OpenAI's Sora and the open-source Open-Sora proj...
•
Season 3
•
Episode 15
•
28:03

🤖 AI Business Risks, Cost Reduction, Defense, Training, and Manus AI
This episode introduce Manus AI, an autonomous AI agent from a Chinese startup, highlighting its ability to execute complex tasks with minimal user input, setting it apart from tools like ChatGPT and DeepSeek. Manus AI boasts mult...
•
Season 3
•
Episode 14
•
19:09

🎬 One-Minute Video Generation via Test-Time Transformer Training
Researchers introduced Test-Time Training (TTT) layers to enhance the ability of pre-trained Diffusion Transformers to generate longer, more complex videos from text. These novel layers, inspired by meta-learning, allow the model's hidde...
•
Season 3
•
Episode 13
•
14:55

The Next Token and Beyond: Unraveling the LLM Enigma
Yes, I can certainly provide a long and detailed elaboration on the topics covered in the sources, particularly focusing on LLM-generated text detection and the nature of LLMs themselves.The emergence of powerful Large Language Models (L...
•
19:27

🤖 AI and Machine Learning: A Multi-Source Overview
This episode provides a comprehensive exploration into the realm of Artificial Intelligence (AI) and Machine Learning (ML), specifically within the context of educational environments. At its core, AI is defined as the simulation of human intel...
•
Season 3
•
Episode 12
•
17:05

🤖 AI Trends and Innovations for 2025
This episode explores the anticipated trajectory of artificial intelligence in 2025, highlighting key trends impacting various sectors. AI agents, capable of autonomous reasoning and action, are a prominent focus across multiple sources....
•
Season 3
•
Episode 11
•
17:52

🥊 AI Giants Compete for College Students
OpenAI and Anthropic are actively competing to become the primary AI tool for college students. Both companies have recently unveiled initiatives aimed at higher education, with Anthropic introducing Claude for Education and OpenAI makin...
•
Season 3
•
Episode 6
•
9:06

🤖 Therabot: AI Chatbot Shows Mental Health Therapy Benefits
Dartmouth researchers conducted a clinical trial of their AI-powered therapy chatbot, Therabot, and found significant mental health improvements in participants with depression, anxiety, and eating disorder risks. The study showed sympto...
•
Season 3
•
Episode 10
•
13:34

📉 Microsoft Adjusts AI Data Center Growth Amid New Trends
Microsoft is reportedly scaling back its ambitious AI data center expansion plans. This decision follows the emergence of new, more cost-effective AI model development methods, particularly from Chinese companies. These methods demonstra...
•
Season 3
•
Episode 9
•
12:52

🚀 Efficient and Portable Mixture-of-Experts Communication
A team of AI researchers has developed a new open-source library to enhance the communication efficiency of Mixture-of-Experts (MoE) models in distributed GPU environments. This library focuses on improving performance and portability...
•
Season 3
•
Episode 8
•
16:59

🤝 Vana: User-Owned AI Models from Decentralized Data
Vana, a decentralized platform originating from an MIT project, aims to shift control of data used for AI training back to individual users. Frustrated by the current model where tech companies profit from user data, Vana allows individuals to ...
•
Season 3
•
Episode 5
•
11:09
.png)
🤖 AI and Copyright: US Copyright Office Report
In a January 2025 report, the U.S. Copyright Office addresses the copyrightability of works created using artificial intelligence. This second part of a broader study examines the level of human contribution necessary for AI-generated ou...
•
Season 3
•
Episode 7
•
20:47

Unleashing Local AI on Your Mac Studio - From Ollama to DeepSeek
Are you intrigued by the power of AI but concerned about privacy or cloud costs? In this episode, dive into the exciting world of running Large Language Models (LLMs) directly on your Mac, iPhone, and iPad! We'll explore how tools like Ollam...
•
Season 3
•
Episode 4
•
14:11

🤖 Agentic AI Courses and Learning Resources
This episode provides information about agentic AI and AI agent courses available in 2025. The courses cover topics like AI fundamentals, building AI agents, prompt engineering, and strategic implementation, catering to diverse skill levels and...
•
Season 3
•
Episode 3
•
15:05

🎭 DreamActor-M1: Hybrid Guided Holistic Human Image Animation
This episode is also about a research paper introducing DreamActor-M1, a new realistic human image animation framework. This DiT-based method utilizes hybrid guidance combining facial representations, 3D head spheres, and body skeletons for fin...
•
Season 3
•
Episode 2
•
19:21

🤖 DreamActor-M1: Human Image Animation
DreamActor-M1 is a new framework for animating human images based on a diffusion transformer, utilizing a hybrid guidance system. This approach enables more precise control over the entire body, adapts to different image scales, and main...
•
Season 3
•
Episode 1
•
14:53

🔬 On the Biology of a Large Language Model
Researchers used a novel "circuit tracing" method to explore how Claude 3.5 Haiku works internally. They mapped out how the model handles tasks like reasoning, poetry, translation, and math, identifying key features and how they interact. The s...
•
Season 2
•
Episode 109
•
22:31

🖼️ GPT-4o: Advancing Useful and Creative Image Generation
OpenAI has introduced 4o Image Generation, a new feature integrated into GPT-4o, designed to create useful and visually accurate images. This multimodal model aims to excel in tasks like precise text rendering and detailed inst...
•
Season 2
•
Episode 108
•
16:19

🤖 The Cybernetic Teammate: AI Reshaping Teamwork and Expertise
This working paper from 2025 details a field experiment at Procter & Gamble investigating how generative AI impacts teamwork and expertise in new product development. The study compared the performance, expertise sharing, and social engagem...
•
Season 2
•
Episode 107
•
13:56

🗣️ OpenAI.fm: Interactive Text-to-Speech Platform and Startup Implications
OpenAI.fm, launched on March 20, 2025, is an interactive platform showcasing OpenAI's advanced text-to-speech (TTS) technology, specifically the GPT-4o-mini-tts model. This tool provides users with the ability to convert text into highly...
•
Season 2
•
Episode 106
•
13:51

🍎 Apple's Hardware and Software to Overcome AI Delays
Despite facing acknowledged difficulties in the realm of artificial intelligence, this 9to5Mac article argues that Apple possesses significant advantages in its upcoming hardware and software innovations. The author suggests that ...
•
Season 2
•
Episode 105
•
11:29
