
Kabir's Tech Dives
I'm always fascinated by new technology, especially AI. One of my biggest regrets is not taking AI electives during my undergraduate years. Now, with consumer-grade AI everywhere, I’m constantly discovering compelling use cases far beyond typical ChatGPT sessions.
As a tech founder for over 22 years, focused on niche markets, and the author of several books on web programming, Linux security, and performance, I’ve experienced the good, bad, and ugly of technology from Silicon Valley to Asia.
In this podcast, I share what excites me about the future of tech, from everyday automation to product and service development, helping to make life more efficient and productive.
Please give it a listen!
Kabir's Tech Dives
🎬 One-Minute Video Generation via Test-Time Transformer Training
Researchers introduced Test-Time Training (TTT) layers to enhance the ability of pre-trained Diffusion Transformers to generate longer, more complex videos from text. These novel layers, inspired by meta-learning, allow the model's hidden states to adapt during the video generation process. To validate their approach, they created a dataset of annotated Tom and Jerry cartoons for training and evaluation. Their model, incorporating TTT layers, outperformed existing methods in generating coherent, minute-long videos with multi-scene stories and dynamic motion, as judged by human evaluators. While promising, the generated videos still exhibit some artifacts, and the method's efficiency could be improved. The study demonstrates a step forward in creating longer, story-driven videos from textual descriptions.
Podcast:
https://kabir.buzzsprout.com
YouTube:
https://www.youtube.com/@kabirtechdives
Please subscribe and share.