Blog

DeepSeek in the Uncanny Valley of Paradoxes

The funky post title is inspired by DeepSeek-R1’s response to one of our paradoxical situations where it used the apt phrase ‘Ethical Uncanny Valley’ in its response.

Yashika Mittal, Aman Paliwal, Pankaj Pansari

Elegant Tensors: Attention Forward & Backward Passes using PyTorch Einsum/Einops

I was looking at the FlashAttention-2 paper recently. It’s about optimizing the forward and backward pass through the attention layer - this being the bottleneck to scaling…

A Closer Look at Helgrind for Concurrency Issues

Concurrency issues in multi-threaded programs are notorious to track down. Helgrind is a dynamic analysis tool, part of the Valgrind suite, that helps in detecting such…

Some Observations on Locks and Threads

Consider the simple multi-threaded program here; each thread increments the shared global variable counter max number of times (we won’t write a program this way to…

Profiling LLM Inference

Let’s use PyTorch Profiler to get a better understanding of what happens under the hood during LLM inference. The model we’re going to use is the 1B parameter…

Throughput vs Latency - GPU vs CPU

CPUs are optimized for latency; GPUs are optimized for throughput. They are independent processors, so can and should work on different things at the same time.

Automated Qualitative Feedback on Programming Assignments

I’ve been teaching courses on Computer Systems and Operating Systems at Plaksha University. Programming assignments are an integral part of such systems courses. The…

Grounding Language Models in the Physical World

I recently listened to a podcast episode on The Robot Brains where Jitendra Malik, an eminent computer vision researcher, shared his thoughts and experiences on grounding…

A Probabilistic Perspective on Regularization

Regularization is a common technique in machine learning to prevent overfitting. The two most widely used regularizers are the L2 and L1 norms. In this post, we look at how…

Machine Learning in Research versus Production

I have been going through the book ‘Designing Machine Learning Systems’ by Chip Huyen to better understand how machine learning systems are deployed in production in…

An Introduction to Large Language Models

Large language models (LLMs) are very large deep learning models that aim to predict and generate sensible text in a natural or symbolic human language. LLMs, in other…