Viswanath (Vish) Sivakumar

← thoughts

It's the question, stupid

April 10, 2026

The parameter that has the largest impact in research is taste. Taste is about selecting what to work on. This often requires reframing what looks like a solved problem to ask an entirely new question.

I love the following from the CLIP paper, when much of the rest of the vision world was hill-climbing on supervised datasets.

In computer vision, zero-shot learning usually refers to the study of generalizing to unseen object categories in image classification (Lampert et al., 2009). We instead use the term in a broader sense and study generalization to unseen datasets. We motivate this as a proxy for performing unseen tasks, as aspired to in the zero-data learning paper of Larochelle et al. (2008). While much research in the field of unsupervised learning focuses on the representation learning capabilities of machine learning systems, we motivate studying zero-shot transfer as a way of measuring the task-learning capabilities of machine learning systems. In this view, a dataset evaluates performance on a task on a specific distribution.