NeurIPS 2022
Date:
This past year I attended the Conference on Neural Information Processing Systems for the first time to present a poster alongside a paper that was accepted at the workshop on Information Theoretic Principles in Cognitive Systems.
The first day fo the conference I attended the NewInML workshop, which was intended for researchers that are new to the machine learning community. Since this was my first NeurIPS and my background is in cognitive science, this felt like the perfect way to start the conference. The topics of this workshope varried from general overviews of researchers experiences in machine learning, to more specific topics like navigating the early years of a tenure-track position or how to best negotiate a machine learning engineer job in industry.
I was glad there was a wide variety of topics as it kept the long day of talks interesting, but my favorite presentation was the first of the day given by Yoshua Bengio. This consisted of a broad overview of the history of his research, including the early days of research into neural networks before the so-called AI winter. It was interesting to hear Yoshua talk about perspectives on these famous epochs in AI research and what kept him going and interested in different areas that eventually lead to some of the impressive systems we have today. While it is hard to predict the future, if I am to stay in machine learning research for a while then it could probably be expected that the field will experience something along the lines of the AI winters of the past, and staying interested in my work will be an important skill.
There were several interesting invited talks given during the main conference, one of my favorites was given by Geoffrey Hinton titled “The Forward-Forward Algorithm for Training Deep Neural Networks”. This talk was mainly focused on discussing a more biologically plausable method of training neural networks as an alternative to the backpropegation method currently in use. Hinton highlighted several issues with the biological plausibility of backpropegation, and introduced what he believes is a more biologically plausible network training method, called the feed-forward algorithm. There is an acompanying paper on Hinton’s website that explains in detail the motivation, methods, and some preliminary experimentation on simple domains.
While the algorithm introduced in the talk is titled ‘feed-forward’, Hinton does not suggest that the brain is entirely feed-forward, as it is highly recurrent, but rather that it can be trained in that directional dependency, as opposed to backpropogation which lacks a biological basis. The main thrust of the model seemed to me to be similarly motivated as the free-energy principle, by relying on a model that determines the probability that the current stimuli is in a sense accurate, through the introduction of an idea of a ‘goodness’ measure. This can be connected to other topics like the purpose of sleep, which Hinton suggests is both a way to train the brain to be able to generate its own experience, useful for real planning, and also as a discriminator for better labelling unreal experience.
One interesting point from this talk is highlighting the scale of the human cortex which he lists as roughly 10^14 connections, with ‘only’ 10^9 seconds to train them, and that “It is possible that the learning algorithm we have is not so good (as ANNs) at squeezing a lot of knowledge into a few weights, it has the opposite problem… we are data limited rather than capacity limited”. I thought that this was an interesting point to bring up in a talk on a more biologically plausible neural networks, since most of my work on making machine learning techniques more human-like has been focused on explicitly modelling the limitations of human learning. In fact, Hinton uses the term “Capacity-Limited”, which has directly inspired my work in making reinforcement learning models that are closer to human like learning. However, I don’t see the main point of this talk as totally disconnected from my own work, since it broadly is interested in more human like machine learning, while it takes a different perspective on better understanding now NNs can be trained, and my work focuses more on how humans leverage experience and biases to learn quickly, but in ways that can lead to suboptimal behaviour.
The workshop on Information Theoretic Princples in Cognitive Systems was created by, among others, Sam Gershman and Irina Higgins, both of whom I have cited extensively and take great inspiration from. My paper for this workshop was titled Learning in Factored Domains with Information-Constrained Visual Representations. This paper is an extenstion of the model that I developed for my dissertation, and the human experiment that was run alongside that. The main differences with the approach used in this workshop paper is that the model uses latent representations of visual information as the input to a hypothesis generation and evaluation method, as opposed to a seperate neural network trained to predict utility. This allowed for extremely fast learning in a contextual bandit setting based on images of human faces, with the model selecting the correct option after only 2-3 experiences in the task despite the highly complex state information. I am exited to see where this research can extend to, potentially in designing a new experiment for human learning in the type of task described in this paper, as I think the proposed model would be a great explanation for human learning while also making predictions of how humans represent visual information during learning.