How to Pursue a Career in Brain-Based AI

exdsq · on April 11, 2022

Was hoping for something a little deeper than "Personal projects", "blog", "read books", and "go to relevant conferences". But then I noticed it was written by a content marketer and not a 'Brain-Based AI' engineer.

I wonder how many members of their team have a background similar to what this article suggests (rather than PhDs and MScs in the field).

Edit: Didn't have to wonder too long, they're hiring interns and would like PhD students with a history of publishing at relevant conferences. So they themselves wouldn't hire an intern based on their own advice!

_ncuy · on April 11, 2022

As a machine learning researcher for over a decade I clicked to understand what "brain-based AI" is. I learned it's a made-up term by the author to maximize click-through rate. Pass.

SubiculumCode · on April 11, 2022

classic numenta TM

Mageek · on April 11, 2022

This is an area that a lot of people are probably interested in. We still don't really have practical robots, self-driving cars, and personal assistants, and some sort of general context-sensitive learning intelligence breakthrough is needed to get us there. Traditional machine learning and deep learning have not yet been sufficient to get us there, and brain-based approaches might. It would be good to hear about more of the companies pursuing this sort of thing (maybe with less of a focus on brains and more of a focus on general understanding), and how to proactively contribute beyond the numenta-specific content here. Its a start though!

mcbuilder · on April 11, 2022

A lot of computational neuroscience guys are in DL now, myself included. Hard to do basic research when you're not funded, and there is this new shiny thing with million dollar compute.

knodi123 · on April 12, 2022

can you answer a noob question I've always wondered?

Is there any reason AI research has to run at fast speeds? Like, any modern learning model that you're researching with could certainly run on a million dollar gpu farm or whatever.... but it could also run on a macbook in 1000x the time, right? And isn't that enough speed to determine whether your algorithms are performing the way you expect, even if they aren't fast enough to do fun interactive stuff like realtime video or whatever?

mattkrause · on April 12, 2022

The speed "only" really matters for practical reasons, but they're pretty big reasons. In principle, anything you do on a GPU can be done on a CPU or even with pencil and paper--it's just arithmetic, after all.

However, the speedup is so big that it's almost impossible to ignore. One way to measure compute speed is in terms of Floating Point Operations per Second, or FLOPS. A recent-ish CPU is probably ~500 gigaflops, while a single A100 GPU is ~150 teraflops[0]. OpenAI reportedly has a cluster with 10,000 V100 GPUs (the A100's predecessor, but still...). GTP-3 still supposedly took about a month to train on that cluster, so it would never—for all intents and purposes—finish on a MacBook. Few groups operate at that scale, but using even one decent GPU is still such a huge speedup that I doubt many people start with less.

It's also worth noting that "training" the model from examples is often a lot more compute-intensive than using it to do "inference." For example, image recognition models are often trained on large clusters, but can be deployed to something much smaller (a phone or laptop). There's a whole subfield of "distillation", which takes large models and finds ways to simplify them for deployment.

[0] There's some marketing involved in these numbers, different precisions, and the GPU does work in parallel, so you're not getting one operation every 1/150e12 second.

knodi123 · on April 12, 2022

Thanks, that's not the answer I wanted but it was sadly convincing.

lootsauce · on April 11, 2022

Saw Jeff Hawkins talk at Streangeloop years ago, and I thought his book was brilliant. What has this company ever accomplished in machine learning/ai? I have seen a few videos on YouTube teaching about HMT theory but it’s not clear that this theory even results in anything that is even marginally useful. What am I missing?

Alex_Notchenko · on April 12, 2022

In modern deep learning there are multiple approaches that can be argued are close to some of inspirations from neuroscience: Capsule Networks, Helmholtz Machines, Energy Based Models (Score Based Generative Modeling / Diffusion Models), Associative Memory.

Information theory, Bayesian methods, Approximate Computations are more relevant for inspiration. Neuroscience is not the filed which studies intelligent behavior.

hprotagonist · on April 11, 2022

> What am I missing?

Not much!

Mageek · on April 11, 2022

An HTM layer requires far more parameters than a traditional deep neural network, on the order of gigabytes for one or two basic HTM layers. A cortical column is going to have many such layers, and a 1000-brains model will have many (thousands?) of cortical columns. In my opinion, the underlying idea is fantastic but the practical aspects of implementing it are daunting.

exdsq · on April 11, 2022

That's what killed deep learning in the 90s right? The ideas were there, they were just impractical, and now they're Python libraries on a laptop. Could be similar here albeit I don't know anything around the viability of their HTM layers!

p1esk · on April 12, 2022

What is "HTM layer"?

ByersReason · on April 12, 2022

Hierarchical Temporal Memory

p1esk · on April 12, 2022

I know what HTM stands for. I'm asking what is "HTM layer".

Mageek · on April 12, 2022

An HTM layer is one feedforward layer in a Heirarchical Temporal Memory network. See for example https://arxiv.org/abs/1511.00083 which trains one-layer HTM models. See for example https://timallanwheeler.com/blog/2022/01/09/hierarchical-tem... for a more approachable blog post about the same topic

p1esk · on April 13, 2022

That's a very nice blog post, thanks!

I'm not sure why you think HTM layers are bigger than modern DL layers. The HTM layer configuration used in the paper (B=128, M=32, N=2048, and K=40) is 335M parameters. Compare to GPT-3 with 96 layers, where each layer has 1.8B parameters. Much larger models than GPT-3 have already appeared with no end in sight as to how much more they can scale.

The point is, if HTM worked, people would throw compute resources at it, just like they do with DL models. But it doesn't.

zyrtech · on April 11, 2022

Yeah, Numenta is not going anywhere. DL has taken over the serious applications a while back, and Numenta doesn't have anything really substantial to offer compared to DL.

mino · on April 12, 2022

I'm actually surprised they are still around...

JonChesterfield · on April 11, 2022

There's a company based in Switzerland doing this sort of thing. Unfortunately I've forgotten the name, was spun out of some academic research.