Is it the NN model, or the compute?
Good topic. I think they are inextricably linked, part of the AI team's job is to decide how to develop and tweak the model(s) between the Mothership training and the in-car inference.
The HW3 compute is a limit they can't change in those cars (setting aside the dream of an HW5 retrofit program). However, they can improve its performance by applying more and better-architected compute on the training side, by increasing the amount of training data and by curating it better. Ashok actually talked about all this in the earnings call the other day.
HW3 I think has 144 TOPS. If we put the same NN on a 2000 TOPS processor, would it react faster, or the exact same?
With disclaimers about me not being an ML engineer:
I don't believe that simply running the same generated inference model at a higher clock speed would result in better driving behavior. In fact it would probably mess things up pretty badly. It might work for Go or chess or SAT taking, but driving is a real-time real-world activity.
You don't want reactions to every stimulus to happen faster or earlier, you want faster (and broader) analysis and decision-making, giving more time to plan appropriately. On the training side, more examples of people pulling out too late, too slowly, indecisively or giving clues that they don't see you or they suck at driving, will all help.
A faster inference computer helps because it can execute a more sophisticated model, not just reacting faster but processing more clues to anticipate better and to better evolve the planning as the situation develops. From a resource accounting point of view, it can do a better job with a less refined and faster-to-train model and on smaller training data.
Obviously the scaling trade-offs have limits. They can't make it work by distilling all the computing power on Earth into a Game Boy in the car. But the AI engineers are telling us that all these are soft limits, and that the field is rapidly evolving regarding performance optimization vs resource constraints. Performance problems in the currently released software do not imply that they've reached the limit of what HW3 will be able to accomplish.