Welcome to Tesla Motors Club
Discuss Tesla's Model S, Model 3, Model X, Model Y, Cybertruck, Roadster and More.
Register
This site may earn commission on affiliate links.
It's harder than traditional engineering in many ways such as lack of explainability and super hard to "patch" individual issues and to validate the impact of a change in the training set
Who needs to be explained what certain parts of the neural network are contributing to driving decisions? Typically it's needed for debugging code to make specific changes, but if you can fix individual issues with more end-to-end training of examples of the desired behavior, is the human explainability piece necessary as opposed to evaluating whether the behavior changed without regressing others?

Perhaps the neural network structure of current end-to-end will be insufficient to learn enough corner cases and that might require increasing the size of the network and/or changing the architecture, and these could exceed the hardware compute capabilities of HW3 to achieve some safety level. So Tesla might need to do some hybrid solution to get working on existing vehicles, but end-to-end could still be the correct approach with newer hardware.
 
Who needs to be explained what certain parts of the neural network are contributing to driving decisions? Typically it's needed for debugging code to make specific changes, but if you can fix individual issues with more end-to-end training of examples of the desired behavior, is the human explainability piece necessary as opposed to evaluating whether the behavior changed without regressing others?
When I say "explainability" and "validation" I don't mean for individual driving decisions. How do you propose you validate a change in a Level 5 system that is supposed to work in all types of situations? Do we agree that ML is not magic and that there is currently no way of doing this?

You can build a system that does a "best effort" of trying to drive everywhere - that would be a wide ODD L2. But in order to make a L5 you need to guarantee it can drive safely (and efficiency) everywhere.

You can also provide higher guarantees for some situations by validating them, but that's not general self driving.
 
At this point, it's the only reasonable way to solve generalized autonomy. In hindsight, human heuristics would never capture all the nuances in all the different locales. You'd spend all day tweaking heuristics while messing up others.
Hi, Powertoold --

At this point, it's the only reasonable way to solve generalized autonomy.
Musk has said that autonomy requires AGI. If you believe that -- and I have no firm opinions -- then if end-to-end works, the implication is that neural nets can deliver AGI. That would be a -- to use a Muskism -- profound result.

Yours,
RP
 
You can also provide higher guarantees for some situations by validating them, but that's not general self driving
The quality of higher driving automation such as safety or comfort does not change the design intent of the system, so a robotaxi could be deployed and fail to reach passenger in certain situations. That's probably not a good business model to not pick up passengers, so the business can have its own metrics to decide how to roll out.

Tesla's approach seems to focus on building highly capable driver assistance to find these potential failure situations and incrementally resolve them with end-to-end training, and at the same time they can evaluate how likely and how severe each of these situations are. If a situation is extremely rare not encountered even with billions of miles of real world experience, potentially end-to-end will handle it just fine with the hopes of general learning or it could fallback to a minimal risk condition which probably isn't great for any passenger trying to get to a destination but perhaps understandable if this is truly a rare situation. I suppose the true failure case you're probably getting at is the robotaxi ends up misunderstanding the situation to do something wrong, but even then it's unclear if the actions would actually be unsafe or just awkward.

The other side is probably regulators deciding if they should push back on some robotaxi deployment. Presumably they too will be gathering data and looking at metrics to potentially build out their own evaluation test suite, but even then practically it will be limited in scope to ensure some base amount of safety. Just like a driving test doesn't require a new driver to prove they can make an unprotected turn in a blizzard on a hill with high traffic, regulators probably will focus on common cases and known potentially problematic situations. Society's response to general self driving can also influence regulators, but perhaps people will appreciate the new opportunities and general safety improvements of robotaxis and be understanding of the hopefully rare situations.
 
  • Like
Reactions: powertoold
Tesla's approach seems to focus on building highly capable driver assistance to find these potential failure situations and incrementally resolve them with end-to-end training, and at the same time they can evaluate how likely and how severe each of these situations are.
Tesla’s approach has been “trying to learn to drive using data” since 2016. End to end is the latest buzzword. In two years there will be something else.

Back prop isn’t new and I think the first papers on e2e in self driving are from around 2017.

At some point people will understand that you don’t get “Moore’s Law” for AV:s. Even Elon has been shifting his tone from exponential progess to logarthimic.
 
Sure, it's possible we'll never get there (4-10x safety) with end-to-end, but we sure as heck won't get there with heuristics.

This is a strawman. Nobody is arguing that all heuristics will get to 4-10x safety. Everybody uses ML and NN. And yes, NN are needed to get to 4-10x safety but you can do NN without doing end-to-end. End-to-end is just one way to organize the NN. So if end-to-end does not achieve 4-10x safety, it does not mean that we are doomed to never solve FSD because the alternative, heuristics, can't do it, it just means we try a different way to organize the NN or we discover new ML.

In fact, maybe the key to getting to 4-10x safety is more redundancy: have 2 stacks, an end-to-end stack and a modular NN stack, running in parallel so that if one makes a mistake, the other can catch it, thus making the overall system even more reliable and thus safer?
 
Last edited:
  • Like
Reactions: clydeiii
This is a strawman. Nobody is arguing that all heuristics will get to 4-10x safety. Everybody uses ML and NN. And yes, NN are needed to get to 4-10x safety but you can do NN without doing end-to-end. End-to-end is just one way to organize the NN. So if end-to-end does not achieve 4-10x safety, it does not mean that we are doomed to never solve FSD because the alternative, heuristics, can't do it, it just means we try a different way to organize the NN or we discover new ML.

All I'm saying is that general autonomy isn't going to be achieved with humans manually tweaking all sorts of little details in code or maps or etc.

What proportion of heuristics vs NNs that is is debatable
 
End to End neural network ensemble can work, although unlikely with current hardware. A neural network ensemble has explainability at the end of each individual neural network in the ensemble. I do believe an end to end ensemble is a good approach long term but won't be full FSD, in the next few years.
 
  • Helpful
Reactions: pilotSteve
All I'm saying is that general autonomy isn't going to be achieved with humans manually tweaking all sorts of little details in code or maps or etc.

Sure. But nobody is trying to achieve general autonomy that way. So I am not sure why you keep bringing it up.

Again the choice is not either E2E or "humans manually tweaking all sorts of little details in code or maps". That is a false choice.


What proportion of heuristics vs NNs that is is debatable

Yes that is the debate.
 
At this point why do we feel in any way compelled to listen to what he says at all? We can have endless debates about whether he’s being ’optimistic,’ ‘aspirational,’ deceitful, lying or is just plain clueless. Regardless the only statements he makes that have any value at all are the ‘rolling out right now’ statements. Beyond that the best you can say is they are a general indication of what might happen at some point in the future.
You answered your own question (in bold).
 
Specifically planning & control - can NN do it alone.

Probably but it might just be a lot harder than we realize. Specifically, it might require more data and more training than we initially realized because we underestimated the edge cases and nuances in driving in different places. As you know, driving habits can vary from place to place. There can also be unwritten driving "rules" in some places. So it might take a lot more data to properly train on all the different differences from one geolocation to another.
 
Probably but it might just be a lot harder than we realize. Specifically, it might require more data and more training than we initially realized because we underestimated the edge cases and nuances in driving in different places. As you know, driving habits can vary from place to place. There can also be unwritten driving "rules" in some places. So it might take a lot more data to properly train on all the different differences from one geolocation to another.
But - does NN have to follow all the unwritten local rules & customs. Afterall we have all driven in different states ... and people from a totally different country drive here almost immediately after landing.
 
And end-to-end has its own challenges too. You need to get over billion parameters just right and every time you add more data, the training might tweak one parameter the right way but tweak another parameter the wrong way. So fixing an issue without causing a regression somewhere else is a challenge with end-to-end.
I think it's important to distinguish between "end-to-end NN" and "unified NN", since these seem to have been confounded in some of the discussions

I can have an "end-to-end" NN driven car that (in very simple terms) has a back-end NN that creates a world-view from the cameras etc and then feeds that output as the input for a front-end NN that generates the appropriate driving outputs. This is different from a "unified NN" approach that feeds camera etc input in at one end of a single (big) NN and directly generates driving controls at the output, with nothing in-between.

My take is that Tesla are doing the former (distinct back-end and front-end NNs), since (a) they already HAVE the back-end NN in FSDb, and (b) you can SEE that they still have it since the car can still show the NN world-view in the few demo videos we have seen (this cannot be generated by the unified NN). There is also the very practical issue that training a unified (huge) NN is MASSIVELY expensive, even by NN standards. Segregating the NNs into a stack allows each to be validated and trained distinctly, which vastly reduces the training time, since a change to one NN does not force a re-taining of the entire stack.

(I'm greatly simplifying here of course, since in fact even the existing back-end NN has many distinct tasks.)
 
But - does NN have to follow all the unwritten local rules & customs. Afterall we have all driven in different states ... and people from a totally different country drive here almost immediately after landing.

Well, it depends on what you think the standard should be for AV deployment. Do AVs just have to drive statistically safer than humans or do they also need to have good roadmanship as well? Some argue that good roadmanship should be a requirement for AVs because they think it is important that AVs not be jerks on the road. If the AV does not follow the unwritten local rules and customs, it could cause frustration or even road rage from other human drivers. It could even cause accidents if the human drivers assumed the AV will behave a certain way and it doesn't. For example, maybe humans expect vehicles to yield in a certain situation but it is an unwritten rule. If the AV does not yield because it does not know about that unwritten rule, it could cause a collision if both vehicles try to go at the same time.
 
Back prop isn’t new and I think the first papers on e2e in self driving are from around 2017
You could say the same thing about transformers from 2017 or general attention mechanisms from years earlier or even more broadly decades earlier for modern back-propagation, but it wasn't until last year that people saw practical usage of these with ChatGPT based on OpenAI's engineering efforts to train at scale as well as to fine-tune towards human preferences. Sure, Tesla is applying "old" techniques to "old" problems, but you seem to have already decided that Tesla's engineering efforts won't be able to get end-to-end control working for FSD.
 
I can have an "end-to-end" NN driven car that (in very simple terms) has a back-end NN that creates a world-view from the cameras etc and then feeds that output as the input for a front-end NN that generates the appropriate driving outputs. This is different from a "unified NN" approach that feeds camera etc input in at one end of a single (big) NN and directly generates driving controls at the output, with nothing in-between.

Thanks. What you call "unified NN" is what I call "end-to-end". I define "end-to-end" as a specific NN architecture where there is just one single NN with video in and control out. The literature that I have read describes the single NN, video in and control out as "end-to-end". What you call "end-to-end" is what I call "modular NN", where it is all nets from start to finish but split into separate NN, like one NN for perception and one for planning/control.
 
  • Like
Reactions: Bitdepth
If the AV does not yield because it does not know about that unwritten rule, it could cause a collision if both vehicles try to go at the same time.
IMO we exaggerate the differences, so no - I don't think NN needs to be trained for each city separately. As I said - we all drive in different cities and states - and millions take roadtrips every year without problems. Ofcourse, good maps are a prerequisite ...

I wonder whether Tesla will put all the eggs in one e-to-e basket or parallelly develop the current track & V12.