Welcome to Tesla Motors Club
Discuss Tesla's Model S, Model 3, Model X, Model Y, Cybertruck, Roadster and More.
Register

Neural Networks

This site may earn commission on affiliate links.
I have always wondered this. I see the occasional video on Electrek, yet never see the hands of the driver or the IC during the event. Makes me question the move of the vehicle. Car or driver.

Been close enough to kiss a 18 wheeler tire on the side, yet nothing but red ultrasonic sensors around the car and nothing audible or car taking control while using AP.

I can add anecdotal evidence to this. From a post i made in back in May: What safety feature kicked in, car nudges itself out of the lane

I think the same thing happend to me back in February. I was driving on a small snow-packed local road going up to a ski resort here in Norway, and I had milk truck/trailer/rig in front of me. The truck slowed down on a straight stretch and signaled me to pass it.

The road was quite narrow and with the 5 feet high plowing edge on the left hand side perhaps is was a tight squeeze as I came up beside the truck. But all of a sudden, alarm and auto steer kicks in, and controls the car over the entire length of the truck making both left and right corrections, the whole thing lasted perhaps 3 seconds. I was really amazed.

This was on 2017.50 (I still am) and i didn't have Acc or AP activated as I started passing the truck.
 
  • Helpful
Reactions: mongo
The rule is avoid colliding with objects on the road. You might think that styrofoam is not a danger, but you can't see that it's in fact an anvil with its packaging intact.

It’s not as simple as just avoid anything in the road. E.g. roadkill is not worth suddenly swerving out of the lane in a dangerous maneuver, whereas a person (who is not a Tesla short) is worth running off the road entirely.
 
Last edited:
Karpathy tweeted this one:
http://josh-tobin.com/assets/pdf/troubleshooting-deep-neural-networks-01-19.pdf

The pedestrian classification example seems relavant to what Tesla like have been spending the last year on. Hopefully by now they have greatly increased the performance of their neural net using the same method or similar as presented here.

Traffic light detection is at 98%, they want to get to 99.9 some nines(anyone heard how many?) percent by the end of the year. I assume that this is not frame by frame, but with a few seconds of video and with map as an input to the neural network. Given this paper talking about how hard it is to lower error rates I assume they will have some serious work in front of them. By then likely many other subsystems will be working really well and I assume subjective experience should be vastly better than current experience.
 
  • Like
Reactions: J1mbo
Traffic light detection is at 98%, they want to get to 99.9 some nines(anyone heard how many?) percent by the end of the year. I assume that this is not frame by frame, but with a few seconds of video and with map as an input to the neural network. Given this paper talking about how hard it is to lower error rates I assume they will have some serious work in front of them. By then likely many other subsystems will be working really well and I assume subjective experience should be vastly better than current experience.
"five nines" is a common term, which means 99.999%. Often used in IT world for uptime reliability, but also used elsewhere for various metrics. Obviously Tesla would never *stop* at five nines, but they want to target five nines as a minimum.
 
  • Like
Reactions: kbM3
From earnings call:


This has the potential to save millions of lives, tens of millions of serious public injuries and give people their time back, so that they don't have to drive, they can -- if you're on the road, you can spend time doing things that you enjoy instead of being in terrible traffic. So it's extremely important. We feel confident about our technical strategy, and I think we have an advantage that no one else has, which is, that we have, at this point, somewhere in the order of 300,000 vehicles on the road, with a 360-degree camera sensor suite, radar, ultrasonics, always connected uploads, especially video clips with the customer submission when there is intervention. So effectively, we have a massive, massive training fleet.

Our -- the amount of training that we have -- if you add everyone else up combined, they're probably 5%, I'm being generous, of the miles that Tesla has. And this difference is increasing. A year from now, we'll probably go -- certainly from 18 months from now, we'll probably have 1 million vehicles on the road with -- that are -- and every time the customers drive the car, they're training the systems to be better. I'm just not sure how anyone competes with that.

This reads to me that, in any case where EAP gets overridden, the data gets sent back to Tesla for potential labeling and integration in the NN training.
 
From earnings call:
This reads to me that, in any case where EAP gets overridden, the data gets sent back to Tesla for potential labeling and integration in the NN training.

As we all know Musk is a bit of an optimist in his assertions. I do agree that the potential is there for Tesla since they have the fleet and potential to collect a massive amount of data. That said, they seem to be at a very basic level in their data collection, limited to certain triggers and perhaps override situations. They would probably need to invest a considerable sum of money in staffing up their Maps/Data teams to do these things.

Here's a few ideas I think are possible:

1. Real time & GPS traffic data - This could be integrated into autopilot to supplement GPS based speed limits. If GPS tile says this road segment speed limit is 35MPH but all traffic is going 65MPH, maybe the limit should be upped to 65MPH based on the statistics. Another opportunity is supplementing GPS Maps data. Nav on AP thinks there's an exit here but there has not been any traffic for a month (because the exit is under construction so we can remove the route).

2. Distributed mapping stop signs and stop lights: You could supplement the onboard ability to see a stop lights/signs with data from other cars that have observed stop signs or stop lights at certain points. Humans do this... Sometimes we might miss a street sign if we are unfamiliar with the environment, but if we've driven this way before we know what to expect. One of the challenges today is the system might be only 90% certain that it see's a stop light that it should stop for. Perhaps the angle is not clear... Is this for another traffic direction or me? Perhaps it's covered with snow.... But if many other cars have reported this and we see cars stopping/starting at this location then the system can be more certain it needs to stop here.

3. Environment 3D mapping. Taking the distributed mapping idea all the way, you could theoretically process the camera data to create fine grained maps of the road environment. When Autopilot is not active, you could leverage the GPU/ AP3 custom hardware to do this processing. Current cars may not have the storage, processing or network bandwidth to upload useful amounts of this data but it could be a feature deployed on future cars, perhaps a usecase they planned for in HW3.0
 
  • Like
Reactions: RabidYak
As we all know Musk is a bit of an optimist in his assertions. I do agree that the potential is there for Tesla since they have the fleet and potential to collect a massive amount of data. That said, they seem to be at a very basic level in their data collection, limited to certain triggers and perhaps override situations. They would probably need to invest a considerable sum of money in staffing up their Maps/Data teams to do these things.

Here's a few ideas I think are possible:

1. Real time & GPS traffic data - This could be integrated into autopilot to supplement GPS based speed limits. If GPS tile says this road segment speed limit is 35MPH but all traffic is going 65MPH, maybe the limit should be upped to 65MPH based on the statistics. Another opportunity is supplementing GPS Maps data. Nav on AP thinks there's an exit here but there has not been any traffic for a month (because the exit is under construction so we can remove the route).

2. Distributed mapping stop signs and stop lights: You could supplement the onboard ability to see a stop lights/signs with data from other cars that have observed stop signs or stop lights at certain points. Humans do this... Sometimes we might miss a street sign if we are unfamiliar with the environment, but if we've driven this way before we know what to expect. One of the challenges today is the system might be only 90% certain that it see's a stop light that it should stop for. Perhaps the angle is not clear... Is this for another traffic direction or me? Perhaps it's covered with snow.... But if many other cars have reported this and we see cars stopping/starting at this location then the system can be more certain it needs to stop here.


Yeah, geocoded agumentation was mentioned for Stop signs.

Even just override data is great for failure rates, especially if you make a heat map so you can focus on places (or times) that EAP fails often. Then a 5 second before to 5 second after clip with compressed radar data allows the NN training to be improved while ignoring one off disengagements due to user overforce.
 
  • Like
Reactions: RabidYak
Perfect example I ran into yesterday (and I’m sure there are millions of these...) I was going down a divided 2 lane highway that’s briefly 55mph then drops down to 35mph. Along both sides during the 55mph stretches are smaller development entrances/side streets (no traffic lights) and coming out of one on my right was a school bus with its nice clear-as-day stop sign folded in (that unfolds out when kids are are getting on/off) that pulled a little further out than I’d expect, though still not blocking the lane.

If my Tesla saw that stop sign using “vision-only,” we would’ve gone 55-0mph in a heartbeat and I wouldn’t fault a vision-only model, since it WAS a stop sign that LOOKED like it was for my lane.

This is why there has to be a mix of hard-codes + vision.
 
  • Like
Reactions: Inside
Perfect example I ran into yesterday (and I’m sure there are millions of these...) I was going down a divided 2 lane highway that’s briefly 55mph then drops down to 35mph. Along both sides during the 55mph stretches are smaller development entrances/side streets (no traffic lights) and coming out of one on my right was a school bus with its nice clear-as-day stop sign folded in (that unfolds out when kids are are getting on/off) that pulled a little further out than I’d expect, though still not blocking the lane.

If my Tesla saw that stop sign using “vision-only,” we would’ve gone 55-0mph in a heartbeat and I wouldn’t fault a vision-only model, since it WAS a stop sign that LOOKED like it was for my lane.

This is why there has to be a mix of hard-codes + vision.

Was the stop sign affixed to the school bus?

The Vision-Only Model would perform well.

Page 7: https://cs.stanford.edu/people/karpathy/densecap.pdf
 
All driving policy can be implemented with relatively "simple" software 1.0 code as long as you got the metadata interpeted. Apply location parameters to this driving policy module, and it will know that a green arrow to the right means "you're good to go" within Norway.

The counter-argument to this is that engineers have been working seriously on Software 1.0 solutions to driving policy since 2003. Waymo since 2009. And today driving policy for autonomous vehicles still feels to me like it’s almost at the science project phase.

“But,” comes the reply, “the technology is rapidly improving!” Is it? Autonomous vehicle projects are secretive and opaque. We have no reliable metrics to measure progress, since most disengagements can go unreported, and since miles between disengagements doesn’t tell us anything about the difficulty of the miles.

Qualitative evidence is limited, but the few glimpses we have are not encouraging. One anecdote is that Waymo safety drivers took over at least once over the course of four rides in the Phoenix metro area. The minivans still seem to be flummoxed by basic tasks like unprotected left turns.

I don’t think there is any evidence of rapid improvement in Software 1.0 for autonomous driving, for robotics, or for anything. I also don’t think there is a theoretical reason to believe Software 1.0 engineering can lead to rapid improvement — the speed of progress is constrained by small teams of humans plugging away at their code and doing trial-and-error. That seems like an inherently slow process. Or at least one that hasn’t gotten much faster since 2003. So progress from 2019 onward should be about as slow as progress from 2003 to 2019.

Progress only accelerates if Software 2.0 is used.
 
Last edited:
AlphaStar looks like a compelling proof of concept for imitation learning, both in itself and as a way to bootstrap reinforcement learning:

DeepMind’s AlphaStar beats a top player in StarCraft II

I can imagine a similar approach being used for driving policy. Use imitation learning with 10 billion miles of state/action pairs from human Tesla drivers. Then do self-play in simulation.
 
As a next step I'd be happy to have the cars pick up the speed limit signs! I lost count of how many are still wrong around me and I've logged bug reports on and off since I got the car 8 months ago, so that's not working either. Would be awesome if cars could validation the map to the visual and after x number of consistent differences from the fleet finally update the map! Or flag it for human review.
 
As a next step I'd be happy to have the cars pick up the speed limit signs! I lost count of how many are still wrong around me and I've logged bug reports on and off since I got the car 8 months ago, so that's not working either. Would be awesome if cars could validation the map to the visual and after x number of consistent differences from the fleet finally update the map! Or flag it for human review.

Yes, ironically Tesla’s ”Software 2.0” has taken us back in time a decade and some to the time before EyeQ1 was released in 2008 that read speed limit signs...

And let’s not even get started with that ”Software 2.0” auto-wiper...
 
  • Funny
Reactions: caligula666
I don’t think there is any evidence of rapid improvement in Software 1.0 for autonomous driving, for robotics, or for anything. I also don’t think there is a theoretical reason to believe Software 1.0 engineering can lead to rapid improvement — the speed of progress is constrained by small teams of humans plugging away at their code and doing trial-and-error. That seems like an inherently slow process. Or at least one that hasn’t gotten much faster since 2003. So progress from 2019 onward should be about as slow as progress from 2003 to 2019.

Progress only accelerates if Software 2.0 is used.

The big question is: What is progress in this case?

Will developing rules-based software become rapidly faster? You are right, probably not in itself.

But will deployment of already developed rules-based software become rapidly faster? Quite possibly, if the software is nearing completion. There is no reason to conclude driving policy necessarily is one of those tasks where rules-based software would inherently fail although I admit there are such domains like machine translation. But in the driving policy case it might simply mean it has taken a decade to develop rules-based software for it (while also allowing maturity to supporting technologies like sensors and perception) and now it is just about putting the finishing touches on it - and then the deployment could well be relatively rapid.

There is no denying NNs are very promising technology and when applied to specific domains like games, perception, machine translation and simulating human movement they have yielded impressive progress. Any self-driving system will also use them in some areas like perception at the very least, perhaps more. There are other areas and perspectives where they have stagnated. Whether or not they are better or best suited for a task such as global driving policy is an open question.
 
Last edited:
I wrote an essay about AlphaStar that is relevant to this thread: https://link.medium.com/JuS3HlnplU

With AlphaStar, DeepMind applied imitation learning and reinforcement learning to StarCraft. With imitation learning alone, AlphaStar was able to achieve roughly median human performance. With reinforcement learning on top of that, it was able to beat a world-class pro.

This makes me wonder what would happen if a company did the same for autonomous driving. In principle, any company could do this, but currently Tesla is the only company that could record state-action pairs from millions of drives (as far as I know). We also have one report from The Information that says Tesla is collecting state-action pairs from the production fleet and using them for imitating learning. A smaller hint is that Tesla has mentioned reinforcement learning in its job postings for Autopilot AI interns.

It’s worth thinking about whether driving will be harder to solve than StarCraft, and if so, why. This is a useful exercise for understanding the problem space better and speculating about what direction Tesla might go in.
 
  • Like
Reactions: jimmy_d
Traffic light detection is at 98%, they want to get to 99.9 some nines(anyone heard how many?) percent by the end of the year. I assume that this is not frame by frame, but with a few seconds of video and with map as an input to the neural network. Given this paper talking about how hard it is to lower error rates I assume they will have some serious work in front of them. By then likely many other subsystems will be working really well and I assume subjective experience should be vastly better than current experience.

Suppose you use the TensorFlow model that Intel demoed early last year (90% accuracy), you would still have a 99.999997% chance of correctly detecting at least one light (on average) within a tenth of a second at 30 fps, assuming an average of 2.5 lights per direction — that is, 1 - ((.1 ^ 2.5) ^ 3).

So unless Intel's algorithm is just way too slow for real-time use, chances are the 98% accuracy is *not* over several seconds.

If we assume instead that 98% is per-frame, per-light, then if they identify 98% of them, assuming an average of 2.5 traffic lights per intersection, you'd have about a 99.994% chance of detecting it correctly in any given frame, and a 99.999999999982% chance of detecting one within a tenth of a second at only 30 fps. At 60 fps, well, Google's calculator can't work with numbers that small, though the odds of failure are one in almost 31 septillion. You are 10 billion times more likely to win a billion dollars or more in both PowerBall AND Mega Millions than for that to fail.

So a 98% detection rate is probably good enough. Not necessarily, but probably. :)
 
  • Like
Reactions: APotatoGod