Welcome to Tesla Motors Club
Discuss Tesla's Model S, Model 3, Model X, Model Y, Cybertruck, Roadster and More.
Register

March of 9s: The Beginning of the End

This site may earn commission on affiliate links.
I have had enough experience now with FSD 12 to say that in my opinion, I believe Ashok is correct:


I would also stress we are at the beginning of the end. That is, we will hopefully now start to see significant improvement with the "known FSD issues" and "corner case disengagements" with every major FSD release. "The end" (level 3 /4 autonomy within a wide Operational Design Domain (ODD)) is almost certainly year+ away. I'd like to document, just from one consistent anecdotal use case that can be repeated over time, where we are "starting from."

I will be driving a 90ish mile (2-3 hrs) "loop" under FSD to cover a range of driving scenarios. (Mileage is approximate):

For privacy reasons, there are about 15 miles in my loop that are not included in the link above that takes me to/from my actual home.

1713797820977.png


Here is a Link to inspect the route in detail

This route takes me from NJ Suburbs into and out of Manhattan (NYC) and includes approximately:
  • 10 Miles Suburban driving
  • 65 Miles "limited access highway" driving (This should be using the FSD 'highway stack' which is not the same as the FSD 12 stack)
    • includes interchanges
    • Includes Tolls
    • Includes Tunnel
  • 8 miles of other "highway type driving", (will probably fall under the FSD 12 stack)
  • 6 miles of dense city driving...including areas around Times Square, Rockefeller center, etc which will have dense vehicle and pedestrian traffic.
I will not be recording with a phone or anything like that. However I will try and save dashcam footage of anything notable.

I will report on:
  1. Interventions (Accelerator presses, particularly if safety related)
  2. Disengagements (comfort or safety related)
  3. Overall impressions
As we know Version 12.3.x does not support...(but will need to in the future):
  1. Smart Summon or "Banish"..so what I call the "first 100 yards and final 100 yards" is not available to test. (Drop offs / pick-ups).
  2. "Reversing" while on FSD is not yet supported
Finally, there are what I would say 2 well documented "comfort / safety" issues with FSD 12.3.x that I have also experienced regularly first hand:
  1. "Lane Selection Wobble"...for example, approaching an intersection where the single driving lane splits into multiple lanes (turning vs. straight)...FSD may act "indecisively"
  2. Unprotected turn (stop sign) behavior. Notably: stops early....then creeps. If no cars detected it may creeps into intersection instead of "just going". Further, if it has creeped into the intersection, THEN detects a car approaching, it may still hesitate and require intervention (accelerator press) to get it going.
In addition to those two consistent issues, I expect to encounter some issues related to routing, and any number of other 'corner case' issues. All things that will ultimately need to be handled, but we expect to see dealt with as we progress though the "March of 9s"...toward the "end of the end".

Although I have driven FSD regularly over the past 3 weeks...I have yet to take it into NYC.

Vehicle: Refresh Model S (2023), Vision Only, HW4. First test will be using:
Firmware version: 11.1 (2024.3.15)
FSD Version: 12.3.4

So...there's the set-up. I expect later today to drive the first loop.
 
Last edited:
Completed the drive...from 4:45 PM to about 7:35 PM.
Refresher: 2023 Model S, HW4, Vision Only.
Firmware version: 11.1 (2024.3.15)
FSD Version: 12.3.4

Subjective Comments:

City Driving: Overall very impressed on how FSD 12 handled "the noise" of the city: lots of pedestrians, cars, bikes, scooters in close proximity....handled smoothly without 'freaking out.' I have seen Version 11 "freak out" (spastic steering, abrupt braking, stutter acceleration) even in the suburbs / rural areas. I was never comfortable enough with V 11 to even try it in the city. V12 Handles obstacles (double parked cars, cars half way in intersections, pedestrians where they should and should not be) nearly exactly how I would have in the cases I came up against. The biggest area of improvement for me is related to planning....it should attempt to get into the "correct lane" for turning earlier.

Here's a summary of the interventions / disengagements...as they happened:

Suburbs: FSD 12 Stack
1) Accelerator press: One lane bridge where negotiation is required. The Tesla slowed a bit too much on the bridge while the car on the other side was waiting.
2) Accelerator press: Unprotected right turn onto highway.

Highway, FSD 12 stack. (This was a multi-lane divided highway, but there are traffic lights and it is not limited access. FSD 12 was driving)
1) Disengagement: Chose the incorrect lane for getting on to the interstate. It may have correctly navigated over eventually, but was not the right move.

Interstate, (Unknown software stack, FSD11?)
1) Accelerator Press: car slowed too abruptly when "entering a toll booth." (This is an automated toll...but there are still actual booths to go through). After the slow down, accelerator press was needed to push it through.
2) Disengagement: (Navigation). This one's interesting: Navigation took me on the lower level of the George Washington Bridge. However, while on the lower lever, navigation wanted to take me to a left fork to get to Harlem River Drive...however the correct fork was to the right. Turns out on the UPPER level, you stay to the Left to get on Harlem River Drive...which is the opposite of the lower level. (Long story short, Navigation was directing me as if I was on the upper level)

NYC Streets: (FSD 12)
1) Disengagement: Pulled over to the right to allow an emergency vehicle to pass on the left.
2) Disengagement: Did not move into correct turn lane in time
3) Disengagement: Did not appear to be getting ready to stop for a car that had crossed in front, but stopped before getting through the intersection. This was the only time the entire drive I did not feel "safe."
4) Disengagement: Car appeared to start attempting to go "around" a car in front of me that had put back-up lights on to parallel park. This may have been OK, but I would normally stop and just wait for the guy to park.

For the record, FSD12 handled the "traffic cone lanes" to get into the Lincoln Tunnel brilliantly.

Interstate, (Unknown software stack, FSD11?)
1) Accelerator Press: (Another toll booth similar to above)
2) Disengagement: Navigation correctly indicated the need to "crossover" from express to local lanes, but car did not appear to be following it.

So there we go: A baseline of this nearly 3 hour "suburb / interstate / city / Interstate / suburb" excursion.

I did save some dash-cam clips....I'll see if I can possible put some of it up to illustrate some of the situations above.

With my FSD subscription ending...the next time I try this loop will likley be whenever FSD V12.4 is made available.
 
Tesla is not gonna give you a break down of their FSDS testing analysis .... you can be confident about that.
If they don't that is because they are not good. If they claim they are "close" to robotaxi but disengagement rates are 1 in 20 miles (vs Waymo's 1 in 10,000), investors are going to laugh. Thats why Tesla doesn't publish hard data now.

If and when Tesla starts getting really good disengagement rates, they WILL publish.
 
So there we go: A baseline of this nearly 3 hour "suburb / interstate / city / Interstate / suburb" excursion.
So, 11 interventions in 75 miles or about 1 in 7 miles. That sounds about right. That is what we have been seeing for a while now. BTW, if you haven't seen this ... here is a crowdsourced intervention rate.

 
  • Like
Reactions: Joey D
What is unknown is how long will it actually take - year+, 5 year+, 10 year + .... ? Nobody knows.

For eg. how long will it take for Tesla to recognize/respond to school bus and school zones ? Surely they have enough training material. It isn't an edge case and very easy to gather more training data if they want. They know the deficiency - but have not tried to address it. Makes me think it isn't easy.

That last little bit is the killer. If it ends up taking years to improve the current version, then in about 2-3 more years, Tesla will announce their next "complete rewrite" that will blow the previous code out of the water.
 
  • Funny
  • Like
Reactions: Matias and EVNow
So, 11 interventions in 75 miles or about 1 in 7 miles. That sounds about right. That is what we have been seeing for a while now. BTW, if you haven't seen this ... here is a crowdsourced intervention rate.

Yes, I have seen that. It's very helpful...although one issue I have with using it to gauge progress is that it doesn't seem to capture the "diversity" of disengagements well. For me, the "types" of disengagements (all kinds of random stuff) were much more diverse in V11 vs. v12. in other words, in my estimation, Tesla has fewer "things" to work on improving....and this should mean that when one type of intervention is corrected...the absolute number of issues should decrease significantly with each one addressed

Time will tell.
 
  • Informative
Reactions: cusetownusa
Yes, I have seen that. It's very helpful...although one issue I have with using it to gauge progress is that it doesn't seem to capture the "diversity" of disengagements well. For me, the "types" of disengagements (all kinds of random stuff) were much more diverse in V11 vs. v12. in other words, in my estimation, Tesla has fewer "things" to work on improving....and this should mean that when one type of intervention is corrected...the absolute number of issues should decrease significantly with each one addressed

Time will tell.
One contrary thing to remember is - as FSD figures out how to handle one thing, we figure the next thing.

For eg., earlier I'd just disengage in construction areas. Now that V2 handles it somewhat ok, I let it go - but usually end up disengaging because of some particular issue. So, FSD has gotten better, but disengagement rate remains about the same.

But in my case, most of my roundabout disengagements have vanished. So, my actual disengagement rate has gotten way better. Now they are all mostly lane selection related.
 
  • Like
Reactions: Matias and Joey D
Completed the drive...from 4:45 PM to about 7:35 PM.
If you're up for the work, post the map of your route with dots where you disengage (red dots), or intervene (yellow dots). Keep the map cumulative, but allow the dots to fade out if the disengagement or intervention doesn't happen again.

Yes, that's asking for a lot, but I think people would be more interested in a graphical depiction than a textual one.

It would be awesome if Tesla could let us access our own intervention and disengagement data to make plotting - and aggregation - straightforward. Ashok, are you listening?
 
I for one have noticed a huge improvement in autonomous driving. I use navigation then put the car in autodrive and it handles pretty much everything fairly well. Yes it does slows down to stop at stop signs. I usually glide through the stop sign slowly. It stop way behind the stop sign and slowly edges out. If the intersection is blind on both sides (cars parked to the corners) then it inches out. In this case I add a little acceleration if the coast is clear. I am much more confident in autonomous driving then I was a few years ago. Parking works great too. I like that you can pinpoint the parking spot then engage autopark.

The incentive to move full self drive to the MY plus the tax credit and pricing caused me to jump on buying the MY this year. I'm happy with the purchase.

Two Teslas, 2018 M3 and 2024 MY.
 
  • Informative
Reactions: JB47394
1) I'm not seeing the step change people are talking about.
View attachment 1040802

2) ADAS won't become ADS without major re-archtecturing and new approaches. Drago commented on this the other day.

3) The march of nines begins when you have hundreds of drives in a row without interventions. If Ashok doesn't understand that, then he's an overpaid clown. Or perhaps it's simply that he has another goal for his bonus-check than autonomy.
I don't really agree. Wish I did, because it's a such a precise value. I believe the real march of 9's begins when the NHTSA data base verifies that FSD, with or without a human involved, is safer than human only driving. Translated that would be - How can anyone be opposed to something that will reduce the ~~40,000 people killed annually. Over the weeks and years it will surely improve and the number of lives saved will grow and be a real measure of the march of 9's. For me, the huge improvement with V12.3 felt like it crossed the threshold and V12.3.4 became the first of the extra 9's.
 
I don't really agree. Wish I did, because it's a such a precise value. I believe the real march of 9's begins when the NHTSA data base verifies that FSD, with or without a human involved, is safer than human only driving. Translated that would be - How can anyone be opposed to something that will reduce the ~~40,000 people killed annually. Over the weeks and years it will surely improve and the number of lives saved will grow and be a real measure of the march of 9's. For me, the huge improvement with V12.3 felt like it crossed the threshold and V12.3.4 became the first of the extra 9's.
I don't mean 100 drives literarily...

The fact of the matter is that Tesla's systems fails so often right now that it's obviously not ready to even be considered as a driverless system.

When people stop getting excited about a zero intervention drive, but rather get surprised when the system fails, then we're looking at marching towards nines.

You need to have 99% safe drives without interventions before 99.9% and 99.9% before 99.99% and so on. Tesla is not yet at 99% and each nine is an order of magnitude improvement.

An driverless system probably needs 99.99999% or something like that to be as safe as a human. Once every 100 miles is not anywhere close.
 
Last edited:
The somewhat counter-intuitive fact is that we would probably see more accidents with FSDS if the system had a higher MTBF until the system autonomously is safer than a human.

WebFig2A.gif


 
Here is a recap of the interventions / disengagements, along with maps where they occurred. Orange = "Intervention" (either click my turn signal to change a lane, or accelerator press). Red = "Disengagement", either steering wheel turn or brake press.

I may post some video clips (from dashcam) so you can see "what happened" for some of the disengagements.

1713884932404.png


Blow-up of diengagements 7-10:

1713884968833.png



Suburbs: FSD 12 Stack
1) Accelerator press: One lane bridge where negotiation is required. The Tesla slowed a bit too much on the bridge while the car on the other side was waiting.
2) Accelerator press: Unprotected right turn onto highway.

Highway, FSD 12 stack. (This was a multi-lane divided highway, but there are traffic lights and it is not limited access. FSD 12 was driving)
3) Blinker lane change: FSD Chose the incorrect lane for getting on to the interstate. It may have correctly navigated over eventually, but I forced a lane change with manual blinker.

Interstate, (Unknown software stack, FSD11?)
4) Accelerator Press: car slowed too abruptly when "entering a toll booth." (This is an automated toll...but there are still actual booths to go through). After the slow down, accelerator press was needed to push it through.
5) Disengagement: (Navigation). This one's interesting: Navigation took me on the lower level of the George Washington Bridge. However, while on the lower lever, navigation wanted to take me to a left fork to get to Harlem River Drive...however the correct fork was to the right. Turns out on the UPPER level, you stay to the Left to get on Harlem River Drive...which is the opposite of the lower level. (Long story short, Navigation was directing me as if I was on the upper level)

NYC Streets: (FSD 12)
6) Disengagement: Pulled over to the right to allow an emergency vehicle to pass on the left.
7) Disengagement: Did not move into correct turn lane in time
8) Disengagement: Did not appear to be getting ready to stop for a car that had crossed in front, but stopped before getting through the intersection. This was the only time the entire drive I did not feel "safe."
9) Disengagement: Car appeared to start attempting to go "around" a car in front of me that had put back-up lights on to parallel park. This may have been OK, but I would normally stop and just wait for the guy to park.
10) Disengagement: Car did not get into the left lane for the turn.

For the record, FSD12 handled the "traffic cone lanes" to get into the Lincoln Tunnel brilliantly.

Interstate, (Unknown software stack, FSD11?)
11) Accelerator Press: (Another toll booth similar to above)
12) Disengagement: Navigation correctly indicated the need to "crossover" from express to local lanes, but car did not appear to be following it.

NOTE: Disengagement 8: The planned route is in blue. Because the turn onto E46th was missed (Disengagement 7), the actual path taken by the vehicle was to continue down 5th and make the left onto E44th, then make the left onto Madison to get back on track. This is why Disengagement 8 occurred at the intersection of Madison and E45th...which is not on the planned route.
 
Last edited:
The somewhat counter-intuitive fact is that we would probably see more accidents with FSDS if the system had a higher MTBF until the system autonomously is safer than a human.

View attachment 1040917


This is why I thought it was absolutely nuts that Tesla themselves promoted on X someone using FSD to drive them to a hospital during a medical emergency as opposed to calling 911, and they were in an area where ambulances have quick access to them. And now there's promotion of using FSD for old people that have poor reactions. No way can dying or old people with compromised skills watch over FSD when it makes some of it's sudden mistakes like turning left right into oncoming traffic.
 
Here is a recap of the interventions / disengagements, along with maps where they occurred. Orange = "Intervention" (either click my turn signal to change a lane, or accelerator press). Red = "Disengagement", either steering wheel turn or brake press.

I may post some video clips (from dashcam) so you can see "what happened" for some of the disengagements.

View attachment 1040928

Blow-up of diengagements 7-10:

View attachment 1040929


Suburbs: FSD 12 Stack
1) Accelerator press: One lane bridge where negotiation is required. The Tesla slowed a bit too much on the bridge while the car on the other side was waiting.
2) Accelerator press: Unprotected right turn onto highway.

Highway, FSD 12 stack. (This was a multi-lane divided highway, but there are traffic lights and it is not limited access. FSD 12 was driving)
3) Blinker lane change: FSD Chose the incorrect lane for getting on to the interstate. It may have correctly navigated over eventually, but I forced a lane change with manual blinker.

Interstate, (Unknown software stack, FSD11?)
4) Accelerator Press: car slowed too abruptly when "entering a toll booth." (This is an automated toll...but there are still actual booths to go through). After the slow down, accelerator press was needed to push it through.
5) Disengagement: (Navigation). This one's interesting: Navigation took me on the lower level of the George Washington Bridge. However, while on the lower lever, navigation wanted to take me to a left fork to get to Harlem River Drive...however the correct fork was to the right. Turns out on the UPPER level, you stay to the Left to get on Harlem River Drive...which is the opposite of the lower level. (Long story short, Navigation was directing me as if I was on the upper level)

NYC Streets: (FSD 12)
6) Disengagement: Pulled over to the right to allow an emergency vehicle to pass on the left.
7) Disengagement: Did not move into correct turn lane in time
8) Disengagement: Did not appear to be getting ready to stop for a car that had crossed in front, but stopped before getting through the intersection. This was the only time the entire drive I did not feel "safe."
9) Disengagement: Car appeared to start attempting to go "around" a car in front of me that had put back-up lights on to parallel park. This may have been OK, but I would normally stop and just wait for the guy to park.
10) Disengagement: Car did not get into the left lane for the turn.

For the record, FSD12 handled the "traffic cone lanes" to get into the Lincoln Tunnel brilliantly.

Interstate, (Unknown software stack, FSD11?)
11) Accelerator Press: (Another toll booth similar to above)
12) Disengagement: Navigation correctly indicated the need to "crossover" from express to local lanes, but car did not appear to be following it.
Very detailed write up, nice. I hope Tesla is monitoring this thread.
 
  • Like
Reactions: legendsk
I don't mean 100 drives literarily...

The fact of the matter is that Tesla's systems fails so often right now that it's obviously not ready to even be considered as a driverless system.

When people stop getting excited about a zero intervention drive, but rather get surprised when the system fails, then we're looking at marching towards nines.

You need to have 99% safe drives without interventions before 99.9% and 99.9% before 99.99% and so on. Tesla is not yet at 99% and each nine is an order of magnitude improvement.

An driverless system probably needs 99.99999% or something like that to be as safe as a human. Once every 100 miles is not anywhere close.

I'm not sure we have even reached the 90% mark yet (the first nine).
 
If they don't that is because they are not good. If they claim they are "close" to robotaxi but disengagement rates are 1 in 20 miles (vs Waymo's 1 in 10,000), investors are going to laugh. Thats why Tesla doesn't publish hard data now.

If and when Tesla starts getting really good disengagement rates, they WILL publish.
I think it’s more like they just don’t want analysis by the general public. Too many completely unqualified idiots that would question things that just aren’t real issues. This is not as simple as say the incredible mistake of not providing a washer for the CT’s rear camera. That was just dumb.
 
  • Like
Reactions: zoomer0056
I think it’s more like they just don’t want analysis by the general public. Too many completely unqualified idiots that would question things that just aren’t real issues. This is not as simple as say the incredible mistake of not providing a washer for the CT’s rear camera. That was just dumb.
There will always be dumb people - you can't use that excuse to not publish hard data when you claim you are going "balls to the walls". You want people (and investors) to take your claim that you are getting closer to achieving robotaxi level FSD - publish data. Otherwise its just BS.

Whatever "benefit of doubt" Elon may have got a few years back is now exhausted. His good will is fairly negative on this point now.
 
  • Like
Reactions: Cal1 and spacecoin