Welcome to Tesla Motors Club
Discuss Tesla's Model S, Model 3, Model X, Model Y, Cybertruck, Roadster and More.
Register

March of 9s: The Beginning of the End

This site may earn commission on affiliate links.
I have had enough experience now with FSD 12 to say that in my opinion, I believe Ashok is correct:


I would also stress we are at the beginning of the end. That is, we will hopefully now start to see significant improvement with the "known FSD issues" and "corner case disengagements" with every major FSD release. "The end" (level 3 /4 autonomy within a wide Operational Design Domain (ODD)) is almost certainly year+ away. I'd like to document, just from one consistent anecdotal use case that can be repeated over time, where we are "starting from."

I will be driving a 90ish mile (2-3 hrs) "loop" under FSD to cover a range of driving scenarios. (Mileage is approximate):

For privacy reasons, there are about 15 miles in my loop that are not included in the link above that takes me to/from my actual home.

1713797820977.png


Here is a Link to inspect the route in detail

This route takes me from NJ Suburbs into and out of Manhattan (NYC) and includes approximately:
  • 10 Miles Suburban driving
  • 65 Miles "limited access highway" driving (This should be using the FSD 'highway stack' which is not the same as the FSD 12 stack)
    • includes interchanges
    • Includes Tolls
    • Includes Tunnel
  • 8 miles of other "highway type driving", (will probably fall under the FSD 12 stack)
  • 6 miles of dense city driving...including areas around Times Square, Rockefeller center, etc which will have dense vehicle and pedestrian traffic.
I will not be recording with a phone or anything like that. However I will try and save dashcam footage of anything notable.

I will report on:
  1. Interventions (Accelerator presses, particularly if safety related)
  2. Disengagements (comfort or safety related)
  3. Overall impressions
As we know Version 12.3.x does not support...(but will need to in the future):
  1. Smart Summon or "Banish"..so what I call the "first 100 yards and final 100 yards" is not available to test. (Drop offs / pick-ups).
  2. "Reversing" while on FSD is not yet supported
Finally, there are what I would say 2 well documented "comfort / safety" issues with FSD 12.3.x that I have also experienced regularly first hand:
  1. "Lane Selection Wobble"...for example, approaching an intersection where the single driving lane splits into multiple lanes (turning vs. straight)...FSD may act "indecisively"
  2. Unprotected turn (stop sign) behavior. Notably: stops early....then creeps. If no cars detected it may creeps into intersection instead of "just going". Further, if it has creeped into the intersection, THEN detects a car approaching, it may still hesitate and require intervention (accelerator press) to get it going.
In addition to those two consistent issues, I expect to encounter some issues related to routing, and any number of other 'corner case' issues. All things that will ultimately need to be handled, but we expect to see dealt with as we progress though the "March of 9s"...toward the "end of the end".

Although I have driven FSD regularly over the past 3 weeks...I have yet to take it into NYC.

Vehicle: Refresh Model S (2023), Vision Only, HW4. First test will be using:
Firmware version: 11.1 (2024.3.15)
FSD Version: 12.3.4

So...there's the set-up. I expect later today to drive the first loop.
 
Last edited:
Dude, you had four disengagements within four blocks.
Does not change what I said. If you have not used it, then of course you "don't see" what everyone is talking about. So please, if you have questions about the ride, disengagements, etc, I'm happy to elaborate.

If you're just here to repeat ad nauseum how "Tesla can't do it because XYZ" and you haven't even USED it, then continue in the several other threads where you are doing that already.

Try and keep this thread about what I set it out to be: actually observing changes (or not) in FSD behavior over new major FSD revisions. When the next major revision hits, I will be repeating the loop again (likely a few times per revision.)

Thanks.
 
Does not change what I said. If you have not used it, then of course you "don't see" what everyone is talking about. So please, if you have questions about the ride, disengagements, etc, I'm happy to elaborate.

If you're just here to repeat ad nauseum how "Tesla can't do it because XYZ" and you haven't even USED it, then continue in the several other threads where you are doing that already.

Try and keep this thread about what I set it out to be: actually observing changes (or not) in FSD behavior over new major FSD revisions. When the next major revision hits, I will be repeating the loop again (likely a few times per revision.)

Thanks.
I'm happy you're happy with the FSDS progress and that you find it useful.

What matters to me is order of magnitude improvements in actual MTBF, not meaningless anecdotes. My only comments in this thread was regarding your title of it, which I find hilarious btw, and direct questions and comments based on my first post.

Here are some videos from your area, right?
 
I don't mean 100 drives literarily...

The fact of the matter is that Tesla's systems fails so often right now that it's obviously not ready to even be considered as a driverless system.

When people stop getting excited about a zero intervention drive, but rather get surprised when the system fails, then we're looking at marching towards nines.

You need to have 99% safe drives without interventions before 99.9% and 99.9% before 99.99% and so on. Tesla is not yet at 99% and each nine is an order of magnitude improvement.

An driverless system probably needs 99.99999% or something like that to be as safe as a human. Once every 100 miles is not anywhere close.
If you think that humans are even 99% perfect drivers you obviously don't drive on the roads of this country, or are delusional.
 
If you think that humans are even 99% perfect drivers you obviously don't drive on the roads of this country, or are delusional.
Human drivers in the US have on average 2-3 accidents in their lifetime and a handful parking related impacts and such. A good driver has fewer. To get a net improvement in road safety, that must be the benchmark.
 
Human drivers in the US have on average 2-3 accidents in their lifetime and a handful parking related impacts and such. A good driver has fewer. To get a net improvement in road safety, that must be the benchmark.
I have read 4+ "crashes"...per lifetime...which may be different than accidents.

There are several different benchmarks that I'm sure will be looked at: "crashes or accidents per mile", "deaths / mile", "injury / mile" , "$$$ in damages / mile", etc. All will have an impact on the introduction of L3 and L4 autonomy.

Again, if you have not used both V11 and V12 and experienced them both, then you have no hope to see V12 in context. Yes, "dude", I had 4 disengagements in 4 blocks. However, only ONE disengagement of those (in fact the ENTIRE 90 mile drive) was due to a safety concern. The others were all planning / navigation related or comfort related. So I didn't NEED to disengage...it would have happily just re-routed. It would not have been as efficient a drive, but not a disengagement.

This behavior is completely different than my experience (and others, as you noted) vs FSD 11.
 
Last edited:
Again, if you have not used both V11 and V12 and experienced them both, then you have no hope to see V12 in context.

I used version 11, and I now use 12.3.4. What a difference! Version 11 put the excitement into a boring drive. Never knew quite what it would do when multiple cars were around, but it would sometimes be dramatic.

Version 12 drives like my sister.

I mostly drive around Phoenix, AZ, which has wide streets, painted lines, some freeways, and basically an easier place to drive than, say Alexandria, VA. Most of my disengagements are because the car is driving too much like my sister, and I want to get there on time. Most of the rest of the disengagements are because FSD won't get into or out of the HOV lane here. I do have 'use HOV lane' enabled.

I'm sold on FSD. I'm paying subscription and intend to continue. It's just too good as a driver's aid. Just like I use cruise control on a boring highway, I use FSD when I'm more interested in doing something besides driving. I'll be driving along and get a phone call perhaps. I just put it into FSD and take the call. If the car isn't doing something I want, I take over and do what I want. I don't expect it to be perfect. I expect it to be good enough, like Amazon Alexa or Google or Apple is at deciphering your speech. Works great until you get to something it can't do, often a restaurant's foreign name. For me, it's already good enough. I relax more when it is driving. I generally foresee problems coming, and I've always got one hand weighing on the steering wheel. Often have a foot poised over the accelerator pedal too. It's just a different way of driving the car. I still feel like I'm driving the car, I'm just not having to do most of the heavy lifting.
 
Elon claims that 'the wiggle' (which I have already said is one of 2 major issues) is resolved in FSD 12.4:


We'll see....whenever it comes out! This also reinforces my assumption that the point releases (X.Y.Z) are minor, and not worth testing for fixes / advancement. I'm assuming the point releases are mostly for fine-tuning for different hardware configurations, merging into separate code branches, etc.
 
This also reinforces my assumption that the point releases (X.Y.Z) are minor
The software industry generally refers to those as major, minor and bugfix/patch/maintenance releases. The third number is usually just some quick thing that needed doing, but isn't a feature. Major and minor are pretty much always feature releases. Major features and minor features.
 
  • Like
Reactions: zoomer0056
The software industry generally refers to those as major, minor and bugfix/patch/maintenance releases. The third number is usually just some quick thing that needed doing, but isn't a feature. Major and minor are pretty much always feature releases. Major features and minor features.
Exactly...but with many tesla personalities / bulls getting hyped over any new release, I just thought it was prudent to call this out. :)
 
  • Like
Reactions: JB47394
What is unknown is how long will it actually take - year+, 5 year+, 10 year + .... ? Nobody knows.

For eg. how long will it take for Tesla to recognize/respond to school bus and school zones ? Surely they have enough training material. It isn't an edge case and very easy to gather more training data if they want. They know the deficiency - but have not tried to address it. Makes me think it isn't easy.

I have recently provided them a bunch of training data on school zones and buses!
 
I have read 4+ "crashes"...per lifetime...which may be different than accidents.

There are several different benchmarks that I'm sure will be looked at: "crashes or accidents per mile", "deaths / mile", "injury / mile" , "$$$ in damages / mile", etc. All will have an impact on the introduction of L3 and L4 autonomy.
You're only at fault in about half of the incidents you're involved in statistically. So I think it's closer to 2 than 4. It's doesn't really matter for the sake of my point though.

Yes, in the long run autonomy will get those numbers down. Some are already there. Look at the study that Waymo has Swiss:re do.

However. no one is even getting close of removing the driver in a "complete-geo" ODD. I don't think it will happen in 10 years, and not without a steering wheel in 20 years, as that requires even more capabilities and higher reliability. Simple but infrequent cases like getting on and off a ferry or valet parking in super tight garages won't likely be generally solved in 10 years, as that won't be a priority for quite some time.

I can see and acknowledge that people feel that v12 drives more naturally than v11, but it fails about as often when looking at the data. The latter is the only KPI I find relevant for now when discussing "the march of nines" (ie reliability) . If you can't remove the driver, there is no autonomy.
 
Last edited:
I can see and acknowledge that people feel that v12 drives more naturally than v11, but it fails about as often when looking at the data. The latter is the only KPI I find relevant for now when discussing "the march of nines" (ie reliability) . If you can't remove the driver, there is no autonomy.
Again, it may (though not in my experience) "fail about as often"...if you define "failure" as any intervention or disengagement.

However, the scope / diversity of situations in which it fails is much, much. much less. This is what you do not seem to understand. It's not just being more "natural" (which is true). It's a step change in confidence using the system because you "know" that there are really only certain situations where you are hyper vigilant.

Driving on version 11 was often more stressful than not.
Driving on version 12 is completely opposite.

You don't "see this" in the data. You experience it. So yes, I re-iterate that I see version 12 as the "beginning of the end". Again: beginning of the end.
 
Last edited:
Again, it may (though not in my experience) "fail about as often"...if you define "failure" as any intervention or disengagement.

However, the scope / diversity of situations in which it fails is much, much. much less. This is what you do not seem to understand. It's not just being more "natural" (which is true). It's a step change in confidence using the system because you "know" that there are really only certain situations where you are hyper vigilant.

Driving on version 11 was often more stressful than not.
Driving on version 12 is completely opposite.

You don't "see this" in the data. You experience it. So yes, I re-iterate that I see version 12 as the "beginning of the end". Again: beginning of the end.
Critical DE:s are in the data, so we do see them.

I think FSDS is quite an achievement technically, but it remains to be seen if it's a product people are willing to pay money for, not counting enthusiast/nerds like you and me.

It's all semantics but it don't think it's the beginning of the end. My position is that there is no autonomy effort from Tesla other than in marketing until they define an operational design domain (ODD) where they guarantee the functionality.

You don't go from nothing to everything at once by some magic "invention". That's all marketing and not how safety-critical engineering works.

I think there will be many more iteration and re-architecturings of Tesla's approach if they want to solve for autonomy, including but not limited to more sensors, other sensors, new software approaches et.c . You are never done and there is plenty of more nines to chase even when you have specified an ODD, which Tesla is yet to do. And when you have enough nines in your current ODD, you can try to expand it and get nines there.

And so it goes in perpetuity until you're done or give up.
 
I think FSDS is quite an achievement technically, but it remains to be seen if it's a product people are willing to pay money for, not counting enthusiast/nerds like you and me.
Obviously.
It's all semantics but it don't think it's the beginning of the end.
You've made that much clear. We get it.

Now do you have any actual questions about the drive itself or suggestions on route, methods, etc? Or do you want to continue to make semantical arguments and de-focus the thread?