As a generalist, you need to have a couple simple questions in mind: if I summon a car to get me places, and there's a car in the fleet nearby, will it be able to get to me?
Good questions, yes. I don't know where you live. But for me, the answer is "no".
Will it take me where I need to go?
"No"
My vacuum cleaner kicks off at 1pm every day. It got a lot of space to deal with, pretty complicated too. Yeah it gets stuck here and there, so maybe twice a month I have to pull it from somewhere and take it to its charging base.
I had one of those too! Got stuck every day every time! Returned it for a refund. Hired someone to clean the place. *end anecdote*
Would I go back to hiring someone to clean the place just because it fails ~7% of the time? Hell no, this is good enough and I'm sure in the 3 years since I got this vac, they came up with something that would only fail 1% of the time, once a quarter maybe. Same here, as long as these FSD cars aren't presenting a safety hazard, even if they're not perfect the added safety and economic value is so overwhelming that dealing with a few glitches here and there is going to be well worth it.
Glitches like occasionally being stopped dead mid-trip and having to call a taxi at random locations? Well, OK, if you're cool with that, fine. In some major cities, it's not a problem, you can just step out and get in the subway.
Not to get too off topic on this, but let me use the doctors example to help illustrate that the issue with machine learning isn't in the capability of the neural network, but in the paradigm space that humans trying to train it operate in (i.e. are you asking the right question?) and in the quality of a training set.
If you could quantify a good chunk of critical data about a person, such as:
1. Genome
2. Lifestyle stuff: diet, predominant world view and psychological states, sleeping habits, envoronment, etc
3. Gut biome
4. Historical data on lifestyle, injuries, medications taken, past traumatic events, etc.
5. Current comprehensive blood, urine, stool, fMRI, whatever else tests that can be done
6. A bunch of other stuff I can't think of now
..then you add labeling to this dataset. Someone shows up at the doc and we do 1-6, they get a diagnosis and a prescription and then they die within half a year. Or they recover. Or they don't do what the prescription says and do something else, and have a different outcome.
So you stuff all of this into a training dataset. And I bet you will start getting things that will make a lot of people real uncomfortable, but also creepy accurate. Like "stop eating cookies or die in 23 month". "Decreased life span if take this antibiotic by 3 years, not going to help with the current symptom. Go to Hawaii instead, healed in 3 weeks". "Divorce or heart attack in 3.4 years".
Our use of machine learning is limited by how well can we present real world conditions and outcomes to it. Sure, currently we do stuff like train to be able to classify cancer by image. But that is only because we haven't learned (or, as I think would soon be the case, are dis-incentivesed to even try) to collect more representative data in more dimensions and train bigger NN's with those bigger datasets.
For self-driving, overwhelming majority of the overall problem can be broken down into well understood classification and prediction sub-problems and solved with an appropriately big and representative training set. Which is what Tesla is doing, and have gotten fairly far along that road. Driving is a lot simpler than medicine, but that is not to say medicine can't be much better solved by machine learning. Same thing applies: it's a huge multiplier when you can train one "uber-doctor" on all the corner cases that any one doctor would never ever be able to see in their lifetime, not even 1%. And also train that doctor without any preconceived notions of what should and should not work, that would consider any solution. Same with humans and driving. Except driving is something that won't encounter all sorts of moral and other ways humans will resist acknowledging reality, which I fully expect to happen in the case of medicine. Drove from A to B, no accidents, reasonable speed -- all good, we're on board.
So I actually rated this "love", because you're right, "Our use of machine learning is limited by how well can we present real world conditions and outcomes to it".
But this is much, much, harder than you think it is.
One of the things you're wildly wrong about is the difficulty of actually getting the right dataset. In medicine, we don't even know what data we need to
collect -- the importance of the gut biome was not recognized until recently and people who talked about it were considered cranks, for instance. Half the work of a good doctor is considering collecting different data.
Many medical studies end up being debunked when lurking variables are discovered. NN's overfit and would give tons of bad advice if you didn't account for this, which is very hard to account for. They're just correlation machines -- they don't know causation; that would require a different type of computer system.
In driving, the problem is far easier because we do know that we can do it mostly with vision + maybe some listening. The correlation-causation problems remain.
In medicine, the really tough problem is that we have trouble even measuring the outcomes. Then there's a worse problem -- we have trouble even determining what the desirable outcomes are --
people actually disagree. Different people have different health goals.
Driving has a bunch of situations like this too. Tesla hasn't even gotten to those situations where people disagree on the desirable outcomes. With human driving, to some extent, the human is given the decision-making power. Musk even discussed the tradeoff between getting into traffic and risking fender-benders in LA traffic, and basically said the driver would decide. I don't see any way to avoid that, which prohibits FULL self-driving.