No, I don't think you need AGI to reach FSD. That's because driving by definition is a specialized task. I define AGI as fully generalized intelligence that can learn any task from driving to writing a letter, playing music, solving a math problem or playing a sport. You don't need intelligence that can play music or write a poem to be able to drive. You just need intelligence that understands how to drive. Hence, you don't need AGI in my opinion. But you do need AI that can think critically about the task of driving.
As I see it, there are 3 elements of intelligence that are required in order to drive safely, that autonomous cars will need as well:
1) Perception.
The AV needs to understand the world around it both static and dynamic, ie be able to detect and classify roads, lane lines, road markings, traffic lights, stop signs, road signs, drivable space, curbs, crosswalks, vehicles, motorcycles, trucks, bicycles, pedestrians, animals, road debris etc....
2) Behavior Prediction.
The AV needs to understand how objects relate to each other and move. So it needs to understand if a car is going to turn right, if a pedestrian is about to cross the street etc... This can be complex because the behavior of objects is often interconnected and change in real-time. For example, a car starts to turn right but then stops when it sees a pedestrian. The pedestrian was going to stop but it sees the car yielding so they decide to cross anyway.
3) Planning/Controls
The AV needs to make a plan about to do and then execute that plan with the right steering and accelerator/braking. The AV also needs to understand the rules of the road. So it needs to obey speed limits, stop signs, traffic lights, yield signs, other road signs etc.... For example, it sees a stop sign so it applies the brakes the right amount to come to a stop at the stop line, wait if necessary for cross traffic, turn the steering wheel the right amount to make a smooth turn when it is it's turn to go and the path is clear. Or make a lane change to move into the turn only lane to stay on route, slow down when the car in front slows down, move around an obstacle when it is clear, etc...
Planning is the real challenge IMO. Perception is relatively easy (sensors can tell you where objects are). You can build NN that take in perception and road rules and predict likely paths objects might take. The real intelligence part is the planner because that is the decision-making. And there are millions of different driving scenarios that the AV has to figure out. Also, there is not always an absolute correct answer. Sometimes, both decisions can be safe, just one might be more assertive than another. For example, turning before cross traffic reaches me or waiting for cross traffic to pass and then make the turn. Both might be safe, it is a matter of how assertive I want to be. But I need to factor in how my decision might affect others too.
A lot of driving can seem routine when you are just cruising down the road and making protected turns. But there are many tricky edge cases, from construction zones that change the drivable space, road accident with police directing traffic by hand, to obstacles like double parked cars, to unusual objects on the road etc... And some cases can be ambiguous. For example, I remember Waymo and Cruise used to get confused if a stopped vehicle was double parked or not. So they were not sure if the vehicle was actually stopped and they should go around or wait for the vehicle to start moving. And if the AV does decide to go around the obstacle that requires driving into the oncoming lane, it needs to make sure the path is clear so as not to drive head on into an oncoming car. And what do if there is an occlusion like a large vehicle where you can't see if there is oncoming traffic without moving into the other lane? Those are just a couple of scenarios that require the AV to reason about what to do.
On a side note, I think this is why so many overestimated when autonomous driving would arrive. They looked at perception, prediction and planning for common driving scenarios and figured it would not be that long to solve everything. They underestimated all the edge cases that would require more complicated reasoning.
It should be noted that we've had L4 cars that can do perception, prediction, follow road rules and planning for a few years now. The fact that we have L4 now without AGI, suggests to me that AGI is not needed. But up to now, AI has been very "rigid". It only works well within what it has been trained to do. For example, say the AV was trained to stop at a red light, wait for the green, make a protected turn when the path is clear. When it encounters that scenario, it follows its training and can handle it great. But if it encounters a scenario outside of its training, called an edge case, then it might stop and act confused, not sure what to do. We've seen examples of this with Waymo and Cruise, where the robotaxis encounter a new or different scenario and they pause and ask remote assistance for help. This is why I think current AVs either have safety drivers when the ODD is big or use remote assistance when driverless but only in a geofence. We need to limit the risk one way or the other.
So I would say the real challenge now is how to deal with these edge cases, ie how to make the AV more intelligent. One approach would be to simply add more training. More training, less edge cases. Less edge cases, less stalls. And in theory, we should be able to do enough training to eventually cover 99.999999% of cases and then stalls will be rare enough that they won't matter. The problem with this approach is that it might take awhile to solve enough edge cases and it is likely stalls will still happen, just be very rare. So with this approach, it is likely that we would still need a human in the loop for awhile to help when the AV gets stuck. Ideally, for true autonomous driving, we would like it to both work everywhere and also remove the human from the loop entirely.
Another approach would be to get AI that can think outside of its training. If the AI can generalize outside of its training then we would not need a human in the loop for edge cases anymore because the AV could learn and figure things out on its own when it encounters a new edge case, like humans do. Of course, this approach is more difficult because it would require a new type of AI that can generalize and reason on its own. But the hope is that the latest developments in AI, specifically foundational AI and end-to-end learning, that we might be getting close. If anything, even if these latest advances in AI, don't solve everything perfectly, they should help us solve a lot more edge cases faster and therefore make AVs much safer and more reliable, where we reach L5 and don't need safety drivers or even remote assistance anymore. Lastly, we should remember that AVs don't need to be perfect, they just need to orders of magnitude safer than humans. So we don't need AI to be perfect, just superhuman.