Welcome to Tesla Motors Club
Discuss Tesla's Model S, Model 3, Model X, Model Y, Cybertruck, Roadster and More.
Register
  • Want to remove ads? Register an account and login to see fewer ads, and become a Supporting Member to remove almost all ads.
  • Tesla's Supercharger Team was recently laid off. We discuss what this means for the company on today's TMC Podcast streaming live at 1PM PDT. You can watch on X or on YouTube where you can participate in the live chat.

Elon Musk: EAP solved, on track for FSD completion in 2019 (No one else is close!)

This site may earn commission on affiliate links.
Can you explain what human can do that camera vision system can't?

Side stepping the question aside just want to make sure you know there is nothing human eyes can do that camera vision system can't.

Define "camera vision system" as you are using it, since the conversation you joined had a specific definition (single camera) and one identified limitation (depth perception).
 
Can you explain what human can do that camera vision system can't?

High framerate (a normal camera needs a LOT of exposure/light to do the same)
Cleaning itself :) (Tesla does not have this)

Also:
  • A film in a camera is uniformly sensitive to light. The human retina is not. Therefore, with respect to quality of image and capturing power, our eyes have a greater sensitivity in dark locations than a typical camera.
  • Eye has 130 million pixels
Both have pros and cons, but they are not the same at all.
 
High framerate (a normal camera needs a LOT of exposure/light to do the same)
Cleaning itself :) (Tesla does not have this)

Also:
  • A film in a camera is uniformly sensitive to light. The human retina is not. Therefore, with respect to quality of image and capturing power, our eyes have a greater sensitivity in dark locations than a typical camera.
  • Eye has 130 million pixels
Both have pros and cons, but they are not the same at all.
Plus, our ability to infer information as we drive (eye contact, seeing a person unfocused, etc) is far above what most AI engines are likely able to extract from cameras today, so “anything” they can have that would give them an unfair advantage over human (adding LIDAR for example) should be leveraged IMO.
 
I do agree that computer vision isn't on par YET with human vision but my point is that eventually computer vision will be good enough for 99% of weather types. Depth perception of computer vision is already better than humans do but humans can decipher very complex scenarios and especially all edge cases are processed with ease by humans. Still in a lot of cases environment awareness is already better with cars. Just take the many examples of Autopilot breaking for something ahead while for a driver this probably is very hard to spot if you are not 100% focused on all cars ahead.
 
High framerate (a normal camera needs a LOT of exposure/light to do the same)
Cleaning itself :) (Tesla does not have this)

Also:
  • A film in a camera is uniformly sensitive to light. The human retina is not. Therefore, with respect to quality of image and capturing power, our eyes have a greater sensitivity in dark locations than a typical camera.
  • Eye has 130 million pixels
Both have pros and cons, but they are not the same at all.

Could you drive a car by watching a monitor fed by a camera?

Corollary: Are racing video games possible?
 
Could you drive a car by watching a monitor fed by a camera?

Corollary: Are racing video games possible?

What about the opposite: can AP function in a simulation?

If someone set a Tesla AP (v8) system up in front of a console gaming rig running Forza Horizon 4 on a giant screen, disabled the radar and hooked the steering/power/braking from AP into the console, would it be able to "drive"?

Part of the NN training is (probably) done in a simulation, so it should do quite well.... right?

This would be a great hack to see! :)
 
  • Like
Reactions: croman
What about the opposite: can AP function in a simulation?

If someone set a Tesla AP (v8) system up in front of a console gaming rig running Forza Horizon 4 on a giant screen, disabled the radar and hooked the steering/power/braking from AP into the console, would it be able to "drive"?

Part of the NN training is (probably) done in a simulation, so it should do quite well.... right?

This would be a great hack to see! :)

Tesla was hiring 3-D engine game developer type people, so I think something similar to that is happening. Likely using real world graphics. Can dump a few hundred/ thousand/ tens of thousand (running the final NN is easier than training) of the same NN in a virtual city and let them drive as the sped of CPU (racks of AP3 chips?).

Watched the PyTorch Karpathy talk. He was talking about using one code set (network, training data, test cases) for everyone with a build (re-train) on every check in. Can't image the size of the training computer system to make that feasible.
 
Define "camera vision system" as you are using it, since the conversation you joined had a specific definition (single camera) and one identified limitation (depth perception).

Read the article below to understand how camera sensor capability compares to human eyes. This is a more than a decade old article. Camera system has already matched or exceeded human visual capability in pretty much every measure. Sensor technology is way more advanced today than when that article was published. It is safe to say the "sensor" is better than our eyes. A lot of people got a hung up on the "eye" but they don't know it's the "brain" that does most job. That's the reason why Tesla has concentrated on working on the neural net machine learning and improving the AI chip to solve the FSD challenge.

Cameras vs. The Human Eye

As for camera to determine distance there are many ways to do it. You can do a search if you're interested in this. Modern consumer cameras, for example, can easily determine image distance, with a single sensor, to perform the autofocus task at a speed better than we could.
 
  • Like
Reactions: mongo
Could you drive a car by watching a monitor fed by a camera?

First thing that came to my mind after reading this was that we have Drone Pilots in Nevada (I believe) using cameras to control Drones on the there side of the world. Maybe not the same thing. Just hit me when you ask the question. I do think this may happen more and more as backup to self driving fleets.
 
  • Like
Reactions: mongo
Lots of interesting stuff being discussed here. When do the optimistic minded of you suggest we start selling the system? Also, could you be more specific about the hardware needed? I have to start building the boards for this FSD system.
 
  • Funny
Reactions: croman
Read the article below to understand how camera sensor capability compares to human eyes. This is a more than a decade old article. Camera system has already matched or exceeded human visual capability in pretty much every measure. Sensor technology is way more advanced today than when that article was published. It is safe to say the "sensor" is better than our eyes. A lot of people got a hung up on the "eye" but they don't know it's the "brain" that does most job. That's the reason why Tesla has concentrated on working on the neural net machine learning and improving the AI chip to solve the FSD challenge.

So what you really meant to ask was, "What can the human eyes do that a camera cannot do?" This is a very different question than:

Can you explain what human can do that camera vision system can't?


As for camera to determine distance there are many ways to do it. You can do a search if you're interested in this. Modern consumer cameras, for example, can easily determine image distance, with a single sensor, to perform the autofocus task at a speed better than we could.

Focus is not the same as measuring distance. The former is a "sensor" concept (eye/camera), whereas the latter is a "processing" concept (brain/computer). Autofocus requires both.
 
Here is something that a computer vision system can do that human eyes cannot - they can look in all directions at the same time.

That's actually pretty important. How much of the human driving equation is complicated and fuzzy because we can only look in one direction at a time, but threats can come from multiple directions? To work around this, humans develop a very complicated and often faulty thought process to assign threat levels to all areas of the driving environment and then focus on the areas where they perceive the most likely threats coming from.

Someone brought up the difficulty in determining if someone has seen you and recognized you. That is a hard problem, but most of the value in making that determination is determining whether or not they will end up violating your right of way so that you can turn your focus to somewhere else that might have conflicting traffic. While it might still be helpful for courtesy to know if they really recognized you and whether you should wait for a cooperative right-of-way assessment, there is less of a safety issue if you can keep your eye on them as you also watch for cross traffic and the light changing and pedestrians crossing in front and traffic coming up from behind - all at the same time.
 
  • Like
Reactions: fmonera