Welcome to Tesla Motors Club
Discuss Tesla's Model S, Model 3, Model X, Model Y, Cybertruck, Roadster and More.
Register

Seeing the world in autopilot, part deux

This site may earn commission on affiliate links.
so what's the importance of the 3D bounding box @Bladerskb ? you just need to know where you cannot drive on your side of the obstacle, who cares how is it shaped on the other sides anyway (in other words you can skin this cat in other ways). and they do this with driveable space, the bounding boxes are just for information more or less and it's possible though unlikely) that they can do 3D bounding boxes, there's depth information, just no orientation vector to draw the 3d part.

Show us 3D planned path prediction or driveable space prediction from other vendors if you have the samples? Those seem to be pretty important things.
 
so what's the importance of the 3D bounding box @Bladerskb ? you just need to know where you cannot drive on your side of the obstacle, who cares how is it shaped on the other sides anyway (in other words you can skin this cat in other ways). and they do this with driveable space, the bounding boxes are just for information more or less and it's possible though unlikely) that they can do 3D bounding boxes, there's depth information, just no orientation vector to draw the 3d part.

Show us 3D planned path prediction or driveable space prediction from other vendors if you have the samples? Those seem to be pretty important things.

This will be a rehash from reddit but i will try to update with greater details.

Actually 3D bounding box might be the most important thing in object detection. A 2d bounding box (which includes shapes) doesn't include important information which you need. For example the precise orientation of a car, with that you can predict its actions. This is vital in a dense environment, during parking, in a parking lot. Is this car trying to pull out or not. What precise direction is he pulling to, With just the information necessary to produce 2d bounding box, you don't see it. But the information is reflected in the information necessary to produce a 3d bounding box. Information like, is this the left side of a car, the door might pop out, is the door currently open? is this car trying to complete a turn. is this car coming at me or... the list goes on. As Mobileye Amon says. 2d bounding box is irrelevant.

While its impressive that distances and speed are done by visual. Mobileye has been doing this since eyeq3. There's no tangible difference between atleast the firmware you have unraveled versus what eyeq3 has been doing since 2014 production date.

I'm just laying out that the lack of 3d boxes and the instability of the detection shows their weakness compared to mobileye.


so what's the importance of the 3D bounding box @Bladerskb ? you just need to know where you cannot drive on your side of the obstacle, who cares how is it shaped on the other sides anyway (in other words you can skin this cat in other ways).

3d bounding box whether you are using lider or camera gives you the precise orientation of an object with which you can deduce its precise moment to moment decision. This is very important in dense traffic or parking lot.

Imagine if you painted inside of the 2d boxes with solid color and then look at it in the perspective as a human. Notice how you lose all indication of what's going on in the scene, only that there are objects in front of you but no idea the scenario, orientation or state they are in.

oh wait, you already did that.



Notice how you don't know if the car infront of you is turning and where they are turning to or not. Or even if they are going the same direction as you. you're basically driving blind.

This is even now more evident when you are driving on dense surface streets. A car could be in-front of you at a intersection in the adjacent turning lane, preparing to turn into the parallel lane next to you. With a 2d bounding box info, you won't know if that car is driving forward parallel to you or driving forward adjacent to you or making a turn.

Ask yourself, can you drive in dense traffic with the above view superimposed to your view? Answer is absolutely not. You need to know the orientation of cars.


and they do this with driveable space

If only driving were just about not hitting things and not prediction, knowing and planning around the actions or perceived actions of another driver.
, the bounding boxes are just for information more or less and it's possible though unlikely) that they can do 3D bounding boxes, there's depth information, just no orientation vector to draw the 3d part.

Depth info by itself won't be useful, like a raw lidar data, depth info is simply just pixels and raw lidar data is simply a bunch of cloud points. Its when you process the info to produce quantifiable output that matters and based on your statements those outputs currently doesn't exist.

Show us 3D planned path prediction or driveable space prediction from other vendors if you have the samples? Those seem to be pretty important things.

Will provide more in another post.
 
3D bounding boxes are a marketing gimmick.

Object boundaries, it’s velocity and jerk are all that is needed.

3D bounding boxes = object boundaries?

Why do you say 3d bounding boxes are gimmick, but you need object boundaries?

I guess I am agreeing with @Bladerskb here. You do need 3d bounding boxes,

But I don't think object height is that important.

But you definitely need 3d object position along with orientation and width/length in order to get object boundaries.


In this case Tesla has 2D box with depth, and if that depth is accurate, then they could make crude 3d object boundaries with that data... however in many cases they will end up assuming an object is larger than it actually is, which is probably not a big deal for now
 
Given that car’s body follows front wheels AI, in dense environment, would be better off following orientation of the front wheel rather than the whole body. For example, when parked car wants to merge with traffic, it’s possible velocity cannot be predicted by its 3D bounding box.
 
Last edited:
  • Like
Reactions: jaguar36
Thanks - its a really interesting insight to how much further the internals are than what we see.

You could just plug an SSD into the usb-c port on the side and record footage

Does this exist port on the non-dev versions? Do you think plugging in an ssd drive will be necessary for the V9 dash-cam feature or is there sufficient high-speed storage for there to be a usable dash cam without? (I think a dash cam with a rolling 30 second buffer that is saved if airbags deploy is better than nothing, but I'd probably keep my existing dash cam if that is the limit of the functionality).

found ways to correlate some of the metadata with real world meanings and came up with code to paint internal autopilot state (the parts we understand)

How much of the metadata does this represent? I.e. was there enough uninterpretable metadata for street signs and road markings such as give way lines & pedestrian crossings to be being recognised, or is this something that will likely need the AP3 hardware?
 
  • Helpful
Reactions: strangecosmos
Can’t tell if you’re joking or not

The new nav miscounts roundabout exits (i.e. it will often say "take the first exit" when it means, and shows, the second exit on the screen) - for about 1 in every 20 roundabouts near me in the UK. We'd (some UK Tesla owners) speculated this was due a relative lack of testing due to a lack roundabouts near Tesla HQ. The comments above (joking or not!), made me wonder - a little searching found Mapping America's Resistance to Traffic Roundabouts - CityLab which has some interesting stats.

The US has about 10x fewer roundabouts per junction than the UK, so the roundabout exit-count error would be 10x less likely to happen. But even in the US, the roundabouts are not evenly spread. California actually had the 2nd highest number of roundabouts per state, but the author states "Where I live now in California there are literally no roundabouts" (presumably just his local area given the other statistic).
 
The new nav miscounts roundabout exits (i.e. it will often say "take the first exit" when it means, and shows, the second exit on the screen) - for about 1 in every 20 roundabouts near me in the UK. We'd (some UK Tesla owners) speculated this was due a relative lack of testing due to a lack roundabouts near Tesla HQ. The comments above (joking or not!), made me wonder - a little searching found Mapping America's Resistance to Traffic Roundabouts - CityLab which has some interesting stats.

The US has about 10x fewer roundabouts per junction than the UK, so the roundabout exit-count error would be 10x less likely to happen. But even in the US, the roundabouts are not evenly spread. California actually had the 2nd highest number of roundabouts per state, but the author states "Where I live now in California there are literally no roundabouts" (presumably just his local area given the other statistic).
I would hazard your number is way off. More like 1 roundabout for every 100,000 in the UK.
(Metrics are covered in that article, thanks.)
The UK itself IS a roundabout. :)
There are several in SWFL (Southwest Florida) (more than I have ever seen for all the places I've lived in the US) and yes, I've noticed the exit count is off.
 
Great work @verygreen and @DamianXVI - the whole Tesla enthusiast community is indebted to you!

so what's the importance of the 3D bounding box @Bladerskb ? you just need to know where you cannot drive on your side of the obstacle, who cares how is it shaped on the other sides anyway (in other words you can skin this cat in other ways). and they do this with driveable space, the bounding boxes are just for information more or less and it's possible though unlikely) that they can do 3D bounding boxes, there's depth information, just no orientation vector to draw the 3d part.

Show us 3D planned path prediction or driveable space prediction from other vendors if you have the samples? Those seem to be pretty important things.

One thing occurred to me just today.

I'm running 2018.32.4 040c866 and I have noticed its car recognition as having improved compared to older versions. It actually seems to show pretty accurately in-lane and next-lane cars on the IC, although of course it does not separate between vehicle type like AP1 does. But it is pretty fluid and accurate IMO.

One thing that is special is that it clearly uses the fisheye (and/or radar) to see a wider angle than AP1-like narrow angle would be capable of. I was driving, in traffic, on a winding road and it actually displayed the queue of cars in front of me, that spanned a significant portion of my FoV towards the left - with the lane turning so sharply, the cars spanned a range much wider than what the main/narrow cameras on AP2 (or the AP1 camera) could see... So far so good, pretty impressive actually.

But here comes my point: It displayed all those cars in a correct arc, in their correct places, but it was unable to recognize the lane's angle, so they were "hanging in air". Without the lane approximation or movement to guide the IC's display system, and clearly lacking or ignoring any 3D data that might (or might not) exist, it simply defaulted to all the cars facing my driving direction. So the queue of stopped cars - winding/queueing towards the left in reality - ended up looking like a bunch of cars (vertically) parked next to each other in a left-winding arc.

3D boxes would help with such displays - and of course with any decision-making that might rely on the direction and angle of other cars.
The "toy car" orientations in IC sort of indicate that there might be some 3D understanding going on of the diiferent "vechiles" :) No?

I've always thought it basically just guesses based on the lane angle (which in itself is very much only an approximation on the IC), though I admit the "toy cars" sometimes flipflop directions on the IC, which might suggest it is trying to do more than that... I guess it might also use their movement as a guide, which wasn't available with a queue stopped in traffic, but that doesn't seem very reliable IMO.

Please keep up the good research and discussion, guys! Thank you.
 
Last edited:
  • Helpful
Reactions: croman