Welcome to Tesla Motors Club
Discuss Tesla's Model S, Model 3, Model X, Model Y, Cybertruck, Roadster and More.
Register

Tech-wonky Q about autonomy and chip sets

This site may earn commission on affiliate links.

AudubonB

One can NOT induce accuracy via precision!
Moderator
Mar 24, 2013
9,841
47,941
It is painfully obvious to me that a large number of you are supremely more knowledgeable about today's computers and electronics than I ever will be. Fortunately, you also all are compassionate, well brought-up chappies who delight in explaining in comprehensible terms your stores of knowledge to those of us less fortunate.

That buttering-up taken care of, would some of you care to turn into English the following comment about Nvidia Corp regarding their future in vehicular navigation systems, and in particular how it redounds to Tesla? This is from Alex Cho. Thanks.

the logic silicon needed to run all of those complex algorithms seems uniquely situated for CUDA based API, which is proprietary only to Nvidia.
 
Not sure if this is the right level of detail answer to your question so feedback is welcome...

I assume you mean Alex Cho from SeekingAlpha? If so I suspect that what Alex is referring to is the fact that NVidia provides the 3 x Tegra 3 computers that power the center console, driver console, etc. NVidia is primarily a graphics card (GPU) company. Graphics cards these days have tremendous computational ability, orders of magnitude more power than CPU's, and they can do things other than graphics. NVidia has created a programming language called CUDA which allows a graphics card to do the kinds of processing work that you might normally use a CPU to do. This is commonly called GPU Compute or GPGPU (General Purpose compute on a GPU).

For completeness it probably makes sense to point out that there are vendor neutral OpenGL provides OpenCL and Microsoft provides DirectCompute.
 
CUDA is a programming system that NVidia (a company the builds video cards) created. The idea is that unlike a normal computer that can only do one thing at a time (actually some can do up to 10), but that thing may be pretty complex, a system that can use the CUDA system can do many things, typically a few thousand, at one time, albeit that thing is simpler. For many applications, such as machine vision, it turns out that doing lots of simple operations at once beats the pants off doing one complex thing at a time.

As a slightly deeper dive into this, one of the big limitations of using the CUDA system, or any other similar system that runs on a video card, is that while it can do a few thousand operations at once, they all have to be the same operation. For example, it can have an operation that adds two numbers together running a thousand times at once, but it can't have 500 operations doing an add, and another 500 operations doing a multiply at the same time. All the simultaneous operations have to be the same, e.g. adds, but they decidedly don't have to be the same set of numbers. This may seem like a serious issue, but actually, when you think about trying to analyze an image, a lot of the work is identical for every pixel, so rather than working on a million pixels one at a time, the CUDA system can work on a few thousand at a time. It does make the programming pretty tricky, but truth to tell, it's really a lot of fun once you get your head around it. In playing with this on a pretty old NVidia card, I was able to speed up a process by a factor of about 200 by switching it from the main CPU that could only run 8 simultaneous operations to the video card that could run 1024 simultaneously.

One final comment. CUDA is NVidia's proprietary way to do this highly parallel (simultaneous) programming, however it's not the only one. There is an industry standard system called OpenCL and a related one called OpenGL that do exactly the same thing but which is supported by pretty much all the companies that make such hardware, including NVidia.
 
My attempt to turn into English (but you know, YMMV): If you use certain nVidia microchips, then you really need to use CUDA as the way you interact with the features of the microchips. If you don't, you'll be mostly sad since you'll waste resources, time and not happy with the end result as you'll not get a rich feature set or might lack features all together since it just isn't possible. Without CUDA you'll essentially have to call the microchip resources directly and you may essentially just end up building your own CUDA (aka not a short process).

Using CUDA isn't all fluffy bunnies though and it isn't free.

So, did you get the info you were after? Or is there more you are interested in?
 
Not quite correct. NVidia supports OpenCL which is an industry standard.

So basically situation is this:
1) If you have NVidia then you can use either CUDA or OpenCL.
2) If you have another graphics card then you have to use OpenCL.

OpenCL does not have _all_ the features of CUDA and is somewhat less optimized on NVidia, but it's entirely adequate unless you're trying to squeeze out every percent of speed.

(I'm leaving out Vulkan, Metal, CUDA-to-OpenCL translators and other stuff for simplicity)
 
Nvidia (mostly CUDA, also OpenCL) and AMD (OpenCL) offer parallel computing capability based on computer graphics chips. These processors are now found in supercomputers used for any highly parallelizable task like weather forecasting. Mobileye's EyeQ chips, which are used for video processing in Tesla's AP, are roughly similar. I recall hearing Mobileye claim that their architecture is more specialized for running video processing algorithms than Nvidia's or AMD's processors, which offer more general-purpose parallelism. This might offer a cost advantage. The real difference is that Mobileye has a more vertically-integrated model offering both processors and the video processing software they run. AMD and Nvidia are focused on offering processors and programing tools for others to use.

At least that's how I understand it.
 
I am thinking of beginning a thread devoted solely to Tesla-related acronyms and abbreviations (automotive, electronics, computer/logic, regulatory and the like); the idea is to provide a cogent answer as to what, for example, "CUDA" not only stands for but what it implies. It probably will have to go in the Off Topic category - but here is a heads-up for those of you to contribute. Thanks in advance.
 
Last edited:
The thing is that CPU's and GPU's are kind of general purpose, although they do their work somewhat differently, devices like ASICs are really very specialized and as such can be much faster at a limited variety of problems. The Intel Quick Sync video encoder/decoder hardware is another example. We've gone from very general purpose CPUs such as the Intel Pentium, programmable in a host of languages, to somewhat specialized, but still pretty generalized GPUs programmed using CUDA, OpenCL and OpenGL, to highly specialized ASICs which really are largely programmed when the hardware is designed.

I'm wondering if what we're going to be seeing in the near future is more of these very specialized hardware devices closely attached to more general purpose CPUs to do things like machine vision and other autonomous driving devices. Back to the age of "Silicon Compilers" with the application being largely build into the chips. It's not all that crazy to imagine a world where a software update is done by replacing the processing unit on a tiny daughter card. Forget Firmware, we're talking Stoneware.