My job went to Nvidia

… and all I got was this lousy blog entry.

For the title of this blog entry, I was reminded of reading about a book by Chad Fowler titled “My job went to India”. I read its successor, The Passionate Programmer, as it was discussed during the initial days of the software craft movement on the mailing list. Why did so many jobs move to Nvidia, the graphic processor company? Or didn’t it? Let’s find out.

At the moment, it seems that every company out there is innovating hard on AI approaches. Not just the executive suites are playing with it, but they also ask the technology side of things to implement it into their product. New job titles like “prompt engineer” can be seen on the job markets, and one of the beneficiaries behind all the hype stems from the underlying hardware technology that is used by so many LLM-providers: Nvidia with its Cuda-core technology first of all used on graphics processors to create 3D game worlds, and their respective computing power behind it.

Don’t get me wrong, I’m not trying to run an advertisement here.

Back in my university days, I first worked on an approach to detect faces in a stream of pictures. Real-time processing was our goal, meaning we could detect all faces in at least 30 frames per second. We did some research, and decided to implement an offline learning algorithm that was published by some folks at Intel called AdaBoost. We collected face sample images, and non-face sample images, fed it into our training algorithm. We were allowed to use some faculty machines, so with training on 8 or so computers, the training of the resulting detector took about a week or so. After that, we could evaluate the detector performance with an evaluation set of images, that was not part of the original training set to get a feel for the false positive and false negative rates of the detector, and even try it out on some live camera images.

That was back in 2002-2003 on a Pentium III 800 processor.

Later on, we used the same training and detection algorithm as part of the “Visual Active Memory Processing and Interactive Retrieval” (yes, V.A.M.P.I.Re.) project to detect usual office supplies like a keyboard, a mouse, a TESA roller, and so on. We had to deal with a camera mounted on a helmet and the respective views on the office supplies.

Even later, I wrote my diploma thesis on hand gesture detection in a robot-human interaction scenario.

I recall wondering in 2010 that – at that point in time – my telephone was able to do face detection in real-time and it fits into my pocket.

That’s when I realized a certain glimpse of Moore’s Law in effect.

Skip forward to today, and our computing power increased to the degree that Large Language models can now be trained in data center, or as DeepSeek showed us, even at home on your very own GPU.

The downside?

Everyone plays around with a use case inside the larger things that used to be called AI back in my day. Heck, it wasn’t even called AI back in my day. It was just a means to detect certain aspects in an image.

So….. did my job/your job/anyone’s job move to Nvidia?

I don’t think so. The problem with the underlying learning algorithms to some extend lies in a detector or a generator becoming to over-specified. With all the texts out there currently being generated by those LLMs, the new training data will have its difficulties to generate good texts that are not overly specific – and I also find them boring. There are several parameters at play that a human needs to watch in the training process. For example, in the face detection scenario there was a problem with reliably detecting white person faces and black person faces in the same quality. Usually you could only have reliability for one kind, but not the other at the same kind. In the years to come, the resulting LLMs will be trained on too many LLM-generated texts to produce an output that’s still worthwhile to read in the end. Probably something similar will happen with image generation.

Personally, I like Taiichi Ohno’s “automation with a human touch” when I think about tools like these. It was the same with test automation, now it’s the same with LLMs. They may have a use to make tedious tasks easier, but I still want to double-check their results. For that, we still need our skills from the old days, but can get to some results quicker. Not sure whether these results will always be better, though.

Take-away?

Hone your skills, train them, get better at what we humans do best: learn. With a one week feedback cycle we had quite a long feedback loop for our face detectors. Nowadays you might get a shorter feedback loop. That means, that you can learn faster, though you will still need to rely on your skills to double-check the results, rather than taking a good quality for granted. And that will stay.

What are your thoughts?

This text was human-generated.

  • Print
  • Twitter
  • LinkedIn
  • Google Bookmarks

Leave a Reply

Your email address will not be published. Required fields are marked *