Dog? Pig? or Loaf of Bread? Computer Vision and Object Detection

When you think about it, computer vision has actually had quite an illustrious film career. From science fiction to cartoons, the concept of smart machines that can see have captivated the imaginations of audiences for decades. In Netflix’s new animated film The Mitchells vs. The Machines, the Mitchells embark on a road trip only to find themselves to be the last line of defense in the robot apocalypse! **Spoliers ahead** Luckily they discover a secret weapon: the family dogMonchi the dog proves to be man’s best friend and helps to defeat the evil robot army in the most hilarious way. Being that we are self-proclaimed computer vision and machine-learning geeks, we thought it would be fun to break down the Mitchells’ “Monchi Defense”.

The Monchi Defense: 

The Mitchells discover that their dog is able to fool the robot sentry’s computer vision system–causing the robots to malfunction. So they strap him to the front of their station wagon and plow their way through the robot horde.

 Normally, computer vision algorithm errors wouldn’t make us laugh but the scenario in this film is at once absurd and real–and we couldn’t help it. Monchi, presumably a beleaguered old pug, cannot be identified by the robots and instead causes an object detection malfunction as the computer roulettes between recognizing Monchi as a dog, pig, and loaf of bread. This glitch in the CV algorithm results in the hilarious takedown of robots in the film, but misidentifying objects can actually be a real problem with computer vision models! It’s important to build robust training datasets that take into account the variations a model might see.

What does it teach us about computer vision? 

Teaching a computer to recognize one dog breed in a controlled environment is relatively simple but recognizing all dog breeds in multiple environments would be a massive project. Dogs really can be a complicated class to predict since there are so many variations of breeds and sizes! The hypothetical dataset used to train the robots could have been low on pugs specifically resulting in low confidence in prediction between dog, pigs, and bread.

We promise we aren’t on the side of the robots, but it looks like their machine learning engineers need a little help improving their doggo dataset. We can recommend a great annotation platform. Check out how you can create better datasets faster with Sense Data Annotation!

More Plainsight Blog Posts:

Vision AI Use Cases for the Energy Sector: Monitor Storage Tanks & Detect VOC Leaks

Vision AI Use Cases for the Energy Sector: Monitor Storage Tanks & Detect VOC Leaks

Energy providers have not only taken action to address the causes of leaks, but set ambitious goals to eliminate or offset their emissions altogether. The sector has undoubtedly grown both safer and more sustainable in the years since the above EPA statistics were first published. Still, without the added capabilities of vision AI-enhanced processes, they’re potentially missing opportunities for improvement and letting issues like leaks go undetected.

SmartML: Train Vision AI Models with One Click

SmartML: Train Vision AI Models with One Click

SmartML is Plainsight’s proprietary model training toolset. It is at the heart of Plainsight’s promise to make vision AI’s impressive capabilities more accessible to users of all experience levels and enterprises of all types. Thanks to SmartML, Plainsight users can use their labeled datasets to train vision AI models without the need for a single line of code.

TrackForward: Faster Frame-by-Frame Video Labeling

TrackForward: Faster Frame-by-Frame Video Labeling

TrackForward is a Plainsight AI-powered labeling feature that can dramatically reduce the time and effort required for labeling video data. Using AI, TrackForward analyzes the labels in one frame of a video to predictively label objects in subsequent frames. By labeling an object with either a Bounding Box or Polygon and selecting the TrackForward tool, Plainsight users can quickly generate labels automatically for desired objects across entire videos.