Dog? Pig? or Loaf of Bread? Computer Vision and Object Detection

When you think about it, computer vision has actually had quite an illustrious film career. From science fiction to cartoons, the concept of smart machines that can see have captivated the imaginations of audiences for decades. In Netflix’s new animated film The Mitchells vs. The Machines, the Mitchells embark on a road trip only to find themselves to be the last line of defense in the robot apocalypse! **Spoliers ahead** Luckily they discover a secret weapon: the family dogMonchi the dog proves to be man’s best friend and helps to defeat the evil robot army in the most hilarious way. Being that we are self-proclaimed computer vision and machine-learning geeks, we thought it would be fun to break down the Mitchells’ “Monchi Defense”.

The Monchi Defense: 

The Mitchells discover that their dog is able to fool the robot sentry’s computer vision system–causing the robots to malfunction. So they strap him to the front of their station wagon and plow their way through the robot horde.

 Normally, computer vision algorithm errors wouldn’t make us laugh but the scenario in this film is at once absurd and real–and we couldn’t help it. Monchi, presumably a beleaguered old pug, cannot be identified by the robots and instead causes an object detection malfunction as the computer roulettes between recognizing Monchi as a dog, pig, and loaf of bread. This glitch in the CV algorithm results in the hilarious takedown of robots in the film, but misidentifying objects can actually be a real problem with computer vision models! It’s important to build robust training datasets that take into account the variations a model might see.

What does it teach us about computer vision? 

Teaching a computer to recognize one dog breed in a controlled environment is relatively simple but recognizing all dog breeds in multiple environments would be a massive project. Dogs really can be a complicated class to predict since there are so many variations of breeds and sizes! The hypothetical dataset used to train the robots could have been low on pugs specifically resulting in low confidence in prediction between dog, pigs, and bread.

We promise we aren’t on the side of the robots, but it looks like their machine learning engineers need a little help improving their doggo dataset. We can recommend a great annotation platform. Check out how you can create better datasets faster with Sense Data Annotation!

More Plainsight Blog Posts:

SmartML: Train Vision AI Models with One Click

SmartML: Train Vision AI Models with One Click

SmartML is Plainsight’s proprietary model training toolset. It is at the heart of Plainsight’s promise to make vision AI’s impressive capabilities more accessible to users of all experience levels and enterprises of all types. Thanks to SmartML, Plainsight users can use their labeled datasets to train vision AI models without the need for a single line of code.

TrackForward: Faster Frame-by-Frame Video Labeling

TrackForward: Faster Frame-by-Frame Video Labeling

TrackForward is a Plainsight AI-powered labeling feature that can dramatically reduce the time and effort required for labeling video data. Using AI, TrackForward analyzes the labels in one frame of a video to predictively label objects in subsequent frames. By labeling an object with either a Bounding Box or Polygon and selecting the TrackForward tool, Plainsight users can quickly generate labels automatically for desired objects across entire videos.

Computer Vision and Retail’s Digital Transformation

Computer Vision and Retail’s Digital Transformation

When it comes to digital transformation, retailers are no longer asking, “What if?” They’re saying the time is now. For the retail industry in particular, integrating vision AI applications into traditional store operations can open up a world of fresh insights that empower better decision making.