“Behind!” How Are Computers That Understand Our Visual World Like Chefs Who Eat Their Own Food?
In a recent article that caught my eye, Rachel Gordon, Communications and Media Relations Officer at MIT’s CSAIL, details recent exciting developments in computer vision. In her article, “Computer vision system marries image recognition and generation” she notes, “Computers possess two remarkable capabilities with respect to images: They can both identify them and generate them anew.” Rachel draws the analogy of a chef (creator of dishes) and connoisseur (knowledgeable taster of dishes) who share a common understanding of the taste of food, but operate in very distinct capacities. She asks, what would it mean to unify them?
Professional kitchens are adrenaline-fueled, eye-popping visual environments where computer vision-like requirements are necessary for a chef’s success. (Check out these videos to see how Plainsight helps restaurants understand their visual world). If you’ve never worked in a restaurant, you need only to turn to popular television shows for evidence. In “The Bear”, the chaos of a professional kitchen is brought to order with learning and improvement, standardized procedures, common language, attention to detail, knowledge of all things at all times within the environment, and perhaps most importantly, resourceful filling in of any blanks. Tune in to Food Network to feast on the irresistible recipe of chef competitors, surprise ingredients, a ticking clock, and a panel of strict judges. Chefs are urged to taste as they cook–not doing so is considered a rookie mistake. But, rarely do we see them get the chance to eat their own masterpieces, or potential culinary disasters (really, what can we expect from a mystery basket of geoduck, fiddlehead ferns, and squid ink?).
In professional kitchens, CV-like requirements are necessary for a chef’s success.
Across industries and vision AI use cases, dual-purpose capabilities produce enormous benefits. Rachel Gordon reports that researchers at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) have developed a system that combines both image recognition and image generation. Masked Generative Encoder (MAGE) can fill in missing parts of an image while accurately understanding the image’s content. MAGE achieves this by converting images into compact, abstracted versions called “semantic tokens,” which represent small sections of the original image.
These tokens are used for complex processing tasks while preserving the original image information. MAGE uses a technique called “masked token modeling” to randomly hide some tokens and train a neural network to fill in the gaps. This allows MAGE to both recognize patterns in images and generate new ones, facilitating high accuracy with few labeled examples.
Professional chefs just don’t have time to manage the kitchen, create their dishes, and be a connoisseur of their food as well. In contrast, computer vision, especially now in conjunction with generative AI and LLMs, computing power and capacity are accelerating a multitude of simultaneous capabilities, enabling computers to help us gain a deeper understanding of our visual world.
Protecting See-Birds: Computer Vision Monitoring Helps Understand Puffin Populations So They Can Thrive
Computer vision is now helping researchers track and understand 46,000 breeding pairs of puffins. Living on Scotland’s Isle of May, there are many seabirds, human researchers, and now CV-powered surveillance. Researchers are identifying each Puffin with facial recognition and tracking their movements and behaviors throughout their four-month breeding season onshore.
Puffins on the Isle of May. Photo credit: SkyNews
Seabirds face multiple threats: steep swings in water temperature, bird flu which has decimated some colonies and commercial fishing, particularly of sand eels, the puffin’s favorite food. In addition, a new potential threat are wind turbines.
The researchers study can help determine the potential impact a wind farm might have on the seabirds and enable renewable energy companies to proactively address threats. “I think it’s hugely important and plays a really big role when it comes to conservation and sustainability. It helps to unravel a lot of the complexity that conservation brings. Having the opportunity to track, see and gather those insights… means that we can make better decisions in terms of how we go to protect biodiversity as a whole,” Musidora Jorgensen, Chief Sustainability Officer.
Computer vision is emerging as nature’s defender. Learn how Plainsight vision AI solutions are assisting MarineSitu to provide renewable energy providers with monitoring capabilities that helps them understand the impact of underwater turbines on aquatic life and marine ecosystems.
Accurately Counting The World’s Largest Bat Colony, Africa’s Secret Gardeners
While on the subject of protecting and understanding nature, we turn to Africa’s largest bat colony. Now, for those of us who have experienced bat encounters of the close kind, this may take a moment to sink in. This particular bat colony population was previously never known and estimates ranged as large as 10 million.
Every year in November, a small forest in Zambia becomes the site of one of the world’s greatest natural spectacles. In November, straw-colored fruit bats migrate from all over the continent to a patch of trees in the Kasanka National Park. For reasons not yet known, the bats spend three months as the largest colony of bats anywhere in Africa.
Science Daily provides details of a new method to accurately count the bats using GoPro cameras and computer vision. With this method, the colony size is now estimated to be between 750,000 and 1,000,000 bats. Personally, I can see how humans might estimate 1M bats as 10M, and is a perfect reason for computer vision to assist. In all seriousness, the importance of protecting bats cannot be understated. Understanding the population dynamics and monitoring the straw-colored fruit bats is vital for conservation efforts in the context of human development and climate change. These bats are significant contributors to reforestation as they disperse seeds across large distances, making them a crucial species in the African ecosystem. By comprehending changes in their population, we can ensure the maintenance of ecosystem services and the protection of biodiversity.
Last week, we announced our patent for livestock counting, identification, and tracking. At Plainsight, we fully understand the importance of obtaining accurate animal counts to gain an accurate population understanding, efficiently and consistently.
Threads Launches and Goes From 0-to-70 Million Sign Ups in One Day
“Life is a great tapestry. The individual is only an insignificant thread in an immense and miraculous pattern.” ~ Albert Einstein
Meta’s new text-based social media platform and Twitter competitor, Threads, has experienced rapid growth since its public launch, with an influx of users from Instagram, which already boasts a massive user base. Within just one full day since its debut, Threads has garnered a staggering 70 million sign-ups, according to Meta CEO Mark Zuckerberg. Analysts suggest that if only a quarter of Instagram users adopt Threads on a monthly basis, it could potentially rival the size and reach of Twitter. This impressive early growth indicates the potential impact of Threads in the social media landscape. Meta hired many former Twitter employees, and now Elon Musk (via lawyers) is accusing the company of “unlawful misappropriation” of trade secrets.
Threads is a platform that allows you to publish short posts or updates that are up to 500 characters. You can include links, photos or videos up to 5 minutes long. The app is linked to your Instagram account, and according to Meta, you can “easily share a Threads post to your Instagram story, or share your post as a link on any other platform you choose.” The core accessibility features available on Instagram today, such as screen reader support and AI-generated image descriptions, are also enabled on Threads.
About the Author & Plainsight
Joan Silver is the SVP of Marketing at Plainsight and oversees the full scope of marketing and communications for the company. Propelling multiple companies from startup and launch, through funding rounds, to successful IPOs and acquisitions, Joan builds brands from big ideas to big time. A trailblazer in the B2B digital marketing industry, she’s a believer in the transformational power of AI and loves technology that solves problems and improves our daily lives.
Plainsight provides the unique combination of AI strategy, a vision AI platform, and deep learning expertise to develop, implement, and oversee transformative computer vision solutions for enterprises. Through the widest breadth of managed services and a vision AI platform for centralized processes and standardized pipelines, Plainsight makes computer vision repeatable and accountable across all enterprise vision AI initiatives. Plainsight solves problems where others have failed and empowers businesses across industries to realize the full potential of their visual data with the lowest barriers to production, fastest value generation, and monitoring for long-term success. For more information, visit plainsight.ai.