Last week’s look at AI and ML news includes: 2D Video Takes On A New Dimension for 3D Creators, White House Announces 2-Year AI Cyber Challenge, Creating Better AI Agents for Real-World Tasks, AI Helps Airline Pilots Reduce Polluting Contrails, Blueshift Memory’s New Accelerator Chip for CV, and Nvidia’s New GH200 Chip for AI Inference
2D Video Takes On A Whole New Dimension: Shutterstock Collaborations to Bring NeRF Generative AI Technology to 3D Creators
At SIGGRAPH 2023, Shutterstock announced collaborations with three leading innovators in neural radiance fields (NeRF) technology: Luma Labs AI, RECON Labs, and Volinga AI. The goal of the partnerships is to advance NeRF technology in the realm of 3D content creation.
To create high-quality, photorealistic 3D scenes from 2D video footage and drastically reduce the time it takes to create complex 3D environments, Shutterstock plans to work with Luma Labs AI and RECON Labs to leverage the potential of NeRF technology and integrate their existing 3D assets into Shutterstock’s platform, Turbosquid, which focuses on commercial licensing of 2D and 3D assets.
For various creative applications like virtual production, VFX, TV, broadcasting, XR, and video games, Shutterstock is collaborating with Volinga AI to explore the licensing and distribution of NVOL files, which are part of Volinga’s NeRF Suite. Also, plans for the collaboration include developing an NVOL library of high-quality content and establishing a network of creators to produce on-demand NVOL environments.
The White House Launches 2-Year AI Cyber Challenge
The Biden-Harris Administration has launched a two-year AI Cyber Challenge (AIxCC) hosted by DARPA, with collaboration from prominent AI companies including Anthropic, Google, Microsoft, and OpenAI. The goal of this competition, with nearly $20 million in prizes at stake, is to address the critical issue of computer code security.
The competition was announced at the Black Hat USA Conference in Las Vegas, known for generating cybersecurity innovations, and aligns with the administration’s commitment to responsible technological advancement. AI companies will make some of the most powerful AI systems in the world available for competitors to use in designing new cybersecurity solutions.
This initiative will seek to identify and rectify vulnerabilities in software, showcasing the potential of AI in enhancing the security of the internet and societal infrastructure such as power grids and transportation systems.
Into The Wild Blue Yonder: AI Helps Airline Pilots Avoid Areas That Create Polluting Contrails
Google and American Airlines have conducted an experiment using AI to help pilots avoid flight paths that create polluting contrails – clouds of condensation behind planes. These contrails contribute to global heating by trapping solar radiation. Google said in a blogpost that the experiment showed that AI-guided flight path adjustments reduced contrails by 54%, with further improvements expected.
The experiment also indicated that offering contrail avoidance services could be a potential business venture for Google. Although avoiding contrails increased fuel consumption by 2%, the overall impact on an airline’s fuel consumption is estimated to be around 0.3%. This effort aligns with the aviation sector’s urgent need to reduce emissions and its contribution to global heating.
Do What I Say: Creating Better AI Agents for Real-World Tasks with Natural Language Instructions
Artificial intelligence researchers at UC Berkeley have introduced Dynalang to address the challenge of creating AI agents that can effectively perform tasks in the real world by following natural language instructions. While large language models (LLMs) have shown progress in handling specific tasks, they often fall short when tasks require a broader understanding of context.
Dynalang works in different types of environments, using language to better learn world models.
The Dynalang technique is unique with two distinct training modes. The first mode involves training the world model to predict future representations using text and visual data collected online as the agent interacts with its environment. This process mimics the self-supervised learning used by humans to link observations with language. The second mode trains the agent’s action policy using reinforcement learning on the representations from the world model and tasks.
In practical terms, Dynalang processes text instructions and image frames as streams of tokens, allowing the agent to make decisions and actions over time. This method is different from techniques that provide complete instruction text upfront. The Dynalang system can be pre-trained on raw data (text and images) and fine-tuned on smaller datasets of sensory and action data.
The Dynalang research paper is currently in pre-print, meaning it has yet to undergo the rigorous process of peer review. However, the authors of the paper include highly respected figures in the field of AI research, including Pieter Abeel, the Director of the Berkeley Robot Learning Lab and co-director of the Berkeley AI Research Lab.
Despite not yet reaching the level of state-of-the-art techniques in certain areas, Dynalang shows potential by requiring less manual annotation and learning from raw data and researchers envision its application as a self-improving multimodal agent interacting with humans in real-world scenarios.
Blueshift Memory to Unveil New Accelerator Chip for Computer Vision
After a successful 13-month research and development endeavor, funded by an Innovate UK Smart grant, the Blueshift Memory has unveiled its Cambridge Architecture. Blueshift Memory will present the cutting-edge chip at the Flash Memory Summit, and a corresponding paper will delve into the development of this RISC-V-based chip, showcasing its impressive performance enhancements.
One of the most remarkable applications of this technology lies in security scenarios. By integrating the chip into surveillance cameras, it enables real-time identification of firearms and triggering of alarms, potentially preventing tragic incidents.
The Cambridge Architecture stands as a solution to the Von Neumann Bottleneck, which has hindered computational speed by impeding efficient data transfer between memory and core. By addressing this bottleneck, the architecture enhances computational efficiency in data-intensive tasks and reduces energy consumption by minimizing unnecessary data movement.
Blueshift Memory’s breakthrough chip promises faster and more energy-efficient solutions for a wide range of applications. With its potential to reshape industries and save lives, this innovation marks a significant leap forward for AI-accelerated image recognition.
Nvidia Announces New GH200 Chip for AI Inference to Maintain Market Dominance
Currently holding over 80% of the AI chip market share, Nvidia’s expertise lies in graphics processing units (GPUs), widely preferred for large AI models essential to generative AI applications such as Google’s Bard and OpenAI’s ChatGPT. Yet, high demand for Nvidia’s GPUs, driven by tech giants, startups, and cloud providers pursuing AI model development, has led to supply shortages.
The newly revealed GH200 chip, similar in GPU design to Nvidia’s H100 flagship AI chip, is differentiated by its incorporation of 141 gigabytes of cutting-edge memory and a 72-core ARM central processor. Nvidia CEO Jensen Huang highlighted the chip’s scale-out potential for global data centers during a recent conference talk. Scheduled for availability through distributors in the second quarter of next year, with sampling anticipated by year-end, Nvidia has not yet disclosed pricing details.
Typically, working with AI models involves two stages: training and inference. Training involves using extensive data to build models, a time-consuming process requiring significant computational power, often facilitated by GPUs like Nvidia’s H100 and A100 chips.
Subsequently, these trained models are used for predictions or content generation through inference. However, inference is computationally intensive and requires substantial processing power for each software run, impacting tasks like generating text or images. Unlike training, inference happens frequently, only requiring model updates periodically.
Nvidia’s GH200 chip focuses on inference due to its enhanced memory capacity, accommodating larger AI models on a single system. Ian Buck, Nvidia VP, mentioned that the GH200’s 141GB memory surpasses the 80GB of the H100, enabling larger models to remain on a single GPU instead of necessitating multiple systems or GPUs.
Notably, Nvidia’s announcement coincides with rival AMD’s recent unveiling of its own AI-focused chip, the MI300X, boasting support for 192GB of memory and emphasizing AI inference capabilities. Concurrently, companies like Google and Amazon are also developing custom AI chips for inference tasks.
About the Author & Plainsight
Joan Silver is the SVP of Marketing at Plainsight and oversees the full scope of marketing and communications for the company. Propelling multiple companies from startup and launch, through funding rounds, to successful IPOs and acquisitions, Joan builds brands from big ideas to big time. A trailblazer in the B2B digital marketing industry, she’s a believer in the transformational power of AI and loves technology that solves problems and improves our daily lives.
Plainsight provides the unique combination of AI strategy, a vision AI platform, and deep learning expertise to develop, implement, and oversee transformative computer vision solutions for enterprises. Through the widest breadth of managed services and a vision AI platform for centralized processes and standardized pipelines, Plainsight makes computer vision repeatable and accountable across all enterprise vision AI initiatives. Plainsight solves problems where others have failed and empowers businesses across industries to realize the full potential of their visual data with the lowest barriers to production, fastest value generation, and monitoring for long-term success. For more information, visit plainsight.ai.