Robot Navigates Google DeepMind with Gemini
Generative AI continues to showcase its potential across various applications, including natural language interactions, robot learning, no-code programming, and design. This week, Google’s DeepMind Robotics team is highlighting another promising application: navigation.
In their latest paper, “Mobility VLA: Multimodal Instruction Navigation with Long-Context VLMs and Topological Graphs,” the DeepMind team details how they have utilised Google Gemini 1.5 Pro to teach robots to follow commands and navigate an office environment. This innovative approach leverages multimodal instruction navigation combined with topological graphs to enhance robot mobility and task execution.
The demonstration features the use of Every Day Robots, a project originally part of Google before being paused due to widespread layoffs last year. Despite this, the robots have found a new purpose within DeepMind’s research.
In a series of videos accompanying the paper, DeepMind employees initiate interactions with a smart assistant-style command, “OK, Robot,” and proceed to instruct the system to perform various tasks around the 9,000-square-foot office space. These tasks range from simple navigations to more complex interactions, showcasing the advanced capabilities of the Gemini-powered robots.
This research underscores the significant strides being made in integrating generative AI with robotics, paving the way for more intuitive and versatile robotic systems capable of functioning seamlessly in human environments.
In a demonstration of the capabilities of generative AI and robotics, a Googler asks the robot to guide him to a location suitable for drawing. The robot, adorned with a jaunty yellow bowtie, responds, “OK, give me a minute. Thinking with Gemini …” It then successfully leads the person to a wall-sized whiteboard. In another instance, a different individual instructs the robot to follow directions written on the whiteboard. The robot processes the command, navigates the office, and arrives at a designated robotics testing area, confidently announcing, “I’ve successfully followed the directions on the whiteboard.”
Before these tasks, the robots were familiarised with their surroundings using a method the team refers to as “Multimodal Instruction Navigation with demonstration Tours (MINT).” This process involved guiding the robots around the office and identifying various landmarks through speech. Following this orientation, the robots employ hierarchical Vision-Language-Action (VLA), a technique that combines environmental understanding with common-sense reasoning. This integration allows the robots to respond effectively to written and drawn commands as well as gestures.
These demonstrations highlight the sophisticated navigation capabilities of the Gemini-powered robots, showcasing their ability to understand and execute complex instructions in a dynamic office environment. The use of MINT and VLA illustrates the advanced level of robot learning and interaction made possible by generative AI, emphasising the practical applications and future potential of this technology in everyday settings.
Google says the robot had a 90% or so success rate across more than 50 interactions with employees.SOURCE
OpenAI Restricts LLM Access for Chinese Developers
OpenAI has taken a definitive step in restricting access to its technology in China by fully blocking its large language models (LLMs) from Chinese developers. This latest move signifies a complete cessation of OpenAI’s services in the country, marking a significant escalation in the ongoing tech standoff between China and the United States.
China initially set the precedent by banning ChatGPT and placing it behind its Great Firewall, effectively curtailing the use of OpenAI’s technology within its borders. This ban reflected China’s broader strategy to maintain tight control over digital content and technology access, particularly from foreign companies.
The decision by OpenAI to block LLM access follows the initial restrictions, further complicating the technological landscape for Chinese developers who had previously leveraged these advanced AI models for various applications. This move underscores the deepening divide in AI technology access and usage between the two nations. By cutting off access to its LLMs, OpenAI aims to navigate the complex geopolitical tensions and regulatory challenges posed by operating in China. This action also highlights the broader implications for global AI development and the potential for increased fragmentation in the international tech ecosystem. As both countries continue to assert their digital sovereignty, the impact on developers, businesses, and technological innovation will be closely watched.SOURCE
Beeble AI Secures $4.75M for Indie Filmmaker Platform
Beeble AI Revolutionizes VFX for Indie Filmmakers with AI-Powered Virtual Production Tools
Visual effects (VFX) have become indispensable in modern filmmaking, revolutionising storytelling and creative expression through a variety of digital techniques. However, the prohibitive cost of VFX tools often leaves independent filmmakers and content creators at a disadvantage, unable to compete with larger, well-funded productions. Addressing this challenge, Beeble AI, a South Korea-based startup, is leveraging artificial intelligence to democratise high-quality VFX.
Founded in 2022 by five former members of the AI research and machine learning team at South Korean game publisher Krafton, Beeble AI focuses on making advanced VFX accessible to filmmakers on modest budgets. Recognizing the critical role of lighting in filmmaking and photography, the co-founders saw an opportunity to innovate in this underserved area, leading to the creation of Beeble AI.
Beeble AI recently secured $4.75 million in seed funding, led by Basis Set Ventures with participation from Fika Ventures, valuing the company at $25 million. CEO and co-founder Hoon Kim shared this milestone with TechCrunch, emphasising the company’s mission to empower indie filmmakers and content creators with cutting-edge virtual production tools.
The startup’s flagship product, Switch Light Studio, is a desktop application that facilitates relighting and composition within virtual environments. This tool is set to be rebranded as Virtual Studio in the third quarter of this year, reflecting its expanded capabilities. “While our initial focus was on virtual lighting, we are now shifting towards developing comprehensive virtual production studios,” Kim explained. “We foresee a future where small teams of fewer than 10 artists can create content that rivals that of major Hollywood studios.”
Virtual production involves the integration of virtual and physical settings in film creation. Traditionally, green screens have been used to enable the addition of VFX in post-production. However, high-end virtual productions now employ large LED screens as backdrops, which are still cost-prohibitive for independent filmmakers. Beeble AI aims to bridge this gap with its AI-powered solutions, providing affordable tools that maintain high production standards. By harnessing the power of AI, Beeble AI’s Virtual Studio allows filmmakers to achieve Hollywood-level visual effects without the associated costs. This innovation not only levels the playing field but also fosters greater creativity and diversity in the film industry, as more storytellers can bring their visions to life. Beeble AI’s approach exemplifies the potential of AI in transforming industries, making sophisticated technologies accessible to a broader audience and enabling a new era of independent filmmaking. SOURCE
Conclusion
The advancements in AI and its integration into various fields are reshaping the future of technology and creativity. Google’s DeepMind is pushing the boundaries of robotic navigation with its Gemini 1.5 Pro, demonstrating the potential for AI-driven robots to perform complex tasks in dynamic environments. Meanwhile, OpenAI’s decision to restrict LLM access to Chinese developers highlights the geopolitical tensions influencing global AI development. Beeble AI’s innovative virtual production tools are democratising high-quality VFX, empowering indie filmmakers to compete with larger studios. Partner with Arcot Group to stay ahead in the evolving landscape of AI technology. Whether you need cutting-edge AI solutions for your business or advanced tools to enhance your creative projects, Arcot Group offers the expertise and resources to drive your success. Visit our website to learn more about how we can help you harness the power of AI and transform your vision into reality.