Last Week in AI - A Weekly Unwind

From 16-Jun-2024 to 22-Jun-2024

and

Jun 23, 2024

It was yet another thrilling week in the AI field with advancements that further extend the limits of what can be achieved with AI.

Here are 10 AI breakthroughs that you can’t afford to miss 🧵👇

Anthropic Releases Claude 3.5 Sonnet 🚀

Anthropic has released Claude 3.5 Sonnet, succeeding the Claude 3 models family which came out just 3 months ago. Claude 3.5 Sonnet features a context window of 200k tokens and improves upon both speed and intelligence from other leading AI models including Claude 3 Opus and GPT-4o.

Availability: Free on Claude.ai and the Claude iOS app, also via API, Amazon Bedrock, and Google Cloud’s Vertex AI.
Superior Performance: Outperforms Claude 3 Opus, GPT-4o, Gemini 1.5 Pro, and Llama-3 400B in benchmarks like graduate-level reasoning, MMLU, math, and coding.
Speed and Cost: Operates twice as fast as Claude 3 Opus, at the same cost of $3 per million input tokens and $15 per million output tokens.
Vision Capabilities: Excels in visual math reasoning, chart understanding, and document understanding.
Artifacts Feature: Introduces a dynamic workspace for interacting with AI-generated content such as code snippets and text documents.

Two AI Video Generation Models in 1 Week 🌉

Runway has released Gen-3 Alpha, a new AI model for generating high-quality videos. Gen-3 generates >10 seconds long videos with expressive human characters and a wide range of actions, gestures, and emotions.

While the samples might have been cherry-picked by Runway, we used the same prompts in Luma Labs Dream Machine and the difference in quality and realism is significant. However, Dream Machine is available to try for free while Runway has not made Gen-3 available for use.

Prompt: Dragon-toucan walking through the Serengeti.

Gen-3 Output

Dream Machine Output

Hear Your AI Videos Come to Life with Google DeepMind 🥁

Google DeepMind is developing V2A (video-to-audio) technology to generate synchronized soundtracks for silent videos. V2A uses video pixels and text prompts to create realistic sound effects, dialogue, and music matching the scene.

This diffusion-based approach offers flexibility and control, allowing users to generate and customize soundtracks, significantly enhancing the emotional depth and storytelling of AI-generated videos.

Prompt for Audio: A drummer on a stage at a concert surrounded by flashing lights and a cheering crowd.

Nvidia’s Toolkit for Generating Synthetic Data for LLMs 🧰

Nvidia has launched an open-source toolkit, Nemotron-4 340B, designed to generate synthetic data for training LLMs. This toolkit includes three models – a base model, an instruct model, and a reward model – which create and evaluate synthetic data for accuracy and coherence.

Optimized for efficiency, it integrates with Nvidia’s NeMo framework and TensorRT-LLM library. Developers can also customize the base model using proprietary data and Nvidia’s HelpSteer2 dataset for specific applications.

nemotron synthetic data generation pipeline diagram

Retired US Army General Joins OpenAI Board 🔐

Former head of the National Security Agency, retired US Army General, and a cybersecurity expert, Paul M. Nakasone has joined OpenAI’s Board. As their first priority, Nakasone has been appointed as a member of OpenAI’s Safety and Security Committee.

While Nakasone’s insights can help build safer and more responsible systems, the timing suggests that OpenAI is more concerned with the optics, aiming to show that they are taking this issue seriously when there is growing criticism on how the company is prioritizing commercializing their products.

Further, OpenAI CEO Sam Altman has reportedly told some shareholders that the company is considering changing its governance structure to a for-profit business that the company’s nonprofit board doesn’t control.

Robots Learn from Human Demos in Real-Time 👬

Stanford researchers have developed a new system called HumanPlus that enables robots to learn complex tasks from human demonstrations. HumanPlus uses a single RGB camera for real-time shadowing of human motions, allowing robots to replicate these movements without expensive motion capture systems.

The system also trains robots using a 40-hour dataset of human motions, enabling them to perform a wide range of tasks autonomously. In tests, HumanPlus achieved 60-100% success rates in tasks such as wearing a shoe, unloading objects, and folding clothes.

Local AI Copilot Using Opensource AI Models 🤖

The Open Interpreter team has introduced Local III, a suite of tools and features that lets you run powerful language models locally. This release features:

Local Model Explorer: An interactive interface to easily select, download, and configure local models.
Optimized Profiles: Pre-configured settings for popular local models like Llama 3 and Condestral, customizable for other models.
Local Vision: Enables image processing with Moondream, generating descriptions and OCR output.
Experimental Local OS Mode: Allows LLM to control mouse, keyboard, and see the screen using the Point model for icon identification.
The ‘I’ Model: A free, opt-in model based on Llama3-70B for interacting and training a new open-source language model for computer control.

Nvidia Becomes The Most Valuable Company Globally 🥇

Nvidia has become the world’s most valuable company, surpassing Microsoft. This comes just two weeks after Nvidia overtook Apple as the second most valuable company. Nvidia’s stock price rose by about 3.5% on Tuesday, giving the chipmaker a market capitalization of over $3.33 trillion. Microsoft’s market cap, on the other hand, fell slightly, standing at nearly $3.32 trillion.

OpenAI Co-founder Starts Safe Superintelligence ⛑️

OpenAI co-founder Ilya Sutskever has launched the world’s first AI lab to build Safe Superintelligence. Safe Superintelligence Inc. (SSI) is dedicated solely to developing a safe and robust superintelligence. Sutskever, alongside Daniel Gross and Daniel Levy, believes the problem of safe superintelligence is the most pressing technical challenge of our time.

SSI aims to advance capabilities alongside safety measures, ensuring safety always remains ahead of progress. The company emphasizes its singular focus on superintelligence, its business model and structure are designed to avoid distractions and short-term commercial pressures.

One-Stop Shop for Open-Source LLM Development 🔨

FinetuneDB has released a significant update simplifying work with open LLMs. This update includes a streamlined platform for managing datasets and fine-tuning models, featuring a robust dataset manager with version control, automatic validation, and a visual function calling editor. It also supports fine-tuning and serving popular open-source models, aiming to reduce costs and improve model performance through optimized processes.

Which of the above AI development you are most excited about and why?
Tell us in the comments below ⬇️

That’s all for today 👋

Stay tuned for another week of innovation and discovery as AI continues to evolve at a staggering pace. Don’t miss out on the developments – join us next week for more insights into the AI revolution!

Click on the subscribe button and be part of the future, today!

📣 Spread the Word: Think your friends and colleagues should be in the know? Click the ‘Share’ button and let them join this exciting adventure into the world of AI. Sharing knowledge is the first step towards innovation!

🔗 Stay Connected: Follow us for AI updates, sneak peeks, and more. Your journey into the future of AI starts here!

Shubham Saboo - Twitter | LinkedIn ⎸ Unwind AI - Twitter | LinkedIn

Awesome LLM Apps | Sponsor Us

Unwind AI