GPT-4's Biggest Competitor Gemini Ultra is Out!
PLUS: Midjourney's website for image editing, Apple's opensource text-based image editing
Today’s top AI Highlights:
Google just released its most powerful LLM yet - Gemini ULTRA
Use Midjourney on its own website, no more Discord clutter
Using smaller models to Jailbreak larger ones with the highest success rate
Androids perform tasks learned only with data, no human intervention
Opensource text-based image editing by Apple
& so much more!
Read time: 3 mins
Latest Developments 🌍
Google’s most powerful LLM - Gemini ULRA
Google just released Gemini ULTRA, the biggest competitor to GPT-4 and Opensource LLMs. Here's EVERYTHING you need to know:
Gemini Ultra is the first LLM to outperform human experts on MMLU (massive multitask language understanding).
Bard will now be called Gemini. It’s available in 40 languages on the web and is coming to a new Gemini app on Android and the Google app on iOS.
Gemini Pro (formerly Bard) remains FREE, while Gemini Ultra is available as part of the Google AI Premium plan for $20/month.
Gemini Advanced can be your personal tutor - creating step-by-step instructions, sample quizzes or back-and-forth discussions tailored to your learning style.
With the Google AI premium plan, you'll also get access to Gemini in Gmail, Docs, and more (available soon), 2 TB storage, along with other benefits.
Gemini Advanced users will have access to expanded multimodal capabilities, more interactive coding features, deeper data analysis capabilities, and more.
Gemini Advanced is available today in more than 150 countries and territories in English, and Google plans to expand it to more languages over time.
Gemini on your phone - you can type, talk or add an image for all kinds of help while you’re on the go. You can take a picture of your flat tire and ask for instructions or generate a custom image for your dinner party invitation.
Unlike ChatGPT, Gemini does not support uploading multiple file formats such as PDF, Excel, CSVs, etc. It only supports uploading images at the moment.
Google Gemini is available on Android App as of today and will soon be available on iOS in the coming weeks.
Midjourney in a new look 🔥
Midjourney just launched its own website, making it super simple to create amazing AI-generated images. Say goodbye to Discord hurdles and hello to endless creativity. If you've made >1000 images on Midjourney you can now help them alpha-test the new image creation website here.
All Neural Networks. All Autonomous. All 1X speed ✨
1X Technologies, an AI and robotics company that designs androids, has announced its latest release in their autonomous androids, demonstrating new capabilities learned entirely from data, without human intervention in control. These androids are operated by a single vision-based neural network, which manages various tasks such as driving and object manipulation at a consistent 1X speed, with actions emitted at 10Hz. This achievement showcases the androids' ability to perform complex tasks autonomously, emphasizing the system's reliance on neural networks for end-to-end learning from visual inputs.
The training process involves a high-quality, diverse dataset from 30 EVE robots, leading to a "base model" that comprehends a wide range of physical behaviors. This model is fine-tuned for specific tasks, allowing rapid skill acquisition with minimal data. This method underscores a significant shift in programming robots, where "Software 2.0 Engineers" use data to define robot capabilities, enhancing flexibility and reducing reliance on traditional coding. This approach has led to a considerable expansion in the potential applications and tasks these androids can undertake.
Weak-to-Strong Jailbreaking on LLMs 🚨
The increasing sophistication of LLMs has brought to light their susceptibility to jailbreak attacks, which can coax these models into generating content that is harmful, unethical, or biased. While traditional methods for such attacks are known, they often require substantial computational resources. Addressing this gap, the Weak-to-Strong Jailbreaking technique leverages the nuanced differences in initial decoding distributions between aligned and jailbroken models. This method significantly enhances the potential for misalignment, achieving over a 99% success rate in altering model outputs with minimal computational demand.
Key Highlights:
The weak-to-strong jailbreaking attack introduces an innovative approach to compromise LLMs, utilizing two smaller models—a safe and an unsafe one—to manipulate a larger, safe model's decoding probabilities. This technique diverges from traditional, computationally intensive methods, demonstrating that a significant shift in the generation of harmful content can be achieved with just one forward pass per example.
This study tested the attack on 5 diverse LLMs showcasing its broad applicability and effectiveness. The results are alarming, with the attack successfully increasing the misalignment rate to more than 99% on two specific datasets, AdvBench and MaliciousInstruct. This underlines a critical safety issue across the board, highlighting the urgent need for enhanced security measures.
In response to the identified vulnerabilities, the researchers proposed a defense strategy aimed at mitigating the impact of such attacks, potentially reducing the attack success rate by 20%. However, the paper emphasizes the complexity of creating more advanced defenses, urging the community to intensify efforts in improving the alignment and safety of LLM.
Tools of the Trade ⚒️
ElevenLabs GPT: This GPT specializes in converting text to speech with five different voice options, including digital assistant-like, classic narrators, and voices ideal for speeches, podcasts, or children's stories. It primarily supports English but can accommodate other languages with a multilingual model.
ml-mgie: Opensource text-based image editing with improved controllability and flexibility of image manipulation without elaborate descriptions or regional masks.
Spatial Paint: An application in Apple Vision Pro that allows you to draw in 3D space! You can export your creations as USDZ files, a format compatible with many 3D viewers, to share with friends.
Canary 1 by NVIDIA: A state-of-the-art automatic speech recognition (ASR) and speech translation model leading the Open ASR Leaderboard, including Whisper and Seamless M4Tv2, across four languages.
😍 Enjoying so far, TWEET NOW to share with your friends!
Hot Takes 🔥
With all the technological advancement over the past few decades, there's still no more productive input device than a mouse and keyboard. Tablets and smartphones are terrible to type on, laptop trackpads all suck (yes including Apple's). Voice and gesture inputs are bad. Give me the ol' WASD and a ball mouse any day and I'll be pwning noobs in CS1.5 and typing furiously while you're all looking like goofballs gesturing around in the air and pinch and zooming like losers. ~ Kyle Mann
Starting to crash around us is a new wave of technology. This wave is unleashing the power to engineer two universal foundations: a wave of nothing less than intelligence, and life...What if the wave is actually a tsunami? ~ Mustafa Suleyman
Meme of the Day 🤡
Summer 2024
That’s all for today!
See you tomorrow with more such AI-filled content. Don’t forget to subscribe and give your feedback below 👇
Real-time AI Updates 🚨
⚡️ Follow me on Twitter @Saboo_Shubham for lightning-fast AI updates and never miss what’s trending!!
PS: I curate this AI newsletter every day for FREE, your support is what keeps me going. If you find value in what you read, share it with your friends by clicking the share button below!