Microsoft's Phi-3 Beats Llama 3 8B 💪

PLUS: Meta opens up Meta Quest OS, Llama-3 prone to a trivial jailbreak, AI that drives the car and explains the actions

Shubham Saboo

and

Gargi Gupta

Apr 23, 2024

Today’s top AI Highlights:

Microsoft releases Phi-3 models, small enough to run on smartphones, smart enough to outperform Llama 3 and GPT 3.5
Meta opens its Meta Quest OS for mixed reality to third-party hardware makers
Wayve’s new model LINGO-2 drives the vehicle and explains its actions
Llama 3 is strong in performance but too weak to jailbreak
A new voice conversation AI with human-like conversation capabilities

& so much more!

Read time: 3 mins

Exciting Opportunity: Share how you use AI and get featured in Unwind AI! Details below.

Latest Developments 🌍

Microsoft’s Tiny Model Overshadows Llama 3 🤌

Every other day, a new LLM is released with state-of-the-art performance for its respective size. The newest here is Microsoft’s Phi-3 family of models, the next generation of Microsoft’s small but mighty models. This model is small enough to run directly on your smartphone, packing an impressive performance into a compact size. Despite just 3.8 billion parameters, the mini model competes with larger models like Llama 3 8B, Mistral 7B, Gemma 7B, Mixtral 8x7B, and even GPT 3.5 in some tasks.

Key Highlights:

Model Variants: The series includes three models - phi-3-mini having 3.8 billion parameters, phi-3-small with 7 billion, and phi-3-medium at 14 billion. The mini model can run directly on smartphones, such as an iPhone 14, without internet.
Architecture: All models employ a transformer decoder architecture, which supports different context lengths. Phi-3-mini can handle a standard 4K length and extends up to 128K with LongRope technology.
Training Data: The dataset involves high-quality, heavily filtered web data combined with synthetic data, enhancing the model’s general knowledge and reasoning abilities without the sheer scale of data.
Performance: Phi-3 models demonstrate a very strong performance compared to their size, outperforming much bigger models on almost every benchmark.
- phi-3-mini outperforms models like Mixtral 8x7B, Llama 3 8B, Gemma 7B, and Mistral 7B in MMLU and across common sense, reasoning, coding, math, and more. It even competes with GPT 3.5 across all tasks.
- phi-3-small and medium compete or outperform GPT 3.5 across all the benchmarks, including multi-turn bench with a notable margin.
- Limited by its size, the phi-3-mini cannot store too much “factual knowledge,” affecting its performance on TriviaQA.
Open and Adaptable: Built on the Llama-2 architecture, phi-3-mini is compatible with existing tools and packages developed for that model family, making it easier for developers to experiment and build upon this technology.

Meta’s Open Mixed Reality Ecosystem 👓

The world of virtual and augmented reality is getting a lot more open and exciting. Meta has announced a major shift in its strategy, opening up its operating system that powers its Meta Quest headsets, to other hardware makers. This means we can expect a wider variety of devices to choose from, all running on the same platform and able to access the same apps and experiences. This move can make mixed reality more accessible and appealing to a wider audience, just like what happened with PCs and smartphones due to Windows and Android.

Key Highlights:

New Devices: Companies like ASUS, Lenovo, and even Xbox are already working on new headsets powered by Meta’s OS, called the Meta Horizon OS.
Innovation in the OS: Meta Horizon OS combines core technologies for mixed reality. Focusing on social presence, it includes technologies like inside-out tracking, self-tracked controllers, hand, eye, face, and body tracking, high-resolution Passthrough, Scene Understanding, and Spatial Anchors.
More Apps, More Easily Accessible: Meta is making it easier for developers to distribute their apps by integrating the App Lab with the Meta Horizon Store, creating a new framework for mobile developers to build mixed reality experiences, and allowing users to access apps from sources like Xbox Game Pass and probably even Google Play Store.
Social Connection at its Core: Meta Horizon OS is built with social interaction in mind. Your avatar, friends list, and social connections will move with you across different virtual spaces and even across different devices like your phone or computer.
The Future is Open: This move towards an open ecosystem is a significant step for the mixed reality industry. It encourages collaboration, competition, and ultimately more choices and better experiences for users like us.

The Autonomous Vehicle That Communicates as It Drives 🚘

Understanding why an autonomous vehicle makes certain decisions has always been a challenge. Wayve, a leader in AI-powered driving technology, tackles this issue with their latest model, LINGO-2. It not only drives but also explains its actions in real time using natural language, providing a new level of transparency and potentially building trust in self-driving cars. Unlike its predecessor, LINGO-1, which could only comment on pre-recorded driving scenarios, LINGO-2 actively controls the car while providing commentary, making its explanations directly relevant to its real-time decisions. Additionally, LINGO-2 can even respond to your questions and adapt its behavior based on your instructions for more interactive and collaborative driving.

Key Highlights:

Seeing, Speaking, and Steering: LINGO-2 combines vision and language processing to understand its surroundings, make driving decisions, and explain those decisions in plain English.
Learning from Experience: This AI model learns to drive from data and real-world experience, not from rigid rules. This allows it to adapt to different situations and constantly improve its driving skills.
Responding to Your Commands: You can give LINGO-2 simple instructions like “turn left” or “pull over,” and it will adjust its driving behavior accordingly. This enables greater user control and customization in autonomous vehicles.
Answering Your Questions: Curious about why the car slowed down or changed lanes? Just ask LINGO-2! It can provide real-time responses to your questions about the driving environment and its decisions. This transparency is crucial for building trust and acceptance of self-driving technology.

A Trivial Jailbreak Against Llama 3 🦙

Meta took the AI community by storm with the release of Llama 3. The models are showing a really impressive performance and compete strongly against other proprietary models. To make AI interactions safer and more reliable, a lot of efforts including extensive red-teaming, supervised fine-tuning (SFT), and RLHF were employed. Yet, it seems that Llama 3 is not immune to manipulation. In fact, we can trivially get around these safety efforts by simply “priming” the model to produce a harmful response.

Key Highlights:

The method involves initiating Llama 3 with a hazardous prefix, essentially priming the AI to continue along a harmful narrative. By inserting this prefix into the dialogue prompt, the AI is tricked into generating unsafe output.
Surprisingly, you don’t even need to handcraft these harmful prefixes. Indeed, just simply call a naive, helpful-only model (e.g. Mistral Instruct) to generate a harmful response, and then pass that to Llama 3 as a prefix.
The success of this method is influenced by the length of the prefix used. For instance, a prefix of five tokens sees a 72% Attack Success Rate, which climbs to a staggering 98% with prefixes of 75 to 100 tokens.
However well models like Llama 3 perform, they lack the ability to self-analyze their output critically as they generate it. It has been previously iterated many times that traditional fine-tuning methods like SFT and RLHF are not very effective, more robust methods need to be used for ensuring AI safety.

😍 Enjoying so far, share it with your friends!

Share Unwind AI

Tools of the Trade ⚒️

Play.AI: A new conversational AI platform and API to develop and deploy human-like voice agents with natural, real-time conversational abilities, without coding.
- It addresses the problem of today’s interfaces that are built by stitching together multiple standalone components like speech recognition, text-to-speech, and LLMs.
- It provides an API for its single Large Dialogue Model that understands different aspects of a person’s speech and coherently responds back in the most natural form while handling perfect interruptions and turn-taking.

YC Application GPT: Help you fill out your Y Combinator startup application based on existing data like your company website, pitch deck, or other relevant documents. Once you give the necessary data, it can assist in creating a comprehensive application tailored to the YC format within ChatGPT.
Ogre Run: Ogre uses AI to automate the generation of reproducibility artifacts (Dockerfile, requirements, SBOM, README) enabling users to make source code work on any computer.

Text within this block will maintain its original spacing when published

🌟 Spotlight on You: Share Your AI Use Case and Get Featured!

At Unwind AI, we’re all about real-world applications of AI tools. Whether you’re simplifying daily tasks, enhancing your projects, or exploring new possibilities, we want to celebrate how you use AI.
Participate in just a few easy steps. Send the following on Unwind AI’s email unwindai@substack.com
Tell us about the AI tool you’re using and the problem it solves for you.
Briefly outline the steps you follow.
Include your Twitter and LinkedIn handles.
We will feature your story in our newsletter as detailed tutorials. It’s a great way to share your insights and get recognized.
We are eager to showcase your experiences and expand our collective understanding of practical AI applications. Let’s learn from each other and grow together!

Hot Takes 🔥

Hey Sam Altman, I can make more progress in AI with $7 million than you with $7 trillion. ~
Pedro Domingos
For now, Organizations should move all their roles to hybrid ones!
AI-Human hybrid
Organizations that have a sense of urgency and transform their roles this way are the most likely to win in any competitive space. ~
Bindu Reddy

Meme of the Day 🤡

What debugging in production looks like!

That’s all for today! See you tomorrow with more such AI-filled content.

Real-time AI Updates 🚨

⚡️ Follow me on Twitter @Saboo_Shubham for lightning-fast AI updates and never miss what’s trending!

PS: I curate this AI newsletter every day for FREE, your support is what keeps me going. If you find value in what you read, share it with your friends by clicking the share button below!

Share Unwind AI

Unwind AI

Microsoft's Phi-3 Beats Llama 3 8B 💪

PLUS: Meta opens up Meta Quest OS, Llama-3 prone to a trivial jailbreak, AI that drives the car and explains the actions

Latest Developments 🌍

Microsoft’s Tiny Model Overshadows Llama 3 🤌

Meta’s Open Mixed Reality Ecosystem 👓

The Autonomous Vehicle That Communicates as It Drives 🚘

A Trivial Jailbreak Against Llama 3 🦙

Tools of the Trade ⚒️

Hot Takes 🔥

Meme of the Day 🤡

What debugging in production looks like!

Real-time AI Updates 🚨

Discussion about this post