In the world of AI Assistant, Pi's the AI friend 😇
PLUS: Pi - the AI with EQ, Instagram Reels to get more addictive, Tenstorrent's AI Bounty Program 💰
Today’s top AI Highlights:
Inflection-2.5 LLM matches GPT-4 level performance with much less compute.
Abacus AI releases Liberated-Qwen1.5-72B, open-source uncensored LLM
Meta to increase Reel watch time by 10% with AI video model
OS-Copilot's FRIDAY learns computer tasks autonomously
& so much more!
Read time: 3 mins
Latest Developments 🌍
World's best personal AI 💝
Inflection’s empathetic bot has received a massive upgrade, adding IQ to Pi’s exceptional EQ. Inflection 2.5 LLM is competitive with the world's leading LLMs like GPT-4 and Gemini. It couples raw capability with its signature personality and unique empathetic fine-tuning. It notably achieves more than 94% of the average performance of GPT-4 despite using only 40% of the compute for training.
Key Highlights:
The model demonstrates improvements particularly in coding and mathematics, impacting key industry benchmarks and ensuring cutting-edge technological capability.
Pi now includes real-time web search for quality breaking news and up-to-date information.
Inflection-2.5 shows substantial gains over Inflection-1 on the MMLU benchmark, from 72.7 to 85.5, very close to GPT-4.
Inflection-2.5's performance on the Hungarian Math and the Physics GREs is also notable, achieving 85th percentile of human test-takers on the first exam.
Inflection-2.5 demonstrated particular improvements in math and coding, as evident in a strong performance in MBPP+ and HumanEval+ benchmarks, along with HellaSwag and ARC-C for common sense and science understanding.
Uncensored, Open-source AI with Unmatched Adherence 🙇
Abacus AI has open-sourced Liberated-Qwen1.5-72B, a completely uncensored language models that adhere strictly to system prompts. This addresses the critical challenge with open-source LMs where following system instructions is paramount. One of the most intriguing aspects of this model is its use of the SystemChat dataset, specifically designed to improve compliance with system prompts throughout conversations.
Key Highlights:
Central to its development is the SystemChat dataset, comprising 6,000 synthetic conversations to teach the model to prioritize system messages over user commands. There are no guardrails or censorship added to the dataset.
Liberated-Qwen1.5's demonstartes impressive performance on MT-bench and HumanEval, outperforming Qwen1.5-72B-Chat model. The model also boasts an MMLU score of over 77, the highest for an open-source language model.
It supports a 32k context size, and the fine-tuning process used 8k sequence length inputs.
Meta’s Single AI for Entire Video Ecosystem 🎥
Meta is creating an AI model designed to power its entire video ecosystem, which includes both TikTok-like Reels short videos and more traditional, longer videos, announced Tom Alison, Head of Facebook. This initiative aims to integrate a single AI recommendation model across all of Meta’s platforms, moving away from using separate models for different products like Reels, Groups, and the core Facebook Feed.
By testing the new model architecture on Reels, Meta observed an 8% to 10% increase in watch time. Additionally, Meta is exploring broader applications of generative AI, including digital assistants and sophisticated chatting tools within its platforms.
A unified model could enhance the relevance and engagement of content recommendations across Meta’s platforms. It could also improve the responsiveness of these recommendations, providing us with more tailored content based on their interactions across different Meta services.
Towards Generalist Computer Agents with Self-Improvement 👨💼
The challenge of creating autonomous agents that can interact with a computer as humans do has been a tough goal, but the dream is still alive. OS-Copilot is a novel framework designed to pave the way for agents capable of navigating through the myriad tasks an operating system can offer, from managing files and multimedia to leveraging web and code terminals. What sets OS-Copilot apart is that it learns and improves itself by performing general computer tasks, much like a diligent learner who gets better with practice.
Key Highlights:
OS-Copilot framework involves a modular design, consisting of a planner, configurator, and actor. This design mimics human cognitive functions, such as planning, configuring, and executing tasks, but in a digital context. The framework is adept at breaking down complex user requests into actionable steps, configuring these steps for execution, and learning from the outcomes to refine future actions.
FRIDAY, the agent developed using OS-Copilot, outshines its predecessors by a significant 35% margin on the GAIA benchmark. This benchmark involves 466 complex QA tasks that test an agent's ability to handle calculations, web browsing, multi-modal interactions, and file manipulations.
Unlike traditional models, FRIDAY engages in self-directed learning, a method where it generates and tackles a sequence of tasks that increase in difficulty. This process not only showcases the agent's ability to learn autonomously but also emphasizes its skill in mastering unfamiliar applications.
AI Learning Hour ⏰
🚨AI Bounty Alert: Tenstorrent just announced their first-ever bounty program with thousands of dollars in reward. All you need to do is to add SOTA LLMs to the opensource TT-Buda model demos repo and the best integration will earn $500 reward 💰
Popular Bounty Opportunities:
Buy the innovative AI devkit and get started now 👉 AI Devkit!
Tools of the Trade ⚒️
Multi-edit in Figma: Simplifies editing across multiple frames in Figma.
Find and select matching layers, resize and align them to their frames, batch edit text boxes, make changes to all variants at once, and more - all in just a couple clicks.
DraftAid: Automates the conversion of 3D models into detailed 2D fabrication drawings for manufacturing, saving engineers and designers significant time and reducing errors. It allows teams to generate editable, manufacturing-ready drawings with a single click, slashijg their current drawing time by up to 90% and eliminate common drawing errors.
Daytona: Open-source tool to set up development environments on any infrastructure with a single command, offering broad IDE support, Git integration, and security features to eliminate the "works on my machine" problem.
😍 Enjoying so far, TWEET NOW to share with your friends!
Hot Takes 🔥
I have heard from multiple men in sf that they think raising $10M is easier than finding a gf ~ Miranda Nover
Apple's problem is that machine learning is incompatible with perfectionism. ~ Pedro Domingos
Operating systems were a low AI phenomenon. ~ Bojan Tunguz
Unrelated to AI but we couldn’t ignore this one!
Taylor Swift for President! I just saw her at her concert in Singapore and realized that she can bring together Americans and people in most countries much better than either of the candidates, and that bringing people together is the most important thing. ~ Ray Dalio
Meme of the Day 🤡
That’s all for today!
See you tomorrow with more such AI-filled content. Don’t forget to subscribe and give your feedback below 👇
Real-time AI Updates 🚨
⚡️ Follow me on Twitter @Saboo_Shubham for lightning-fast AI updates and never miss what’s trending!!
PS: I curate this AI newsletter every day for FREE, your support is what keeps me going. If you find value in what you read, share it with your friends by clicking the share button below!