It was yet another thrilling week in the AI field with advancements that further extend the limits of what can be achieved with AI.
Here are 10 AI breakthroughs that you can’t afford to miss 🧵👇
First Real-time Voice AI Ever Released, Moshi 👬
Kyutai, a Paris-based AI research lab, unveiled their real-time voice AI called Moshi. Developed from scratch in just 6 months by a small team of eight, Moshi distinguishes itself as the first voice-enabled AI openly available for public testing and use.
Moshi has a very realistic and emotionally nuanced voice, and can talk in 70 expressions. With a latency of ~160 ms, the conversations feel real-time and natural. Moshi’s code and model weights will be shared publicly soon.
Salesforce’s 7B Model Outperforms GPT-4 & Claude-3 Opus 💪
Salesforce has released two new language models, xLAM-7B and xLAM-1B, designed for function calling. They are particularly good at interacting with applications and services for retrieving information and completing tasks as instructed. These models stand out for their strong performance despite being smaller.
With only 7B parameters, the model achieved SOTA performance on the Berkeley Function-Calling Benchmark, outperforming multiple GPT-4 models. Even the 1B model surpassed GPT-3.5-Turbo and Claude-3 Haiku. The key lies in how they are trained.
Eleven Labs Releases Voice Isolator 🎙️
Remove unwanted background noise and extract crystal clear dialogue from any audio to make your next podcast, interview, or film sound like it was recorded in the studio. It’s not currently optimized for music vocals. Voice Isolator costs 1000 characters for every minute of audio.
Runway’s Gen-3 Alpha Model is Available for Use 🌅
Runway has made its latest Gen-3 Alpha model available publicly for use. The pricing plans to use Gen-3 Alpha start from $12 a month. Gen-1 and Gen-2 are available for free with limited credits. You can try it here.
Cloudflare Declares War on AI Content Scraping 🤖
Cloudflare has introduced a new one-click feature to block all AI bots from scraping website content, available to all customers, including those on the free tier. This simple dashboard setting addresses concerns over data ownership and the impact on content creators.
Using its extensive global network, Cloudflare proactively identifies and categorizes suspicious AI bot activity. Even bots disguising themselves as regular browsers are detected and blocked through advanced machine learning models.
Perplexity Gives Pro Search a Research Overhaul 🧑💻
Perplexity AI has upgraded its Pro Search to handle complex research tasks. Previously capable of quick searches, Pro Search can now execute multi-step reasoning to answer intricate questions by understanding, planning, and working through goals. This upgrade combines enhanced search and synthesis capabilities with advanced computational abilities for a comprehensive research tool.
Meta is Testing Llama-3 405B Model in Meta AI 🐑
Meta has started rolling out the Llama 3-405B model (preview) in Meta AI in the latest WhatsApp update to limited users. The screenshot shows that users will be able to select the model they want to interact with. Llama 3-70B will be the default model, and users can select the 405B model for more complex prompts with a usage cap.
Take Control of Your AI Agents with LangGraph 🎛️
LangChain has introduced stable release of LangGraph v0.1 and LangGraph Cloud in beta, an infrastructure for scalable and reliable agent deployment. These tools address the challenges of building real-world AI applications that can reliably execute complex tasks. LangGraph and LangGraph Cloud provide developers with increased control, visibility, and scalability in their agent-based applications.
Extract Data From Any Website with a Single API Call 🌐
MultiOn has released a new tool for developers working with web data: the Retrieve API. This API lets you extract structured information from any website using simple natural language commands, eliminating the need for complex parsing or scraping scripts. It integrates seamlessly with MultiOn’s existing Agent API, so you can build truly autonomous web agents that can navigate pages and scrape information in just 3 lines of code.
Microsoft GraphRAG to Teach LLMs Beyond Keywords 📚
Traditional RAG is a tough nut to crack. It struggles to connect related facts scattered across a dataset and often fails to grasp the bigger picture within a large corpus of information. This is where Microsoft Research’s GraphRAG steps in.
Unlike traditional RAG, which relies on simple keyword matching, GraphRAG builds a knowledge graph from your text data, so the LLM can understand relationships and meaning in a way that’s not possible with keyword-based searches.
Which of the above AI development you are most excited about and why?
Tell us in the comments below ⬇️
That’s all for today 👋
Stay tuned for another week of innovation and discovery as AI continues to evolve at a staggering pace. Don’t miss out on the developments – join us next week for more insights into the AI revolution!
Click on the subscribe button and be part of the future, today!
📣 Spread the Word: Think your friends and colleagues should be in the know? Click the ‘Share’ button and let them join this exciting adventure into the world of AI. Sharing knowledge is the first step towards innovation!
🔗 Stay Connected: Follow us for AI updates, sneak peeks, and more. Your journey into the future of AI starts here!
Shubham Saboo - Twitter | LinkedIn ⎸ Unwind AI - Twitter | LinkedIn