App Store for GPTs Coming Next Week 📱
PLUS: From Audio to Full-Body Expressive Avatars, Prompting Principles to Boost LLM Performance by 50%
Today’s top AI Highlights:
OpenAI’s app store for GPTs launching next week
Photorealistic Expressive Full Body Avatars from Audio Input Only
26 Prompting Principles to Boost LLM Performance
AI Tools for Customized Wallpapers, RAG Integration, Workflow Optimization and Personal Interview Coaching
& so much more!
Read time: 3 mins
Latest Developments 🌍
App Store for GPTs 📱
OpenAI had launched GPTs on its Dev Day where individuals could create their own custom GPTs on their data. Soon, the internet was flooded with 1000s of GPTs built by people for different use cases.
OpenAI has announced that they are launching an app store for GPTs which will let creators launch their GPTs and earn real money. Still unclear however is whether the GPT Store will launch with a revenue-sharing scheme of any sort.
Full-Body Avatars with Expressions from Voice Alone 🕺
Full-body photorealistic avatars created just from voice! Meta and UC Berkely have released a method to create full-bodied, photorealistic avatars that respond with realistic gestures and expressions during conversations, using only speech audio as input. It can synthesize detailed facial expressions, body movements, and hand gestures in sync with the audio input, enhancing the realism and responsiveness of digital avatars.
Key Highlights:
The study introduces a sophisticated motion model with three parts: a face motion model, a guide pose predictor, and a body motion model. It uniquely combines audio inputs and a conditional diffusion model for facial movements, and an innovative process using vector quantization and diffusion for detailed body motion.
The research highlights the critical role of photorealism in avatar generation. It shows that photorealistic avatars are essential for accurately portraying subtle conversational gestures, with studies indicating a clear preference for these over mesh-based avatars. This emphasizes the limitations of non-textured meshes in capturing detailed gestures.
The research is supported by a novel, extensive dataset of dyadic conversations, captured in a multi-view setup to allow for precise tracking and photorealistic 3D reconstruction of participants. Covering a wide range of topics and emotional expressions, this dataset provides a robust foundation for the development and testing of the motion model.
The Art of Prompting for Boosting LLM Responses 🚀
Prompt engineering is a skill often overlooked, yet it holds the key to unlocking the true potential of LLMs. Recognizing this, a recent study has introduced a comprehensive set of 26 guiding principles specifically designed to enhance the art of prompting LLMs. And if you thought that these are just theoretical principles, the application results in an average of 50% improvement in response quality across different LLMs.
Key Highlights:
The study introduces 26 detailed principles for prompt engineering, categorized into five distinct groups. These groups cover aspects like Prompt Structure and Clarity, Specificity and Information, and Complex Tasks and Coding Prompts. The principles are designed to address a wide range of scenarios and user interactions with LLMs, aiming to maximize the efficiency and relevance of the models' responses.
The effectiveness of these principles was tested on different scales of LLMs, including small-scale (7B models), medium-scale (13B), and large-scale (70B, GPT-3.5/4). Two key metrics were used for evaluation: Boosting, which measures the enhancement in response quality, and Correctness, focusing on the accuracy of the responses.
These principles resulted in performance gains of 57.7% and 67.3% in quality and accuracy, respectively, for GPT-4. Moving from LLaMA-2-7B to GPT-4 exceeds a 40% performance gain. Beyond improving AI performance, these principles significantly contribute to user understanding and interaction with LLMs. By providing clearer and more structured prompts, users can better comprehend the capabilities and limitations of these models.
Tools of the Trade ⚒️
RAGatouille: A tool designed to integrate SOTA retrieval methods into any RAG pipeline, with a focus on ease of use and modularity. It aims to bridge the gap between complex research in information retrieval and practical RAG pipeline applications.
Dashy: An all-in-one app that centralizes tools, notifications, and data into a customizable dashboard, enhancing productivity and workflow efficiency with over 40 specialized widgets for various professional and personal needs.
Echo AI: Your personal interview prep & coach assistant that helps improve interview skills, focusing on behavioral questions with over 50 practice queries. You can record your answer, transcribe it to text, and receive feedback and a grade. It syncs with iCloud for seamless practice across devices.
Wally: AI-powered tool that allows users to create unique, shareable wallpapers by selecting a subject, style, and color tints, featuring a range of artistic styles and an easy-to-use interface.
😍 Enjoying so far, TWEET NOW to share with your friends!
Hot Takes 🔥
What happened to architecture search? It had so much potential to help AI self improve but afaics nobody is using a model found by it.
Maybe it'll get unlocked at another scale since it's just so expensive? ~ Richard SocherI don’t know who needs to hear this, but getting an .ai domain will not turn you into an “actual AI company”. ~ Bojan Tunguz
Meme of the Day 🤡
That’s all for today!
See you tomorrow with more such AI-filled content. Don’t forget to subscribe and give your feedback below 👇
Real-time AI Updates 🚨
⚡️ Follow me on Twitter @Saboo_Shubham for lightning-fast AI updates and never miss what’s trending!!
PS: I curate this AI newsletter every day for FREE, your support is what keeps me going. If you find value in what you read, share it with your friends by clicking the share button below!