Reka AI Unveils Multimodal Models Rivaling GPT-4 and Claude
PLUS: Amazon Music’s AI playlist, OpenAI API for bulk data processing
Today’s top AI Highlights:
Amazon Music, too, tests the AI-generated playlist feature - Maestro
PyTorch introduces Torchtune to simplifying LLM fine-tuning
Reka releases three multimodal models with state-of-the-art video understanding
OpenAI releases a new API for bulk AI processing (with a 50% discount!)
Build production-ready conversational AI applications in minutes
& so much more!
Read time: 3 mins
Latest Developments 🌍
Create Playlists with Just Simple Words or Emojis 🎧
Just like Spotify’s AI playlist that was released last week, Amazon Music is also now testing its AI-generated playlist feature, Maestro. You can generate a full playlist with simple text prompts including text or even emojis. It is currently in beta testing and available only in the U.S.
Key Highlights:
Prompts: You can give your playlist prompts by talking or typing. The prompts can be specific moods or as random as activities, sounds, or emotions. For instance, “😭 and eating 🍝”, or “🎤🚿🧼”, or “Myspace era hip-hop”, and the playlist will appear within seconds.
Availability: Prime members and users on the ad-supported version of Amazon Music can listen to 30-second previews of songs in the playlists, whereas subscribers can listen to full playlists immediately, save them, and share them.
Beta Limitations: Since Maestro is in beta, the AI might not always correctly interpret the prompts on the first try.
Gemini Models Get a New Competitor 🦾
Reka has introduced its largest and most capable multimodal model, Reka Core, which is making waves with its impressive demo. Along with Core, Reka has also released two more models: Reka Edge and Flash. All three models can process text, images, video, and audio, offering flexible solutions for different computational needs. With competitive performance against industry-leading multimodal models, Reka Core is one of only two commercially available comprehensive multimodal solutions.
Key Highlights:
Architecture: Reka Core is a transformer model and features a context length of up to 128K tokens. It is trained on a mix of publicly available and proprietary data, including text, images, videos, and audio clips.
Capabilities: Core has superb reasoning abilities (including language and math), is a top-tier code generator, and is fluent in English and several Asian and European languages.
Performance: Core performs at par with industry-leading models like GPT-4, Claude-3 models, and Gemini models, showing competitive performance on benchmarks like MMLU, maths, coding, reasoning, knowledge, and image understanding.
Video understanding: Specifically in video question answering, Reka Core and Flash excel by outperforming Google’s Gemini Ultra.
Edge and Flash: Reka Edge and Flash deliver top-tier performance without the heft of larger models. Edge, with 7B parameters, and Flash with 21B, trained on 5 trillion tokens, both manage multimodal tasks effectively across all modalities.
We tried the Reka models in the Reka playground, they currently process only videos up to 30MB and 1 minute. However, the speed of processing and accuracy of understanding is impressive!
Quick Updates from OpenAI for Developers 🤌
OpenAI has launched a new Batch API that allows users to process large-scale AI tasks more cost-effectively by utilizing computational resources during off-peak times.
Bulk Processing Capability: The Batch API allows users to upload files containing multiple queries, such as data categorization or image tagging.
Asynchronous Tasks: These tasks are put into a queue and handled when resources are available. Users will receive the outcomes within a 24-hour.
50% Off-Peak Discount: Users can benefit from a 50% reduction in API costs when they opt to run tasks during times of lower demand.
OpenAI has introduced Projects within its API dashboard, to offer enterprises a more granular way to manage their various initiatives.
Access and Management: Only Owners can create a project, control access, manage members, set specific API keys, and define precise usage limits for individual projects.
Customizable API Keys and Limits: Each project can have its own set of API keys with tailored permissions and rate limits.
Enhanced Management: Organizations can now create up to 1000 distinct projects and manage these operations under a single umbrella.
PyTorch’s Library for End-to-End Fine-tuning Workflow 👩🔧
PyTorch has released “torchtune”, a new library to fine-tune opensource LLMs directly within the PyTorch ecosystem. The alpha release of torchtune is designed to make powerful AI tools more accessible, ensuring that even those with modest resources can engage with advanced language models. It can function seamlessly on a single 24GB GPU, significantly democratizing fine-tuning and broadening the user base.
Key Highlights:
Full Workflow: Torchtune handles the entire fine-tuning process from data preparation to local inference, including crucial steps like model customization and evaluation.
Integrations: The library is fully integrated with essential AI tools such as Hugging Face Hub for models and datasets, PyTorch FSDP for training scalability, Weights & Biases for performance tracking, and torchao for post-training model quantization.
Evaluation and Quantization: Torchtune supports detailed model evaluation using the LM Evaluation Harness by EleutherAI, and enables efficient model quantization to optimize deployment.
😍 Enjoying so far, share it with your friends!
Tools of the Trade ⚒️
Studia AI: Generate personalized video courses on any topic using a simple input prompt. Just tell it the topic and set the number of modules you want, and in a few minutes, it’ll generate the lessons in text as well as videos.
Popstarz: Sing Like a Popstar With AI. It clones your voice in a minute and then creates songs that you like with your voice. It has a vast karaoke catalog of song tracks, ready for your AI voice model covers, including many genres from pop, K-pop, country, rap, and more.
PocketPod: Creates AI-generated podcasts tailored to your personal interests, providing everything from daily news updates to in-depth discussions on specific topics. By leveraging advanced language models and text-to-speech models, it offers a cost-effective way to consume personalized audio content.
Chainlit: An open-source Python framework to quickly build scalable conversational AI applications. With Chainlit, you can create ChatGPT-like applications, embedded chatbots, software copilots, and customize your own agentic experiences with accessible API endpoints.
Hot Takes 🔥
Vinod Khosla’s vision for 2035-2049
Meme of the Day 🤡
That’s all for today! See you tomorrow with more such AI-filled content.
Real-time AI Updates 🚨
⚡️ Follow me on Twitter @Saboo_Shubham for lightning-fast AI updates and never miss what’s trending!
PS: I curate this AI newsletter every day for FREE, your support is what keeps me going. If you find value in what you read, share it with your friends by clicking the share button below!