Llama-3 405B Coming Next Week

PLUS: RAG to analyze 10 million words, Open framework for distributed LLM training

Shubham Saboo

and

Gargi Gupta

Jul 15, 2024

Today’s top AI Highlights:

Train AI models globally using peer-to-peer network with this opensource framework
Build RAG apps that can process 10 Million words
Meta is releasing Llama-3 405B on July 23
OpenAI is reportedly working on an AI project codenamed Strawberry

& so much more!

Read time: 3 mins

Latest Developments 🌍

Open Framework for Distributed Model Training 🌐

Training LLMs is a resource-intensive process, often requiring access to massive, centralized computing power. This has created a barrier to entry for many developers and researchers. Prime Intellect, a US-based startup that provides a platform for finding global compute resources, has released an open-source framework for enabling globally distributed LLM training.

OpenDiLoCo enables decentralized LLM training by using a peer-to-peer network for communication, even with limited internet bandwidth. It has been tested with models spread across two continents and three countries while maintaining 90-95% compute utilization.

Key Highlights:

How it works - OpenDiLoCo builds upon Google DeepMind’s Distributed Low-Communication (DiLoCo) method for data-parallel training of language models on islands of devices that are poorly connected. Instead of synchronizing training data after every step, it only shares updates every 500 steps, drastically reducing the amount of data that needs to be transferred between machines.
DiLoCo Scaled - OpenDiLoCo successfully scales the DiLoCo method to handle models with up to 1.1 billion parameters, compared to the original 400 million parameters in DeepMind’s work.
Global Decentralized Training - OpenDiLoCo supports training across continents and countries, achieving 90-95% compute utilization. It uses a peer-to-peer network with Hivemind for inter-node communication and PyTorch FSDP for intra-node communication.
Fault Tolerance - OpenDiLoCo allows on/off ramping of resources and maintains training even if some devices become unavailable. Hivemind’s fault-tolerant training ensures that training continues smoothly.
Open Source - It is developed on the Hivemind library and the codebase is publicly accessible. Training can be carried out with as little as two GPUs, which don’t need to be co-located.

Chat Apps with Built-in RAG and Explainable AI 🧠

Writer, an AI startup that provides a full-stack generative AI platform, released its AI Studio last month. With this Studio, enterprises can quickly build and deploy a wide range of AI apps, including text generation apps, chat apps, and more, all without any coding.

They have released some major upgrades to their platform. These upgrades include built-in RAG to analyze up to 10 million words, explainable AI features, dedicated modes, voice rewrites, custom instructions, and more to make it more powerful and easy to use.

Key Highlights:

10 Million-Word Analysis - With chat apps, you can now upload files with up to 10 million words. Writer uses graph-based RAG where uploaded files are broken down into data points and their semantic relationships are mapped in a graph structure. When you ask questions, the AI retrieves relevant data points from the graph and uses them to generate accurate and contextually relevant responses.
Explainable AI for Transparency - Writer’s chat apps now feature “thought process” functionality. The AI breaks down complex questions into sub-questions, providing answers and citing sources for each. This transparency can help you refine your prompts for better results.
Dedicated Modes for Specific Tasks - Recognizing that a one-size-fits-all approach isn't optimal, Writer introduces “modes” – dedicated user experiences tailored for different tasks. You can choose between
1. General mode for on-the-fly assistance on general knowledge,
2. Document mode for deep dives into a few specific files, and
3. Knowledge Graph mode for accessing company knowledge bases and getting answers based on large corpora of data.
Customization - Additional features enhance customization and user experience. These include voice profile rewriting, custom instructions that apply automatically to outputs, and dictation capabilities.

Quick Bites 🤌

Meta is planning to release the largest Llama-3 model with 405B parameters on July 23. As Meta had announced earlier, Llama-3 405B will be multimodal, multilingual, and have a large context window. Its performance had reached GPT-4 level while it was in training. It’d be the most capable opensource model to date! (Source)
Some updates from OpenAI
1. OpenAI is reportedly working on an AI project codenamed Strawberry. Strawberry models can not just generate answers to queries but also plan ahead enough to navigate the internet autonomously and reliably to perform what OpenAI terms “deep research.” (Source)
2. OpenAI has come up with a set of five levels to track its progress toward building AGI capable of outperforming humans. These tiers range from AI with conversation abilties to that can do the work of an organization. OpenAI claims that it is currently on Level 1 and is at the cusp of reaching Level 2. (Source)
3. OpenAI landed in controversy in May when employees called out the company for making them sign NDAs with lifetime nondisclosure and non-disparagement clauses. The whistleblowers have reached out to the US financial watchdog to investigate these NDAs, claiming that these NDAs violate SEC rules by waiving employees’ federal whistleblower rights. (Source)
A bipartisan group of senators introduced the COPIED Act to protect artists, songwriters, and journalists from having their work used for AI training without consent. The bill requires AI companies to provide content provenance information so the creators can control and set terms for their work. It also mandates guidelines to identify AI-generated and altered content. (Source)

😍 Enjoying so far, share it with your friends!

Share Unwind AI

Tools of the Trade ⚒️

Highlight: Access ChatGPT, Claude 3.5, and Perplexity in one single interface, while staying in the flow. It appears on any website or app with just a shortcut (Command +.). You can type in your query, take a screenshot and paste it, or just ask it to take a screenshot of what you’re looking at.
Corgi: The first AI insurance company for small/mid-sized businesses to seamlessly buy and manage customized insurance policies tailored to their needs. It uses AI to handle applications, provide clear policy explanations, and offer instant support.
Tribe AI: Low-code tool to build and manage multi-agent teams to handle complex tasks efficiently. It lets you customize agent roles, maintain persistent conversations, monitor performance in real time, and easily deploy using Docker.
Awesome LLM Apps: Build awesome LLM apps using RAG for interacting with data sources like GitHub, Gmail, PDFs, and YouTube videos through simple texts. These apps will let you retrieve information, engage in chat, and extract insights directly from content on these platforms.

Hot Takes 🔥

One more reason to pick an AI President - at a minimum it can’t be assassinated ~
Bindu Reddy
AI will increase the need for human intelligence, not decrease it. ~
Pedro Domingos

Meme of the Day 🤡

r/ProgrammerHumor - softwareArchitectsAreTheRootOfAllEvil — Source

That’s all for today! See you tomorrow with more such AI-filled content.

Real-time AI Updates 🚨

⚡️ Follow me on Twitter @Saboo_Shubham for lightning-fast AI updates and never miss what’s trending!

PS: I curate this AI newsletter every day for FREE, your support is what keeps me going. If you find value in what you read, share it with your friends by clicking the share button below!