Claude Projects Elevate AI Workflows
PLUS: New protein-generating AI model, 240T tokens data corpus freely available
Today’s top AI Highlights:
Anthropic introduces Projects in Claude.ai to organize your chats and tasks
Evolutionary Scale emerges from stealth, releases a protein-generating AI model
A new data corpus with 240T tokens is made freely available for LLM training
Waymo opens its driverless robotaxi service in San Francisco for everyone
& so much more!
Read time: 3 mins
Latest Developments 🌍
New Way for Teams to Work Together with AI 🤝
Anthropic has introduced a new feature to Claude.ai for users to organize their chats into Projects. This feature enables teams to collaborate more effectively with Claude by bringing together relevant information, chat activity, and insights in one centralized location.
Projects offer a shared workspace for teams to work on specific tasks or projects. Each Project has a context window of 200k tokens so users can add all of the relevant documents, code, and insights to enhance Claude’s effectiveness.
Key Highlights:
Added Context - Projects provide a dedicated space to upload documents, code, and relevant information, so Claude can access and understand the context of the task at hand. This helps Claude provide more accurate and tailored responses.
Projects Sharing - Claude Team users can also share snapshots of their conversations with Claude in your team’s shared project activity feed. Activity feeds help each teammate get inspired around different ways to work with Claude, and help the entire team uplevel their skills working with AI.
Availability - Projects are available on Claude.ai for all Pro and Team customers, and can be powered by Anthropic’s latest model Claude 3.5 Sonnet.
New Protein-Generating AI Model Goes Beyond AlphaFold 🧬
EvolutionaryScale, a new startup backed by Amazon and Nvidia, has just emerged from stealth mode with a massive $142 million seed funding. The company is focused on building AI models capable of generating novel proteins for scientific research and has released its latest creation: ESM3, a 98B protein-generating AI model. ESM3 is described as a “frontier model” for biology, capable of reasoning over the sequence, structure, and function of proteins.
Google DeepMind’s AlphaFold primarily focuses on predicting protein structure from sequence. ESM3 goes beyond that to reason over sequence, structure, and function, making it capable of generating novel proteins. This new model is a significant step forward in protein engineering.
Key Highlights:
Generative Capabilities: ESM3 is a generative model, meaning it can create new proteins from scratch. It can even follow prompts, allowing scientists to guide the model to generate proteins for specific applications.
Reasoning: Unlike previous models, ESM3 can understand the relationship between a protein’s sequence, its three-dimensional structure, and its intended function. This allows it to generate proteins with specific properties.
Simulating Evolution: By training on a massive dataset of proteins, ESM3 has learned to mimic the process of evolution. It can generate proteins equivalent to hundreds of millions of years of natural evolution.
Massive Scale: ESM3 is one of the largest AI models ever trained for biological applications. It uses over 98 billion parameters and has been trained on a dataset of 2.78 billion proteins with over 1x1024 FLOPS (floating-point operations per second).
Open Access: EvolutionaryScale is committed to open science and has released a smaller version of ESM3 for offline use, and a larger version for non-commercial use through its Forge platform.
The Next-gen of Training Sets for Language Models ⚙️
A massive new dataset pool called DCLM-POOL containing 240 trillion tokens has been made freely available by a team of institutes. DataComp for Language Models (DCLM) is a standardized platform for dataset experiments for researchers to compare different data curation techniques and their impact on model performance. DCLM offers a massive 240 trillion token corpus extracted from Common Crawl, along with open-source software for processing large datasets and a multi-scale experimental design that accommodates researchers with varying compute budgets.
Key Highlights:
Largest Public Corpus - DCLM-POOL is the largest public corpus ever assembled for LLM training. Its sheer size provides an unparalleled resource for exploring different data curation methods.
Data Curation - DCLM-POOL is designed for research into data curation techniques. Researchers can utilize it to test and compare various filtering methods, and identify the most effective approaches for building high-quality language models.
Training Recipes - DCLM also provides standardized training recipes and evaluation suite to track model performance on different datasets. Researchers can directly compare the effectiveness of different curation techniques based on the resulting model performance on downstream tasks.
DCLM-BASELINE - It is a new state-of-the-art public training set. Trained on 2.6 trillion tokens, it enables training a 7B parameter language model from scratch to 64% accuracy on MMLU, surpassing previous open-data models and performing comparably to models trained with significantly more compute.
Waymo’s Driverless Cars Available to Everyone in SFO 🚖
Waymo One, the driverless ride-hailing service from Google’s parent company Alphabet, is now open to everyone in San Francisco. No longer limited to a waitlist, anyone can download the app and request a ride in one of the company’s autonomous vehicles. The service is operating 24/7 across the city.
Waymo rolled out fully autonomous rides in the area in late 2022 and has been used by nearly 300,000 people since it first opened a waitlist in San Francisco. Waymo claims its autonomous vehicles have been involved in fewer crashes than human drivers and have helped curb carbon emissions by an estimated 570,000 kg since the beginning of its commercial operations in August 2023.
😍 Enjoying so far, share it with your friends!
Tools of the Trade ⚒️
Ozone: Create and edit videos using AI-powered features like text-to-image and video, cloud-based solutions for seamless editing, and real-time collaboration capabilities. Auto captioning, auto animation, keyframing, and other tools are available to streamline the editing process.
InnerWallet: Chat with your finances using AI to get budget suggestions, track spending, and manage your accounts. It helps you see your net worth, set financial goals, and receive tailored insights to improve your financial health.
Crawl4AI: A free, open-source tool for web crawling and data extraction, providing outputs in formats suitable for LLMs. It supports multiple URLs, extracts media and links, gathers metadata, customizes user-agent, and executes custom JavaScripts, making it ideal for AI applications.
Awesome LLM Apps: Build awesome LLM apps using RAG for interacting with data sources like GitHub, Gmail, PDFs, and YouTube videos through simple texts. These apps will let you retrieve information, engage in chat, and extract insights directly from content on these platforms.
Hot Takes 🔥
We'll have AGI long before we figure out how the brain works. The brain is just very, very complicated. In fact, developing AGI is likely a prerequisite in order to make progress on understanding the brain. ~
François CholletAfter over a year of effort and millions spent on GPU rentals, no community-finetuned LLM has emerged as a go-to for important tasks. Everyone relies on models pretrained and instruct-finetuned by big tech. The open LLM leaderboard on Hugging Face has stagnated, unchanged for months, with little interest. Enthusiasts, find other pursuits.
This is a dead end. Creativity, not GPU hours, is key to better AI. ~
Andriy BurkovThe rumor seems to confirm that Sam Altman and OpenAi are building their own OS and communication tool. The evidence is mounting. ~
Chubby
Meme of the Day 🤡
That’s all for today! See you tomorrow with more such AI-filled content.
Real-time AI Updates 🚨
⚡️ Follow me on Twitter @Saboo_Shubham for lightning-fast AI updates and never miss what’s trending!
PS: I curate this AI newsletter every day for FREE, your support is what keeps me going. If you find value in what you read, share it with your friends by clicking the share button below!