OpenAI's New Breakthrough Project Q* 💫

PLUS: ChatGPT can now Speak, Text-to-Video by Stability AI, Inflection-2: Second Best LLM

Shubham Saboo

Nov 23, 2023

Today’s top AI Highlights:

You can now Literally Talk to ChatGPT
Generative AI Video Model by Stability AI
System 2 Attention by Meta to Reduce LLM Biases
Inflection AI’s New LLM - Second Best LLM Behind GPT-4
OpenAI’s Leap Towards AGI with Project Q*, and a New Board

& so much more!

Read time: 3 mins

Latest Developments 🌍

ChatGPT can now Speak 🗣️

The latest update to ChatGPT is an evolution from its initial text-based interaction to a more dynamic and interactive experience. ChatGPT can now respond not just in text but also with its own voice so you can start a conversation anytime with your AI assistant. It is available to all users on the ChatGPT app.

Stability AI’s Model Surpasses Runway and Pika 🐱

Stability AI has launched and opensourced its first generative video foundation model Stable Video Diffusion, built on the Stable Diffusion image model. The model can generate videos from text and image prompts, focusing on high-resolution output, and surpasses both Runway ML’s and Pika Labs’ models in user preference.

Key Highlights:

Stable Video Diffusion Model: It introduces two image-to-video models capable of generating 14 and 25 frames, with frame rates adjustable between 3 and 30 frames per second. These models have been externally evaluated and preferred over leading closed models in user preference studies.
Dataset and Training: The model utilizes a large, meticulously curated video dataset comprising roughly 600 million samples to train a strong base model. This foundation offers a general motion representation, crucial for high-resolution text-to-video and image-to-video tasks.
Adaptability and Applications: It is designed to be versatile, and suitable for tasks like multi-view synthesis from a single image. The model's focus on high-resolution video synthesis and its adaptability to various finetuning tasks, like frame interpolation and multi-view generation, suggest a promising trajectory for broader applications in video AI.

Focusing on What Matters 🎯

LLMs are susceptible to errors due to irrelevant context or inherent biases (sycophancy) in the input. Addressing the issue, researchers at Meta have introduced System 2 Attention (S2A), a technique to refine the attention mechanism of LLMs. S2A focuses on generating responses by selectively attending to relevant parts of the input context, aiming to enhance the accuracy and objectivity of LLMs.

Key Highlights:

S2A addresses the limitations of soft attention in Transformer-based LLMs. By leveraging LLMs' ability to reason and follow instructions, S2A regenerates the input context to include only pertinent information, thereby improving the final response's relevance and accuracy.
The implementation of S2A has shown better results in diverse settings, significantly increasing factuality in questions containing opinions, enhancing objectivity in argument generation, and improving accuracy in math problems, especially when dealing with irrelevant or biased context.
While S2A marks a notable advancement in LLM attention mechanisms, it also presents challenges, such as increased computational demands and dependency on the choice of prompts.

The Second Best LLM Right Behind GPT-4 🥈

Inflection AI has released Inflection-2, claiming it to be the best model in its compute class and the second most capable LLM globally, running right behind GPT-4. It exceeds Inflection-1 in factual knowledge, stylistic control, and reasoning abilities. The model will soon replace Inflection-1 to be the backbone of Pi.

Key Highlights:

Inflection-2 was benchmarked against leading models like LLaMA-2, Grok-1, PaLM-2, Claude-2, and GPT-4. It demonstrated superior performance in many areas, including MMLU tasks, reaching the top position behind GPT-4. The model performed well in areas ranging from common sense to scientific question answering, HellaSwag, TriviaQA benchmarks, and more.
Although not a primary focus, Inflection-2 showed strong capabilities in code and math reasoning benchmarks outperforming all peers except GPT-4. It can be further fine-tuned for enhanced coding abilities.

OpenAI has a New Board 🧑🏻‍💼

The past couple of days have been no less than a real-life soap opera at OpenAI when the Board abruptly fired Sam Altman. After much speculation and anticipation, Sam Altman has returned to OpenAI, leading to the resignation of the entire board except for Adam D’Angelo.

OpenAI has formed a new “initial” board of directors, including Bret Taylor (Chairman of Twitter and Co-CEO Salesforce) as Chair, Larry Summers (former U.S. Secretary of the Treasury), and Adam D’Angelo. Ilya Sutskever has lost influence following these leadership changes.

Also, new details have surfaced that several researchers at OpenAI raised alarms about a potentially threatening AI discovery known as Project Q*, warning the board about its dangers. which probably could be the reason behind the firing. Project Q*, acknowledged internally at OpenAI, is believed to be a significant breakthrough in the pursuit of AGI, capable of solving complex problems like grade-school math with a level of accuracy comparable to human intelligence. The researchers' letter to the board highlighted the potential risks of this powerful algorithm to humanity.

Tools of the Trade ⚒️

screenshot-to-code: Converts a screenshot to HTML, Tailwind CSS or JS code. It uses GPT-4 Vision to generate the code and DALL-E 3 to generate similar-looking images.

RAGs by LlamaIndex: A Streamlit app for creating and customizing RAG (Retrieval-Augmented Generation) pipeline for personal data, using natural language. This enables setting up a "ChatGPT over your data" without the need for coding.
Klu: A next-generation platform designed to build, evaluate, and optimize applications powered by LLMs like GPT-4. It serves as an all-in-one platform for experimenting, versioning, collaborating prompt engineering, and fine-tuning such applications.
What should I watch?GPT: Find movies and TV shows to watch based on your taste and preferences.

😍 Enjoying so far, TWEET NOW to share with your friends!

Hot Takes 🔥

Emmett Shear should next broker peace between Israel and Palestine. His new found skills must be put to good use. ~ immad
AI will be an order of magnitude more important for jobs over the next generation than any changes in trade brought by trade agreements ~ Lawrence H. Summers

Meme of the Day 🤡

That’s all for today!

See you tomorrow with more such AI-filled content. Don’t forget to subscribe and give your feedback below 👇

Real-time AI Updates 🚨

⚡️ Follow me on Twitter @Saboo_Shubham for lightning-fast AI updates and never miss what’s trending!!

PS: I curate this AI newsletter every day for FREE, your support is what keeps me going. If you find value in what you read, share it with your friends by clicking the share button below!

Share Unwind AI

Unwind AI