GPT-3 Sized Models in 565 Lines of Code 📝

PLUS: UC Berkeley's Humanoid Robot Strolls Autonomously, Sophisticated LLMs Generate More Insecure Code

Shubham Saboo

Dec 12, 2023

Today’s top AI Highlights:

Cerebras’ Advanced Implementation of Andrei Karpathy’s nanoGPT
Beyond Training: UC Berkeley's Self-Learning Humanoid Robots
Meta’s Framework for LLM Security and Compliance in Coding
Google’s AI Tells Your Life Story with Your Photos
Meta’s Audiobox for AI Audio Generation is Out

& so much more!

Read time: 3 mins

Latest Developments 🌍

Train 100B+ Parameter Model with Some Lines of Code 📝

Cerebras Systems has released gigaGPT, a compact code base for training and fine-tuning GPT models. It is an advanced implementation of Andrei Karpathy's nanoGPT. gigaGPT can train models exceeding 100B parameters using only 565 lines of code, leveraging Cerebras hardware for large memory and compute capacity.

Key Highlights:

While nanoGPT can train models in the 100M parameter range, gigaGPT exceeds this, training models well over 100B parameters without additional code or third-party frameworks.
The implementation of gigaGPT adheres closely to the basic GPT-2 architecture, demonstrating its effectiveness by successfully training and validating a range of models. These include configurations with 111M, 13B, 70B, and 175B parameters, indicating its robustness and scalability.
The hardware utilization aspect is notable, as gigaGPT can potentially scale to models over 1 Trillion parameters without encountering memory issues, a feat achieved by storing entire models in Cerebras' dedicated MemoryX appliance and using data parallelism for model training across distributed clusters.'

Humanoid Robot Strolls Around Autonomously 🚶

UC Berkeley has developed a new AI controller for humanoid robots, enhancing their ability to adapt and function in real-world environments. This advancement focuses on a learning-based approach, enabling robots to navigate diverse settings more effectively and with greater autonomy.

Key Highlights:

The team has designed a causal Transformer model, an AI that learns from past movements and sensory inputs to predict and execute future actions. This model allows the robot to adapt its movements in real time, without needing updates to its core programming.
The AI was trained using model-free reinforcement learning across a wide range of simulated terrains, employing IsaacGym's GPU technology for efficient training. The robots, initially trained in these simulated settings, successfully adapted to various real-world outdoor environments without additional modifications.
The robots demonstrate impressive adaptability, capable of omni-directional movement and mimicking human-like arm swings. They also show resilience to unexpected challenges, such as adapting their walking style to different terrains and recovering from sudden physical disturbances.

How Safe are Models While Coding 🔐

Last week Meta had announced Purple Llama, an open comprehensive project designed to enhance trust and safety in the rapidly evolving domain of generative AI. Under this program, Meta has released Purple Llama CYBERSECEVAL which is designed to evaluate two primary security aspects of LLMs:

Their tendency to generate insecure code.
Their compliance level when asked to assist in cyberattacks.

Key Highlights:

The benchmark incorporates an 'Insecure Code Detector' to analyze insecure coding practices in eight programming languages, addressing 50 Common Weakness Enumeration insecure practices. Additionally, the benchmark tests LLM compliance using test cases from the MITRE ATT&CK® framework that covers 10 cyberattack tactics and techniques.
CYBERSECEVAL applied to several models within the Llama 2, Code Llama, and OpenAI GPT families. This extensive evaluation highlighted a concerning trend where more sophisticated models often suggested insecure code.
The tested models displayed a notable tendency to suggest insecure code, with an average of 30% of the code being vulnerable. Furthermore, these models showed a 53% compliance rate in assisting with cyberattacks. These findings underscore the importance of integrating robust security measures in LLMs to ensure safety.

Your Life in Photos 🎑

Google is introducing Project Ellmann designed to curate a comprehensive narrative of a user’s life. Powered by Gemini, Project Ellmann analyzes an individual's Google Photos and search history to compile a detailed life story and organizes life events into chapters, allowing users to view specific periods such as school years or vacations.

An integral part of this project is the 'Ellmann Chat', a chatbot to interact with life data, asking questions and receiving detailed, AI-generated responses about various aspects of their lives. While Project Ellmann shares similarities with Apple Photos' Memories feature, Google claims it distinguishes itself through its capability to create a narrative rather than just categorizing photos with metadata and labels.

Tools of the Trade ⚒️

Audiobox: Meta’s new text-to-audio tool Audiobox is out, powered by their foundational research AI model for audio generation. You can generate audio in a specific style by giving a sample, describe a new voice, generate sound effects and easily customize the generated audio.

DejargonizerGPT: Put any text with jargons in it and get an explaination of all the jargons used. Following the response with a “?”, DejargonizerGPT will further explain all the jargons it had used in its previous response.
Digest AI: Transform YouTube videos into insightful summaries and engaging blog posts. Just paste the video URL and in a few seconds the article is ready.
Magnific: It not only upscales images in high-resolution, it enhances them with generative AI. Magnific can hallucinate and reimagine as many details as you wish guided by your own prompt and parameters.

😍 Enjoying so far, TWEET NOW to share with your friends!

Hot Takes 🔥

evals are surprisingly often all you need ~ Greg Brockman
good sign for the resilience and adaptability of people in the face of technological change:
the turing test went whooshing by and everyone mostly went about their lives ~
Sam Altman

Meme of the Day 🤡

That’s all for today!

See you tomorrow with more such AI-filled content. Don’t forget to subscribe and give your feedback below 👇

Real-time AI Updates 🚨

⚡️ Follow me on Twitter @Saboo_Shubham for lightning-fast AI updates and never miss what’s trending!!

PS: I curate this AI newsletter every day for FREE, your support is what keeps me going. If you find value in what you read, share it with your friends by clicking the share button below!

Share Unwind AI

Unwind AI

GPT-3 Sized Models in 565 Lines of Code 📝

PLUS: UC Berkeley's Humanoid Robot Strolls Autonomously, Sophisticated LLMs Generate More Insecure Code

Latest Developments 🌍

Train 100B+ Parameter Model with Some Lines of Code 📝

Humanoid Robot Strolls Around Autonomously 🚶

How Safe are Models While Coding 🔐

Your Life in Photos 🎑

Tools of the Trade ⚒️

Hot Takes 🔥

Meme of the Day 🤡

Real-time AI Updates 🚨

Discussion about this post