AI Agents can now Hack Websites 🕵️
PLUS: Plan-and-Execute Agents for cheaper and faster output, Slack AI to go through work messages like a pro, Amazon's SOTA TTS model
Today’s top AI Highlights:
LLM agents can autonomously hack websites
Amazon's billion-parameter leap in text-to-speech
Slack releases AI features for better workplace communication
Plan-and-execute agents in LangGraph
A self-organizing AI note-taking app that runs models locally
& so much more!
Read time: 3 mins
Latest Developments 🌍
Unexpected Cybersecurity Challenge Posed by GPT-4 🚨
The increasing capabilities and use of LLMs are coupled with increasing cybersecurity threats. A recent research shows that LLMs specifically GPT-4 can autonomously hack websites, including performing complex tasks such as SQL injections and database schema extractions without prior knowledge of vulnerabilities. GPT-4 stands out for its ability to autonomously identify and exploit vulnerabilities in websites, attributed to its advanced tool use and context-leveraging capabilities.
Key Highlights:
The capability for LLM agents to autonomously hack websites is facilitated by providing these agents with the ability to read documents, call functions, manipulate web browsers, retrieve results, and access context from previous actions. This is made possible through the use of standard APIs, such as the OpenAI Assistants API, and can be implemented in as few as 85 lines of code.
The success rate of the GPT -4-based agent in hacking attempts is 73.3% (11 out of 15 vulnerabilities tested), showcasing a significant proficiency in exploiting website vulnerabilities. In contrast, the success rate drops dramatically for other models, with GPT-3.5 achieving only a 6.7% success rate and open-source models failing entirely (0% success rate).
The cost analysis presented in the study shows that an autonomous website hack attempt by an LLM agent costs approximately $9.81, considering the incorporation of failures into the total cost. This is significantly lower than the potential cost of human effort required for similar tasks, which could reach up to $80.
Lessons from building the largest TTS model 🔉
Amazon AGI team has successfully trained what is now the largest text-to-speech (TTS) model, a giant leap that introduces "emergent" qualities, significantly enhancing its ability to deliver complex sentences naturally. Traditional TTS models, trained on hundreds of hours of data, struggle with rendering complex and expressive speech which BASE TTS overcomes by leveraging large-scale data and an approach inspired by the success of LLMs in NLP.
Key Highlights:
BASE TTS boasts 1-billion parameters, combined with training on 100K hours of diverse speech data. BASE TTS can produce highly natural speech across a range of texts and voices, outperforming existing large-scale TTS systems such as YourTTS, Bark, and TortoiseTTS in subjective evaluations.
The model converts raw texts into discrete codes, termed "speechcodes," using a unique tokenization technique that features speaker ID disentanglement and efficient compression. This method allows for incremental, streamable waveform generation through a convolution-based decoder, significantly enhancing the model's efficiency and output quality.
BASE TTS also demonstrates emergent abilities in rendering complex prosody for textually intricate sentences. A special "emergent abilities" test has been developed to systematically evaluate and benchmark these capabilities, revealing that larger dataset sizes and model parameters lead to monotonic improvements in handling linguistic challenges such as compound nouns, emotional expression, and syntactic complexities.
“Inspired by the ancient Chinese philosophy of "Feng Shui," Li rearranged her house, aiming to create a harmonious flow of "qi" throughout her home.”
“Can anyone hear me over there??? Please, we need help!!! NOW!!!!”
Catch Up on Work Chats Effortlessly 🏔️
For those of us who dread returning to a mountain of messages after a few days off, Slack is rolling out a suite of AI features for making our life easier. With these new tools, you can quickly catch up on conversations, ask about ongoing projects, and even get definitions for those perplexing acronyms tossed around in your workplace Slack.
The service, available as a paid add-on for Slack Enterprise plans, includes summarizing threads, customizing recaps for selected timeframes, and integrating with other apps like Notion and Box for seamless information retrieval.
Slack is working on integrating Einstein Copilot, its own AI chatbot, as a native feature. This chatbot can compose messages to colleagues, further streamlining workplace communication.
Streamlining AI with Plan-and-Execute Agents 🧑💻
LangGraph has introduced three innovative agent architectures designed to enhance AI-powered applications by employing a "plan-and-execute" strategy, moving away from the traditional Reasoning and Action (ReAct) model. These agents will make workflows faster, more cost-effective, and higher in quality by enabling multi-step task execution without repeated consultations with LLMs, thereby streamlining the process and improving task completion rates.
Plan-and-Execute Agents These agents stand out with their strategic planning, allowing for a comprehensive approach to task resolution without constant LLM intervention, signifying a leap towards more autonomous AI systems.
ReWOO Agents These agents introduce variable assignments, eliminating the need for LLM in every step and showcasing how tasks can depend on previous results for more streamlined execution.
LLMCompiler It takes execution speed to the next level with its DAG of tasks, parallel task execution, and dynamic planning based on the entire graph history, promising a 3.6x speed boost in task execution.
Tools of the Trade ⚒️
Fina.xyz: A flexible financial management platform enhanced with AI-powered tools for personalized insights, scenario planning, and automated transaction categorization, designed to simplify and optimize personal finance tracking. It includes categorizing transactions quickly, setting rules for automation, comprehensive reporting, and much more.
VectorShift: An AI automation platform that empowers teams to use AI for searching knowledge bases, generating documents, and deploying chatbots through a no-code interface or Python SDK. It offers a no-code builder for creating and deploying AI applications with features like drag-and-drop components, data integration, and automation triggers.
Reor: An AI-powered desktop note-taking app that enhances productivity by automatically linking related ideas within your notes, providing answers to questions based on your notes, and enabling semantic searches, all while running models locally on your desktop.
😍 Enjoying so far, TWEET NOW to share with your friends!
Hot Takes 🔥
AI researchers do not rise to the level of their ideas, they fall to the level of their infra, tooling, and evals ~ jason
the anthropic commercials are the hubris that brought doom to san francisco ~ roon
Meme of the Day 🤡
That’s all for today!
See you tomorrow with more such AI-filled content. Don’t forget to subscribe and give your feedback below 👇
Real-time AI Updates 🚨
⚡️ Follow me on Twitter @Saboo_Shubham for lightning-fast AI updates and never miss what’s trending!!
PS: I curate this AI newsletter every day for FREE, your support is what keeps me going. If you find value in what you read, share it with your friends by clicking the share button below!