Last Week in AI - A Weekly Unwind

From 14-July-2024 to 20-July-2024

and

Jul 21, 2024

It was yet another thrilling week in the AI field with advancements that further extend the limits of what can be achieved with AI.

Here are 10 AI breakthroughs that you can’t afford to miss 🧵👇

OpenAI’s Most Cost-Efficient Small Model 👩‍👦

OpenAI has launched GPT-4o mini, its most cost-effective small multimodal model yet. It is 60% cheaper than GPT-3.5 Turbo and boasts excellent capabilities at a significantly lower price point. It achieves an impressive 82% on MMLU and outperforms GPT-3.5 Turbo in various tasks.

GPT-4o mini supports text and vision in the API, with support for text, image, video, and audio inputs and outputs coming soon. It is now available as API and in ChatGPT for all users in place of GPT 3.5 Turbo.

Tenstorrent Releases New AI Devkits and Workstations 💡

Tenstorrent, the AI chip company led by renowned engineer Jim Keller, has released new devkits and AI workstations built around their latest Wormhole AI accelerators. Their Wormhole architecture is designed to be more powerful and efficient than its predecessor, Grayskull. Manufactured using a 12nm process, it boasts 80 Tensix+ cores each and offers up to 328 TOPS of compute performance.

Tenstorrent is offering two Wormhole-based PCIe cards: the n150 (single chip) and the n300 (dual chip). For those seeking a complete system, the TT-LoudBox and TT-QuietBox workstations each housing four n300 cards are available.

Left: Wormhole-based n300 Devkit PCIe Card, Right: TT-QuietBox Workstation

Groq’s Llama-3 Tops Function Calling Benchmarks 🏆

Groq has unveiled two new open-source language models, Llama-3-Groq-70B-Tool-Use and Llama-3-Groq-8B-Tool-Use, specially designed for advanced tool use and function calling. They are released with the same permissive style license as the original Llama 3 models.

Llama-3-Groq-70B-Tool-Use tops the Berkeley Function Calling Leaderboard (BFCL) surpassing all models with a 90.76% overall accuracy. Llama-3-Groq-8B-Tool-Use achieves an 89.06% accuracy, securing the 3rd position on the leaderboard.

AI Search Engine for RAG Apps & AI Agents 🌐

Exa is an AI search engine that finds the best content on the web using embeddings. Exa’s precise content retrieval supercharges RAG pipelines, automates hours of research, and creates high-quality datasets for your niche use-case.

Exa has announced a major upgrade with its 1.5 release. Exa 1.5 features an upgraded index with high-value data types, including scientific research papers, company information, and even tweets. It introduces hybrid search, combining neural search with keyword matching for more targeted results, and an Auto Search with Google fallback.

New Text-to-Video Model Rivaling Sora and GEN-3

London-based company Haiper has released its new video generation model Haiper 1.5 which generates 8-second-long videos from text or image prompts. It can even extend your prior 2 and 4-second videos to 8 seconds, just like Luma Labs Extend feature. Below is a comparison of videos generated by Haiper 1.5, Luma Labs, and Runway GEN-3, for the same text prompt.
”Dragon-toucan walking through the Serengeti.”

Not just this, Haiper also has an integrated upscaler that can upscale videos to 1080p in a single click.

New Specialized Models by Mistral AI 🤖

Mistral AI has released two new specialized models: Codestral Mamba which specializes in code generation, and Mathstral designed for math reasoning and scientific discovery.

Codestral Mamba, a Mamba2 language model, outperforms other 7B models on various coding benchmarks and even competes with larger models including Mistral’s Codestral 22B and Meta’s CodeLlama 34B.

Mistral’s first Mathstral model with 7B parameters is designed for advanced mathematical problems requiring complex, multi-step logical reasoning. It scored 56.6% on MATH and 63.47% on MMLU.

Andrej Karpathy Launches AI + Education Company ✍️

Andrej Karpathy is starting an AI + Education company called Eureka Labs with the goal of creating an AI-native school. Partnering with subject matter experts, the first product will be LLM101n, an undergraduate-level course guiding students to train their own AI.

12x Faster Inference for RAG 📈

Fireworks AI has released FireAttention V2, an optimized CUDA kernel for high-traffic AI applications, significantly improving inference times for long contexts (8K-32K tokens). V2 achieves up to 12x faster inference compared to previous versions, outpacing alternatives like vLLM in both short-medium and long-generation tasks.

FireAttention V2 enhances throughput and latency, crucial for maintaining performance in expanding applications like RAG and multi-agent systems.

Build RAG Apps that can Process 10 Million Words 🧠

Writer, an AI startup, released its AI Studio, enabling enterprises to build AI apps without coding. Recent upgrades include built-in RAG for analyzing up to 10 million words, explainable AI, dedicated modes, voice rewrites, and custom instructions.

These upgrades allow chat apps to upload large files and use graph-based RAG for accurate responses. Explainable AI breaks down complex questions with cited sources, while dedicated modes and customization features enhance user experience and functionality.

Meta is Releasing Llama-3 405B on July 23 🐑

Meta is planning to release the largest Llama-3 model with 405B parameters on July 23. As Meta had announced earlier, Llama-3 405B will be multimodal, multilingual, and have a large context window. Its performance had reached GPT-4 level while it was in training. It’d be the most capable opensource model to date!

Which of the above AI development you are most excited about and why?
Tell us in the comments below ⬇️

That’s all for today 👋

Stay tuned for another week of innovation and discovery as AI continues to evolve at a staggering pace. Don’t miss out on the developments – join us next week for more insights into the AI revolution!

Click on the subscribe button and be part of the future, today!

📣 Spread the Word: Think your friends and colleagues should be in the know? Click the ‘Share’ button and let them join this exciting adventure into the world of AI. Sharing knowledge is the first step towards innovation!

🔗 Stay Connected: Follow us for AI updates, sneak peeks, and more. Your journey into the future of AI starts here!

Shubham Saboo - Twitter | LinkedIn ⎸ Unwind AI - Twitter | LinkedIn | Instagram

Awesome LLM Apps | Sponsor Us

Unwind AI