In a packed keynote session at AWS re: Invent 2024, AWS CEO Matt Garman took the stage for his first keynote as chief executive, marking the 13th annual edition of this flagship event. “This is my first event as CEO, but it’s not my first re:Invent. I’ve had the privilege to be to every reInvent since 2012,” Garman shared, setting a tone of continuity and experience.
Event Scale and Community Focus
With nearly 60,000 attendees on-site and 400,000 joining online, the conference kicked off with remarkable energy. As Garman noted, “reInvent has something for everybody. It has stuff for technologists, executives, partners, students, and more. But at its core, re: Invent is a learning conference generally dedicated to builders and specifically to developers.”
The 2024 edition features an impressive 1,900 sessions and 3,500 speakers, emphasizing AWS’s commitment to community learning. “Many of those speakers and sessions are led by customers, partners, and AWS experts,” Garman emphasized. “Sharing your content and expertise makes re:Invent so special.”
Major Announcement: $1 Billion Startup Investment
In a significant announcement, Garman declared, “I’m excited to announce that in 2025, AWS will provide $1 billion in credits to startups globally as we continue to invest in your success.” Garman explained that this investment comes at a crucial time: “With generative AI, there is never a more exciting time out there in the world to be a startup. Generative AI has the potential to disrupt every single industry out there.”
AWS Compute Innovation Highlights
EC2 and Infrastructure Evolution
Reflecting on AWS’s compute journey, Garman shared his connection: “I used to lead the EC2 team for many years. Technically, I’m probably not allowed to say I have favorites in my current role, but I love EC2.”
He highlighted that AWS now offers 850 instance types across 126 families, emphasizing that “you can always find the exact right instance type for the workload that you need.”
Graviton Success Story
| Performance Metrics | Graviton 3 | Graviton 4 |
| Compute Performance | Up to 25% better than Graviton 2 | +30% per core compared to Graviton 3 |
| Max vCPUs per Instance | Up to 64 vCPUs | Up to 192 vCPUs |
| Memory Capacity | Up to 128 GiB | Up to 768 GiB |
| Workload Optimization | General-purpose and compute-intensive workloads | Scale-up databases and large instance workloads |
The growth of Graviton has been remarkable, with Garman revealing, “In 2019, all of AWS was a $35 billion business. Today, there’s as much Graviton running in the AWS fleet as all computing in 2019.” He highlighted Graviton’s impressive metrics: “Graviton delivers 40% better price performance than x86. It uses 60% less energy.”
New GPU and AI Infrastructure Announcements: P6 Instance Family

“Today, I’m happy to announce the P6 family of instances,” Garman announced. “P6 instances will feature the new Blackwell chips from NVIDIA, which will be coming early next year. P6 instances will give you up to 2.5 times faster computing than the current generation of GPUs.”
Trainium 2 GA Release
| AI Infrastructure Evolution | Trainium 1 | Trainium 2 |
| Adoption Phase | Early adoption phase | Established and optimized for generative AI |
| Workload Support | Limited workload support | Advanced ML training support |
| Cost Efficiency | 50% cost reduction for compatible workloads | 30-40% better price-performance vs current GPUs |
| Performance | Basic ML training capabilities | 20.8 petaflops per node |
| Instance Configuration | Not specified | 16 chips per instance |
Garman emphasized the significance of the new Trainium 2 instances: “Trainium Two delivers 30-40% better price performance than current GPU-powered instances. That is a performance that you cannot get anywhere else.”
EC2 Trainium 2 Ultra Servers
Announcing this breakthrough, Garman explained, “An ultra server connects four Trainium Two instances, so 64 Trainium Two chips, all interconnected by that high-speed, low-latency neural link connectivity… giving you a single ultra node with over 83 petaflops of compute.”
Project Rainier Announcement
| Early Adopter | Key Benefits | Specific Features |
| Adobe Firefly | Inference optimization for image generation | Significant cost reduction for production workloads Custom pipeline integration |
| Databricks Integration | Up to 30% TCO reduction | Native support in the Databricks platform Optimized for data science workflows |
| Qualcomm’s Edge AI Development | Cloud training to edge deployment pipeline | Model optimization for edge devices Efficient quantization support |
The most ambitious announcement was Project Rainier. As Garman described it, “Together with Anthropic, AWS is building what we call Project Rainier… containing hundreds of thousands of Trainium 2 chips to deliver five times the compute capacity of existing Claude clusters”. This infrastructure supports distributed training and advanced model architectures, optimizing communication overhead for scalability.
Early adopters are already benefiting from these advancements. Adobe Firefly optimized inference workloads, achieving significant cost savings. Databricks integrated Trainium 2, reducing TCO by 30% and enhancing data science workflows. Qualcomm streamlined cloud-to-edge pipelines, improving model optimization and quantization for edge devices.
For developers, AWS provides migration tools, SDK support for popular ML frameworks, and optimization resources. Graviton 4 is ideal for containerized applications and general-purpose workloads, while Trainium 2 and P6 instances excel in AI/ML tasks.
Industry Partnership Highlights
The keynote concluded with insights into AWS’s partnership with Apple. Benoit Dupin, Apple’s Senior Director of Machine Learning and AI, took the stage to discuss their collaboration on machine learning and AI infrastructure.
“At Apple, we focus on delivering experiences that enrich our users’ lives.” Dupin shared that AWS supports Apple’s cloud services, including Siri, iCloud, and App Store.
This blog post covers the first 30 minutes of Matt Garman’s keynote at AWS re: Invent 2024, held in Las Vegas, Nevada.
Discover more from AI For Developers
Subscribe to get the latest posts sent to your email.