Thoughts on Apple WWDC 2024
I’m an Apple fan and long-time shareholder. I’m still bullish, but today’s WWDC wasn’t their best. Here’s why today’s news is still a winner for mobile application developers.
This is the second time in recent memory where the keynote felt clunky when explaining the technical details felt clunks. Similar vibes with the Vision Pro. Maybe they just had to get it out of the way, but the “Apple’s take on AI” felt forced.
I may be late to the party. That updated Siri logo is ugly.
Elon is making a stink about ChatGPT integrated at the OS level. This should not be ignored for any who has some sort of enterprise mobile device management (MDM) on a work phone. Developer sessions are still taking place this week, so there may be more to come. Regardless, there needs to be a way to disable ChatGPT and any non-on-device LLM calls at the OS level for enterprise devices. Otherwise, there is the potential for data leakage, regardless of how secure the architecture is.
Apple says even the cloud-LLM calls will be included in the price of the device. Either 1) they have modeled the cloud COGS into the life of the device, 2) are using OpenAI as a stop gap on their path to all on-device processing with some future M-Series chip, or 3) OpenAI is paying Apple. The latter wouldn’t be crazy talk if there was an additional side deal in place similar to Google paying Apple for default search on the phone. Apple confirmed that Google Gemini would be coming to devices as well. This is similar to multiple browser or maps on phones — with built in options for defaults. Will be interesting to see if there is any real money for either side to make with this integration over time.
A lot of consumer-focused AI wrapper applications are effectively DOA with some of these announcements. We knew they were playing short-term arbitrage for many apps. Still all good for those able to make the best of the short initial window before these LLM capabilities were built in.
The generated photos were ugly. Overly cute is fine, but the end results looked like generative image models from two versions ago. The Genmoji images were pretty good. I imagine those will get a lot of use.
Discovery of the Siri actions to control apps will be challenging. Not quite the same issues that Alexa has had, but it will take time for people to connect the dots on what they can control with voice. That said, the developer API for this will be compelling for multi-step actions if it performs as demo’d. I don’t buy the complex multi-step tasks are ready in iOS 18, but very useful once it gets there. In the meantime, voice-based photo editing looks like a potential killer app. That and local semantic search. Why…
The on-device graph is the real magic. I’m guessing this is a on-device vector database. I imagine something like @lancedb is being used here. iOS uses SQLite for many of the local databases. It would make sense that an approach like LanceDB would be great for the phone. Pair that with an on-device embedding model and you can do local RAG.
The financial opportunity for application developers is quite interesting when pairing voice control with the on-device graph. Developers could start claiming the default position for a number of new smart actions. And since voice will often need to pick the best local provider, there are all sorts of ways that apps will become modularized with multiple apps coming together to execute a single task. Some combo of voice, local embedding model, local vector model, and on-device graph is what will be the real impact of today.
The entire keynote was shot on iPhone. Love this. Impressed they did this again. I know folks will argue that the phones were all rigged out. Yes, true. Still, that’s impressive. Some of the shots in hard sun light would have benefited from an ND filter.
The production value for the keynote was high. The messaging had some rough edges. Creating a personalized local graph of on the phone is going to create real AI use cases.
Related Posts
Our Impeding Societal Divide — Tech is driving us towards a more individualistic lifeThe TikTok algorithm keeps serving the media ⟵
The Five-Sentence Framework — Most people will ignore your business since attention is scarceYou can break through the noise by w ⟶
30-Second ChatGPT Prompt Training — On a whim, I applied the STAR method to ChatGPT prompt construction, and it worked like a charmHere ⟶