The AI Fumble

Imizuzu: Transcription service

A lot of the work we do center around how humans interact with computers. From games to museum installations to your basic website. The AI revolution has brought about a fair amount of perceived disruption to the way we do things online... or so I thought.

Natural language feels like the gap that has always been between how humans interact with computers. It requires all existing interaction paradigms to shift as humans had to adapt to how computers work in order to interact in a meaningful way and not the other way around. In other words, interacting with computers was a learnt process.

Humane Pin and Rabbit R1

Two devices that very quickly jumped on this disruption train and had in my mind a real chance to create something truly magical was Humane's AI Pin and Rabbit Inc. R1. It got me excited because for the first time it appeared that the way in which we create and interact with digital services could change. It is a chance to move away from the App ecosystem, the organised chaos that is The Browser and also selfishly more opportunity to find creative ways of engaging with the humans who use our software. But as it happens, it often doesn't work first time round.

The devices shipped, users reviewed them and boy I've not seen more scathing reviews of any tech in quite some time and for good reason. Shipping is hard enough so kudos to them for doing so but shipping unfinished products or simply diverging so far from the marketing track that faking it till you make it is entirely off the table. For me this was a great moment to change the way we interact with computers on a daily basis. A more natural way. A way in which we don't have to tap on pictures under glass. But in all revolutions there are necessary casualties and these products appear to be them.

Apple Intelligence

Moving swiftly along from the thousands of AI layer start-ups, the biggest play maker in tech hasn't had a say yet, arguably been too busy playing with ski goggles and 2D planes. Did they miss the boat? Maybe or maybe they just waited for the Sam Altman speed boat to arrive and save the day. "Roads? Where we're going we don't need roads!"

Apple now bakes into their entire range of software the same AI that these two failed products try to use to sell their products. Their recent announcement at their annual dev conference WWDC24 showcased how this would work with a lot of buzz and fog, lots of unanswered questions about security and above all is it really necessary? The calculator looks amazing and a real good example of how you can leverage these models to make an existing piece of software even better. But this is features that too my knowledge not a lot of people asked for. They ask for Siri to be improved, maps to work better and for Apple not to take such a big cut from developers - please man Apple. It remains to be seen if these embedded models and their integration will vastly improve the experience or hinder it as is the case with most of the other AI solutions out there.

In Conclusion

Someone noted recently on X that of all tech, AI is probably the hardest forceful push onto consumers that has ever been seen. Insane levels of VC money poured into startup upon startup. It's a pity as it has real value for everyday tasks but forcing the hand closer to the sharpened teeth is not the way in my opinion. Just ask Google or Adobe, they know this all too well.

Imizuzu Private Beta Released

Imizuzu: Transcription service

We're excited to announce that Imizuzu, our Whatsapp transcription service is currently in Beta and we need you! So if you speak Zulu, Afrikaans or English we really could do with your help to test our Whatsapp voice note to text service.

A few weeks ago, my wife came to me and said "you know what I need? I need a way to store my voice notes as text." She's a therapist and often does work out in the field when her NPO Spieel Arts Collective does group work on remote farms. Sharing briefs, notes among the team is vital and also capturing feedback and responses from participants really helps with report writing. The way they do it is by using Whatsapp voice notes, it's fast and easy and everyone has Whatsapp. Everyone other than USA that is. I learnt that recently. Just kidding!

How does it work? It's very simple - you either engage in a conversation with Imizuzu on Whatsapp using voice notes only or forward a voice note that you've recorded previously and send it to the service. Once transcribed you will receive a message back with the text version of the voice note. The service is 95% accurate for English speakers but we would like to see it used more for other languages like Zulu and Afrikaans. A cloud service is also enabled for beta testers where you can access your past transcriptions or conversions in an online dashboard. Simply sign up with the number from which you engaged with Imizuzu and it will automatically show you all your past transcriptions.

We're excited by this as we've been building AI ideas for a while and this one seems to be very functional and works really well.

Get in touch here if you want to join the beta program or DM us through our socials channels.

LLMs and the garden path

"If a large LM, endowed with hundreds of billions of parameters and trained on a very large dataset, can manipulate linguistic form well enough to cheat its way through tests meant to require language understanding, have we learned anything of value about how to build machine language understanding or have we been led down the garden path?"
[Bender, E. M., & Gebru, T. (Year). On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? ACM Digital Library]
Stochastic parrot

LLMs in it's current form provide very little value. The value lies within you, the LLM can't operate on it's own. They will get bigger and noisier but ultimately these finite state machines can't reason without mimicking the data they were trained on. One of the reasons why coding won't be replaced by AI agents anytime soon is the lack of reasoning. Reasoning dances with creativity which is often where us humans insert domain knowledge into the solution we are trying to design. The programmer inserts itself into the problem - it's the part that requires careful consideration as to how to pull together various different procedural or functional solutions to ultimately solve a complex problem.

The argument here is not to dismiss LLMs in it's entirety but rather to narrow down the focus to more specific fields of interests. This is where the best results would be achieved and where these predictive models really shine.

Another huge issue that would only become visible in the future is that, the more information is polluted online the less quality training data becomes available to expand and scale these models even further. Ultimately these models will become even less reliable.

Build your own wearable AI in record time!

FRIEND is an open source AI wearable built at the MistralAI and Cerebral Valley hackathon. The result is impressive with low cost hardware and open source software to run and build your own voice activated wearble AI device.

Does it look good - mmm not really but that is not the point here. I still think wearable AI or an AI edge devices has a lot of potential in many industries.

I like that it challenges the way we interact with the web.

Github Repo

Did OpenAI Sora train on Youtube content?

Interesting week for OpenAI. The struggling response from CTO regarding the use of Youtube content for training and then the main Sora teams' tip-toe dance performed when being asked the same question by Marques really does not place OpenAI in the most trustworthy light. Still fascinating to hear the ideas and thoughts of the Sora team and where they think this tech will be used in the future.

Should human developers be worried?

Will coding still be needed in the future? Surely AI will just replace human programmers and the need for this skill will pretty much go extinct. That is the current conclusion of many and even some heavy weights in the AI industry like Jimmy Hueng, CEO of Nvidia.

DevinAI coder

AI makes everyone into a coder is the gist of Hueng's answer to a recent interview question and what appeared to only be the start of ruffling our fragile feathers. So should we be worried?

It's a scary thought to think that your livelihood could very well be consumed by virtually itself. It's the ultimate "meta" if you think about it. The software developer replaced by it's own creation.

Oh look! Devin has entered the chat.

Devin was announced by CognitionAI a few days ago which is the first of it's kind autonomous coder. It will take a problem and do everything a developer does to find a solution through code. If you read the marketing headline you as a developer might very well believe that this is it. This is the end. However, Devin has a 13.86% success rate in solving problems it was given. We don't know which tasks were selected and on what basis. It only uses the Python language. Now you might say 13% is quite low even if that is a significant jump from the competitors who only was able to reach roughly a 6% success rate. Even if this benchmark impresses you it shouldn't really. The annoying thing about Devin is it smells of VC marketing BS.

What is actually going on here? On the one hand I feel like we are living through a true transitional period in software history. Let it be known, AI has been around for decades. The problem that was solved was not creating AI it was making inference accessible, i.e. allowing a broader audience to access and interact with these AI models. And that costs a lot of money. A huge amount. It's the money plowed into and supporting the drive to make AI accessible to the masses that is currently at play. With big risk comes big reward, just have a look at the profits the non-profit OpenAI has made since the launch of ChatGPT.

On the other hand I feel like we are experiencing an enormous ground swell of VC hype. Remember Augmented Reality? Work that ultimately in the past would never see the light of day as it would be enclosed in an academic Research department and recorded accordingly in a lab somewhere, is now being taunted as the next big thing. Why? Because of the amount of money that has been plowed into this research and development. There is no denying the Devin Hype Train is real and that everyone wants a piece of anything AI at the moment. Clearly the next Gold Rush is upon us even if Sam Altman is pushing the humanity narrative. It's for the better of humanity... and my investors.

The first self driving car that was autonomous appeared in the 1980s. And yet here we are in 2024 and still not quite there yet.

If however you are a developer and still worried. Don't be. Devin is hiring. For real. The autonomous agent coder is requiring some help it would seem from human coders.

In conclusion it's worth adding that programming is a vast field. The worriers here are definitely the programmers who work primarily on the web, i.e. web developers or the self proclaimed software engineers. And their concern is real as most of web development is in fact easily reproducible, we know this because Wix and Squarespace. If there will be any hostile job takeover by Skynet bots like Devin, my guess is these would be the first to go.

Buckle up. It's time to learn some C.

Observer Framework

What I like is simplicity. And I see more and more web developers embrace this approach in the noisy environment that is web development.

Observer Framework Website

ObserverHQ appears to offer this through their Observer Framework. Build data dashboards using JS, Markdown, Python, SQL and compile it all down to a static site. It's simple and looks really good.

Can't wait to give it a try on a next dashboard build for Isivuno.

Android Payment Systems

As someone who dips in and out of various different ecosystems it sometimes feels overwhelming to find a new way of doing things. Android is no slouch in improving and re-inventing the wheel when it comes to it's own framework and how to implement everyday requirements for modern mobile apps changes frequently. It's hard to keep up unless you are a daily Android Dev.

I'd thought as I recently had to work through implementing in-app purchases, that I'd break down a quick overview of what is currently available and what the appropriate API would be.

Google Pay API

The Google Pay API allows merchants to integrate secure payment methods into their websites, mobile apps, and other digital platforms.

The major difference is that Google Pay integrates with  Payments Service Providers (PSPs) or payment gateways for those old enough, e.g. Stripe, PayU and WorldPay. It allows users to use cards saved in their Google Account to connect with PSPs and seamlessly handle the transaction.

The devil is in the details however. From the TOS: "With respect to the sale of digital products and services, these terms apply for any transactions completed exclusively through a web browser. If You want to sell digital products or services through mobile applications, You may not use the API (please see In-App Billing)."

If you're wondering why Spotify is allowed - they struck a deal with Google to navigate to a webpage to accept payments.

So it's simple. For peasants. It's web use only.

Google Billing API

"Google Play's billing system is a service that enables you to sell digital products and content in your Android app" - Billing API

This new API includes the "Pay with Google" button which a lot of devs are after as it simplifies the payment checkout process.

Consumables vs Non-Consumables

The Billing API breaks it down into Consumables vs Non-Consumables. The former being an item which can be consumed more than once, i.e. a digital product like a download and non-consumables are those products which typically can only be purchased once. Upon successful purchase the developer must acknowledge the purchase as well as mark the item as consumed if the product is a consumable item.

Subscriptions

I did not need to integrate subscriptions but we all know what they are. The internal flow for subscriptions are outlines in the integration guide listed below this article under Resources.

Google Wallet API

"The Google Wallet API allows developers to create digital passes in the Google Wallet. The Google Wallet API currently supports boarding passes, event tickets, loyalty cards, offers, and transit passes. More recently, a generic pass type was added to support additional use cases."

This API doesn't include any payment methods and should only be considered when you need to store some kind of event pass/stub/ticket on the user's Google Profile.

David vs Googleth Search

HouseFresh showed what terrible state Google Search is in at the moment killing independent site's by showing how their original article is ranked lower than larger sites like Reddit.

The irony here is that the Reddit post simply links to the original story. This has been going on for some time over the last decade and with the advent of AI generated articles the problem seems to have exploded. With sites like Buzzfeed, Forbes all suddenly doing product reviews, the famous "17 steps to" post - it would appear that Google has a big problem on their hands, a problem of scale.

What is also not helping the cause is a day after responding to the outcry of killing independent site rankings, Google announces a massive partnership expansion with Reddit.

It's clear that this partnership is a lovely ring fenced ad revenue generator. Reddit posts always first in search results means the Reddit advertisers have more eyeballs on their pages. Of course Google denies this and wants to improve but for now it is harder and harder for independent sites to find a comfy position in the spotlight.

[Edit] Turns out the real reason is data for AI training in exchange for money and organic search all just days before Reddit launches IPO.

What do we need?

"If everyone is busy making everything, how can anyone perfect anything?" - Apple's advertisement for the iPhone 5s circa 2016

I was wondering after seeing the OpenAI announcement of Sora that perhaps this is the beginning of The Social Media Apocalypse. And maybe that is a good thing. Hear me out.

The influx of AI generated content onto these platforms will most likely negatively impact the human need for social recognition or validation. A fitness influencer with perfect abs which doesn't exist giving you the nutritional info on your macros post workout. The make-up tutorials, travel vloggers. Of course a big part of social media is simply filling the empty boredom space and this in itself is problematic. But for those people who desperately want to be content creators, the door is slowing starting to close.

There is no doubt that AI content will produce spectacular block buster content with a human touch still needed but the essence of what Apple is speaking of here, the struggle, the rinse and repeat improvement and focus on simplifying the creative endeavour will surely be lost. Unless that is we find a way to co-exist but I'm not so sure this ground swell can be stopped.

One thing is for certain humans adapt and will most definitely do will with AI. The question becomes how will we adapt and where will our focus shift to next?

There will always be something...

What is amazing to me is that even after 20 odd years of programming I still get disappointed when things don't work first time round. And hearing David Eriksson from Teenage Engineering talk about the fact that there is always something you didn't account for really does drive it home. And arguable hardware is way harder than software as it's less malleable.

There will always be something that doesn't work.

In a way then Google's approach here makes sense i.e. ship often - ship early as they did with Gmail. The product was in Beta for years. But by doing so you minimise the amount of things that will always be there at launch to fix.

But from a creator perspective and not from a business perspective, to let the perfection go is the really hard part. It's an ego deflater. Even after 20 years. Is it because we compare ourselves to the wrong peers, object, ideas? Who knows but I'm a bit tired of this feeling and will try to view the work in a more objective way to ease the post launch suffering.

Graaffs Pool

Working on a side project I wrote this short description for a gallery entry after looking for an image which led me down a rabbit hole of Cape Town's rich history of tidal pools.

"A tidal pool with a colourful history including a tunnel, male only nude bathing and an old lady in a wheel chair. Today there isn't much left but the remains enlightens the digital nomads for a great sunset selfie."

www.capetownweather.co.za/attractions/tidal-pools/

Cape Town Weather Update v1

We've successfully updated Cape Town Weather to integrate live data from Isivuno our Data Mining Hub. The Cable Way opening status and weather conditions are now added to the site and updated multiple times a day.

www.capetownweather.co.za

Reimagining Farming

I've always been fascinated by farming and technology. How can tech and automation be used to avoid hard and poor working conditions? Let's face it working on farms is less than ideal even though in the current times it is a necessity for some financially and also for the masses who need fresh food on the table.

Re-imagining the future where technology aids farming from the energy it requires to the harvesting of the end product, is an ideal example of making tech work for us.