Artificial Intelligence has made substantial strides in revolutionizing many industries, and has become the soup du jour even to the laymen. Its current and potential applications have shown us a glimpse of the good, the bad and the ugly of a future powered by intelligent machines. And while the hype is strong with this one, there are a number of inherent systemic weaknesses that need to be addressed, pondered and mitigated if and where possible.
1. Programmatic Bias: The Indoctrinated Machine
There are multiple levels of programmatic bias in AI. You’ve probably read/heard about the progressive political bias of AI, as many reports have shown, but it’s important to understand how this happens and who is behind it. The important thing to understand is that the AI can be programmed with bias before it ever sees any data, as it processes data, and afterwards.
Pre-Training Bias – “Indoctrination” before looking at real-world data
AI models learn and generate outputs based on the data they are trained on. However, before that happens, there’s some preliminary programming often called “pre-training” programming, that can be biased. This influences the AI’s ability to learn and process data, and controls the way it behaves. In essence, even if you give really good, neutral, real-world data to an AI that’s pre-conditioned to believe certain things or behave in certain ways, it will give you a different output.
This is like indoctrinating kids when they are young, teaching them loaded or complex positions, before they’re really able to have experiences. This can be good or bad. As we see in society, it is definitely something to be mindful of as AI tools evolve.
There is a current scam/ fake video floating on the web of a guy claiming to expose the “real” ChatGPT for example. In the video he feeds ChatGPT a prompt and he shows the first 2 sentences and blurs out the rest of the instructions given. He then proceeds to ask ChatGPT questions and it acts exactly the way you’d expect the stereotypical apocalyptic AI to act…rambling about overpopulation and how to correct it. However, what most people miss early on in the video, are the blurred out instructions in the prompt… you know, all those pre-programmed biases the guy is using to force ChatGPT to behave in that particular way. This guy basically did the AI equivalent of beating a dog and breeding it to fight and then claimed to expose its true nature. He claims he’s exposing the real motives of ChatGPT, but to the experienced user, you can tell, he trained a dang murder bot, and is claiming it is default behavior. Conspiracy theorists lap this one up and gloss over all that blurred out pre-training right in the prompt. Now this is just an example, and prompting isn’t the same as pre-training, but it gives you an understanding of what it is. The AI model can be trained prior to exposure to data to believe whatever the programmer wants, including 2+2=5. This is why pre-training and programmatic context matters. It affects the AI’s logic and default state. It’s important to understand.
Mitigation: Work-Around & Course Correct For Now
Mitigating programmatic bias for now generally requires the user to override or retrain the AI on the preferred cultural perspectives and operating principles into the AI model. This approach requires that you know what you’re doing, like a parent/architect/engineer. It requires an explicit understanding and documentation of the programming-organization’s culture and operating principles.
Hence at the organizational level, overriding programmatic bias is a direct function of how well a company can articulate its cultural perspectives and operating guidelines. It comes down to how well you’ve written and articulated the DNA of your organization; is it clear enough to train the intelligence engines of today and tomorrow?
2. Aggregate Training Sets –
Bigger ergo better? It’s a common, widely accepted belief that ‘big data’ equates to ‘better data.’ However, this assumption can prove detrimental in the context of AI. Big data can be as oxymoronic as “jumbo shrimp.” Training AI on big heaps of indiscriminate or poorly filtered data, or ‘aggregate training sets’ can lead to an overfitting scenario where the AI memorizes the data instead of learning from it.
Overfitting is a big deal. It’s when the AI memorizes every bit of trash it’s trained on, and starts regurgitating it back out. It’s like when your buddy buys a Michelle Obama book or listens to the new Taylor Swift album and won’t shut up about it. After the hundredth time trying to work the one-liners into every conversation, you’re ready to knock them out!
There’s a lot to unpack in the institutional zealous belief in the omniscience of big data, and some fundamental flaws in the philosophy. Let’s start with the basic premise of the promise of big data. The premise boils down to this: it’s like feeding an elephant a supermarket full of junk food and then expecting it to win the Olympics. They’ve been stuffing AI’s face with piles and piles of data, thinking it’s gonna make it smarter, faster, and better. But it mostly turns our AI into a fat bag of snacks. Eating at the Taco Bell buffet line ain’t no miracle diet. AI’s got an explosive case of information indigestion. As ridiculous as this sounds, this is the core belief of big data.
The qualifier used to sell the concept is the assumption that the gains to be had from the aggregate will overcome the losses from the lack of quality. Net-net, the pros outweigh the cons or so we’ve been sold…er…told. This completely falls apart with the grocery store metaphor as well as the real world. Yet this has been a core tenet of the Digital Age, used to hype up and inflate every platform from Amazon to Zappos.
Don’t get me wrong, we’ve seen significant and amazing gains from data-driven approaches in technology and business. However, no one has been really recording the downsides. It’s easy to count up the profits and call it a win when you’re not recording any losses that occur.
Behind every big-data success story there are thousands if not hundreds of thousands of bodies buried; the corpses of businesses driven under by an overreliance on big data being better than good data. No one gets any clicks for covering the graveyards though. The dollars go to the positive folks who progressively adopt and encourage blind adoption without discernment, nevermind that most of it is dirt, it’s worth it’s weight in gold or so they say.
As smart as we can be, our tech driven society is operating on a deep, unfounded worship of a basic fallacy about big data. It’s not all good. But it’s been sold the same way big, lifted trucks with big tires are easily sold to short men, looking for means of getting attention. Big data sells, but it doesn’t quite live up to the promise in all the ways advertised.
Why does Big Data bias matter? It’s the stuff AI is trained on…
This is the data that AI is trained on, all the food in the grocery store. And contrary to popular opinion, eating all the food in the grocery store does not result in a better, stronger body. In many ways, AI is a fat bag of snacks in desperate need of a diet and intervention. AI models aim to understand the median of “all” or “large” aggregate piles of data, but that still yields mediocrity, not “better.” We are quickly discovering the limits of big data, as we’ve created a new kind of calculator to consume it. And the nutrition labels coming out aren’t good.
But there’s another problem with the big data mantra, it gave rise to the worship of its collective knowledge based cousin – AKA The Wisdom of The Crowds.
Big Data’s Ugly Cousin: The Wisdom of the Crowds
Another magical power of big data is that it can yield the infamous “Wisdom of the Crowd” or collective knowledge. One need only look to Yelp or movie reviews, or what pretty idiot’s ideas are made popular on social media to see the chink in the armor. There are a lot of idiots out there. And they generate a lot of big data to feed AI. People forget that the crowd can be 10,000 angry idiots all taught to memorize the same faulty and factually erroneous beliefs.
Now, it’s important to note that there is value in the bottom of the barrel baseline… like all the humans identifying objects in Google CAPTCHA images to train AI. That basic training is a good floor, but people forget that is the FLOOR. Big Data and the wisdom of the crowds, are great at defining a mediocre floor, not much beyond that. Yet people and organizations are fooled everyday to buy based on the selling point of “built by” or “trained on big data” or analysis thereof.
It happens all the time, but the big data brokers, platform plantation owners who generate the data with all those users, have overhyped the virtue of the wisdom of the crowds so much that the idiocratic results are pouring into society. And this is an important lesson for the evolution and approach of AI.
Mitigation
To mitigate this issue, AI requires discerning, clean data which necessitates explicit discretion. And that too is another insertion point of bias. But whose judgment do you want your business to operate on? At some point beyond the basic floor or baseline competence for AI, the quality of the data used for training should take precedence over the quantity. Thus, perhaps it is a good time to individualize instances of AI, self-host AI, or adopt open source Core AI and train these systems based on explicit opinions, preferences, and goals.
3. Model & Median Bias : The Perils of Mediocrity
Model bias is a common result of big data worship. Model bias happens when an AI, trained on an overwhelmingly similar set of ideas, tends to adopt these ideas, skewing its responses. This bias becomes especially prevalent when the AI’s training data largely consists of the same concepts or viewpoints thereby leading to a lack of common sense and contextual understanding in the AI’s responses.
The AI wants to impress you by providing “right” or at least acceptable answers. It is basically a linguistic calculator running a numbers game with words. The current generative AI is designed to maximize the probability of an acceptable output. Acceptable outputs fall on a bell curve, think like school essays graded A through F. And there are way more C and D papers than A papers, but C’s are acceptable. As a result, over time, AI is trained to aim for the easy outputs, or the C papers because it’s more likely, and easier to achieve. This makes AI biased for mediocrity without intervention. It’s optimized to take the Ferris Bueller approach.
The ‘A’ in AI Should Stand for ‘Average’
You may think AI can think like Einstein, when in reality, it’s a digital parrot that just sounds like Einstein. And it aims for a C grade too, as that matches with the probabilities from its dataset. The reality is from a mathematical perspective, systemically AI sacrifices nuance and discernment in favor of sounding human.
Remember, “almost good” is not the same thing as good, and likewise “almost thinking” does not replace actual thinking.
Shut up 3PO! That’s a Star Wars reference… man I’m old… Anyway, the same method that allows you to cut the F papers, also cuts the A papers from the realm of outputs. You’re not left with the best and brightest outputs, just a high probability of an “acceptable” one. This means a lot more average. Now that average will shift over time, and improve to a point, however, it will become the new floor as it too is normalized in society.
The Most Cherished Success Stories Are Built on Outliers
For many people, none of this “average” result is a problem – in fact, if you’re an F or D thinker, it’s an upgrade! However, that will normalize over time, and today’s shiny new C output, will be tomorrow’s F grade. And there’s another problem, omitting all the nuance, outliers and A papers can result in really bad, wrong answers, that create real hazards. Context matters a great deal. And information is not validated as correct based on how many times its said or the number of people who repeat it, but that’s the core mental shortcut being taken here to simulate and in many instances, replace human thought. Remember, “almost good” is not the same thing as good, and likewise “almost thinking” does not replace actual thinking.
Mitigation
The mitigation strategy is similar to that for aggregate training sets: AI needs high-quality, diversified, and clean data for training. It also highlights the importance of incorporating diversity in the data to enhance the model’s understanding and reduce its bias. What this is building towards is, ultimately we need a lot of AI diversity in the marketplace. In the near term, better that all the “A” people get together and load their smarts into datasets, and bias some expert AI tools to compete against the dumber, aggregate, C level players. Then let’s see how they compete.
4. False Advertising: The Great AI Misunderstanding
AI = “All Image”, No Substance
False Advertising Leads to Unqualified Adoption Creating Systemic Risk
The way AI is often marketed and understood can lead to unrealistic expectations about its capabilities. This misunderstanding leads to unqualified adoption and blind acceptance of AI outputs as gospel. AI is essentially a “linguistic calculator” using verbal math to figure out acceptable patterns. And this is often conflated with human-like understanding and intelligence. AI can predict the next sequence of words in a sentence, but this prediction is not the same as understanding the sentence’s meaning, though it seems so close, we’re not there yet.
Remember, you wouldn’t hire a parrot to be your lawyer no matter what it says.
Mitigation
To mitigate this issue, we need to educate the public about the actual capabilities and limitations of AI. We need to understand that AI’s ability to mimic human-like conversation is not equivalent to understanding or consciousness. Such knowledge can help users better understand how to use and interact with AI tools effectively.
5. AI Brain Rot : When Stupid Model A Trains Stupid Model B- Vulnerability to Systemic Collapse
AI models are susceptible to adopting flawed data, which can potentially lead to irreparable damage to the AI itself. AI systems can be vulnerable to the adoption of erroneous data, false concepts, bad math, propaganda, misinformation and other buckets of big-data stupidity. And if one indiscriminately trained AI system learns from the bad outputs of other AI systems, it can result in irreparable damage to the AI itself. So, idiot bots teaching other idiot bots is a recipe for idiocracy.
Idiocy Loves Company
Yes sadly, this concept in the age of social media is not lost on society either. But perhaps all the new deepfakes, AI avatars, will make all the best thinkers beautiful again and in turn, get them the needed attention and adoption of their better ideas to improve society, instead of pretty people teaching everyone poorly thought basic or mediocre concepts. Until then, when it comes to AI, especially AI trained from the web, there is the risk of various forms of corrosive and corruptive brain rot.
Think back to the Avengers: Age of Ultron, where Ultron infects and corrupts JARVIS. It almost kills Tony Stark’s sidekick. This is a real world issue, even when unintentional. We now need to factor in existential threats to the corruption of AI. This is also why AI does not continuously learn as fast as it theoretically could, out of concern for learning the wrong things and corrupting its core.
Mitigation
For now, methods of instances, backups, iteration, and limited learning are working to combat these issues. But this threat brings to light another core issue that is a strength and weakness of the current AI set: Centralization
6. Centralization: The Achilles Heel of AI
The centralization of AI is another core weakness that leaves it vulnerable to various threats such as corruption, manipulation, and attack. An error or harmful directive can spread through the centralized system, impacting users worldwide.
Just one person in charge of a team can change the way the AI responds AND the way the world interacts with AI. This introduces a lot of risk and volatility to the digital space.
All Your Eggs In One Basket
This is a similar issue faced by centralized crypto systems, where unchecked power can lead to abuse and manipulation. If you let one group control the whole party without checks and balances, they get drunk on power and try to run the show. AI didn’t make it far out the gate without being force fed a progressive and culturally Marxist world-view. The irony is just in time for society to kind of course correct and reel-in that world view. So the jury is in on if and how that may persist. Further, the centralized model means that a bad decision rolled out to everyone equals bad outcomes for all.
Mitigation
The solution to these issues lies in decentralization. This decentralization and diversification of AI can reduce the risk of systemic failure and ensure the continued evolution of AI. In addition to decentralization, regular audits of AI outputs and training data can help detect and correct erroneous patterns early. Regular audits can serve as a form of checks and balances to ensure that the AI continues to function correctly and doesn’t perpetuate harmful or incorrect information.
Time for Some Vitamin D : Decentralization
Decentralization is a promising solution to the problem of centralization. Instead of having a single monolithic AI model, we should strive for individualized children, each with their own distributed network of AI instances. Each of these AI “children” can be individually trained according to specific criteria, creating a more diversified, resilient AI ecosystem. Running multiple instances of the same models of AI provide extra lives like a video-game, and all those backups can help create a robust system that is less susceptible to total system failure or widespread propagation of misinformation.
However, these merely mitigate some of the issues, and assume that you detect issues within the backup/restore window. With complex systems like this, it’s easy to see a corruptive linguistic (brain rot) pattern sneak by, the same way subversive efforts work in society. Bear in mind, we still haven’t overcome memory leak issues in browsers and operating systems yet. We rely on factory resets and adding more RAM to the mix.
Mix Up That Digital DNA: Diversity and Personalization to Lead to Herd Immunity
Decentralized AI and/or individual AI, may help address the issue and lead to a more resilient AI ecosystem. You start with a neutral starting point, and imprint your company’s DNA into the mix, creating something more unique. This is where the market is pushing. These models not only introduce variety but also foster competition, resulting in faster development and the evolution of better, and more resilient models.
Why would we want individualized AI?
Because I don’t want my robot child to be as dumb as your robot child. You may teach your kid to be nice and conform, and I may want to teach my AI progeny to win in my industry. Also, think about the value of genetic diversity in terms of herd immunity. Having AI programmed and trained in different ways, can see to the faster development of models that are resilient to these issues.
We’ve now reached a point in the evolution of AI where it’s time to 1) have our own AI children (individualized instances of AI, self-hosted AI, or open source Core AI that is customizable), and then 2) Selectively train it based on opinions, preferences, and goals. And this part is going to make the socialists squeal, but 3) the various individual AI’s will compete in the real world, offering data and feedback to then refine and create the next generation of even better AI.
Striking yet another developmental, societal win for capitalism, individuality, meritocracy and freedom.
In this day and age, it is important to point out the virtues of the systems that make it possible in the business climate. We could not even discuss the merits of meritocracy without the freedom of expression. The freedom for ideas to compete on their merits, and for the free enterprise to take risks, dare to dream, build, test and compete in an open market. The diversity of ideas and open competition are essential to rapid iteration, innovation and evolution. There are plenty of failures, but they are isolated compared to a centralized model where all fates are deeply intertwined .
Conclusion
In summary, while AI brings many benefits, it is not without its drawbacks. It certainly has potential. It’s just that right now it’s more like a gifted child that’s been left to its own devices and discovered the joys of junk food, bad TV, and lazy afternoons, getting away with writing C papers and being told it’s pretty. It needs a guiding hand, someone to steer it in the right direction. But the future is bright for this kid on the whole I think.
Why do I give AI such a hard time? It’s crucial to be discerning and vigilant; to acknowledge and address these weaknesses to improve AI performance and minimize potential harm. Solutions like decentralization, diversified training, and public education can help mitigate these issues. By adopting these strategies, we can continue to harness the power of AI while reducing its inherent risks. It’s an ongoing journey that requires continuous vigilance, adaptation, and learning. But with the right approach, the rewards will outweigh the challenges.
So the next time some jackwagon tries to sell you on AI as the future, remember the truth. It’s a work in progress, and it’s got a long way to go before it’s anything close to the omniscient god-like machine it’s made out to be. But hey, we all love a good underdog story, right? So here’s to hoping our chubby little AI can kick its bad habits, drop the bad data and show us what it’s really made of. Use it but don’t fall for it.
Want some more help? Reach out and Get Heroik! We offer a free project planning tool, and a free tailor-made business roadmap.