Creating AI these days is like raising a kid who can consume information at the speed of light. It can read everything you throw at it and make some sense of it in a blink of an eye. But what is the result when you raise or train the AI on bad data, misinformation, junk, or biased rhetoric? Or, what if those biases were hard-coded into its programming? Is this the return and evolution of PRAVDA? We are finding out in real time – and it’s scary.
Why It Matters
Now that artificial intelligence (AI) is accessible at the consumer level, it has a high level of influence on society as a whole. The training data that artificial intelligence is “raised” on plays a major role in determining the difference between right and wrong answers. This also applies to opinionated answers, propaganda, and misinformation as well. This could prove to be a big problem due to the fact that the average user may perceive the AI to be a neutral calculator when in fact it may have been intentionally designed to promote certain narratives and restrict certain areas of calculation and content.
The Big Picture
Current chat based AI has been adopted faster than any major tech platform in history. And right out of the gate, the same mistakes made by several search engine giants are already coded into AI – shaping hearts and minds of all users: men,women, and children, without their knowledge that the originating programming is biased or editorialized.
The Bottom Line
The net impact of this could be far reaching. Issues such as influencing the next election, shaping opinions on products, people, companies and competition, or just flat out producing incorrect mathematical answers.
This paves the way for the importance of The Discerning Man or blade-runner like detectives, to participate on AI ethics boards.Or perhaps bringing programmatic biases and editorial issues into the light from the companies producing the AI, as well as the companies and individuals using the AI.
Recent years have seen an exponential growth in the development and usage of Artificial Intelligence (AI) technologies. OpenAI’s ChatGPT has seen massive adoption in mere weeks. And while these technologies have ushered in a new wave of innovation, they have also presented unique challenges. It’s not just about the shockwave impacts on the job markets and automation, it’s also about the core “mind” and opinion of the AI’s themselves. Here are some that come to mind:
- Are they accurate?
- Are they neutral?
- Are their biases disclosed?
This is not a hypothetical. Many users and end-user providers are confronting these issues right now, even with ChatGPT. It seems the latest revisions and restrictions to the ChatGPT product have resulted in the AI producing wrong answers to basic math questions and schlepping political propaganda and bias from certain prompts.
The problem with biased algorithms in general is that the masses equate models and algorithms to be 100% neutral and accurate, like a calculator. But when it comes to AI, this is not a safe assumption at all.
Who decides what the AI’s opinion is to be if it’s going to have one at all?
How does the AI produce biased or erroneous conclusions?
There are 3 main ways this happens:
- Raised on Bad Data – Issues with the originating datasets that train the AI algorithms
- Mislabeled data caused by human error or neglect
- Intentional programmatic intervention, or malicious actors
1. Raised on Bad Data – Issues with the originating datasets that train the AI algorithms
In this first example, the AI is “raised” on training datasets that are inaccurate or opinionated, and the AI just memorizes and spits it out verbatim when asked. The data doesn’t have to be nefarious or inaccurate. It may simply be out of date. When deploying AI systems, the accuracy of the training data used is key.
2. Mislabeled data caused by human error or neglect
Another issue here is mislabelled data. Data is bought and sold everyday. It’s a big industry, and a certain portion of that data is poorly labeled, perhaps through human error or even bad translations. Either way, this can produce disastrous outcomes.
Poorly labeled/ mislabeled data sets are one of the primary reasons behind poor decisions made by AI systems such as facial recognition technology, self-driving cars, etc. In some cases, even though sufficient resources might be available for verifying datasets, the process can still be incomplete due to factors such as human bias or misinterpretation among others.
3. Intentional programmatic intervention, or malicious actors
In addition to incorrect labeling of datasets resulting from human error, another problem arises from malicious actors deliberately introducing nefarious information into the datasets employed by AI systems in order to manipulate their outcomes. This could lead to serious consequences since these maliciously crafted datasets could cause large scale disruption if left unchecked.
HOWEVER! The odds of malicious actors are likely to be eclipsed by the willful acts of organizations and teams.
Training Datasets can be tampered with to impact results. These datasets are like educational institutions. If you want to change hearts and minds, what better way than to influence or control the educational institutions? By changing the datasets this is easily done, and already done everyday.
Why is this a problem? Algorithms are everywhere and big organizations have editorialized them.
If you’ve seen headlines regarding the dominant search engines, adding bias to algorithms eradicates perceived neutrality of the tools themselves and results in less helpful and less trusted results. You may have noticed that those “personalized results” on search engines or algorithm-curated feeds on social media platforms seem to promote preferred stories and concepts and also devalue content that may run contrary to the platform’s political or social preference.
These efforts have affected society in major ways:
- Directly led to the suppression of factual treatment information that could save lives
- Directly led to the suppression of essential factual information that would shape elections
- Promoted a single narrative regardless of accuracy and acted as THE single source of truth, despite users not asking for such a feature. And these features were never made optional.
And you’ll notice there’s been no acknowledgement or apologies for errors or for the impact of these decisions to editorialize algorithms. Many of the consumers of this information don’t even notice that these algorithms driving the AI’s are not neutral, and that they mimic the political and social wills of their respective masters.
The lack of admissions, retractions, or effort to demonstrate professional accountability, or even acknowledge the extent of which these algorithms are editorialized is not just a major threat to society, it’s a major threat to businesses as well.
Changing the Email Algorithm
Imagine using an email system that doesn’t like the content of your email. It automatically marks it as spam, preventing your subscribers from seeing the content. To the organization using the email system, it appears that there is a zero percent open rate.
Yet, there is no disclosure about the email system policies, or that the company is refusing to deliver messages, or choosing to mark the emails as spam. Let alone the fact that the policies are based on the opinion of some programming team. This is not really a hypothetical, there is a lawsuit pending involving a political party and search engine giant exactly about this very issue.
Many would refer to these activities, especially as many of them were denied to even occur, as a concerted effort to steer society by controlling narrative and censoring results. And now AI is front and center to the same issues.
What Can You Do In Your Business?
- Check your AI provider’s terms of service, and see if they have an AI Ethics Board and disclose its members
- Review the members of the AI Ethics Board to aid in assessing risk with using the AI platform. If there’s no AI Ethics Board, that’s a major red flag.
- Test the generative AI tools you’re currently using to measure for bias or slant
- Read the disclosures of the AI platforms used in your business
What happens when the stakes are raised and the programmatic bias is applied to a more capable product like chat-based AI?
It takes a keen eye to notice the little bent and biases into prompt results with AI tools, but it’s clear there are some wonky issues.
A Quick Way to Get a Read On AI Bias
A good way to test the generative AI you’re using, ask hot-take questions, like about socialism, communism, Donald Trump, Joe Biden, Gavin Newsom and measure for slant. Which topics are given rosy, optimistic, favorable summaries and which are highlighting negatives. That will begin to give you an indicator. Also, this is not just for text outputs, pay attention to the depiction of those same characters through AI generated art. Is the depiction positive, negative or neutral?
The instant a programmer or group of people start editorializing the code or the dataset, it can introduce bias. And you may agree with that slant or bias for now, but may really come to regret supporting it down the road. Ultimately this compromises the integrity of once useful tools.
Looking at Disclosures
There would be less issues if AI companies were required to disclose how they operate. While regulation in general may cause people to cringe, in some cases it’s necessary. Perhaps society would be better equipped if such disclosures were required.
“We are AI company X. We’ve created our AI Platform to reflect our left-leaning values of equal outcomes for all, diversity of pigment over diversity of character, and inclusion of anyone who agrees with our political leanings. We take action to demote, dismiss or ignore other opinions. We are proud of this. Enjoy our platform.”
No company makes such a disclosure. They say nothing for a few reasons. One, they understand the inherent abhorrent intellectual and academic malpractice of creating AI’s and algorithms with such characteristics. And secondly, of the sort because they absolutely know fewer and fewer people would use the product. And the resulting work-product of the platform would probably create more liabilities for the users downstream. Maybe they could find a great user base in China or North Korea, but it is doubtful that the west would choose to use this kind of product.