An AI dilemma: how to implement generative AI tools safely and ethically

Posted on August 21, 2023 by Dan Ciruli

CNCF projects highlighted in this post

Member post by Dan Ciruli, VP of Product at D2iQ

Artificial intelligence is being used in all sorts of ways, from chatbots and virtual assistants to self-driving cars, and 97% of business owners believe that ChatGPT will help their business. But with any new technology, there are concerns about safety and ethics – and it’s no different with AI.

Some business leaders have recently called for a six-month pause on the development of new models more powerful than GPT-4, warning of “profound risks to society and humanity.” With the introduction of the Biden Administration roadmap to promote responsible innovation and focus investment in AI research and development, it’s clear that these risks must be properly mitigated to ensure that safety and the public good remain at the center of all innovation.

For companies looking to adopt AI on an enterprise level, there is hesitation on the longevity and safety of new generative AI tools, which poses a necessary question – is all AI bad? What ethical concerns do we need to be aware of?

As we work to identify answers to these questions, there are tangible steps that can be taken to avoid risking possible ethical dilemmas brought on by data bias. Companies using generative AI must be cognizant of the potential damage that bias can cause and, while large language models (LLMs) are useful, they rely on large sets of data that must be reliable and unbiased.

Ethical challenges of AI

While ChatGPT and other new AI-generated tools are tempting, and the opportunities seem endless, integrating them into existing products without caution and careful review can reinforce existing stereotypes and discriminatory practices. These generative AI models rely on large sets of data to form their reasonings and explanations, and if those data sets are flawed, biases will be reflected in the responses and work it produces.

Data bias used to train these tools can lead to catastrophic results, which is one of the many reasons why an ethical code must be developed and enforced among organizations creating, adopting and integrating these tools into existing products and platforms. For instance, a study by two researchers at the University of Washington found that ChatGPT perpetuates gender stereotypes for occupations across several different spoken languages.

How to harness the benefits of AI – without causing harm

Avoid bias

The most obvious step in creating AI tools that do not suffer bias is to ensure that the data on which the AI is trained does not have bias. This is at odds with models that are trained on the public internet; there is no way to ensure that data pulled randomly from the internet can be free of bias (and, in fact, virtually guarantees that bias will exist). However, when targeting very specific use cases, you can limit the input data and in turn, vet the training data for bias.

Choose use cases wisely

When deciding whether or not to use AI in a particular use case, think about whether and how AI might be affected by bias. You may find use cases that are much less likely to suffer from bias (for example, in my industry, generating Kubernetes YAML from an English description of a deployment topology) than others (for example, writing a job description for an engineering position, which could accidentally introduce gendered pronouns indicating bias).

Protect user privacy

We are more aware than ever of how data is being used – think about the number of times a day you get asked about “cookies” on a website. AI and language models represent yet another way that data can be used, and just like with waves of innovation that preceded this one, we need to ensure that we are protecting data privacy.

If you are planning on using user-submitted content as part of your training dataset, you must at least notify your users that their data can be used in that way. And ideally, you would allow users to opt-out of having their data used in training.

Be transparent about how AI is being used

While ChatGPT can be a useful tool and ease many monotonous and routine tasks, it is crucial to be transparent about AI usage – both internally and externally. A thorough understanding of not only how the work was created but also the data set that was used to inform the work is required to ensure proper fact-checking and bias-reducing actions can be taken.

Transparency can help police any bias and build trust with users by openly sharing information to inform employees, customers and users’ decisions based on their comfort and encouraging a two-way dialogue about the use of such tools.

Large-Language Models

LLMs are a powerful interactive tool for implementing ChatGPT and other generative AI tools. The best part? They can be trained on private and personalized data sets and models, mitigating many of the ethical issues that may arise in other use cases.

Enterprise companies looking to adopt generative AI can use LLMs to build AI-driven chatbots ranging from technical support portals to blog post generators. However, a disclaimer is needed here – like all code, whether it was written by a colleague, copied from Stack Overflow or generated by an LLM, it must be carefully reviewed and tested before put into production.

While it is important for the industry as a whole to take steps to ensure ethical models are being enforced, companies themselves must also take on the responsibility of reducing bias when implementing new generative AI tools. As the generative AI landscape continues to evolve and new models are introduced to the market, companies should keep a close eye on not only how these new models can benefit their organizations, but also the broader impacts of implementing these technologies on a larger scale.