For AI to be a helpful tool for people, trust is key. When creating AI solutions the space needs to actively be created to include the end user in the loop. Not only is this ethical, as we want AI to be a net benefit to society by supercharging not replacing humans, but the human context is also needed to make sure the model output makes sense in the real world. We’ve already seen issues with this in the more recent GenAI wave with hallucinations in Large Language Models (LLMs) like ChatGPT, where facts are outputted which sound plausible but are actually made up. This naturally impacted trust and caused many to disregard these tools altogether.

So, how do we make sure these AI models earn and warrant our trust? For context, the focus of this article will be on B2B applications of AI for the purpose of company internal tooling. A big component of creating trust is having greater transparency built into the AI tool to help communicate to the end user the decision-making process that is taking place by the AI model. A large part of this is explaining the underlying assumptions and the data inputs the model is referencing to avoid the “black box” problem.

Black box problem: An AI model is referred to as a “black box” if it lacks transparency and it’s difficult to understand how it arrived at an output - whether that’s a decision, prediction or observation.

Human-In-The-Loop

The human-in-the-loop approach means designing the technology with human participation in mind. The end user can give the AI a helping hand so-to-speak to make sure its output makes sense in the real world and can be operationalized. This is where optimizing for accuracy does not always produce the best overall results for the user, since it might not take into account their requirements or restrictions. When it comes to wider context about the nuances of the problem at hand or the industry someone might be operating in, having the ability for someone to provide helpful nudges and context to the model can improve output quality.

A sheep in front of a classroom blackboard with the calculation 2+2=5 written in white chalk
Image Credit: Elimende Inagella

Investigation Engine for Output

Creating a workflow for the end-user to investigate model outputs and carry out their own retrospective will help people feel they’re in control of the process. This can be particularly useful for AI tools that go beyond Q&A and are used for assessment and data analysis. A dedicated environment for investigating would allow the end user to slice and dice both positive and negative conclusions, as well as dive into inconclusive and inconsistent results, to establish a root cause.

Explainability at the Data Level

The cleaning process involved in making sure data is ready for being inputted into an AI model is tedious for a human to carry out - from identifying errors to finding format inconsistencies and gaps in the data. AI technologies are at a point now where they’re able to help with automating the data cleaning process, which is great for efficiency but also needs to be explainable to mitigate the risks involved in the cleaning process. Mistakes during this processing step can introduce bias into your AI model. The risk of mistakes can increase if the data is high-dimensional i.e. the data has a large number of features or variables. An example of introducing bias would be if critical outliers are removed as ‘noise’. Removing outliers which are naturally occurring and not due to errors, means your model no longer accounts for surprises which are prone to happen, making the process appear more predictable than it actually is. You can mitigate this by communicating to the end user at the data level and not just at the stage of model output. This can be done by providing an assessment of the overall quality of the dataset looking at factors such as size and recency of the data. Also, it’s helpful to have an audit trail built into your tool explaining where and how data has been corrected during the data cleaning process.

Built-in Model Guardrails

Building clear and transparent guardrails into your AI tool can help make sure the necessary checks are in place to safeguard users and the system itself. Guardrails can take many forms, they can be internal company best practices and frameworks for how to develop AI models in ways that are safe, ethical and compliant. Then you can have technical guardrails which are embedded into the model itself. In the context of LLMs, these could be bits of code that detect toxicity in language, misinformation, bias, or age-appropriate output. Guardrails can be used at different stages, so for LLMs, prompts entered by the user can be screened to make sure they won’t cause the model to misbehave - this can be done unwittingly by the end user or with malicious intent to exploit security vulnerabilities. An example of this is engineering prompts to find ways of accessing servers, but guardrails can be created to recognise common patterns in these security attacks. Again, transparency is important. Internally, this can involve documenting the purpose, operation and rationale behind the guardrails as well as making sure they’re regularly audited and allow for input from a third-party. Also, when the technical guardrails are triggered by the user, the underlying assumptions need to be communicated to avoid confusion and frustration. Explaining why they triggered the guardrails and presenting corrective actions will increase confidence in the tool.

A lot of what underpins trust in AI is centred around transparency and explainability but you want to get the level right. It’s important to find the right balance between making sure it’s clear why certain decisions are being made by the AI model and not overloading the end user with information. The aim is to surface the most critical decisions and outputs for human consideration. Deciding what seems right can depend on who the end user is. Someone with a higher level of technical expertise may want to access greater technical context in explanations. Operators might be less interested in the technicalities but will want an indicator of accuracy or confidence level when AI is being used to draw its own conclusions. Most of the time though, it’ll be a case of information being more ambient, available for subconscious consideration, but under specific conditions might come more to the foreground. And sometimes in critical situations, there may even be the need to alert the user’s attention.

Image Credit: Cash Macanaya