AI Governance / Risk Assessment

AI risk, sized for your business

You don't need a risk register or a Big Four consultancy. You need to think clearly about three categories of risk and check in on them quarterly.

Reviewed by Level Up Automate.
TL;DR
  • Three risk buckets cover most of the surface: data, accuracy, and dependency.

  • Score each tool 1–5 in each bucket. Anything that scores 4+ in any bucket gets human review.

  • Re-check at every contract renewal and any time the tool announces a major change.

Bucket 1: Data risk

What can the tool see, and where does that information go? A coding assistant that sees your source code is high-data-risk. A meeting note-taker that records customer calls is high-data-risk. A grammar checker on internal memos is low.

For each tool, ask: what data type is it touching, where is it stored, and could a leak embarrass us with a customer?

Bucket 2: Accuracy risk

What happens if the tool is wrong? An AI that drafts a customer email — the draft is reviewed before sending, low-accuracy-risk. An AI that prices proposals automatically — high. An AI that triages support tickets and recommends actions — medium, depending on what those actions are.

For each tool, ask: who reads the output before it acts on the world, and how bad is a mistake that slips through?

Bucket 3: Dependency risk

What happens if the tool stops working tomorrow? If sales reps can't draft proposals, you'll feel it but recover. If your support team has rebuilt their workflow around an AI summarizer that gets shut down, you have an outage. The deeper the tool is wired in, the more you need a backup plan.

For each tool, ask: how long until we'd feel the pain if this disappeared, and what's our fallback?

How to score and what to do

Make a simple table: tool name, three columns scored 1–5, plus one column for who owns it. Score it together with the team that uses it — not in isolation in the executive suite. The conversation alone surfaces 80% of what you need to know.

Anything scoring 4 or 5 in any column gets named in your written policy with a clear human-review step. Anything scoring 5 in two or more columns gets a second look — should it really be in production yet?

Common questions

Plain-English answers

Do I need to do this for every AI tool?
Yes, but the bar is low for low-risk tools. A grammar checker scoring 1/1/1 takes 5 minutes to assess. The whole table for a 10-tool stack should take under an afternoon.
How often do I redo this?
Quarterly, plus at each vendor contract renewal. Vendors change behavior — model upgrades, new data uses, sub-processor changes. Yesterday's score isn't always today's.
Next step

Want a hand getting this right?

A 30-minute conversation often saves weeks of guessing. We'll talk through your team, your data, and what to do first — no slide deck required.