Getting AI tools into people's hands is the easy part. Getting them used well is something else.
I recently ran an internal pilot with Claude across a team that hadn't used it before. We gave people access, set some expectations, and watched what happened.
What became clear, quickly, is that the tool itself is the smallest part of the problem. The harder questions are about context, habits, and the decisions nobody thinks to make explicit upfront.
Here's what I'd tell anyone starting that process.
The tool you pick for a task actually matters
Most people assume the AI is the AI. One chat window, one model, same result regardless of how you get to it.
That's not how it works.
In Claude's case, there are distinct ways to work with it. The standard chat interface is the right starting point for most tasks: drafting, summarising, reviewing a document, writing an email. It's fast, capable, and won't drain your usage allowance on routine work.
Then there's a more powerful model (Opus) available via the model picker. It handles complex, multi-step reasoning better than the default. It's also a lot more expensive in terms of usage.
On a shared team plan, that matters. Burning Opus on tasks the default model handles fine is the fastest way to hit limits mid-afternoon and leave your colleagues blocked. The practical rule: start with the default. Upgrade deliberately when the task actually requires it.
There's also an agentic mode. It handles multi-step tasks on its own, works directly with files on your machine, and runs without you watching every step. That's powerful when you genuinely need it. It's not a fancier chat window.
Open it for a task that chat could do in 30 seconds and you've burned through several standard tasks for no reason. Before handing anything to an agentic tool, use chat first to plan what you want it to do. Getting the instruction right before you start is the single most effective way to keep usage down and results up.
Some modes aren't ready for everything
This is the kind of thing that doesn't appear in the marketing material.
Agentic AI tools are often still in beta. Ours wasn't yet captured in audit logs or compliance reporting. Anything commercially sensitive, client-related, regulated, or that you might need a record of later should not go through agentic modes until that changes.
Conversation history in some tools is stored locally on the user's machine, not centrally. That's fine for low-risk work. It's a genuine governance gap for anything else.
Know which mode stores what, and where. If you're running a pilot, establish those boundaries clearly before people start using the tools, not after.
Prompting is a skill, and most people start badly
The quality of what you get out of these tools depends almost entirely on what you put in. That sounds obvious. It is also routinely ignored.
The most common mistake is telling Claude what you don't want. "Don't make it too long" or "don't be too technical" is vague in a way that produces vague results. Say what you want instead. "Write a three-paragraph summary in plain English for someone who has never heard of this topic" gives the model something to aim at.
A few habits that made a real difference in our pilot:
Put the document first, your question last. If you're pasting in content for Claude to work with, put the content at the top of the message and your instruction at the bottom. It processes in order. This matters.
Be specific about output format. If you want a table, say table. If you want bullet points, say bullet points. Claude will default to whatever format seems reasonable to it, which may not match what you need.
Ask everything at once. If you have three questions about a document, put all three in one message rather than sending them one at a time. You get more coherent answers, and you use less of your allowance.
Tell it who the output is for. "Write a summary for a client who knows nothing about carbon markets" gives you a different result from "write a summary." The two prompts don't return the same thing. Audience context does real work.
When it's wrong, be specific about why. "Make it shorter" is less useful than "cut this to three sentences." Vague correction produces vague improvement.
Newer versions of Claude are more likely to push back on your framing. They may flag uncertainty or question what you've asked. That's not a bug.
It's the model doing something closer to actual reasoning. Engage with it rather than just re-prompting until you get the answer you expected.
The compliance conversation needs to happen early
Most people won't think about data handling until something goes wrong.
Standard chat interfaces in enterprise AI tools typically have defined data retention and handling policies. Agentic or beta modes may not.
If your organisation works in a regulated sector, or handles client data, get clarity before anyone starts. Know which mode applies which policy before sensitive information goes in.
This isn't a reason not to roll out the tools. It's a reason to be deliberate about what goes where, and to make that distinction clear to the people using them.
Run it as a proper experiment
The teams that get the most from an AI pilot are the ones that treat it as a learning exercise rather than a productivity programme.
That means capturing what people actually tried, what worked, and what didn't. Not in a formal report nobody reads, but in the moment, as part of normal use. What did you ask? What did you get? Was it useful? What would you change?
That data is what lets you move from "everyone has access" to "people know how to use this well." Without it, you just have licences.
The tools are capable. The gap is rarely in the model. It's in the habits, the boundaries, and the judgment about when and how to use what. Those things take deliberate effort to get right, and that effort is worth it before you hand access to a team, not after.