Your Image Library Is Full of Information You Can't Find. Here's How to Fix That with AI.
A few thousand photos. Zero searchable content. That's the state of most enterprise image libraries today — and it's costing teams more time than anyone tracks.
Walk into any construction company’s SharePoint site and you’ll find the same pattern: a document library packed with site photos, project images, building interiors — all named exactly the way every camera and phone names them. IMG_4821.jpg. IMG_4822.jpg. And so on, for thousands of files.
Switch to tile view and you can squint at thumbnails one at a time. But the moment someone needs “the photo with the workbench from the Q3 site visit,” there’s no shortcut. Filenames carry zero information. Search carries even less, because search has no idea what’s actually inside an image — only what it’s called.
This isn’t a construction-specific problem. It’s an *image-heavy industry* problem. Healthcare libraries full of clinical and equipment photos. Legal teams managing evidence and exhibit images. Real estate teams sitting on thousands of property photos. In every case, the content sits there, fully visible to a human, completely invisible to a system.
The cost shows up as time — minutes per search multiplied by however many times a week someone needs to find “that one image” — and as risk, when the right image never surfaces at all because no one thought to name it well enough to find it later.
What I actually did about it
I recently worked through this exact scenario using **Skills**, a capability inside Copilot in SharePoint, and it changed how I think about image metadata entirely.
Here’s the sequence, step by step:
-- Selected a single image in the library and asked Copilot to identify the *primary objects* in it. Within seconds, it returned a clean, comma-delimited list: pencil, caliper, ball bearing, technical drawing, wooden table.
-- Asked again — same image, different lens — for the *background objects*. Copilot came back with wooden table, technical drawing paper, pencil.
-- Told Copilot to write both outputs directly into the file’s metadata columns. No manual tagging, no spreadsheet export, no second tool. Just a prompt, and the document library’s metadata updated in place.
That alone was a meaningful improvement. But here’s the part that matters for anyone managing this at scale: doing this one image at a time isn’t a workflow, it’s a chore. Nobody is going to manually prompt Copilot for every photo in a 5,000-image library.
Turning a sequence of prompts into a reusable skill
So I asked Copilot to package the entire sequence — identify primary objects, identify background objects, write both to metadata — into a Skill. I named it **Construction Image Tagger**.
What came back was a `skill.md` file, saved automatically to the library’s agent assets folder. Opening it up, the structure is worth understanding because it’s the actual mechanism that makes this repeatable:
-- **Trigger phrases** — natural-language cues like “tag this construction image” that tell Copilot when to invoke the skill, fully editable if your team uses different language
-- **Direction on when to use it** — context for the agent about the intent behind the skill
-- **Inputs** — what the skill expects to receive (in this case, one or more selected images)
-- **Steps** — the exact sequence of actions, written out plainly
-- **Output** — what gets returned and where it gets written
-- **Error handling** — a built-in instruction that if any step fails or returns empty, the skill says so plainly rather than guessing or fabricating tags
That last point is the one I’d highlight to any business leader evaluating this. This isn’t a brittle macro that breaks silently. It’s a documented, inspectable, editable asset that fails loudly instead of quietly producing bad metadata.
Proof: does it actually hold up at scale?
I selected a batch of images — not just one — and ran the skill against all of them in a single command. Metadata populated automatically across every file: primary objects, background objects, no repeated prompting required.
Then the test that actually matters to a business user: I went to SharePoint search and typed “workbench.” It returned the correct image — not because anyone had renamed the file, but because the object was now metadata, not just pixels. I asked Copilot directly, “find me an image that includes a workbench.” Same result, same image, conversational interface this time instead of a search box.
That’s the moment the use case stops being a demo and starts being infrastructure: the same metadata now powers both traditional search *and* AI-driven retrieval, sorting, and analysis, without maintaining two separate systems.
Why this matters beyond construction
The pattern here is industry-agnostic, and that’s the more important point. Anywhere images carry information that the filename and folder structure can’t capture, this same approach applies directly:
-- **Healthcare** — clinical photos, equipment images, facility documentation, made searchable by content rather than by whatever naming convention happened to be used at capture time
-- **Legal** — evidence photos and exhibit images, where being able to search “find every image showing X” during discovery or case prep has obvious time and cost implications
-- **Real estate and facilities** — property and asset photos, searchable by feature rather than by address or file number alone
In every one of these, the underlying mechanics are identical to what I showed above: select images, ask Copilot to identify what’s in them, write the result to metadata, then package the sequence into a skill so it doesn’t depend on someone remembering the right prompts.
This is also, I’d argue, a useful lens for any leader thinking about where AI investment pays off fastest. It’s not always the flashiest agent scenario. Sometimes it’s making an asset you already own — your existing image library — actually usable.
Where to start
If you’re sitting on an image library with this exact problem, here’s the honest scope of the lift: this took a handful of prompts and a few minutes to set up as a reusable skill. No development team, no separate AI platform, no migration.
Watch the full demo here:
YouTube link:
If you’re working through similar AI adoption questions in your own organization — particularly around making existing content actually usable rather than chasing net-new AI projects — subscribe here for more practitioner walkthroughs like this one, and follow along on LinkedIn for the shorter-form version of these as they happen.
The summary
Most enterprise image libraries are full of information that’s invisible to search, because the only thing a system can “see” is a filename. Copilot in SharePoint, through Skills, closes that gap: it identifies what’s actually in an image, writes that as metadata, and turns the whole sequence into a reusable, inspectable asset instead of a one-off prompt. I showed it on construction site photos, but the same handful of steps applies to any image-heavy industry — Healthcare, Legal, real estate, and beyond. The barrier here isn’t technical complexity. It’s simply knowing the capability exists.
