
Merck is using AI agents to reduce drug discovery cycles by a third and ship compliant marketing materials up to 80% faster, but Vice President of Digital Platforms Sean Finnerty says the only reason it’s working is because they built the infrastructure first.
And the pharmaceutical manufacturer is seeing promising early results: AI is generating marketing drafts that are “99% correct” when it comes to compliance, reducing review cycles from months to days and speeding up delivery by 70% to 80%. Meanwhile, in the company’s medical research, an AI-assisted discovery cycle was reduced by 33%.
Still, agent AI only works if companies first build the underlying “plumbing,” Finnerty said of digital platforms and services at a recent AI Impact Series event.
“If we do one-off things, we’re going to end up with thousands and thousands of things that will ultimately just be debt that we have to deal with later,” he said. “And that will be a drag on any further innovation.”
Starting with plumbing
Merck’s strategy of prioritizing plumbing comes from lessons learned during the early days of the cloud in the 2010s, “when no one knew what the heck was going on,” Finnerty said.
Getting the cloud right meant building from scratch; At Merck, that infrastructure now supports 2,500 AWS accounts, numerous Microsoft Azure subscriptions, and new Google Cloud Platform (GCP) integrations.
“AI will be exactly the same,” Finnerty said. “We’re going to have thousands and thousands of agents.” Then the questions accumulate: How do they register? How do you secure them? How do you ensure they are connected to the right tools and have access to the right data and the right context?
The delivery of context is also critical; Merck works with three hyperscalers and has forty-seven edge locations and hundreds of databases. “Many, many petabytes” of structured and unstructured data are stored in Oracle databases, SQL databases, Excel spreadsheets, phone transcripts and other repositories, Finnerty said.
His team is building scaffolding to provide meaningful context in various situations, he explained. Data must be organized and incorporated into multiple platforms, because “there is no single solution to solve all problems.” Sometimes it’s Databricks, other times it’s Amazon Redshift, “plus four other things.”
The goal is, “Let’s make this easy and frictionless for people, let’s secure it and make sure it’s well integrated with MCP (model context protocol), A2A (Agent2Agent) and upstream computing,” Finnerty said. “If you want to run things on GCP or you want to run things on AWS, we have the pipelines in place so you can run your adjacent workloads wherever you want.”
How Merck uses agents
As it develops its technical system, Merck is experimenting with agents in regulated business operations, scientific discovery workflows, and application modernization.
In particular, AI is accelerating drug discovery. Finnerty explained that scientists look at molecular structures and disease states to determine whether a given condition can be treated with drugs. But even if a disease is known, developing a drug to combat it can take years.
Now, with AI, teams are starting to see “very promising things,” like reducing a particular research cycle by a third. “That’s a year off the life of the discovery cycle,” Finnerty said. “Which means that, in theory, we can get it to a patient who needs that therapy a year faster.”
Once developed and approved, these products are regulated and marketing materials related to them must be clearly and explicitly articulated. “The way that information is communicated by market, by country, by state, by region, is very carefully governed and regulated,” Finnerty said. It’s also variable: a vaccine advertising campaign in the state of Georgia looks very different from one launched in Canada.
Historically, humans did due diligence to make sure the company complied with various laws. Draft materials go through iterations of revisions; When a mistake is discovered, “they put it back to the beginning, repeat it again, and then it takes another few weeks and months,” Finnerty said.
But now, AI can do that “much, much more effectively,” and the process is increasingly evolving from a human in the loop to essentially a "human as governor." With human supervision, AI can deliver a first draft in a day or a week with 99% compliance, allowing teams to ship materials up to 80% faster.
Meanwhile, when it comes to application modernization, AI can discover architecture, document data interactions, APIs, network paths, and perform authentication and authorization checks; You can also write code for Terraform for deployment and refactor JavaScript in Python.
Where before the company would have spent weeks and months and hundreds of thousands of dollars to update an app, Finnerty said, now agents are handling the job through prompts.
running towards "craziness"
That’s not to say there aren’t significant challenges; Finnerty noted that his team has run into some “crazy things”; for example, in automated code and scenario testing. AI has shamelessly made up scenarios, whether due to incorrect context or infrastructure, “or whether it was just getting creative: ‘You should try these three functions that don’t even exist in the code you’re trying to test.'”
“That surprised me a little because I thought we had already overcome some of the challenges of hallucinations in these later models,” he said.
To address this, his team has designed guardrails to keep hallucinations to a minimum, essentially using AI to monitor the AI and apply confidence scores. So if Claude created the first result, they will tell Microsoft Copilot to evaluate it.
“So if you ask something once, have the AI review it, and then ask it a third time, the confidence increases each time and minimizes some of the garbage that is created in the first few runs,” Finnerty said.
Agent AI use cases in financial services
Meanwhile, at Mastercard, chief data officer Andrew Reiskind and his team are focusing agent experimentation on highly orchestrated transaction and dispute workflows. As you noted, a chargeback or fraud dispute is not a one-time event.
When a consumer disputes a charge (usually online), that “starts a completely different process on the back end that tends to be very labor-intensive,” Reiskind said.
Mastercard has to collect details about the actual dispute; Then the merchant has its own investigations (Was the card reported lost or stolen? Does the consumer frequently dispute the charges?). Additionally, the network in between has its own rules for timing and sending information.
“You have each and every one of these steps, many of which are unstructured, but there are also structured data elements,” Reiskind said. The question of whether a card was lost or stolen tends to be structured, but the consumer complaint is “unstructured data of questionable reliability.”
“So we’re looking at a decision-making system that has deterministic decisions, but also probabilistic decisions,” he said.
This problem can be accelerated and potentially solved by AI agents, but it can be a complex process: What tasks are you giving the agents? When will things be returned to the human representatives? How many agents are you ultimately using? What are the cost implications?
Then there are reputation issues and costs: Did you just potentially call a consumer a liar when they weren’t lying?
“It’s an exact problem that you, as a bank, want to maintain the trust of your consumer,” Reiskind said. “But we also want to make this efficient and eliminate costs from the system.”
The PB&J Versus Turkey Mistake: Determining Which Risks Are Acceptable
There will always be risks with AI, and companies should evaluate them early in product design, Reiskind said. There is also the question of acceptable risk.
For example: Did you serve a customer a peanut butter jelly sandwich instead of a turkey sandwich (a minor inconvenience)? Or did you serve gluten to someone with celiac disease?
“Is it an acceptable risk if one percent of the time you make the mistake? If it is, let’s move on to the next stage of how to mitigate that risk,” Reiskind said.
Leaders must conduct cost-benefit analyses, breaking down problems into their “constituent parts” and calculating the cost of each. But these are estimates; It’s almost impossible to forecast actual usage, Reiskind said. “It’s not a simple process to get to the cost,” he said. “But it’s doable.”





