April 16, 2026 · Edited to 51 minutes
Semantic Layers, TAM Building, and Signal-Based Targeting
Building semantic layers for GPT database queries, curating target account lists from broad TAMs, and using event signals for dynamic lead enrichment.
Jai walked through building semantic layers that make GPT-based database queries accurate and repeatable, including using GPT itself to auto-generate metric definitions from existing reports. The session covered the full TAM list-building workflow: start broad, enrich with signals, narrow to 30-40 high-quality accounts. Willy and Matt joined for a hands-on discussion on event-based targeting, the reply bot for multi-threaded outreach, webhook vs. polling cost tradeoffs, and how Deepline fits into N8N and Snowflake workflows.
Key takeaways
- —Use GPT to auto-generate semantic layer definitions from existing reports -- gets you 95% accuracy and avoids the unsustainable manual approach.
- —Keep semantic layers separate from system prompts so they can be parsed deterministically, independent of the LLM instructions.
- —Start TAM building with a small set of known good fits (30-40 minimum), then let cloud code discover additional signals the sales team missed.
- —Geographic outreach bias is invisible until you look at the data -- reps unconsciously default to their own time zone.
- —Serper.dev gives roughly 90% LinkedIn profile coverage at under a penny per lookup, dramatically cheaper than dedicated providers.
- —Webhooks are better for real-time signals; scheduled polling works for daily checks. Cost depends on whether you pay per check or per event.
What you'll learn
- —How to build a semantic layer that makes database queries accurate and repeatable for GPT.
- —How to narrow a million-record TAM to 30-100 high-priority accounts using signal-driven filtering.
- —How to uncover outreach inefficiencies caused by rep bias using enrichment data.
- —How to use Serper.dev as a cost-effective LinkedIn profile lookup in your waterfall.
- —How to set up a managed agent with a Slack interface for rep-driven multi-threading.
- —How to choose between webhook-triggered and schedule-triggered workflows based on cost.
Chapters
Semantic layers for GPT database querying
00:00:00Auto-generating metric definitions with GPT
00:01:50Keeping semantic layers separate from system prompts
00:04:30TAM list building: big list to small, curated list
00:07:06How many accounts is enough? 30-40 minimum, 100 ideal
00:09:50Geographic bias in outreach: the West Coast rep story
00:13:16Waterfall enrichment and Serper.dev for LinkedIn profiles
00:20:00Cloud code workflows: webhooks, validation, CRM updates
00:33:00Reply bot and managed agents via Slack
00:41:05Signal-based targeting: webhooks vs. polling
00:47:12Edited transcript
Edited transcript of the public recording. Dead air, setup chatter, and repeated filler were removed from the page version.
00:00:04 · Jai Toor
So this is one that I put together and this is running queries on a database. You can think of it as what are the key concepts they need to know. Instead of querying the data model every time, you define synonyms -- when someone says company, account, deal, opportunity, you have those mapped. You can define concepts consistently, add custom metrics, and the GPT always gets that definition first.
00:01:50 · Jai Toor
The hard part is building this, right? Now I have to go define every single concept, which isn't sustainable. So what we found is you give GPT the Salesforce table and the reports you're trying to generate, and say recreate this. What that captures is the metrics behind the definitions. You have to manually spot check, but for the most part that gets you 95% of the way there.
00:03:35 · Willy Hernandez
Do you do this in one running markdown file or do you separate semantic from a field reference?
00:03:51 · Jai Toor
Very, very separate. The semantic layer should work independent of the system prompt. You should be able to do something deterministic and programmatic on top of it. It's not free text. You're telling the system prompt to use it in the first step, but from then on the semantic layer is independent.
00:07:06 · Jai Toor
A lot of people are coming from the Clay world where everything is very structured. This is a model we used for scoring: build a big list, get potential accounts, have a good idea of your total addressable market, then try to get signals and enrichment about them. What's changing is you start with a small list of known good fits, score the features, and then expand from there.
00:09:49 · Willy Hernandez
Could you define small, in terms of volume?
00:09:53 · Jai Toor
Probably 30 to 40 is the minimum I've seen be actually useful and differentiated. Good results start happening around a hundred. It scales with what you're doing -- a restaurant company with 2 million targets needs maybe 10,000 before you differentiate, but for most products, starting with a hundred and guiding enrichment from what you find works well.
00:13:16 · Jai Toor
The data showed that west coast restaurants were converting higher. That's because the reps were starting early on the East Coast. By the time they got going, it was the beginning of the day for west coast people. So you have this confirmation bias -- 'oh yeah, west coast people answer the phone more.' That's not the causal driver. The data shows us something nobody would have articulated, which means all the east coast people weren't getting called in the morning.
00:33:03 · Jai Toor
Push to Lemlist and then it adds them to a campaign that I also designed with cloud code. Going from zero to warm outbound campaign took a couple minutes. That includes email validation, everything you need. This runs natively in our system with access to all integrations -- custom HTTP connectors, data providers, sequences, CRMs, or your data warehouse.
00:39:45 · Willy Hernandez
I want to target sales leaders going to major events like SuiteWorld or Dreamforce. If they participate, they're trying to find customers there but probably don't have before-event enrichment. If I can listen through signals of people going to these events, pull a list, enrich them, and give context as to why I'm reaching out, that's a good way to make the agent continuously scour for those people.
00:41:05 · Jai Toor
That's a perfect use case. We have an open-source managed agents implementation. The Anthropic managed agents work almost at Cloud Code levels of flexibility. It comes with a Slack interface built in so you can query any of these tasks. The reply bot finds additional contacts for multi-threaded outreach -- an SDR says 'I'm working on Nucor, find me other people within the account based on our ICP.'
00:45:04 · Willy Hernandez
Next time we do office hours, I'd be really interested in diving into the reply bot thing.
00:47:12 · Jai Toor
There are two ways to do signals. When a third party sends you information, create a webhook -- when this URL is hit with data, do some action. The trigger is external. The other way is schedule: every day, check for changes. You can hack a schedule to be a trigger. The difference is cost. We have a partner where you describe what you want and pay per update received, not per check. Like a penny per social post per person.
00:49:00 · Matt Batterson
Do you do these often?
00:49:01 · Jai Toor
Yeah, weekly. Weekly feels like a lot so we might go biweekly. We have a Slack channel too -- Deepline CLI feedback on the website, cloud code plus GTM is right there.