Email Enrichment Pipeline (Request β Generate Billable Contacts)
How CRED turns a "discover this person's email" request into ranked, validated, deduped email contacts. This is the engine behind email (and phone) enrichment β the request/validation/ranking logic that sits on top of the individual vendors.
Where this lives
All of this is in commercial-api (cred-api-commercial), under
src/domain/person/usecase/. The waterfall calls model-api per
provider; model-api owns the outbound vendor calls and rate limiting.
Related docs:
- Universal Waterfall β the product behavior / config surface.
- Enrichment Rate Limiting, Bulk & Resilience β outbound vendor pacing, bulk batching, 429 backpressure.
- Adding Enrichment Vendors β how a new vendor is integrated end-to-end.
The three layers
enrichPersonContacts (GraphQL mutation)
β
βΌ
RequestPersonBillableContactsUseCase β resolve persons, blacklist, credit check, sync vs async
β
βΌ
GeneratePersonBillableContactsUseCase β the waterfall engine: providers β validation β ranking β persist
β
βΌ
per-provider helpers β model-api β external vendors (Cognism, Apollo, Lusha, β¦)
| Layer | File | Responsibility |
|---|---|---|
| Resolver | src/graphql-api/person/resolvers/enriched-contact-resolver.ts |
enrichPersonContacts mutation; thin β delegates to the request use case |
| Request | src/domain/person/usecase/request-person-billable-contacts.ts |
Resolve the person set, blacklist filter, credit check, route sync/async |
| Generate | src/domain/person/usecase/generate-person-billable-contacts.ts |
Run the waterfall, validate emails, rank, dedupe, persist to custom fields |
1. Entry point β enrichPersonContacts
A single @ExposedOperation mutation (display name "Discover Person Emails /
Phones", exposed as WORKFLOW_ACTION / MCP_TOOL / REST, tagged
internal on the federation graph) covers both email and phone enrichment.
Arguments:
| Arg | Type | Meaning |
|---|---|---|
format |
ContactFormat (default EMAIL) |
EMAIL or PHONE |
personIds |
[Int] |
Explicit person IDs to enrich |
collectionId |
Int |
Enrich every person in a collection |
sequenceId |
Int |
Enrich every person enrolled in a sequence |
noEmailsOnly |
Boolean |
Skip persons that already have an email/verified contact |
personSearchFilters |
InputPersonsSearchFilters |
Search-based selection (combinable with collectionId) |
The resolver stamps a TriggeringEventInfo (CONTACTS_ENRICHMENT,
random triggeringEventKey) onto the context β that key threads through every
feature-log row written downstream so a single enrichment run is traceable β and
calls context.usecases.requestPersonBillableContacts.execute(...).
requestProfileRetrieval(personId) is a thin sibling mutation that calls the
same request use case with format: EMAIL for a single person (used for private
LinkedIn profile retrieval).
2. Request layer β RequestPersonBillableContactsUseCase
Extends UserUseCase (authenticated). Three jobs: resolve, gate, route.
Person resolution (resolveRawPersonIds)
Exactly one selection mode wins, in this precedence:
sequenceIdβ person IDs of allACTIVE/PAUSED/PENDINGenrollments (deduped).collectionId+noEmailsOnlyβ collection items that currently have no email.collectionId+personSearchFiltersβ search within the collection (collection_size_limitcap).collectionIdalone β every person in the collection.personIds+noEmailsOnlyβ drop person IDs that already have a tenant contact with any email (email/emails/importEmails).personIdsalone β as supplied.
The resolved set is then run through filterBlacklistedEntityIds
(EntityTypeEnum.PERSON) so blacklisted persons never reach a paid vendor call.
Credit gate (validateInput)
- Parent feature is
FEATURE_CONTACT_WATERFALL(email) orFEATURE_CONTACT_PHONE_WATERFALL(phone). - Loads the enabled child datasource features under that parent; if none β
ValidationError(Pricing is not configuredβ¦, HTTP 400). maxPrice = max(child feature amounts).- Unless the user
isSupport, requiresremainingCredits >= maxPrice Γ personCount, elseValidationError(HTTP 429 βInsufficient creditsβ¦). - If the parent feature is disabled β
ValidationError(Feature is disabled.).
Credit cost is upper-bounded at the gate, charged per-provider later
The gate reserves against the most expensive datasource. Actual charges are written per-person-per-provider as the waterfall progresses (see feature logging below), and reduced for BYOK customers.
Sync vs async routing (innerExecute)
| Person count | Path | Returns |
|---|---|---|
| 1 | Calls GeneratePersonBillableContactsUseCase.execute() inline, unwraps .contacts |
EnrichedPersonContactData[] |
| >1 | Enqueues GENERATE_ENRICHED_CONTACTS worker task via lazyPushTaskFromWeb |
null (client polls the subscription) |
The single-person path discards the partial-success warning (the synchronous
GraphQL surface has no field for it); the async path forwards it on the
CONTACTS_ENRICHMENT_COMPLETED PubSub payload.
3. Generate layer β GeneratePersonBillableContactsUseCase
extends BaseUseCase. The recursive generateContacts() method is the
waterfall. Output is { contacts, warning? } (COM-40844 β the warning carries
a swallowed partial-success notice such as a ZeroBounce transient).
Provider order (feature IDs)
The waterfall walks an ordered array of feature IDs, one provider per step. The order is per-tenant configurable; when a tenant hasn't customized it, the default is resolved as:
- EMAIL β
getDefaultEmailEnrichmentFeatureIds(ctx), which reads the seededRecommendedWaterfallConfigrows for theprimaryEnrichmentEmailtemplate, ordered by priority. There is no static code fallback β a missing template/rows throwsDEFAULT_EMAIL_ENRICHMENT_TEMPLATE_MISSING/β¦_ROWS_MISSING(COM-32909), so the DB seed is the single source of truth. - PHONE β
DEFAULT_PHONE_ENRICHMENT_FEATURES.
Default email chain (current)
The seeded primaryEnrichmentEmail default chain is 4 sources:
| Priority | Provider |
|---|---|
| 1 | Cognism |
| 2 | Apollo |
| 3 | Lusha |
| 4 | CRED |
RocketReach is NOT a default source β older docs that say so are stale
Until 2026-05-14 the default chain was the 5-row
[Cognism, Apollo, Lusha, CRED, RocketReach]. COM-32909 dropped
RocketReach from primaryEnrichmentEmail (migration
20260514130000_drop-rocketreach-from-primary-enrichment-email, corrected
by 20260514150500_fix-rocketreach-feature-id-in-primary-enrichment-email,
which also cleans up existing tenants). RocketReach is still an
integrated, available provider (it remains in funcMap), but it now
lives only in the FE "+ Add source" tray and is never seeded into
RecommendedWaterfallConfig β so it does not run unless a tenant
explicitly adds it. Any documentation claiming "5 default sources including
RocketReach" is describing the pre-2026-05-14 state.
Available provider helpers (funcMap)
These are the providers the engine can call β a superset of the default
chain. A tenant's configured emailEnrichmentFeatureIds (default chain plus
anything added via the "+ Add source" tray) selects from these:
| Feature constant | Helper | Vendor | In default chain? |
|---|---|---|---|
FEATURE_COGNISM_EMAIL/PHONE_ENRICHMENT |
cognism-contacts.ts |
Cognism | β |
FEATURE_APOLLO_EMAIL/PHONE_ENRICHMENT |
apollo-contacts.ts |
Apollo | β |
FEATURE_LUSHA_EMAIL_ENRICHMENT |
lusha-contacts.ts |
Lusha | β |
FEATURE_CRED_EMAIL_ENRICHMENT |
cred-contacts.ts |
CRED (internal) | β |
FEATURE_RR_EMAIL/PHONE_ENRICHMENT |
rocket-reach-contacts.ts |
RocketReach | β β "+ Add source" only |
FEATURE_SK_EMAIL_ENRICHMENT |
skrapp-contacts.ts |
Skrapp | β (default for primaryWorkEmail only) |
FEATURE_AMF_EMAIL_ENRICHMENT |
anymailfinder-contacts.ts |
AnyMailFinder | β (default for primaryWorkEmail only) |
FEATURE_ARL_EMAIL_ENRICHMENT |
aeroleads-contacts.ts |
AeroLeads | β (default for primaryWorkEmail only) |
cred-unverified-contacts.ts contributes unverified CRED candidates.
Two email templates β don't confuse them
There are two seeded email templates with different chains. This is the documentation-vs-reality mismatch that has tripped people up:
| Template | Surface | Seeded chain (current) |
|---|---|---|
primaryEnrichmentEmail |
The internal chain consumed by RequestPersonBillableContactsUseCase / getDefaultEmailEnrichmentFeatureIds β this is the billable-contacts default |
Cognism β Apollo β Lusha β CRED (4) |
primaryWorkEmail |
The FE "Smart Enrich β Email" custom-field bundle | CRED β Cognism β Apollo β Lusha β Skrapp β AnyMailFinder β AeroLeads (7) β BE-SEED-REORDER / D8, 2026-05-13 |
Neither default chain includes RocketReach. Both were 5-row
[β¦, RocketReach] chains before mid-May 2026.
Per-step flow (per recursion)
notFoundIdsβ persons not yet resolved by an earlier step and not excluded.runProviderβ calls the provider helper, thenretryRateLimitedSkips(COM-43136/P-003, behindFF_BULK_ENRICH_RETRY) re-attempts only persons our internal limiter shed (rate_limited);circuit_open/errorskips are not retried.- Split results into verified vs unverified groups (Apollo phone has a special "pending webhook / stale data" exclusion path).
- Email validation (EMAIL only, if an email-validation feature is configured) β see below.
- Demotion (EMAIL only) β strip verification from emails at past-employer and off-current-employer domains.
- Feature log β one
CreateCreditFeatureLogrow per verified person per provider (the actual credit charge); persons resolved by no provider get anEMAIL/PHONE_ENRICHMENT_ATTEMPTlog. - Recurse to the next feature ID with the still-unresolved persons, accumulating contacts + first-wins
warning.
Email validation is free here (COM-39583)
The waterfall calls RequestEmailValidationUseCase.execute(..., { chargeCredits: false })
β ZeroBounce validation is bundled into the enrichment that already charged for
the record. (The standalone revalidate-email-address path still bills.)
Validation writes back onto each contact: isVerified, isEmailValidated,
emailValidationStatus, confidenceTier, and lastValidationCheck.
Waterfall stop predicate β isDeliverableHere (COM-32909)
A contact stops the waterfall only when:
c.isVerified === true && c.emailValidationStatus !== EmailValidationStatusEnum.CATCH_ALL
A valid + alias_address (CATCH_ALL) hit at the current employer is kept
as a candidate but does not short-circuit β the next provider may return a
fully VALID address at the same domain, which is preferred. A fully VALID
verified email at the current employer stops the waterfall.
BYOK credit reduction
In logEnrichedContacts, if the customer has a stored secret for the provider
(FEATURE_TO_SECRET_TYPE[featureId] resolves and getSecretByType returns one),
the charged amount is reduced to 1 credit β they're spending their own
vendor quota.
4. How emails are handled
Validation status (tri-state)
ZeroBounce raw status is mapped by classifyEmailValidationStatus into
EmailValidationStatusEnum: VALID / INVALID / CATCH_ALL / UNVALIDATED /
NOT_APPLICABLE. A catch-all-domain heuristic (COM-39649) can demote a
nominally-valid address to CATCH_ALL based on the email itself, so the FE
never shows a misleading green badge for a known catch-all domain.
Confidence tier
classifyConfidenceTier (COM-39604) derives a persisted tier from the
validation status. CONFIDENCE_TIER_ORDER ranks them 0..4 (lower = better):
| Tier | Order | From validation status | Meaning |
|---|---|---|---|
VERIFIED |
0 | VALID / VALIDATED |
ZeroBounce confirmed deliverable |
CATCH_ALL |
1 | CATCH_ALL |
Domain accepts all β discoverable, never primary within tier |
UNKNOWN |
2 | UNVALIDATED |
Validation ran but inconclusive |
UNVERIFIED |
3 | NOT_APPLICABLE / null / undefined |
Never validated β we simply don't know |
INVALID |
4 | INVALID |
ZeroBounce confirmed undeliverable (invalid / spamtrap / abuse / do_not_mail) |
The tier is stored denormalized so ranking is an O(1) lookup rather than a re-derivation.
INVALID β UNVERIFIED (COM-43358, merged 2026-06-05)
Before COM-43358, a ZeroBounce invalid verdict was folded into
UNVERIFIED β a confirmed-undeliverable mailbox was indistinguishable from
one that was never validated. INVALID is now its own tier, ranked
lowest (below UNVERIFIED), so a confirmed-bad mailbox can never win
primary work/enrichment email. INVALID-tier emails are explicitly
excluded from primary-email eligibility in update-contact-custom-fields.ts
(COM-43361) β an all-invalid contact gets no primary rather than a bad
one β and the sequence recipient-resolver demotes them too. Only new
enrichments get INVALID; existing rows keep their prior tier (no backfill,
no migration β confidenceTier is already varchar(32)).
Ranking & final sort
sortBillableContacts groups contacts by personId (first-appearance order
preserved β persons are not reshuffled), then sorts each person's emails with
rankEmailsByEmployer. Tiebreak order, strongest first:
- Current-employer domain win.
- Past-employer / off-current-employer demotion (verification stripped).
- Confidence tier (
VERIFIED>CATCH_ALL>UNKNOWN>UNVERIFIED>INVALIDβINVALIDlowest, COM-43358) plus the validation score (VALID>CATCH_ALL> unvalidated). - Probability score.
- Recency (
isActiveUpdatedAt). WORKcontact-type.- Data-source priority (COM-32909) β the configured provider order is forwarded as a
Map<dataSourceAbbreviation, priority>so the sort matches what the write path persists asprimaryEnrichmentEmail.
Phones use the legacy verified-first β WORK-tiebreak sort (no employer
concept).
Deduplication & persistence
For EMAIL/PHONE, results are written through
UpdateContactCustomFieldsUseCase into custom fields β
verifiedEmails / unverifiedEmails (EMAIL, COM-32909) and
verifiedPhones / unverifiedPhones (PHONE, COM-42483). Dedup (case-insensitive
by value), demotion, and ranking are applied at write time, with the same
dataSourcePriority map forwarded. get-person-enriched-contacts reads from
these custom fields β never from FeatureLog metadata β so the waterfall no
longer writes a contact snapshot into the feature log.
The chosen primary email is the best primary-eligible candidate, and
INVALID-tier addresses are never primary-eligible (COM-43358/COM-43361) β
if every candidate is INVALID, the primary is left unset rather than promoting
a confirmed-bad mailbox. The same exclusion applies in
recompute-primary-work-email.ts and the CSV export ranker.
GraphQL surface fields
On TypeEnrichedContactDetails:
emailValidationStatus(tri-state) β the field to use. Resolves the most-recentEmailAddressValidationrow (DataLoader-batched) and classifies it; falls back to the legacy boolean for pre-D14 contacts.isEmailValidatedβ deprecated (D14/COM-32909): a boolean cannot distinguishVALIDfromCATCH_ALL.dataSourceβ resolves{ abbreviation }from the contact's provider.
5. Async completion
Multi-person runs return null and complete on the worker. The
contactsEnrichmentCompleted GraphQL subscription
(CONTACTS_ENRICHMENT_COMPLETED, filtered to the current user) delivers the
result, including any partial-success warning (COM-40844 BE-3).
6. Rate limiting, bulk & resilience
Outbound pacing is not in this engine β it's enforced in the model-api
provider-rate-limiter. Skips surfaced by providers
(provider-contact-result.ts) are classified rate_limited | circuit_open |
error, emitted as the cred_enrichment_provider_skipped_total{provider,reason}
metric and a Provider enrichment skipped log, and (for rate_limited)
re-attempted by the retry wave. Full detail β vendor caps, bulk endpoints, 429
backpressure, header-driven self-calibration β lives in
Enrichment Rate Limiting, Bulk & Resilience.
Key code paths
src/graphql-api/person/resolvers/enriched-contact-resolver.tsβenrichPersonContacts,emailValidationStatus/isEmailValidatedfield resolvers, completion subscription.src/domain/person/usecase/request-person-billable-contacts.tsβ person resolution, blacklist, credit gate, sync/async routing.src/domain/person/usecase/generate-person-billable-contacts.tsβ the waterfall engine,isDeliverableHere,funcMap, feature logging, sort.src/domain/person/usecase/helpers/β per-provider helpers (cognism-contacts.ts,apollo-contacts.ts,lusha-contacts.ts,rocket-reach-contacts.ts,skrapp-contacts.ts,anymailfinder-contacts.ts,aeroleads-contacts.ts,cred-contacts.ts,cred-unverified-contacts.ts),provider-contact-result.ts,bulk-enrich-retry.ts,rank-emails-by-employer.ts,load-person-email-ranking-context.ts.src/domain/email/usecase/validation/request-email-validation.tsβ ZeroBounce validation (free when called from the waterfall).src/domain/email/helpers/classify-email-validation-status.tsβ tri-state + confidence-tier classification, catch-all heuristic.src/domain/custom/usecase/field/update-contact-custom-fields.tsβ dedup + persist toverified/unverifiedEmails.src/domain/feature/helpers/get-default-email-enrichment-feature-ids.tsβ resolves the default email chain from the seededprimaryEnrichmentEmailrows (no static fallback).src/data/config/recommended-waterfall-config-data.tsβ the catalog:PRIMARY_ENRICHMENT_EMAIL(4-row internal default) andSMART_ENRICH_EMAIL(7-row FE bundle).src/data/seeds/006-recommended-waterfall-config.tsβ seeds the catalog intoRecommendedWaterfallConfig.src/data/migrations/20260514130000_drop-rocketreach-from-primary-enrichment-email.ts(+20260514150500_fix-β¦) β removed RocketReach from the default chain.
Tickets
| Ticket | What |
|---|---|
| COM-32909 | Verified β CATCH_ALL stop predicate; past/off-employer demotion; data-source-priority ranking; verified/unverifiedEmails custom fields; dropped RocketReach from the default primaryEnrichmentEmail chain (now Cognism β Apollo β Lusha β CRED) |
| BE-SEED-REORDER (D8) | Reordered the FE primaryWorkEmail Smart Enrich bundle to the 7-row CRED β Cognism β Apollo β Lusha β Skrapp β AnyMailFinder β AeroLeads chain |
| COM-39583 | ZeroBounce bundled free in the waterfall (chargeCredits: false) |
| COM-39604 | Persisted tri-state confidenceTier for O(1) ranking |
| COM-39649 | Catch-all-domain heuristic in classifyEmailValidationStatus |
| COM-40844 | { contacts, warning? } partial-success propagation through the recursion and onto PubSub |
| COM-42483 | verified/unverifiedPhones custom fields (phone parity) |
| COM-43136 | Provider skip classification + bounded rate-limit retry wave |
| COM-43358 | New INVALID confidence tier β ZeroBounce-confirmed-undeliverable emails ranked lowest (below UNVERIFIED) instead of folded into UNVERIFIED (merged 2026-06-05) |
| COM-43361 | INVALID-tier emails excluded from primary-email eligibility (no bad-mailbox promotion) |