Enrichment Rate Limiting, Bulk & Resilience

How CRED coordinates outbound calls to external enrichment vendors (Cognism, Apollo, Lusha, …) so that bulk email enrichment of large lists completes without self-inflicted drops or vendor 429s.

This is the runtime/resilience counterpart to Adding Enrichment Vendors (which covers integrating a vendor), the Universal Waterfall (the product behavior), and the Email Enrichment Pipeline (the request → generate-billable-contacts engine that calls this limiter).

Where this lives

The enforcement layer is the model-api provider-rate-limiter (apps/model-api/src/services/provider-rate-limiter.ts). The commercial-api waterfall (GeneratePersonBillableContacts) calls model-api per provider; model-api owns the outbound rate limiting. For the full waterfall engine (entry mutation, person resolution, credit gate, validation, ranking, persistence), see the Email Enrichment Pipeline.

The waterfall & providers

For email enrichment the waterfall calls providers in priority order, each stage only attempting contacts not yet resolved by an earlier one. The default chain (seeded primaryEnrichmentEmail, what RequestPersonBillableContacts uses when a tenant hasn't customized) is 4 providers:

Cognism → Apollo → Lusha → CRED

Other integrated providers — Skrapp / AnyMailFinder / AeroLeads / RocketReach / PDL — are not in the default chain; they only run if a tenant adds them via the FE "+ Add source" tray (or, for Skrapp/AnyMailFinder/AeroLeads, as part of the separate primaryWorkEmail Smart Enrich bundle).

RocketReach was removed from the default chain (COM-32909, 2026-05-14)

The pre-2026-05-14 default was a 5-row [Cognism, Apollo, Lusha, CRED, RocketReach]. RocketReach is now "+ Add source" only. The full provider order, the two email templates, and the cleanup migrations are documented in the Email Enrichment Pipeline § Provider order.

A contact is only "dropped" (no value) if every stage misses, is skipped, or errors for it.

Confirmed vendor rate caps (2026-06)

Vendor	Per-endpoint cap	Account total	Bulk endpoint	429 backoff signal
Cognism	500/min per endpoint (`/enrich`, `/redeem` counted separately)	1000/min	❌ no bulk enrich — enrich is 1/contact; `/redeem` batches 20	`x-rate-limit-reset` (delta-seconds) — NOT a standard `Retry-After`
Apollo	~1000/min on `people/match` + `organizations/enrich`	—	✅ `people/bulk_match` (≤10/call)	standard `Retry-After` (+ `X-RateLimit-Reset` epoch)
Lusha	200/min + 400/hour + 2000/day (contractual ceiling)	—	✅ `/v2/person` batch	standard `Retry-After`
PDL	per `X-RateLimit-*` headers	—	✅ `/v5/person/bulk` (≤100)	`Retry-After` / `X-RateLimit-Reset`
Skrapp / AnymailFinder / AeroLeads / RocketReach	low-volume	—	❌ none	no reliable rate headers

Cognism is a hard vendor wall

Cognism enrich is 500/min with no bulk endpoint. A large Cognism list takes size ÷ 500/min (e.g. ~1000 contacts ≈ 2 min) but completes with zero drops. Raising it is a Cognism entitlements/plan change (vendor action), not code.

Single source of truth: the model-api rate limiter (COM-43136)

Before COM-43136 there were two throttle stacks (commercial-api vendor-rate-limiter and model-api provider-rate-limiter) double-rejecting. They were collapsed to one:

Enforcement = model-api provider-rate-limiter — Redis-backed (rate-limiter-flexible), one bucket per provider, shared fleet-wide across Cloud Run replicas, plus a per-process concurrency cap (default 24, env-tunable). This is the required pattern for a multi-instance fleet sharing a vendor quota.
Commercial-api vendor-rate-limiter = observe-only by default (FF_DISABLE_COMMERCIAL_RATE_LIMITER); set to "false" to restore legacy enforcement. The circuit breaker stays enforced.
Cognism runs two per-endpoint buckets — COG-enrich + COG-redeem (each ~8/s ≈ 480/min, under the 500/min per-endpoint cap; combined < 1000/min total).
Skips are surfaced (provider-contact-result.ts), not swallowed: reason ∈ rate_limited | circuit_open | error, emitted as the cred_enrichment_provider_skipped_total{provider,reason} metric and a Provider enrichment skipped log; a bounded retry wave (bulk-enrich-retry.ts) re-attempts rate-limited contacts before the waterfall falls through.

Retry budget — drains bursts instead of shedding (COM-43313)

schedule() waits for a slot within a maxWaitMs budget (Cognism passes 60s). The retry loop was previously capped at MAX_ATTEMPTS = 3, which gave up after ~3s — far short of the budget — so a bulk burst shed its tail as rate_limited even though the vendor and the budget had room.

Fix: MAX_ATTEMPTS raised 3 → 64, so the elapsedMs + waitMs > maxWaitMs check is what actually stops the loop. Combined with the concurrency gate (which paces entry), a 300-person Cognism burst now drains at the vendor's 500/min with zero internal drops instead of shedding ~130.

Validated on DEV (2026-06-05)

A fresh 300-contact enrichment under enforce produced 0 internal sheds, 0 vendor 429s, 0 errors (vs ~297 sheds on the same shape pre-fix).

Bulk / request coalescing (COM-43205 / COM-43310)

Throttling is per HTTP call, so coalescing many contacts into one vendor request cuts limiter pressure ~N×.

Provider	Bulk status
Apollo	`people/bulk_match` (≤10/call) wired into the email path of the `enrichPersonContacts` waterfall (COM-43310). Phone-reveal stays per-person (async webhook).
PDL	`/v5/person/bulk` (≤100/call).
Lusha	`/v2/person` batch.
Cognism	No bulk enrich (vendor limitation). The two-phase flow is decoupled: enrich per-contact (collect `redeemId`s) → batch-redeem in chunks of 20 (`N` enrich + `N/20` redeem).
Long-tail	per-person (no vendor bulk endpoint).

Adaptive header-driven limiting (COM-43353)

Stop guessing the limits and react to the vendor's own headers. Two layers:

L1 — honor vendor `429` / `Retry-After` (always-on)

Implemented centrally in schedule(): when a vendor call rejects with a 429, block the bucket fleet-wide for the reset window, then rethrow (so the provider still classifies it as rate_limited). One interception covers every provider. The reset is read via the per-vendor adapter first (e.g. Cognism's non-standard x-rate-limit-reset), then the generic Retry-After, then a 2 s default (all clamped to ≤ 5 min).

L2 — self-calibrating ceilings (flag-gated, shadow-first)

Per-vendor header adapters (provider-rate-headers.ts) parse each vendor's real limit/remaining/reset (Cognism, Apollo, Lusha, PDL; long-tail → null → static config stands). limiter.observe(headers) feeds responses back. Behaviour is gated by env var:

`PROVIDER_RATE_LIMITER_HEADER_DRIVEN`	Effect
`off`	ignore vendor headers (pre-L2)
`shadow` (default)	parse + log divergence vs static config; enforce nothing
`enforce`	additionally resize the bucket to `discoveredLimit × 0.9` when it drifts > 20 % from current

This auto-tracks an entitlement change (e.g. Cognism 500→1000) with no deploy, and auto-tightens if a vendor lowers us.

Rollout: shadow → dev → staging → prod

Flip to enforce per environment, dev first — never prod-first. Pass/fail signal: zero vendor 429s post-enforce (a 429 means an adapter over-read the cap → revert to shadow). Then staging for ~24h against the canary criterion (pause if cred_enrichment_provider_skipped_total{reason="rate_limited"} crosses 5× its 7-day baseline for 30 min), then prod. Cognism is the highest-confidence adapter (built from live headers); Apollo/Lusha/PDL adapters are validated in shadow before enforcing.

Logging & observability (COM-43288)

Enrichment logs use the unified log schema: console-logger stamps metadata.feature from metadata.component via log-health/feature-registry. Components that roll up to feature: "email-enrichment" include provider-enrichment, the per-vendor *-person-lookup adapters, and provider-rate-limiter. Build GCP log-based metrics by filtering on metadata.feature rather than enumerating components.

Key log lines to grep (GCP Logs Explorer):

Service	Line	Meaning
model-api	`Rate limit exceeded for provider <p>; retry after Nms`	internal `RateLimitedError` (our limiter shed) — should be ~0 post-fix
model-api	`resizing bucket to discovered limit`	L2 enforce retuned a bucket (only on >20% drift)
model-api	`discovered limit diverges (shadow)`	L2 shadow saw divergence (logs only, no change)
model-api	`Cognism.com enrich/redeem error` (status 429) / `Apollo.io non-200`	a real vendor 429 — the over-enforce / over-send signal
commercial-api	`Provider enrichment skipped` `{provider,reason,personId}`	a contact was skipped (rate_limited / circuit_open / error)
commercial-api	`Retrying rate-limited persons` `{count,provider,retryNumber}`	retry wave fired

Env vars / tuning (set on the Cloud Run service; read lazily)

Var	Default	Purpose
`PROVIDER_RATE_LIMITER_HEADER_DRIVEN`	`shadow`	L2 mode: `off` / `shadow` / `enforce`
`COGNISM_POINTS_PER_SECOND`	8	Cognism per-endpoint bucket rate
`APOLLO_POINTS_PER_SECOND`	15	Apollo bucket rate
`COGNISM_/APOLLO_/LUSHA_PROVIDER_CONCURRENCY_CAP`	24	per-process in-flight cap (`0`/`unlimited` disables)
`COGNISM_MAX_WAIT_MS`	60000	per-request wait budget for the Cognism legs
`FF_BULK_ENRICH_RETRY`	enabled	P-003 retry wave (`"false"` disables)
`FF_DISABLE_COMMERCIAL_RATE_LIMITER`	observe-only	`"false"` restores legacy commercial-api enforcement (double-negative — see runbook)

Known issues

Lusha LSPerson integer overflow — Lusha person/company IDs can exceed 2³¹; the LSPerson column is integer, so those rows fail to persist (value "…" is out of range for type integer). Needs bigint. The contact gets no Lusha value even when Lusha returned data.
validateStatus:()=>true providers (AeroLeads, Adyntel) resolve a 429 instead of throwing, so L1's central interception (which fires on rejection) misses it — these need a per-vendor limiter.block(...) in their status===429 branch (follow-up under COM-43353).
Cognism enrich vendor cap (500/min) — see warning above; large lists are slow-but-complete, not dropped.

Tickets

Ticket	What
COM-43136	Collapse two rate-limit stacks → single model-api enforcer; surface silent drops + retry wave
COM-43313	`maxWaitMs` governs retries (drop `MAX_ATTEMPTS=3` cap)
COM-43205	Vendor-native bulk batching (Apollo `bulk_match`, PDL bulk, Cognism redeem decouple)
COM-43310	Wire the email waterfall to Apollo `bulk_match`
COM-43353	Adaptive header-driven limiting (L1 429 backpressure + L2 self-calibration)
COM-43288	Unified enrichment log schema (component → feature)

Operational runbook: apps/api-commercial/docs/operations/bulk-enrichment-runbook.md (in cred-platform-ts).