Enrichment Rate Limiting, Bulk & Resilience
How CRED coordinates outbound calls to external enrichment vendors (Cognism, Apollo, Lusha, โฆ) so that bulk email enrichment of large lists completes without self-inflicted drops or vendor 429s.
This is the runtime/resilience counterpart to Adding Enrichment Vendors (which covers integrating a vendor), the Universal Waterfall (the product behavior), and the Email Enrichment Pipeline (the request โ generate-billable-contacts engine that calls this limiter).
Where this lives
The enforcement layer is the model-api provider-rate-limiter (apps/model-api/src/services/provider-rate-limiter.ts). The commercial-api waterfall (GeneratePersonBillableContacts) calls model-api per provider; model-api owns the outbound rate limiting. For the full waterfall engine (entry mutation, person resolution, credit gate, validation, ranking, persistence), see the Email Enrichment Pipeline.
The waterfall & providers
For email enrichment the waterfall calls providers in priority order, each stage only attempting contacts not yet resolved by an earlier one. The default chain (seeded primaryEnrichmentEmail, what RequestPersonBillableContacts uses when a tenant hasn't customized) is 4 providers:
Cognism โ Apollo โ Lusha โ CRED
Other integrated providers โ Skrapp / AnyMailFinder / AeroLeads / RocketReach / PDL โ are not in the default chain; they only run if a tenant adds them via the FE "+ Add source" tray (or, for Skrapp/AnyMailFinder/AeroLeads, as part of the separate primaryWorkEmail Smart Enrich bundle).
RocketReach was removed from the default chain (COM-32909, 2026-05-14)
The pre-2026-05-14 default was a 5-row [Cognism, Apollo, Lusha, CRED, RocketReach]. RocketReach is now "+ Add source" only. The full provider order, the two email templates, and the cleanup migrations are documented in the Email Enrichment Pipeline ยง Provider order.
A contact is only "dropped" (no value) if every stage misses, is skipped, or errors for it.
Confirmed vendor rate caps (2026-06)
| Vendor | Per-endpoint cap | Account total | Bulk endpoint | 429 backoff signal |
|---|---|---|---|---|
| Cognism | 500/min per endpoint (/enrich, /redeem counted separately) |
1000/min | โ no bulk enrich โ enrich is 1/contact; /redeem batches 20 |
x-rate-limit-reset (delta-seconds) โ NOT a standard Retry-After |
| Apollo | ~1000/min on people/match + organizations/enrich |
โ | โ
people/bulk_match (โค10/call) |
standard Retry-After (+ X-RateLimit-Reset epoch) |
| Lusha | 200/min + 400/hour + 2000/day (contractual ceiling) | โ | โ
/v2/person batch |
standard Retry-After |
| PDL | per X-RateLimit-* headers |
โ | โ
/v5/person/bulk (โค100) |
Retry-After / X-RateLimit-Reset |
| Skrapp / AnymailFinder / AeroLeads / RocketReach | low-volume | โ | โ none | no reliable rate headers |
Cognism is a hard vendor wall
Cognism enrich is 500/min with no bulk endpoint. A large Cognism list takes size รท 500/min (e.g. ~1000 contacts โ 2 min) but completes with zero drops. Raising it is a Cognism entitlements/plan change (vendor action), not code.
Single source of truth: the model-api rate limiter (COM-43136)
Before COM-43136 there were two throttle stacks (commercial-api vendor-rate-limiter and model-api provider-rate-limiter) double-rejecting. They were collapsed to one:
- Enforcement = model-api
provider-rate-limiterโ Redis-backed (rate-limiter-flexible), one bucket per provider, shared fleet-wide across Cloud Run replicas, plus a per-process concurrency cap (default 24, env-tunable). This is the required pattern for a multi-instance fleet sharing a vendor quota. - Commercial-api
vendor-rate-limiter= observe-only by default (FF_DISABLE_COMMERCIAL_RATE_LIMITER); set to"false"to restore legacy enforcement. The circuit breaker stays enforced. - Cognism runs two per-endpoint buckets โ
COG-enrich+COG-redeem(each ~8/s โ 480/min, under the 500/min per-endpoint cap; combined < 1000/min total). - Skips are surfaced (
provider-contact-result.ts), not swallowed:reason โ rate_limited | circuit_open | error, emitted as thecred_enrichment_provider_skipped_total{provider,reason}metric and aProvider enrichment skippedlog; a bounded retry wave (bulk-enrich-retry.ts) re-attempts rate-limited contacts before the waterfall falls through.
Retry budget โ drains bursts instead of shedding (COM-43313)
schedule() waits for a slot within a maxWaitMs budget (Cognism passes 60s). The retry loop was previously capped at MAX_ATTEMPTS = 3, which gave up after ~3s โ far short of the budget โ so a bulk burst shed its tail as rate_limited even though the vendor and the budget had room.
Fix: MAX_ATTEMPTS raised 3 โ 64, so the elapsedMs + waitMs > maxWaitMs check is what actually stops the loop. Combined with the concurrency gate (which paces entry), a 300-person Cognism burst now drains at the vendor's 500/min with zero internal drops instead of shedding ~130.
Validated on DEV (2026-06-05)
A fresh 300-contact enrichment under enforce produced 0 internal sheds, 0 vendor 429s, 0 errors (vs ~297 sheds on the same shape pre-fix).
Bulk / request coalescing (COM-43205 / COM-43310)
Throttling is per HTTP call, so coalescing many contacts into one vendor request cuts limiter pressure ~Nร.
| Provider | Bulk status |
|---|---|
| Apollo | people/bulk_match (โค10/call) wired into the email path of the enrichPersonContacts waterfall (COM-43310). Phone-reveal stays per-person (async webhook). |
| PDL | /v5/person/bulk (โค100/call). |
| Lusha | /v2/person batch. |
| Cognism | No bulk enrich (vendor limitation). The two-phase flow is decoupled: enrich per-contact (collect redeemIds) โ batch-redeem in chunks of 20 (N enrich + N/20 redeem). |
| Long-tail | per-person (no vendor bulk endpoint). |
Adaptive header-driven limiting (COM-43353)
Stop guessing the limits and react to the vendor's own headers. Two layers:
L1 โ honor vendor 429 / Retry-After (always-on)
Implemented centrally in schedule(): when a vendor call rejects with a 429, block the bucket fleet-wide for the reset window, then rethrow (so the provider still classifies it as rate_limited). One interception covers every provider. The reset is read via the per-vendor adapter first (e.g. Cognism's non-standard x-rate-limit-reset), then the generic Retry-After, then a 2 s default (all clamped to โค 5 min).
L2 โ self-calibrating ceilings (flag-gated, shadow-first)
Per-vendor header adapters (provider-rate-headers.ts) parse each vendor's real limit/remaining/reset (Cognism, Apollo, Lusha, PDL; long-tail โ null โ static config stands). limiter.observe(headers) feeds responses back. Behaviour is gated by env var:
PROVIDER_RATE_LIMITER_HEADER_DRIVEN |
Effect |
|---|---|
off |
ignore vendor headers (pre-L2) |
shadow (default) |
parse + log divergence vs static config; enforce nothing |
enforce |
additionally resize the bucket to discoveredLimit ร 0.9 when it drifts > 20 % from current |
This auto-tracks an entitlement change (e.g. Cognism 500โ1000) with no deploy, and auto-tightens if a vendor lowers us.
Rollout: shadow โ dev โ staging โ prod
Flip to enforce per environment, dev first โ never prod-first. Pass/fail signal: zero vendor 429s post-enforce (a 429 means an adapter over-read the cap โ revert to shadow). Then staging for ~24h against the canary criterion (pause if cred_enrichment_provider_skipped_total{reason="rate_limited"} crosses 5ร its 7-day baseline for 30 min), then prod. Cognism is the highest-confidence adapter (built from live headers); Apollo/Lusha/PDL adapters are validated in shadow before enforcing.
Logging & observability (COM-43288)
Enrichment logs use the unified log schema: console-logger stamps metadata.feature from metadata.component via log-health/feature-registry. Components that roll up to feature: "email-enrichment" include provider-enrichment, the per-vendor *-person-lookup adapters, and provider-rate-limiter. Build GCP log-based metrics by filtering on metadata.feature rather than enumerating components.
Key log lines to grep (GCP Logs Explorer):
| Service | Line | Meaning |
|---|---|---|
| model-api | Rate limit exceeded for provider <p>; retry after Nms |
internal RateLimitedError (our limiter shed) โ should be ~0 post-fix |
| model-api | resizing bucket to discovered limit |
L2 enforce retuned a bucket (only on >20% drift) |
| model-api | discovered limit diverges (shadow) |
L2 shadow saw divergence (logs only, no change) |
| model-api | Cognism.com enrich/redeem error (status 429) / Apollo.io non-200 |
a real vendor 429 โ the over-enforce / over-send signal |
| commercial-api | Provider enrichment skipped {provider,reason,personId} |
a contact was skipped (rate_limited / circuit_open / error) |
| commercial-api | Retrying rate-limited persons {count,provider,retryNumber} |
retry wave fired |
Env vars / tuning (set on the Cloud Run service; read lazily)
| Var | Default | Purpose |
|---|---|---|
PROVIDER_RATE_LIMITER_HEADER_DRIVEN |
shadow |
L2 mode: off / shadow / enforce |
COGNISM_POINTS_PER_SECOND |
8 | Cognism per-endpoint bucket rate |
APOLLO_POINTS_PER_SECOND |
15 | Apollo bucket rate |
COGNISM_/APOLLO_/LUSHA_PROVIDER_CONCURRENCY_CAP |
24 | per-process in-flight cap (0/unlimited disables) |
COGNISM_MAX_WAIT_MS |
60000 | per-request wait budget for the Cognism legs |
FF_BULK_ENRICH_RETRY |
enabled | P-003 retry wave ("false" disables) |
FF_DISABLE_COMMERCIAL_RATE_LIMITER |
observe-only | "false" restores legacy commercial-api enforcement (double-negative โ see runbook) |
Known issues
- Lusha
LSPersoninteger overflow โ Lusha person/company IDs can exceed 2ยณยน; theLSPersoncolumn isinteger, so those rows fail to persist (value "โฆ" is out of range for type integer). Needsbigint. The contact gets no Lusha value even when Lusha returned data. validateStatus:()=>trueproviders (AeroLeads, Adyntel) resolve a 429 instead of throwing, so L1's central interception (which fires on rejection) misses it โ these need a per-vendorlimiter.block(...)in theirstatus===429branch (follow-up under COM-43353).- Cognism enrich vendor cap (500/min) โ see warning above; large lists are slow-but-complete, not dropped.
Tickets
| Ticket | What |
|---|---|
| COM-43136 | Collapse two rate-limit stacks โ single model-api enforcer; surface silent drops + retry wave |
| COM-43313 | maxWaitMs governs retries (drop MAX_ATTEMPTS=3 cap) |
| COM-43205 | Vendor-native bulk batching (Apollo bulk_match, PDL bulk, Cognism redeem decouple) |
| COM-43310 | Wire the email waterfall to Apollo bulk_match |
| COM-43353 | Adaptive header-driven limiting (L1 429 backpressure + L2 self-calibration) |
| COM-43288 | Unified enrichment log schema (component โ feature) |
Operational runbook: apps/api-commercial/docs/operations/bulk-enrichment-runbook.md (in cred-platform-ts).