Document received: SAD import — DE supplier, entry at Taranto. Extraction completed in 23 seconds. TARIC 8471 30 00, CIF €142,400. Anomaly: declared weight 3,800 kg outside expected range. Verification required.
Customs documents arrive extracted and validated in the management system.
Customs Document Extractor automatically extracts information from customs documents: SAD (Single Administrative Document), ICS2 (Import Control System), AEO (Authorised Economic Operator) declarations. Validation consistent with EU Customs Code Reg. 952/2013. Structured data feeds the operator's management system without manual transcription.
Customs Document Extractor at work.
Confirmed, mixed shipment with spare parts. I'll correct the entry on the waybill. Proceed with the corrected extraction.
Entry updated. Data loaded into the customs management system. Audit record written with timestamp.
Why it exists.
The customs document flow for logistics operators and freight forwarders is heavy: every extra-EU import and export comes with a SAD, incoming goods require ICS2 for pre-arrival security, AEO-qualified customers produce dedicated declarations. Manual transcription of data into the management system is slow, expensive, and a structural source of errors.
How it works each day.
Customs Document Extractor activates on document arrival. It identifies the type (SAD, ICS2, AEO) and extracts the structured data specific to each type: for SAD, the TARIC customs code, country of origin, declared value, applicable duty, any documented preferential regime; for ICS2, the pre-arrival security data; for AEO, qualification and validity. It validates the data against EU Customs Code rules and flags anomalies before loading into the management system.
The decision stays with the team.
The customs operations team decides whether to proceed, correct, or block. The agent does not write to the management system without confirmation, and does not bypass quality checks.
The teams that recover time and quality in the customs flow.
Customs operator
Customs operators handling dozens of dossiers per day stop transcribing data manually from PDF or scanned documents. Automated extraction reduces errors and frees time for the cases that require judgment — anomalies, contentious valuations, special regimes.
Head of customs operations
Has a traceable record of every document processed: timestamp, validation outcome, the operator who confirmed. When the customs authority audits, the registry is queryable with a standard SQL client.
Freight forwarder
The freight forwarder handling customers with regular operations — structured importers, AEO operators — cuts the time between document arrival and loading into the customer's system.
A SAD handled in a few minutes instead of half an hour.
The document enters the channel, the agent detects it.
For a logistics operator handling fifty customs dossiers per day, the flow starts when the document arrives on the work channel — email or upload into the ticketing system. Customs Document Extractor detects the document, identifies its type, and starts extraction.
Thirty seconds to extract the key SAD fields.
For a standard import SAD from an extra-EU supplier, the agent extracts in under thirty seconds: TARIC code, country of origin, declared value, applicable duty, any documented preferential regime. If the data is consistent with the configured validation rules, the document is loaded directly into the customs CMS and the customer's ERP. If an anomaly appears, the agent flags it to the operator with the rule violated, suspends automatic loading, and waits for confirmation.
The operator confirms. The event enters the audit registry.
The operator examines the anomaly, corrects if needed, and confirms. The corrected document enters the management system. The entire sequence stays in the audit registry with timestamp, responsible operator, and rule applied.
Declarative rules from the customer's customs operations team.
The rules of Customs Document Extractor are declarative. The customer's customs operations team defines in a readable format the extraction schemas for each document type, the validation rules (expected weight ranges per customs heading, admissible TARIC codes per goods category, recognised preferential regimes), and the anomaly thresholds that require manual review. The rules live in the customer's repository, versioned, validated at agent startup.
- Language
- TypeScript (Node.js)
- LLM model
- customer's choice: Anthropic, OpenAI, Mistral, open source models hosted internally, AWS Bedrock for a private model
- Built-in controls used
- pii-detector, credential-detector
- Native delivery channels
- Slack, Telegram, WhatsApp, OpenAI-compatible HTTP
- OCR for scanned documents
- not built-in: external OCR service (e.g. Google Document AI, Azure Form Recognizer) configured during delivery
- Customs CMS + ERP integration
- dedicated adapter built during delivery by the Exelab team
- Memory
- persistent per instance, pgvector + PostgreSQL FTS
- Registry
- immutable, queryable with a standard SQL client (customs authority audit inspectable)
How Customs Document Extractor works in detail.
For native PDFs (digitally generated, with selectable text) extraction is direct. For scanned documents (images, scanner-generated PDFs) an external OCR service is required and configured during delivery: the Exelab team integrates the customer's OCR service (Google Document AI, Azure Form Recognizer, or equivalent) and connects it to the extraction flow. The OCR step is transparent to the operator — the document enters the channel, the agent returns the structured data regardless of source.
The standard configuration covers SAD (import and export), ICS2 (Entry Summary Declaration), and AEO declarations. Other customs formats — MRN, T1, ATA carnet, preferential origin documents EUR.1 — are added as declarative schemas during delivery. The Exelab team works with the customer's customs operations team to define the priority.
Every anomaly is traced with the rule violated, the extracted value, and the expected range. The operator sees the specific flag, corrects or confirms. Frequent anomalies on a certain document type lead to a revision of the validation rules: the customs operations team updates the declarative file, tests it, promotes it to production. Rule improvement stays inside the customer's team.
The typical pattern is 10-16 weeks. Discovery and document-type mapping two weeks, extraction schema and validation rule configuration three weeks, customs CMS and ERP integration three to four weeks, testing with real dossiers and hand-off to the customs operations team two to three weeks. Actual duration depends on the variety of formats and the complexity of the customer's management system.
From a 30-minute conversation to the squad in production.
A 30-45 minute conversation to understand how Customs Document Extractor would configure to the customer's case. Document types, customs management system, priority validation rules.