Integration Guide: Version 2.3 to 2.4

Overview

Version 2.4 of the Document API v2 introduces additional configuration options and extended processing capabilities.
These enhancements provide greater flexibility and allow document processing to be tailored more precisely to specific workflows and use cases.


Key Differences

1. Feature Configuration

Version 2.3

  • Classification & Grouping: Always enabled

  • Extraction: Optional

  • JSON configuration: Optional

Version 2.4

  • All features are configurable:

    • Grouping.Enabled

    • Classification.Enabled

    • Extraction.Enabled

    • OriginalFileOcr.Enabled

  • A JSON configuration defines the processing behavior

  • At least one feature must be enabled per request

This allows more targeted processing and better alignment with individual requirements.


2. ExtractAnyDocumentAs

Introduced in version 2.4, ExtractAnyDocumentAs allows the direct extraction of a known document type without running grouping or classification.

{
"Features": {
"Grouping": { "Enabled": false },
"Classification": { "Enabled": false },
"Extraction": {
"Enabled": true,
"Options": {
"ExtractAnyDocumentAs": "Invoice"
}
}
}
}

Benefits

  • Faster processing by skipping grouping and classification

  • Uses the original file name as DocumentId (e.g. invoice.pdf)

  • Ideal for single-document files with a known document type

Notes

  • Available when both Grouping and Classification are disabled

  • Cannot be combined with ExtractIfDocumentClassifiedAs


3. Grouping and Classification Coupling

Grouping and Classification are designed to work together and must be enabled or disabled as a pair.

{ "Grouping": { "Enabled": true }, "Classification": { "Enabled": true } }
 
{ "Grouping": { "Enabled": false }, "Classification": { "Enabled": false } }

This ensures consistent and predictable processing behavior.


4. Response Structure

Depending on the selected configuration, response sections may vary.

{
"DocumentId": "invoice.pdf",
"Grouping": null,
"Classification": null,
"Extraction": {
"Result": { "DocumentEssentials": [...] },
"ResultState": { "Code": "200" }
}
}

Key points

  • Grouping and Classification may be null

  • DocumentId reflects the processing mode:

    • Sequential IDs when grouping is enabled

    • Original file name when grouping and classification are skipped

Clients should handle optional response fields accordingly.


5. Validation Rules

Version 2.4 applies clear validation rules to ensure consistent configurations:

  • At least one feature must be enabled

  • Grouping and Classification must be enabled or disabled together

  • Extraction options must match the selected processing mode

  • ExtractAnyDocumentAs and ExtractIfDocumentClassifiedAs cannot be used together

These rules help avoid ambiguous configurations and ensure predictable behavior.


Use Cases in Version 2.4

Use Case 1: Direct Extraction of a Known Document Type

{
"Features": {
"Grouping": { "Enabled": false },
"Classification": { "Enabled": false },
"Extraction": {
"Enabled": true,
"Options": {
"ExtractAnyDocumentAs": "Invoice"
}
}
}
}

Result

  • Faster processing

  • DocumentId equals the original file name

  • Grouping and Classification are null


Use Case 2: Grouping and Classification Only

{
"Features": {
"Grouping": { "Enabled": true },
"Classification": { "Enabled": true },
"Extraction": { "Enabled": false }
}
}

Result: Documents are grouped and classified without extraction.


Use Case 3: Full Processing Pipeline

{
"Features": {
"Grouping": { "Enabled": true },
"Classification": { "Enabled": true },
"Extraction": {
"Enabled": true,
"Options": {
"ExtractIfDocumentClassifiedAs": ["Invoice", "OrderConfirmation"]
}
},
"OriginalFileOcr": { "Enabled": true }
}
}

Result: same behavior as expected in version 2.3.

Use Case 4: OCR Only

{
"Features": {
"OriginalFileOcr": { "Enabled": true }
}
}

Result: OCR data only, without document-level results.


Migration Considerations

When working with version 2.4, it is necessary to:

  • Always include a JSON configuration

  • Explicitly enable only the features required for your use case

  • Implement null checks for optional response fields

  • Use simplified configurations where appropriate to optimize performance


Was this article helpful?