Structured formats import
Structured formats import allows for importing and processing of non-visual documents such as JSON or XML files. It not only correctly extracts the information from these files, but also renders a minimalistic PDF representation for easier manual reviews.
Installation
Structured formats import is a webhook maintained by Rossum. In order to use it, follow these steps:
- Login to your Rossum account.
- Navigate to Extensions → My extensions.
- Click on Create extension.
- Fill the following fields:
- Name:
Structured formats import
- Trigger events:
Document content: Initialize, Started, Updated
- Extension type:
Webhook
- URL (see below)
- Name:
- Click Create the webhook.
- Fill
Configuration
field (see Configuration examples
Basic usage
Work in progress
We're still working on this part and would love to hear your thoughts! Feel free to share your feedback or submit a pull request. Thank you! 🙏
Available configuration options
{
// Various independent configurations that can be conditionally triggered via `trigger_condition`:
"configurations": [
{
"trigger_condition": {
"file_type": "xml"
},
// Optional. Whether the original XML/JSON file should be split into smaller ones:
"split_selectors": ["/RecordLabel/Productions/Production"],
// Fields to be extracted from the source file and assigned to given datapoints:
"fields": [
{
"schema_id": "document_id",
// If many selectors are specified, they serve as a fallback list.
"selectors": ["./Metadata/ID"]
}
],
// Optional specification of the original PDF file that should be extracted from the source
// file (base64 encoded):
"pdf_file": {
"name_selectors": [
"cac:AdditionalDocumentReference/cac:Attachment/cbc:EmbeddedDocumentBinaryObject/@filename"
],
// Content should be base64 encoded:
"content_selectors": [
"cac:AdditionalDocumentReference/cac:Attachment/cbc:EmbeddedDocumentBinaryObject"
]
}
}
// …
]
}