Data value extractor
The Data Value Extractor serves to extract data from a document that is linked in annotation's metadata. The main use case is to process data from REST API Export as a part of the Export Pipeline.
Installation
- Login to your Rossum account.
- Navigate to Extensions → My extensions.
- Click on Create extension.
- Fill the following fields:
- Name:
Data value extractor
- Trigger events:
Export
- Extension type:
Webhook
- URL (see below)
- Name:
- In "Advanced settings" select Token owner (should have Admin access)
- Click Create the webhook.
Environment | Webhook URL |
---|---|
EU1 Ireland | https://elis.data-value-extractor.rossum-ext.app/ |
EU2 Frankfurt | https://shared-eu2.data-value-extractor.rossum-ext.app/ |
US east coast | https://us.data-value-extractor.rossum-ext.app/ |
Japan Tokyo | https://shared-jp.data-value-extractor.rossum-ext.app/ |
Available configuration options
Simple extraction example.
{
"extract": [
{
"format": "json",
"source_reference_key": "ifs_export_reply_payload",
"extract_rules": [
{
"value_path": "MessageId[0].value",
"target_schema_id": "ifs_reply_message_id"
}
]
}
]
}
More complex configuration example using extraction from two different source_reference_key
and two extract_rules
in the second one. There is also the condition
used, which is reference to a document ID in the annotation which triggers the execution of the extraction.
{
"extract": [
{
"format": "json",
"extract_rules": [
{
"value_path": "doc_id",
"target_schema_id": "erp_doc_id"
}
],
"source_reference_key": "api_xml_export_reply_payload"
},
{
"format": "json",
"condition": "@{api_gate}",
"extract_rules": [
{
"value_path": "status_code",
"target_schema_id": "erp_api_status_code"
},
{
"value_path": "headers.etag",
"target_schema_id": "erp_api_etag"
}
],
"source_reference_key": "api_xml_export_reply_headers"
}
]
}
Parameters
Extract Object
The extract object consists of the following parameters:
Attribute | Type | Description |
---|---|---|
format | str | File format. Currently, only json value is supported. |
condition | str | Reference to annotation.content schema_id that holds evaluated value. When it's empty or "false" (case insensitive), this section won't be evaluated. Otherwise, it will proceed. The condition follows the JSON templating syntax e.g. "condition": "@{api_gate}" |
source_reference_key | str | Relation key into metadata for source document. |
extract_rules | object | Rules to update annotation's content. |
The extract_rules
object defines how values are extracted and stored:
Attribute | Type | Description |
---|---|---|
value_path | str | Query to get the value from the referred document. In case of format=json , it should be in jmespath syntax. |
target_schema_id | str | Annotation's schema_id to be updated. |