Skip to main content

Data value extractor

The Data Value Extractor serves to extract data from a document that is linked in annotation's metadata. The main use case is to process data from REST API Export as a part of the Export Pipeline.

Installation

  1. Login to your Rossum account.
  2. Navigate to Extensions → My extensions.
  3. Click on Create extension.
  4. Fill the following fields:
    1. Name: Data value extractor
    2. Trigger events: Export
    3. Extension type: Webhook
    4. URL (see below)
  5. In "Advanced settings" select Token owner (should have Admin access)
  6. Click Create the webhook.
EnvironmentWebhook URL
EU1 Irelandhttps://elis.data-value-extractor.rossum-ext.app/
EU2 Frankfurthttps://shared-eu2.data-value-extractor.rossum-ext.app/
US east coasthttps://us.data-value-extractor.rossum-ext.app/
Japan Tokyohttps://shared-jp.data-value-extractor.rossum-ext.app/

Available configuration options

Simple extraction example.

{
"extract": [
{
"format": "json",
"source_reference_key": "ifs_export_reply_payload",
"extract_rules": [
{
"value_path": "MessageId[0].value",
"target_schema_id": "ifs_reply_message_id"
}
]
}
]
}

More complex configuration example using extraction from two different source_reference_key and two extract_rules in the second one. There is also the condition used, which is reference to a document ID in the annotation which triggers the execution of the extraction.

{
"extract": [
{
"format": "json",
"extract_rules": [
{
"value_path": "doc_id",
"target_schema_id": "erp_doc_id"
}
],
"source_reference_key": "api_xml_export_reply_payload"
},
{
"format": "json",
"condition": "@{api_gate}",
"extract_rules": [
{
"value_path": "status_code",
"target_schema_id": "erp_api_status_code"
},
{
"value_path": "headers.etag",
"target_schema_id": "erp_api_etag"
}
],
"source_reference_key": "api_xml_export_reply_headers"
}
]
}

Parameters

Extract Object

The extract object consists of the following parameters:

AttributeTypeDescription
formatstrFile format. Currently, only json value is supported.
conditionstrReference to annotation.content schema_id that holds evaluated value. When it's empty or "false" (case insensitive), this section won't be evaluated. Otherwise, it will proceed. The condition follows the JSON templating syntax e.g. "condition": "@{api_gate}"
source_reference_keystrRelation key into metadata for source document.
extract_rulesobjectRules to update annotation's content.

The extract_rules object defines how values are extracted and stored:

AttributeTypeDescription
value_pathstrQuery to get the value from the referred document. In case of format=json, it should be in jmespath syntax.
target_schema_idstrAnnotation's schema_id to be updated.