Export to Warehouse
Overall architecture
The system has built-in functionality to export conversations to an external Warehouse. It runs a job periodically using Cron Orchestrator service. The DataExport job reads all conversations from the last export and creates two CSV files (described below) and ZIPs them. Then the file is sent to configured URL (more technical information below)
Format
Two CSV files are generated:
Messeges.csv
Field name | Type | Description |
---|---|---|
conversationId | String | Session id (ssuid from request) |
authorType | String | Human / Bot |
content | String | Message content |
createdAt | DateTime | Message received date in format: YYYY-MM-DD’T’HH-mm-ss.SSS’Z’, e.g. 2021-08-18T09:11:32.877Z |
labels | JSON object (fields: name, labelType) | List of: analytics tags, intents and bot contexts. Field "name" contains the value, possibles values for labelType are: “IntentLabel”, “ContextLabel”, “FlowTagLabel”, “Topic” |
Sessions.csv
Field name | Type | Description |
---|---|---|
projectId | String | Project's indentificator (UUID) in SentiOne Automate |
conversationId | String | Session id (ssuid from request) |
author | String | Telephone number or other name received in API Request |
sourceType | String | Source type (received in API Request) e.g. "chat-tester”, "facebook" etc. (Optional) |
createdAt | DateTime | Conversation start date in format: YYYY-MM-DD’T’HH-mm-ss.SSS’Z’, e.g. 2021-08-18T09:11:32.877Z |
metadata.* | JSON object | Converastion's metadata |
ZIP
Those two files are compressed in ZIP-format and sent to the Service
Service description
ZIP file will be send to configured address as HTTP POST using multipart/form-data body format (part name: "uploads").
Cron job configuration
Automate have the functionality to configure so-called "Cron jobs". These are scheduled jobs that are run periodically and have their own configuration. Here's the example DataExport Cron job configuration:
chatbots.cron-orchestrator.cron-jobs += { name: AnalyticsExport, uuid: "5c7a518d-754a-46aa-b3ff-065016c22597" // job identified (random GUID) interval: "0 0 2 ? * * *", // [CRON](https://en.wikipedia.org/wiki/Cron) format, in this example will run every day at 3AM message-timeout: 5m, // timeout for running export job (optional field) metadata: { extCompanyId: "123" // Value from company settings named "SentiOne company id for transcriptions" (System module, available only for system administrator) // service that will receive the data, more information in the next section url: "https://warehouse-service.yourcorp.org/data/receive" } }
Updated almost 3 years ago