Export to Warehouse

Overall architecture

2119

DataExport architecture

The system has built-in functionality to export conversations to an external Warehouse. It runs a job periodically using Cron Orchestrator service. The DataExport job reads all conversations from the last export and creates two CSV files (described below) and ZIPs them. Then the file is sent to configured URL (more technical information below)

Format

Two CSV files are generated:

Messeges.csv

Field nameTypeDescription
conversationIdStringSession id (ssuid from request)
authorTypeStringHuman / Bot
contentStringMessage content
createdAtDateTimeMessage received date in format: YYYY-MM-DD’T’HH-mm-ss.SSS’Z’, e.g. 2021-08-18T09:11:32.877Z
labelsJSON object (fields: name, labelType)List of: analytics tags, intents and bot contexts. Field "name" contains the value, possibles values for labelType are: “IntentLabel”, “ContextLabel”, “FlowTagLabel”, “Topic”

Sessions.csv

Field nameTypeDescription
projectIdStringProject's indentificator (UUID) in SentiOne Automate
conversationIdStringSession id (ssuid from request)
authorStringTelephone number or other name received in API Request
sourceTypeStringSource type (received in API Request) e.g. "chat-tester”, "facebook" etc. (Optional)
createdAtDateTimeConversation start date in format: YYYY-MM-DD’T’HH-mm-ss.SSS’Z’, e.g. 2021-08-18T09:11:32.877Z
metadata.*JSON objectConverastion's metadata

ZIP

Those two files are compressed in ZIP-format and sent to the Service

Service description

ZIP file will be send to configured address as HTTP POST using multipart/form-data body format (part name: "uploads").

Cron job configuration

Automate have the functionality to configure so-called "Cron jobs". These are scheduled jobs that are run periodically and have their own configuration. Here's the example DataExport Cron job configuration:

chatbots.cron-orchestrator.cron-jobs += {
    name: AnalyticsExport,
    uuid: "5c7a518d-754a-46aa-b3ff-065016c22597" // job identified (random GUID)
    interval: "0 0 2 ? * * *", // [CRON](https://en.wikipedia.org/wiki/Cron) format, in this example will run every day at 3AM
    message-timeout: 5m, // timeout for running export job (optional field)
    metadata: {
      extCompanyId: "123" // Value from company settings named "SentiOne company id for transcriptions" (System module, available only for system administrator)
      // service that will receive the data, more information in the next section
      url: "https://warehouse-service.yourcorp.org/data/receive" 
    }
  }