General description

Storage provides common interface for files operations (writing, reading and removing) in various data sources.

It allows system admin to configure one of the supported (see list below) data sources to store any type of file uploaded by user during conversation or recording of voice-based conversation.

Supported data sources

  • PostgreSQL (default)
  • Local File System
  • AWS S3

API

Default port: 5786.

There are default endpoints for monitoring purposes. They are described in Components monitoring section.

Databases

SQL

The component has its own SQL Database for storing metadata of files uploaded to data sources via this application.

Main table:

file_entries - metadata of uploaded files

Communication

This service communicates only with data sources defined in config.

Storage is an internal application and is visible to the Admin, Analytics, and Gateway applications.

Security

Endpoints are secured with Time based one time password (OTP).

Optionally, during upload files can be encrypted with encryption algorithm AES GCM without padding. In result, stored file data is in encrypted form. Encryption is enabled via providing encryptionKey in data-source configuration.

Configuration

Application's config

chatbots.storage {
  db {
    url: "jdbc:postgresql://DB_HOST:DB_PORT/chatbots-storage",
    user: "USER",
    password: "PASSWORD"
    max-connections: 20
  }
  
  upload-data-source-name: "InternalPostgres-Default"
  data-sources: [
    {
        name: "InternalPostgres-Default"
        type: "InternalPostgres"
        
        #encryptionKey: "" - 256 bits in base64 form (optional)
    },
    {
        name: "LocalFileSystem-Default"
        type: "LocalFileSystem"
        connection: {
            path: "/data"
        }
        
        #encryptionKey: "" - 256 bits in base64 form (optional)
    },
    {
        name: "AwsS3-Default"
        type: "AwsS3"
        # Please note bucket names have to be globally unique across entire AWS
        bucket-name: "PUT_COMPANY_NAME_HERE-automate-storage-service-PUT_ENVIRONMENT_NAME_HERE"
        object-key-prefix: "recordings/"
        
        # Client config is optional section - if not provided client will look in default AWS config locations (~/.aws/credentials, ~/.aws/config, environment variables, system properties etc.)
        #client-config: {
        #    access-key-id: "PUT_ACCESS_KEY_ID_HERE"
        #    secret-access-key: "PUT_SECRET_ACCESS_KEY_HERE"
        #    # eg. eu-north-1
        #    region: "PUT_AWS_REGION_HERE"
        #    #Optional parameter for changing S3 endpoint
        #		 #endpoint-override: "PUT_ENDPOINT_OVERRIDE_HERE
        #}
        
        #encryptionKey: "" - 256 bits in base64 form (optional)
    }
  ]
}

Limits

Maximum allowed size for uploaded file is 100MB.

Data retention

Files associated with session are removed automatically along with session removal. Session's data retention is configured via analytics config .

Storage size requirements

Main use-case for the storage service is to keep recordings of Voice conversations conducted on Automate platform. In the table below you will find estimates of required storage space to accommodate it.

FormatStorage required for 1 minute of recordingEstimated minutes of recordings dailyEstimated minutes of recordings monthlyRecommended Storage (1 year retention)
MP3/192kbps1.44 MB100310054 GB
100031000540 GB
100003100005,4 TB
MP3/320kbps2.4 MB100310090 GB
100031000900 GB
100003100009 TB

Example

Assumptions:

  • 3100 minutes of recordings per month
  • default 1 year retention
  • no recording is removed before retention date
  • recordings are encoded using MP3 (192kbps)

Calculations:

  • monthly storage requirement = 1.44 MB/min x 3100 = 4,5 GB
  • yearly storage requirement = 54 GB