Archive data

Security Requirements and Controls

LogScale supports archiving ingested logs to Amazon S3 and Google Cloud Storage. Google Cloud Storage is only available for self-hosted installations. The archived logs are then available for further processing in any external system that integrates with the archiving provider. The files written by LogScale in this format are not searchable by LogScale — this is an export meant for other systems to consume.

When archiving is enabled all the events in repository are backfilled into the archiving platform and then it archives new events by running a periodic job inside all LogScale nodes, which looks for new, unarchived segment files. The segment files are read from disk, streamed to a bucket in the archiving provider's platform, and marked as archived in LogScale.

An administrator must configure LogScale for the cloud provider and set up archiving per repository. The cloud providers are:

After configuring the cloud provider and selecting a repository on LogScale, the configuration page is available under Settings.

Important

Until archiving has been configured at the cluster level, no archiving can be configured at the org level within the LogScale user interface.

Note

For slow-moving datasources it can take some time before segment files are completed on disk and then made available for the archiving job. In the worst case, before a segment file is completed, it must contain a gigabyte of uncompressed data or 30 minutes must have passed. The exact thresholds are those configured as the limits on mini segments.

For more information on segments files and datasources, see segment files and Datasources.