Log collection form ELS API for given time block can become too large #9

hartfordfive · 2016-11-18T10:08:16Z

Although the ELS API currently does allow a count of items to be specified along with a timestamp start and end rage, it does not return any type of header returning how many logs items in total are within given time range. Due to this, a large number of logs may be downloaded which can become quite heavy for in-memory processing.

As a solution, the logs should initially be saved to a gzip file and then read from this file into smaller chunks.

The text was updated successfully, but these errors were encountered:

hartfordfive · 2016-12-02T18:08:26Z

The previously proposed solution was applied although still wasn't effective enough in terms of total time for downloading/processing.

Description of the Issue:

Current log file download times can exceed 5 minutes and processing time can go up to 10 minutes.
Each ticker itteration is currently defaulted to 30 minutes but unfortunately the ticket for the next itteration doesn't start until the current processing is completed, so in this case it's (30 + 5 + 10) = 45 minutes.
Now keep on adding this time over the period of a day and within 24 hours you can easily end up with a 4 to 5 hour delay in processing time, instead of a more reasonable 30 minutes. These delays will only grow as the traffic increases and the log files increase in size.

Proposed Solution:

Once the period has passed for beat's ticker, two functions would be called asynchronously to perform the following

Download ELS Log File: This function creates X goroutine(s) (# yet to be determined, maybe a pool) to each download the log files (sequentially/in-parallel) parts and place them on log_files_ready channel once completed.
- If 2 minute segments, then 15 files total for 30 minutes
- If 5 minute segements, then 6 files total for 30 minutes
Process/Publish Individual Log Entries: Have another function (also via a goroutine) process these files asynchronously from the log_files_ready channel as they are ready
- X goroutine(s) (# yet to be determined, maybe a pool) are created so that each can open the log file and then send of its processed events via PublishEvent

This may not be the absolute best solution, but it should be more affective than the current one. If more optimizations need to be done later, I'll deal with it then.

hartfordfive added the enhancement label Nov 18, 2016

hartfordfive added this to the beta-0.2.0 milestone Nov 18, 2016

hartfordfive closed this as completed Dec 23, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Log collection form ELS API for given time block can become too large #9

Log collection form ELS API for given time block can become too large #9

hartfordfive commented Nov 18, 2016

hartfordfive commented Dec 2, 2016

Log collection form ELS API for given time block can become too large #9

Log collection form ELS API for given time block can become too large #9

Comments

hartfordfive commented Nov 18, 2016

hartfordfive commented Dec 2, 2016

Description of the Issue:

Proposed Solution: