[Reporting] Improvements needed for screenshot capture flow #59396

tsullivan · 2020-03-05T00:45:57Z

The Reporting plugin has the ability to take a URL to a Kibana page, and a few fields of metadata, and return an observable that resolves to a screenshot of the page, in the form of buffer data.

To execute this task, several things happen in within the pipeline of the observable:

openUrl: Open a browser page to the URL using the Puppeteer library
- Authentication is provided via Basic authentication in the request headers. In order not to reveal the authentication headers outside of the server origin, Reporting intercepts each request made by the browser to check if authentication is required for the request.
- Reporting knows if it has found the expected page by waiting for the .application selector to resolve.
skipTelemetry(): Attempt to have the browser instance avoid Telemetry calls by injecting data to into localStorage that telemetry does not need to be sent [1]
getNumberOfItems(): Read data from the DOM of the Kibana page to get a count of how many visualization items are on the page
- It needs that info to know how many visualizations should "check in" with custom events to signal to Reporting that they are done rendering.
driver.setViewport()Set the browser viewport and "zoom" based on the metadata
waitForElementsToBeInDOM: Wait for all visualization elements to be in the DOM
injectCss(): Inject custom CSS that hacks the Kibana page into a Report style
- hides the navigation and page menus
- hides interactive elements such as buttons and filter inputs
layout.positionElements(): Conditional step: if the visualization elements need to be re-arranged for "print mode," do it here
waitForRenderComplete(): Wait for all visualization elements to be done rendering
getElementPositionAndAttributes: Find (a) the x/y position of each visualization, and (b) title from the DOM for each visualization, that will be captured on the page
- Note: the element position / data can be a set of multiple items for "print" mode of a dashboard
getTimeRange: Read the time range from the filter bar DOM so it can be added to the screenshot title [2]
getScreenshots: Capture screenshots of each visualization on the page, using the element position info and return it, along with the date range and titles info

[1] This is a problematic arrangement because it forces Reporting to be aware of Telemetry and how to talk to it by manipulating Telemetry data in localStorage - also, it doesn't work properly.
[2] Steps 9 and 10 happen in parallel

Once the buffer data is available to the higher level code that runs the capture flow, it gets converted to base64 data, and the Reporting job document gets updated with the base64 and the "status" and "completed_at" fields updated.

Between the first and last step, there are parts that do things to the page to prepare it for reports, and parts that check the progress of the preparations by waiting for CSS selectors.

Plan to improve performance.

We need a plan to improve performance in Reporting for these steps.

Benchmark the current state of performance
We can define a "typical" dashboard as one of the sample data dashboards. We'll run many tests of generating a PNG report, while capture verbose logs with timestamps. Based on that information, we can find the average time taken by each part of the steps in the pipeline.

Benchmarks will help us focus on fixing the parts where the most time gets wasted, and gives us something to compare against after the improvements are done.

Tweak the pipeline
Based on the results of the benchmarking, we can make small changes to the pipeline to chase some improvements in the timing. Presumably, running more things in parallel could be something done as part of this.

Add a "Screenshot Mode Service" to Kibana core
There are steps in the pipeline that can be moved away to other parts of Kibana that "prepare" the page for Kibana. These parts wouldn't be needed if the preparations happen automatically when Reporting opens the page. That would be a huge performance boost for a few reasons:

No wasted rendering that gets replaced with the Reporting styles
No "interaction" components loading or fetching data in the background, which consumes resources.

Screenshot Mode Service RFC: #59084

This issue will be updated with some benchmark results.

The text was updated successfully, but these errors were encountered:

tsullivan · 2020-03-12T20:45:39Z

Here's a starting data point showing a trace of the different methods that get called when capturing a screenshot. This data is shown in APM, being integrated in this PR: [eCommerce] Revenue Dashboard. In this test, 5 PNGs were generated. According to this data, all of the transactions took between 24 and 28 seconds to complete.

Here's some data about the timing of generating a PNG report. Almost the entire time is spent generating the screenshots:

Drilling into the transaction of just the screenshot pipeline, we can see most of the time is waiting for the URL to open and waiting for the Kibana visualizations to complete rendering:

tsullivan · 2020-03-12T20:59:36Z

Here's a data point of the timings of generating a 2-page PDF of the Canvas eCommerce sample data worksheet.

Here is the overall timing of generating the PDF. Unlike the PNG report where the entire generation time is capturing screenshots, there is about 1 extra second of processing time to convert the PNG buffers to a multi-page PDF:

Here is a drill-in of the screenshots pipeline. You can see that 2 pages were opened, and they each took about the same amount of time to render:

elasticmachine · 2020-03-12T21:02:22Z

Pinging @elastic/kibana-reporting-services (Team:Reporting Services)

tsullivan · 2020-03-12T21:32:56Z

This issue description talks a lot about what is happening inside the screenshot pipeline, but we need more info about what happens in PDF generation after the screenshots are captured.

Here is an explanation:

The screenshots observable returns PNG buffer data (Screenshot[] type) to generate_pdf
Each PNG buffer is converted to base64 to pass into the PdfMaker library
We tell pdfMaker to generate the PDF buffer
We convert the PDF buffer to base64 to model it for the elasticsearch document

tsullivan · 2020-05-08T20:07:31Z

Closing via:

tsullivan added the WIP Work in progress label Mar 5, 2020

This was referenced Mar 5, 2020

[rfc][skip-ci] Screenshot Mode Service #59084

Closed

[Reporting] APM integration for baseline performance measurements #59967

Merged

tsullivan added Team:Reporting Services and removed WIP Work in progress labels Mar 12, 2020

tsullivan closed this as completed May 8, 2020

sophiec20 added Feature:Reporting:Framework Reporting issues pertaining to the overall framework and removed (Deprecated) Team:Reporting Services labels Aug 21, 2024

botelastic bot added the needs-team Issues missing a team label label Aug 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Reporting] Improvements needed for screenshot capture flow #59396

[Reporting] Improvements needed for screenshot capture flow #59396

tsullivan commented Mar 5, 2020 •

edited

Loading

tsullivan commented Mar 12, 2020 •

edited

Loading

tsullivan commented Mar 12, 2020 •

edited

Loading

elasticmachine commented Mar 12, 2020

tsullivan commented Mar 12, 2020

tsullivan commented May 8, 2020

[Reporting] Improvements needed for screenshot capture flow #59396

[Reporting] Improvements needed for screenshot capture flow #59396

Comments

tsullivan commented Mar 5, 2020 • edited Loading

Plan to improve performance.

tsullivan commented Mar 12, 2020 • edited Loading

tsullivan commented Mar 12, 2020 • edited Loading

elasticmachine commented Mar 12, 2020

tsullivan commented Mar 12, 2020

tsullivan commented May 8, 2020

tsullivan commented Mar 5, 2020 •

edited

Loading

tsullivan commented Mar 12, 2020 •

edited

Loading

tsullivan commented Mar 12, 2020 •

edited

Loading