Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: implement query profiling #542

Merged
merged 45 commits into from
Aug 7, 2024
Merged
Show file tree
Hide file tree
Changes from 35 commits
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
7f46c52
added basics for query profiling
daniel-sanche May 6, 2024
e66c57f
changes dataclass ordering
daniel-sanche May 6, 2024
1f1e8e1
refactored
daniel-sanche May 6, 2024
1af61d1
added exceptions
daniel-sanche May 6, 2024
ac11cd2
refactoring of dataclasses
daniel-sanche May 6, 2024
730411b
outlines of unit tests
daniel-sanche May 6, 2024
5823dbc
got system tests passing
daniel-sanche May 10, 2024
aa9f334
implemented some unit tests
daniel-sanche May 10, 2024
5cdba98
got query tests passing
daniel-sanche May 11, 2024
0f971c6
added profiling to aggregation query
daniel-sanche May 11, 2024
387b01b
added kwargs to client.aggregation_query
daniel-sanche May 20, 2024
ddec61d
partially implemented aggregation query system tests
daniel-sanche May 20, 2024
cca1f00
removed outdated test
daniel-sanche May 20, 2024
cd97827
added last agg system test
daniel-sanche May 20, 2024
96d6d52
fixed lint
daniel-sanche May 20, 2024
3cf9d8a
fixed system tests
daniel-sanche May 20, 2024
4c24b84
added in transaction tests
daniel-sanche May 21, 2024
73ad241
fixed system tests
daniel-sanche May 21, 2024
686e722
moved datacalsses into new file
daniel-sanche May 21, 2024
4ebf0de
improved docstrings
daniel-sanche May 21, 2024
f2de107
aggregation uses nested query explain options
daniel-sanche May 21, 2024
0abea2f
fixed parsing bug
daniel-sanche May 21, 2024
d4f4475
added samples
daniel-sanche May 21, 2024
fc6187e
moved query_profile model tests to new file
daniel-sanche May 21, 2024
eced67d
better handle empty debug stats
daniel-sanche May 21, 2024
730e7cb
fixed test
daniel-sanche May 21, 2024
ab8d7a2
fixed lint
daniel-sanche May 21, 2024
e02b20d
fixed samples lint
daniel-sanche May 21, 2024
02afe92
added missing test cases
daniel-sanche May 21, 2024
5dcf063
added back removed test
daniel-sanche May 21, 2024
5cf1f88
fixed lint
daniel-sanche May 21, 2024
51ffa9e
added test
daniel-sanche May 21, 2024
c6df706
🦉 Updates from OwlBot post-processor
gcf-owl-bot[bot] May 21, 2024
7d2e597
more lenient in system test
daniel-sanche May 21, 2024
e9e72b6
removed import
daniel-sanche May 21, 2024
083e604
added debug stats checks
daniel-sanche May 24, 2024
58c3c57
Apply suggestions from code review
daniel-sanche Jun 21, 2024
c77f299
Merge branch 'main' into query_profiling
daniel-sanche Jul 8, 2024
7d802e3
Merge branch 'main' into query_profiling
daniel-sanche Jul 29, 2024
5a644ab
Merge branch 'main' into query_profiling
daniel-sanche Aug 2, 2024
b2a9a77
🦉 Updates from OwlBot post-processor
gcf-owl-bot[bot] Aug 2, 2024
e67ca72
fixed new_query calculation
daniel-sanche Aug 2, 2024
4319ac7
added comment
daniel-sanche Aug 7, 2024
175c76a
Merge branch 'main' into query_profiling
daniel-sanche Aug 7, 2024
c5d8a76
fixed offset calculation
daniel-sanche Aug 7, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 11 additions & 1 deletion google/cloud/datastore/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,16 @@
from google.cloud.datastore.entity import Entity
from google.cloud.datastore.key import Key
from google.cloud.datastore.query import Query
from google.cloud.datastore.query_profile import ExplainOptions
from google.cloud.datastore.transaction import Transaction

__all__ = ["__version__", "Batch", "Client", "Entity", "Key", "Query", "Transaction"]
__all__ = [
"__version__",
"Batch",
"Client",
"Entity",
"Key",
"Query",
"ExplainOptions",
"Transaction",
]
98 changes: 66 additions & 32 deletions google/cloud/datastore/aggregation.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,15 +23,11 @@
from google.cloud.datastore import helpers
from google.cloud.datastore.query import _pb_from_query

from google.cloud.datastore.query_profile import ExplainMetrics
from google.cloud.datastore.query_profile import QueryExplainError

_NOT_FINISHED = query_pb2.QueryResultBatch.MoreResultsType.NOT_FINISHED
_NO_MORE_RESULTS = query_pb2.QueryResultBatch.MoreResultsType.NO_MORE_RESULTS

_FINISHED = (
_NO_MORE_RESULTS,
query_pb2.QueryResultBatch.MoreResultsType.MORE_RESULTS_AFTER_LIMIT,
query_pb2.QueryResultBatch.MoreResultsType.MORE_RESULTS_AFTER_CURSOR,
)
from google.cloud.datastore.query import _NOT_FINISHED
from google.cloud.datastore.query import _FINISHED


class BaseAggregation(ABC):
Expand Down Expand Up @@ -159,16 +155,25 @@ class AggregationQuery(object):

:type query: :class:`google.cloud.datastore.query.Query`
:param query: The query used for aggregations.

:type explain_options: :class:`~google.cloud.datastore.ExplainOptions`
:param explain_options: (Optional) Options to enable query profiling for
this query. When set, explain_metrics will be available on the iterator
returned by query.fetch().
If not passed, will use value from given query.
"""

def __init__(
self,
client,
query,
explain_options=None,
):
self._client = client
self._nested_query = query
self._aggregations = []
# fallback to query._explain_options if not set
self._explain_options = explain_options or query._explain_options

@property
def project(self):
Expand Down Expand Up @@ -391,6 +396,7 @@ def __init__(
self._read_time = read_time
self._limit = limit
# The attributes below will change over the life of the iterator.
self._explain_metrics = None
self._more_results = True

def _build_protobuf(self):
Expand Down Expand Up @@ -441,7 +447,6 @@ def _next_page(self):
if not self._more_results:
return None

query_pb = self._build_protobuf()
transaction_id, new_transaction_options = helpers.get_transaction_options(
self.client.current_transaction
)
Expand All @@ -466,38 +471,67 @@ def _next_page(self):
"project_id": self._aggregation_query.project,
"partition_id": partition_id,
"read_options": read_options,
"aggregation_query": query_pb,
"aggregation_query": self._build_protobuf(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just for my understanding, why is this changing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was a lot of duplicated code in the original implementation (see the duplicated request dictionaries)

I did some refactoring of these methods so that building the request and calling run_aggregation_query only happens in one place. And I also ended up removing the query_pb variable, since I found it a little confusing tracking the state between query_pb, old_query_pb, and request["query"]

}
if self._aggregation_query._explain_options:
request[
"explain_options"
] = self._aggregation_query._explain_options._to_dict()
helpers.set_database_id_to_request(request, self.client.database)
response_pb = self.client._datastore_api.run_aggregation_query(
request=request,
**kwargs,
)

while response_pb.batch.more_results == _NOT_FINISHED:
# We haven't finished processing. A likely reason is we haven't
# skipped all of the results yet. Don't return any results.
# Instead, rerun query, adjusting offsets. Datastore doesn't process
# more than 1000 skipped results in a query.
old_query_pb = query_pb
query_pb = query_pb2.AggregationQuery()
query_pb._pb.CopyFrom(old_query_pb._pb) # copy for testability

request = {
"project_id": self._aggregation_query.project,
"partition_id": partition_id,
"read_options": read_options,
"aggregation_query": query_pb,
}
helpers.set_database_id_to_request(request, self.client.database)
response_pb = None

while response_pb is None or response_pb.batch.more_results == _NOT_FINISHED:
if response_pb is not None:
# We haven't finished processing. A likely reason is we haven't
# skipped all of the results yet. Don't return any results.
# Instead, rerun query, adjusting offsets. Datastore doesn't process
# more than 1000 skipped results in a query.
new_query_pb = query_pb2.AggregationQuery()
new_query_pb._pb.CopyFrom(
request["aggregation_query"]._pb
) # copy for testability
request["aggregation_query"] = new_query_pb

response_pb = self.client._datastore_api.run_aggregation_query(
request=request,
**kwargs,
request=request.copy(), **kwargs
)
# capture explain metrics if present in response
# should only be present in last response, and only if explain_options was set
if response_pb.explain_metrics:
self._explain_metrics = ExplainMetrics._from_pb(
response_pb.explain_metrics
)

item_pbs = self._process_query_results(response_pb)
return page_iterator.Page(self, item_pbs, self.item_to_value)

@property
def explain_metrics(self) -> ExplainMetrics:
"""
Get the metrics associated with the query execution.
Metrics are only available when explain_options is set on the query. If
ExplainOptions.analyze is False, only plan_summary is available. If it is
True, execution_stats is also available.

:rtype: :class:`~google.cloud.datastore.query_profile.ExplainMetrics`
:returns: The metrics associated with the query execution.
:raises: :class:`~google.cloud.datastore.query_profile.QueryExplainError`
if explain_metrics is not available on the query.
"""
if self._explain_metrics is not None:
return self._explain_metrics
elif self._aggregation_query._explain_options is None:
raise QueryExplainError("explain_options not set on query.")
elif self._aggregation_query._explain_options.analyze is False:
# we need to run the query to get the explain_metrics
self._next_page()
Copy link
Contributor Author

@daniel-sanche daniel-sanche Aug 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this safe? (Are the results being returned later, or thrown out)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have any results when analyze=True, so this is safe. Added comment

if self._explain_metrics is not None:
return self._explain_metrics
raise QueryExplainError(
"explain_metrics not available until query is complete."
)


# pylint: disable=unused-argument
def _item_to_aggregation_result(iterator, pb):
Expand Down
4 changes: 2 additions & 2 deletions google/cloud/datastore/client.py
Original file line number Diff line number Diff line change
Expand Up @@ -875,7 +875,7 @@ def do_something_with(entity):
kwargs["namespace"] = self.namespace
return Query(self, **kwargs)

def aggregation_query(self, query):
def aggregation_query(self, query, **kwargs):
"""Proxy to :class:`google.cloud.datastore.aggregation.AggregationQuery`.
Using aggregation_query to count over a query:
Expand Down Expand Up @@ -953,7 +953,7 @@ def do_something_with(entity):
:rtype: :class:`~google.cloud.datastore.aggregation.AggregationQuery`
:returns: An AggregationQuery object.
"""
return AggregationQuery(self, query)
return AggregationQuery(self, query, **kwargs)

def reserve_ids_sequential(self, complete_key, num_ids, retry=None, timeout=None):
"""Reserve a list of IDs sequentially from a complete key.
Expand Down
92 changes: 62 additions & 30 deletions google/cloud/datastore/query.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,20 +13,21 @@
# limitations under the License.

"""Create / interact with Google Cloud Datastore queries."""

import base64
import warnings


from google.api_core import page_iterator
from google.cloud._helpers import _ensure_tuple_or_list


from google.cloud.datastore_v1.types import entity as entity_pb2
from google.cloud.datastore_v1.types import query as query_pb2
from google.cloud.datastore import helpers
from google.cloud.datastore.key import Key


from google.cloud.datastore.query_profile import ExplainMetrics
from google.cloud.datastore.query_profile import QueryExplainError

import abc
from abc import ABC

Expand All @@ -38,6 +39,7 @@
_NO_MORE_RESULTS,
query_pb2.QueryResultBatch.MoreResultsType.MORE_RESULTS_AFTER_LIMIT,
query_pb2.QueryResultBatch.MoreResultsType.MORE_RESULTS_AFTER_CURSOR,
query_pb2.QueryResultBatch.MoreResultsType.MORE_RESULTS_TYPE_UNSPECIFIED, # received when explain_options(analyze=False)
)

KEY_PROPERTY_NAME = "__key__"
Expand Down Expand Up @@ -176,6 +178,11 @@ class Query(object):
:type distinct_on: sequence of string
:param distinct_on: field names used to group query results.

:type explain_options: :class:`~google.cloud.datastore.ExplainOptions`
:param explain_options: (Optional) Options to enable query profiling for
this query. When set, explain_metrics will be available on the iterator
returned by query.fetch().

:raises: ValueError if ``project`` is not passed and no implicit
default is set.
"""
Expand Down Expand Up @@ -203,6 +210,7 @@ def __init__(
projection=(),
order=(),
distinct_on=(),
explain_options=None,
):
self._client = client
self._kind = kind
Expand All @@ -221,6 +229,7 @@ def __init__(
else:
self._namespace = None

self._explain_options = explain_options
self._ancestor = ancestor
self._filters = []

Expand Down Expand Up @@ -704,6 +713,7 @@ def __init__(
self._timeout = timeout
self._read_time = read_time
# The attributes below will change over the life of the iterator.
self._explain_metrics = None
self._more_results = True
self._skipped_results = 0

Expand Down Expand Up @@ -777,7 +787,6 @@ def _next_page(self):
if not self._more_results:
return None

query_pb = self._build_protobuf()
new_transaction_options = None
transaction_id, new_transaction_options = helpers.get_transaction_options(
self.client.current_transaction
Expand All @@ -804,46 +813,69 @@ def _next_page(self):
"project_id": self._query.project,
"partition_id": partition_id,
"read_options": read_options,
"query": query_pb,
"query": self._build_protobuf(),
}
if self._query._explain_options:
request["explain_options"] = self._query._explain_options._to_dict()

helpers.set_database_id_to_request(request, self.client.database)

response_pb = self.client._datastore_api.run_query(
request=request,
**kwargs,
)
response_pb = None

while (
while response_pb is None or (
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new while loop seems mostly the same as the previous one. Just for my education, could you explain how they are different and why we need to make the change here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding if response_pb is None is essentially changing it from a while loop to a do while loop; Previously the implementation would doa. single call before the loop, and then repeat a bunch of code to repeat the call inside the loop as well. I wanted to reduce the duplicated code, and move everything inside the loop

The other change was to get rid of the query_pb variable, and directly access the request inside the query instead, to keep all the state in one place

response_pb.batch.more_results == _NOT_FINISHED
and response_pb.batch.skipped_results < query_pb.offset
and response_pb.batch.skipped_results < request["query"].offset
):
# We haven't finished processing. A likely reason is we haven't
# skipped all of the results yet. Don't return any results.
# Instead, rerun query, adjusting offsets. Datastore doesn't process
# more than 1000 skipped results in a query.
old_query_pb = query_pb
query_pb = query_pb2.Query()
query_pb._pb.CopyFrom(old_query_pb._pb) # copy for testability
query_pb.start_cursor = response_pb.batch.skipped_cursor
query_pb.offset -= response_pb.batch.skipped_results

request = {
"project_id": self._query.project,
"partition_id": partition_id,
"read_options": read_options,
"query": query_pb,
}
helpers.set_database_id_to_request(request, self.client.database)
if response_pb is not None:
# We haven't finished processing. A likely reason is we haven't
# skipped all of the results yet. Don't return any results.
# Instead, rerun query, adjusting offsets. Datastore doesn't process
# more than 1000 skipped results in a query.
new_query_pb = query_pb2.Query()
new_query_pb._pb.CopyFrom(request["query"]._pb) # copy for testability
new_query_pb.start_cursor = response_pb.batch.skipped_cursor
new_query_pb.offset -= response_pb.batch.skipped_results
request["query"] = new_query_pb

response_pb = self.client._datastore_api.run_query(
request=request,
**kwargs,
request=request.copy(), **kwargs
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is request.copy() here also for testability? copy() is only a shallow copy for dict, do we need to make a deep copy here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, IIRC it's just for testability.

We want to be able to make assertions about the arguments passed in to each call, which doesn't work if the same request instance is used for each request

)
# capture explain metrics if present in response
# should only be present in last response, and only if explain_options was set
if response_pb and response_pb.explain_metrics:
self._explain_metrics = ExplainMetrics._from_pb(
response_pb.explain_metrics
)

entity_pbs = self._process_query_results(response_pb)
return page_iterator.Page(self, entity_pbs, self.item_to_value)

@property
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any difference between this and the one in google/cloud/datastore/aggregation.py ?

Can the implementation be just in one place? and called from both locations?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately there's a lot of duplicated code between the two classes currently :(

It's a bit complicated, because these query/aggregation classes (although very similar methods) have no superclass. And the names of some internal variables are different, so the implementation's not an exact match either.

I'm open to maybe moving some of this logic into a helper in query_profile.py, and calling it from both places? Let me know what you think. But the more long-term solution is probably to do a larger-scale refactor of query.py and aggregation.py, to fix all the other duplications as well. But that's out of scope for now

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

larger-scale refactor to fix all duplications is preferred.

def explain_metrics(self) -> ExplainMetrics:
"""
Get the metrics associated with the query execution.
Metrics are only available when explain_options is set on the query. If
ExplainOptions.analyze is False, only plan_summary is available. If it is
True, execution_stats is also available.

:rtype: :class:`~google.cloud.datastore.query_profile.ExplainMetrics`
:returns: The metrics associated with the query execution.
:raises: :class:`~google.cloud.datastore.query_profile.QueryExplainError`
if explain_metrics is not available on the query.
"""
if self._explain_metrics is not None:
return self._explain_metrics
elif self._query._explain_options is None:
raise QueryExplainError("explain_options not set on query.")
elif self._query._explain_options.analyze is False:
# we need to run the query to get the explain_metrics
self._next_page()
if self._explain_metrics is not None:
return self._explain_metrics
raise QueryExplainError(
"explain_metrics not available until query is complete."
)


def _pb_from_query(query):
"""Convert a Query instance to the corresponding protobuf.
Expand Down
Loading