Performance improvement: Type cast properties on demand #3089

pprkut · 2018-12-05T20:56:43Z

I analyzed where time is spent when querying the library and found that most of it is actually type casting properties after retrieving them from the database. However, this isn't really necessary. We hardly ever need access to all properties, so it makes much more sense to delay the actual casting until a property is accessed.

Taking my previous baseline of 3.66s for master, doing the delayed type casting reduces this to 1.96s, or down to 1.45s when #2798 is also applied (a bit slower now than my first number with it now no longer being a poc).
Our of those remaining ~1.5s, about 0.5s is spent on the actual database queries, and the rest is object creation.

I'm open to ideas for further performance improvements here, but we get firmly into area of micro-optimizations (creating 11000 objects in 1s, we'd need to make every object creation 1/22000s faster to save another 0.5s). At least I didn't find any obvious candidates.

sampsyo · 2018-12-06T00:51:39Z

Nice!! This seems like a really good optimization target. Excellent work with that profiler. 😃

I have a small idea for how to make the code a little more centralized and perhaps more maintainable: define a little "dict-like" wrapper that is responsible for performing the lazy conversion. Here's a quick sketch I put together:

class LazyConvertDict(object):
    def __init__(self, data, model_cls):
        self.data = data
        self.model_cls = model_cls
        self._converted = {}

    def _convert(self, key, value):
        return self.model_cls._type(key).from_sql(value)

    def __getitem__(self, key):
        if key in self._converted:
            return self._converted[key]
        elif key in self.data:
            value = self._convert(key, self.data[key])
            self._converted[key] = value
            return value

The model class could then just define _values_fixed and _values_flex to be instances of LazyConvertDict and access them normally—eliminating the need for the conversion handling inside of the lookup functions.

This would probably need a __setitem__ method too, but does the rough idea make sense?

pprkut · 2019-03-24T17:38:19Z

Uploaded a raw version that uses your suggested LazyConvertDict. Testing here looks fine. Performance for some reason seems to be even better with that version, not entirely sure why.
Please have a look. I'll perform cleanup, doc additions, test fixes within the next couple days :-)

sampsyo

Yay! Looking great already.

beets/dbcore/db.py

pprkut · 2019-03-31T10:13:56Z

AFAICS everything's fixed up now :)

sampsyo · 2019-04-01T01:51:26Z

OK, this is truly awesome!! We should be maintaining a trophy room somewhere that lists all the performance bottlenecks you have found and defeated. Thank you for all your help along these lines! ✨

Performance improvement: Type cast properties on demand

Performance improvement: Type cast properties on demand

439d4c1

pprkut force-pushed the delayed_type_casting branch from ea09bba to 5eced8d Compare March 24, 2019 17:33

sampsyo reviewed Mar 24, 2019

View reviewed changes

beets/dbcore/db.py Show resolved Hide resolved

beets/dbcore/db.py Outdated Show resolved Hide resolved

pprkut force-pushed the delayed_type_casting branch from 5eced8d to c90a270 Compare March 31, 2019 09:49

Add class wrapper for lazily converting attributes

f9f2bdd

pprkut force-pushed the delayed_type_casting branch from c90a270 to f9f2bdd Compare March 31, 2019 09:58

sampsyo merged commit f9f2bdd into beetbox:master Apr 1, 2019

sampsyo added a commit that referenced this pull request Apr 1, 2019

Merge pull request #3089 from pprkut/delayed_type_casting

a588784

Performance improvement: Type cast properties on demand

sampsyo added a commit that referenced this pull request Apr 1, 2019

Changelog for #3089

6c9c881

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance improvement: Type cast properties on demand #3089

Performance improvement: Type cast properties on demand #3089

pprkut commented Dec 5, 2018

sampsyo commented Dec 6, 2018 •

edited

Loading

pprkut commented Mar 24, 2019

sampsyo left a comment

pprkut commented Mar 31, 2019

sampsyo commented Apr 1, 2019

Performance improvement: Type cast properties on demand #3089

Performance improvement: Type cast properties on demand #3089

Conversation

pprkut commented Dec 5, 2018

sampsyo commented Dec 6, 2018 • edited Loading

pprkut commented Mar 24, 2019

sampsyo left a comment

Choose a reason for hiding this comment

pprkut commented Mar 31, 2019

sampsyo commented Apr 1, 2019

sampsyo commented Dec 6, 2018 •

edited

Loading