Django QuerySets 如何实现惰性查询

Aug 07, 2016

最开始看 Django 官方文档时，关于 QuerySets 的介绍中有这么一段：

QuerySets are lazy – the act of creating a QuerySet doesn’t involve any database activity. You can stack filters together all day long, and Django won’t actually run the query until the QuerySet is evaluated.

简单翻译：

查询集是惰性执行的——事实上，查询集创建时不会对数据库有任何操作。你可以保持过滤器在这个状态一整天，Django 都不会执行这个查询，直到查询集需要求值时才会进行数据库查询。

什么是惰性查询

官方文档给了这样一个例子：

>>> q = Entry.objects.filter(headline__startswith="What")
>>> q = q.filter(pub_date__lte=datetime.date.today())
>>> q = q.exclude(body_text__icontains="food")
>>> print(q)  # 这时才产生了唯一的一次数据库查询操作

在以下情况时，QuerySets 才会进行数据库查询：

对其迭代，如：

for i in foo.objects.all():
    print i.id

对其使用带 step 的切片，如：
```
foo.objects.all()[2:10:2]
```
对其进行序列化或缓存，如：
```
pickle.dumps(foo.objects.all())
```
对其使用 str() repr() len() list() bool() 等，如：
```
str(foo.objects.all())
```

也就是说，QuerySets 只在需要取值时才进行数据库查询，惰性查询的实现除了避免不必要的数据库查询外，还可以支持链式过滤，方便进行更复杂的查询操作。

如何实现

QuerySets 是一个高度定制的类，在使用过程中有两个关键的步骤：

通过 all(), filter(), exclude(), order_by() 等函数进行 SQL 语句构造，并存放在 self.query 中以备使用
当需要取值时，调用 self.query 中构造好的 SQL 语句进行数据库查询，将查询结果封装成 python 对象返回

构造 SQL 语句这部分虽然复杂，但并不存在理解上的难点就不细说了，有意思的是第二步，QuerySets 是怎么知道什么时候要取值，并查询数据库的？这其中就用到了部分 Magic Methods (魔术方法)。

Magic Methods

Magic Methods 是 python 中很有趣的一部分，指那些前后被双下划线包围的特殊方法（如__init__），Magic Methods 有很多，今天主要介绍 QuerySets 中主要用到的几个关键方法：

class Test:

    def __init__(self):
        print 'init'

    def __str__(self):
        print 'str is called'
        return str('str!')

    def __len__(self):
        print 'len is called'
        return 100

    def __repr__(self):
        print 'repr is called'
        return repr('repr!')

    def __iter__(self):
        print 'loop iter'
        return iter([1,2,3])

    def __getitem__(self, k):
        if isinstance(k, slice):
            print 'start:', k.start
            print 'stop:', k.stop
            print 'step:', k.step
        elif isinstance(k, int):
            print 'index:', k
        return [1,2,3,4]

test = Test()
# output: init

print test
# output: str is called
# output: str!

print str(test)
# output: str is called
# output: str!

print len(test)
# output: len is called
# output: 100

print repr(test)
# output: repr is called
# output: 'repr!'

for i in test:
    print i
# output: loop iter
# output: 1
# output: 2
# output: 3

test[1:2:3]
# output: start: 1
# output: stop: 2
# output: step: 3

test[1]
# output: index: 1

可以看见上面这几个 Magic Methods 分别对应了一种取值方式，取值时可以按需返回特定的值，所以通过这个例子很容易明白 QuerySets 在这些 Magic Methods 内都进行了这样一次判断，用 __len__ 举例：

def __len__(self):

    # 判断是否已经缓存数据库查询结果
    if not self.result_cache:
        # 调用之前构造好的 SQL 语句进行数据库查询
        self.result_cache = self.get_result()

    return len(self.result_cache)

实际上每个 Magic Methods 内的实现还略有差别，但通过这样一次判断就可以实现在取值时才进行数据库查询从而达到了惰性查询的目的。

Reference

文章列表评论