2017-05-12 53 views
0

我现在正在与django的prefetch相关的问题。 举个例子,假设这些模型Django prefetch_related一个大型的数据集

from django.db import models 

class Client(models.Model): 
    name = models.CharField(max_length=255) 

class Purchase(models.Model): 
    client = models.ForeignKey('Client') 

让我们想象一下,我们有几个客户,像200,但他们买了很多,所以我们有几百万购买的。

如果我要创建一个网页上显示所有的客户和购买为每个客户端的数量,我会写这样的事情

from django.db.models import Prefetch 
from .models import Purchase, Client 

purchases = Purchase.objects.all() 
clients = Client.prefetch_related(Prefetch('purchase_set', queryset=purchases)) 

在这里的问题是,我会查询大单采购数据库和该查询可能需要超过一分钟,或更糟的是在服务器上创建一个MemoryError。

于是,我试着用

purchases = Purchase.objects.all()[:9] 

只选择一个批次的数据库但我们可以预期,Django不喜欢它多,推出这种异常

Traceback (most recent call last): 
    File "project/venv/lib/python3.6/site-packages/django/core/handlers/base.py", 
line 149, in get_response 
    response = self.process_exception_by_middleware(e, request) 
    File "project/venv/lib/python3.6/site-packages/django/core/handlers/base.py", 
line 147, in get_response 
    response = wrapped_callback(request, *callback_args, **callback_kwargs) 
    File "project/venv/lib/python3.6/site-packages/django/views/generic/base.py", 
line 68, in view 
    return self.dispatch(request, *args, **kwargs) 
    File "project/venv/lib/python3.6/site-packages/django/utils/decorators.py", l 
ine 67, in _wrapper 
    return bound_func(*args, **kwargs) 
    File "project/venv/lib/python3.6/site-packages/django/views/decorators/cache. 
py", line 57, in _wrapped_view_func 
    response = view_func(request, *args, **kwargs) 
    File "project/venv/lib/python3.6/site-packages/django/utils/decorators.py", l 
ine 63, in bound_func 
    return func.__get__(self, type(self))(*args2, **kwargs2) 
****************** login decorators, views, ... 
    File "project/***.py", line ***, in *** 
    for client in clients: 
    File "project/venv/lib/python3.6/site-packages/django/db/models/query.py", li 
ne 258, in __iter__ 
    self._fetch_all() 
    File "project/venv/lib/python3.6/site-packages/django/db/models/query.py", li 
ne 1076, in _fetch_all 
    self._prefetch_related_objects() 
    File "project/venv/lib/python3.6/site-packages/django/db/models/query.py", li 
ne 656, in _prefetch_related_objects 
    prefetch_related_objects(self._result_cache, self._prefetch_related_lookups) 
    File "project/venv/lib/python3.6/site-packages/django/db/models/query.py", li 
ne 1457, in prefetch_related_objects 
    obj_list, additional_lookups = prefetch_one_level(obj_list, prefetcher, lookup, level) 
    File "project/venv/lib/python3.6/site-packages/django/db/models/query.py", li 
ne 1556, in prefetch_one_level 
    prefetcher.get_prefetch_queryset(instances, lookup.get_current_queryset(level))) 
    File "project/venv/lib/python3.6/site-packages/django/db/models/fields/relate 
d_descriptors.py", line 539, in get_prefetch_queryset 
    queryset = queryset.filter(**query) 
    File "project/venv/lib/python3.6/site-packages/django/db/models/query.py", li 
ne 790, in filter 
    return self._filter_or_exclude(False, *args, **kwargs) 
    File "project/venv/lib/python3.6/site-packages/django/db/models/query.py", li 
ne 802, in _filter_or_exclude 
    "Cannot filter a query once a slice has been taken." 
AssertionError: Cannot filter a query once a slice has been taken. 

所以现在,我没有真正的解决方案。我正在研究如何构建django/db/models/query.py:258中的__iter__函数,以尝试创建一个具有相同行为的函数,但需要预取中的有限集以便对它进行分页,然后执行更平行的方式。

有没有什么“好方法”来做这些查询?

回答

0

让我们想象一下,我们有几个客户,像200,但他们买 了很多,所以我们有几百万购买的。

如果我要创建一个网页上显示的所有客户端和 购买次数为每个客户端的,...

我要解释你的问题,因为想要这个功能。您是否尝试过:

from django.db.models import Count 
clients = Client.objects.annotate(num_purchases=Count('purchase')) 
clients[0].num_purchases 

如果要排序,并获得最高的采购客户,你也可以这样做:

clients = Client.objects.annotate(num_purchases=Count('purchase')).order_by('-num_purchases')[:5] 

实现更多的功能见https://docs.djangoproject.com/en/1.11/topics/db/aggregation/

+0

非常感谢你,正是我在找什么,对不起,我没有阅读过手册^^“ –