此解决方案针对内存需求进行了优化,因为您认为它很重要。它需要三个查询。第一个查询要求提交帖子,第二个查询只适用于元组(id,post_id)。第三个过滤最新评论的细节。
from itertools import groupby, islice
posts = Post.objects.filter(...some your flter...)
# sorted by date or by id
all_comments = (Comment.objects.filter(post__in=posts).values('post_id')
.order_by('post_id', '-pk'))
last_comments = []
# the queryset is evaluated now. Only about 100 itens chunks are in memory at
# once during iterations.
for post_id, related_comments in groupby(all_comments(), lambda x: x.post_id):
last_comments.extend(islice(related_comments, 2))
results = {}
for comment in Comment.objects.filter(pk__in=last_comments):
results.setdefault(comment.post_id, []).append(comment)
# output
for post in posts:
print post.title, [x.comment for x in results[post.id]]
,但我认为这将是快了很多数据库后端的第二个和第三个查询合并为一个,因此立即要求的意见各个领域。无用的评论将被立即遗忘。
最快的解决方案是使用嵌套查询。该算法与上面的算法类似,但所有内容均通过原始SQL实现。它仅限于PostgresQL等后端。
编辑
我同意,是不是对你有用
...预取加载到内存中数千条评论,其中99%将不会显示。
因此,我写了一个相对复杂的解决方案,其中99%将连续读取而不加载到内存中。
EDIT
- 所有实施例仅用于您在棒POST_ID的条件[1,3,5]
- 在所有情况下创建(enything早些时候按类别等选择的)关于字段注释索引[ '后', 'PK']
A)嵌套查询PostgreSQL的
SELECT post_id, id, text FROM
(SELECT post_id, id, text, rank() OVER (PARTITION BY post_id ORDER BY id DESC)
FROM app_comment WHERE post_id in (1, 3, 5)) sub
WHERE rank <= 2
ORDER BY post_id, id
如果我们不相信优化器,或者明确要求更少的内存。它应该只从索引中两个内选择,其是少得多的数据比从表:
SELECT post_id, id, text FROM app_comment WHERE id IN
(SELECT id FROM
(SELECT id, rank() OVER (PARTITION BY post_id ORDER BY id DESC)
FROM app_comment WHERE post_id in (1, 3, 5)) sub
WHERE rank <= 2)
ORDER BY post_id, id
b)与最老的显示评论
过滤
from django.db.models import F
qs = Comment.objects.filter(
post__pk__in=[1, 3, 5],
post__oldest_displayed__lte=F('pk')
).order_by('post_id', 'pk')
pprint.pprint([(x.post_id, x.pk) for x in qs])
嗯,很不错的...它是如何编译(你已经按类别等较早选择)通过Django?
>>> print(qs.query.get_compiler('default').as_sql()[0]) # added white space
SELECT "app_comment"."id", "app_comment"."text", "app_comment"."post_id"
FROM "app_comment"
INNER JOIN "app_post" ON ("app_comment"."post_id" = "app_post"."id")
WHERE ("app_comment"."post_id" IN (%s, %s, %s)
AND "app_post"."oldest_displayed" <= ("app_comment"."id"))
ORDER BY app_comment"."post_id" ASC, "app_comment"."id" ASC
备齐“oldest_displayed”由一个嵌套的SQL最初(和设置岗位为零不到两年的意见):
UPDATE app_post SET oldest_displayed = 0
UPDATE app_post SET oldest_displayed = qq.id FROM
(SELECT post_id, id FROM
(SELECT post_id, id, rank() OVER (PARTITION BY post_id ORDER BY id DESC)
FROM app_comment) sub
WHERE rank = 2) qq
WHERE qq.post_id = app_post.id;
我不知道'.select_related( '意见')'提取意见。 '.select_related'可以获取ForeignKey的,OneToOne关系和反向OneToOne – Igor 2014-10-14 12:51:54
@Igor,呵呵,我不知道是这种情况。我猜[prefetch_related]的文档(https://docs.djangoproject.com/en/1.6/ref/models/querysets/#prefetch-related)暗示这一点。感谢您的高举。 – tino 2014-10-14 16:50:41
提取所有相关注释时出现什么问题?您以后可以在每篇文章中只使用前两项。 'posts [0] .comments.all()'不会执行额外的查询。这个问题是否有太多的相关查询来预取它们? – 2014-10-17 13:20:54