我有以下代码优化检查在字典词汇表
for key,value in jobs.items():
job = key
jobVector[key] = []
for x in range (0, len(listOfWords)):
if listOfWords[x] in jobs[job]:
jobVector[key].append(1)
else:
jobVector[key].append(0)
我有一个字典,工作,其具有存储的各种单词和每个计数。伯爵在这种情况下不相关的,但可以说,就业是喜欢本作的关键之一:
jobs[1] = account, addit, allow, ascertain, associ, avail, career, cellular, chang, coasttocoast, commiss, compani, competit, comput, countri, coupl, credit, custom, demand, develop, driven, dynam, employ, enjoi, ethic, exist, expand, experienc, fastest, flexibl, greet, growth, highperform, independ, individu, internet, knowledg, maintain, market, monitor, opportun, order, outstand, payment, person, phone, place, price, privatelyown, process, product, profession, provid, purchas, pursu, receiv, recommend, repres, resolv, respons, retail, right, selfmotiv, specif, store, support, technolog, territori, thatll, throughout, total, train, uniqu, unpreced, wireless, account, addit, aptitud, avail, bartend, benefit, bestbui, bilingu, cellular, colleg, commiss, commun, comput, consult, cross, custom, dedic, deduct, dental, direct, disabl, discount, effect, enterpris, entir, entrepreneuri, excel, execut, extend, famili, fleet, flexibl, goalori, health, impress, individu, insid, insur, integr, interperson, keyword, liter, longterm, medic, member, negoti, offer, outsid, packag, period, person, pleas, possess, possibl, pound, prefer, prescript, proud, provid, recogn, rentacar, repres, respons, retail, retir, salesman, salesperson, saleswoman, satisfi, shield, shortterm, spanish, spend, spirit, sprint, stand, technic, therefor, tmobil, vehicl, verbal, visit, websit, wireless, wwwjoincellularsalescom
可以说listOfWords是这样的:
listOfWords = associ, avail, career, cellular, chang, coasttocoast, commiss, compani, competit, comput, countri, coupl, credit, custom, demand, develop, driven, dynam, employ, enjoi, ethic
我非常想通过每单词在listOfWords中,看看它是否存在于JOBS字典中的每个作业的单个作业中。如果存在,则存储1,否则将0存储到另一个字典中。
他们有什么办法来加速?它目前有效,但在15000个作业的数据集上需要大约3分钟的时间。
只要有可能,请用集合替换列表。集合可以快速地进行成员资格测试(即'如果some_set中的某些东西')。列表必须一次遍历每个元素。 – Kevin 2014-12-03 01:26:33