2016-04-26 209 views
0

大家好,我想我的CSV转换成JSON JSON,使用下面的代码:转换CSV与蟒蛇

f = open('D:\\ResumesClassification\\test2.csv', 'r') 
fieldnames =("id","basicinformation","workexperience","education","skill","publication","add tionalinformation","link","award","certification") 
reader = csv.DictReader(f,fieldnames) 
out = json.dumps([ row for row in reader ]) 
fo = open('D:\\ResumesClassification\\test2.json','w') 
fo.write(out) 
fo.close() 
print out 

当我用mongoimport,故称JSON解码器不同步 - 数据变化的脚下,这是我的.json的样子:

[{"publication": "Structural basis for modulation of a G-protein-coupled receptor by allosteric drugs Nature,http://www.nature.com/nature/journal/v503/n7475/full/nature12595.htmlOctober 13, 2013The design of G-protein-coupled receptor (GPCR) allosteric modulators, an active area of modern pharmaceutical research, has proved challenging because neither the binding modes nor the molecular mechanisms of such drugs are known1, 2. Here we determine binding sites, bound conformations and specific drugreceptor interactions for several allosteric modulators of the M2 muscarinic acetylcholine receptor (M2 receptor), a prototypical family A GPCR, using atomic-level simulations in which the modulators spontaneously associate with the receptor. Despite substantial structural diversity, all modulators form cation interactions with clusters of aromatic residues in the receptor extracellular vestibule, approximately 15 from the classical, orthosteric ligand-binding site. We validate the observed modulator binding modes through radioligand binding experiments on receptor mutants designed, on the basis of our simulations, either to increase or to decrease modulator affinity. Simulations also revealed mechanisms that contribute to positive and negative allosteric modulation of classical ligand binding, including coupled conformational changes of the two binding sites and electrostatic interactions between ligands in these sites. These observations enabled the design of chemical modifications that substantially alter a modulators allosteric effects. Our findings thus provide a structural basis for the rational design of allosteric modulators targeting muscarinic and possibly other GPCRs.High-resolution crystal structure of human protease-activated receptor 1http://www.nature.com/nature/journal/v492/n7429/full/nature11701.htmlDecember 9, 2012Protease-activated receptor 1 (PAR1) is the prototypical member of a family of G-protein-coupled receptors that mediate cellular responses to thrombin and related proteases. Thrombin irreversibly activates PAR1 by cleaving the amino-terminal exodomain of the receptor, which exposes a tethered peptide ligand that binds the heptahelical bundle of the receptor to affect G-protein activation. Here we report the 2.2--resolution crystal structure of human PAR1 bound to vorapaxar, a PAR1 antagonist. The structure reveals an unusual mode of drug binding that explains how a small molecule binds virtually irreversibly to inhibit receptor activation by the tethered ligand of PAR1. In contrast to deep, solvent-exposed binding pockets observed in other peptide-activated G-protein-coupled receptors, the vorapaxar-binding pocket is superficial but has little surface exposed to the aqueous solvent. Protease-activated receptors are important targets for drug development. The structure reported here will aid the development of improved PAR1 antagonists and the discovery of antagonists to other members of this receptor family.Structure and dynamics of the M3 muscarinic acetylcholine receptorhttp://www.nature.com/nature/journal/v482/n7386/full/nature10867.htmlFebruary 22, 2012Acetylcholine, the first neurotransmitter to be identified, exerts many of its physiological actions via activation of a family of G-protein-coupled receptors (GPCRs) known as muscarinic acetylcholine receptors (mAChRs). Although the five mAChR subtypes (M1-M5) share a high degree of sequence homology, they show pronounced differences in G-protein coupling preference and the physiological responses they mediate. Unfortunately, despite decades of effort, no therapeutic agents endowed with clear mAChR subtype selectivity have been developed to exploit these differences. We describe here the structure of the G(q/11)-coupled M3 mAChR ('M3 receptor', from rat) bound to the bronchodilator drug tiotropium and identify the binding mode for this clinically important drug. This structure, together with that of the G(i/o)-coupled M2 receptor, offers possibilities for the design of mAChR subtype-selective ligands. Importantly, the M3 receptor structure allows a structural comparison between two members of a mammalian GPCR subfamily displaying different G-protein coupling selectivities. Furthermore, molecular dynamics simulations suggest that tiotropium binds transiently to an allosteric site en route to the binding pocket of both receptors. These simulations offer a structural view of an allosteric binding mode for an orthosteric GPCR ligand and provide additional opportunities for the design of ligands with different affinities or binding kinetics for different mAChR subtypes. Our findings not only offer insights into the structure and function of one of the most important GPCR families, but may also facilitate the design of improved therapeutics targeting these critical receptors.", "certification": "", "basicinformation": "Hillary GreenData Scientist - KnewtonNew York, NY-Authorized to work in the US for any employer", "award": "", "link": "", "workexperience": "Data ScientistKnewton - New York, NYNovember 2014 to PresentData Scientist - Embedded in Direct to Institution Team Use Big Data to advise the Direct to Institution team on how students and teachers are using the product, how to improve analytics for teachers and students, and how changes to the product might affect current users Create informative, beautiful, and dynamic data visualizations that convey key information about large data sets (javascript, d3, matplotlib) Create tools for processing, labeling, and understanding large, messy datasets (python, pandas, numpy) Write blog posts that highlight data insights to a general audience (https://www.knewton.com/blog/adaptive-learning/how-instructors-use-adaptive-assignments-in-the-classroom/https://www.knewton.com/resources/blog/adaptive-learning/visualizing-personalized-learning/https://www.knewton.com/resources/blog/adaptive-learning/friday-effect-students-really-worse/https://www.knewton.com/resources/blog/adaptive-learning/school-holidays-affect-student-scores/) Leadership: occasionally lead team stand-ups and sprint planning; advise other data scientists on best practices; review data science codeData Scientist - Efficacy Research Lead internal efficacy research efforts, including designing observational studies, performing complex data analysis, writing technical papers, and presenting results to both internal and external audiences Advise partner companies on efficacy research strategies, including study design and methodology and data analysis strategies Use Big Data to demonstrate the impact of adaptive learning technology Report research findings to internal and external audiences via papers, presentations, and visualizations(https://www.knewton.com/resources/blog/adaptive-learning/adaptive-advantage-reducing-performance-gap/)Scientific AssociateD.E. Shaw Research - New York, NYMay 2011 to November 2014Designed, performed and analyzed molecular dynamics simulations on the Anton supercomputer resulting in three publications in Nature Built tools to automate experimental workflows and report progress by email and text message using Python and shell scripting Collaborated with software engineering team to debug and suggest improvements for company-wide job-scheduler Analyzed experimental data using MATLAB and Python, including clustering (hierarchical and k-means), principal component analysis, and data visualization Studied allosteric modulation of GPCRs by drug-like molecules resulting in a completely in silico designed allosteric modulator of M2 muscarinic actylcholine receptor with novel properties Interpreted radio-ligand binding and electrophysiology data from collaborators' experiments Prepared research for publication in major journals (including manuscript writing and figure creation) Currently working on developing small-molecule inhibitors of voltage-gated potassium ion channels", "addtionalinformation": "Authored an article about life as a female scientist.http://lilith.org/blog/2014/05/for-all-you-aspiring-female-scientists-out-there/Authored a blog post on visualizing data from personalized learning (using Adobe Illustrator, iMovie, and d3 to create animations):https://www.knewton.com/resources/blog/adaptive-learning/visualizing-personalized-learning/Authored a series of blog posts on student performance at different times based on analysis of Big Data:https://www.knewton.com/resources/blog/adaptive-learning/friday-effect-students-really-worse/https://www.knewton.com/resources/blog/adaptive-learning/school-holidays-affect-student-scores/Authored a blog post on how procrastination affects student gradeshttp://www.knewton.com/blog/adaptive-learning/the-early-bird-gets-the-grade-how-procrastination-affects-student-scores/Authored a blog post about how instructors use adaptive assignments in the classroomhttp://www.knewton.com/blog/adaptive-learning/how-instructors-use-adaptive-assignments-in-the-classroom/Spoke about efficacy research at NYC Python Meetuphttp://www.meetup.com/nycpython/events/220735605/Spoke about efficacy research at PyGothamhttps://pygotham.org/2015/speakers/profile/359/", "skill": "Expert user in Python (Pandas, NumPy, SciPy, Matplotlib) (5 years), Expert user of SQL (PostgreSQL/MySQL/RedShift) (2 years), Proficient in bash shell scripting (7 years), Proficient in JavaScript (jQuery, d3) (1 year), Proficient in Unix/Linux & Windows environments (7 years), Familiar with High-Performance Computing (4 years), Expert user of Adobe Illustrator (5 years), Expert user in Maestro, VMD, PyMol (4 years), Proficient in MOE, InstantJChem (2 years), Expert user of Charmm/CGenFF force fields (4 years), Molecular Dynamics (including some Enhanced Sampling techniques, FEP) (7 years)", "education": "B.S. in Theoretical and Computational MaterialsUniversity of California, Berkeley - Berkeley, CAUniversity of California, Berkeley2006 to 2010", "id": "1"}, {"publication": "", "certification": "", "basicinformation": "Joseph DaoudNew York, NY-", "award": "", "link": "", "workexperience": "Data Scientist & Quantitative developerSocit Gnrale CIB - New York, NYSeptember 2013 to PresentNew York, USA  2013 - NowDesk Quantitative Developer & Data Scientist - Securitized Products, Exotic Credit Derivatives & Interest Rates Derivatives Designed, implemented and supported several trading pricing, monitoring and reporting tools:o System design and data architecture for financial data repository for data sets spanning multiple asset classes and geographies(data collection, cleaning and analytics)o Quantitative strategy: Negative Basis Trading (generated P&L of $3M in 2015): pattern recognition, opportunity triggero Monitoring: Non-Agency MBS products financing platform (Repo, TRS on Loans, Credit Facility on ABS, CMBS, RMBS, CLO, Loans) with limitsindicators and dynamic haircut computations CMBS Primary (CRE Loans, Hedge with CMBX & IR Swaps)o Big data: Data analysis and Machine Learning of Big Data (Fannie Mae & Freddie Mac MBS): Study of loan data analytics Statistical framework to analyze time series (generated P&L of $10M in 2015): detection of patterns, find relationships, gaininsights/200K+ time series of 15 years/multi-asset (Rates, Credit, FX & Indices)o Pricing: CRE Loans during warehousing period for CMBS primary market issuance Implied Spread of Markit ABX & CMBX indiceso Market Marking: Aggregation of BWICs, price talk and enrichment of bids with market data Deal Analyzero Risk & PnL system: Structured Products & Exotic Credit Derivativeso Contribution: Prices and other market data in several internal systemso Reports: Aging report, IPV report, Management risk reports, Technologies: .NET C# (+ WPF), Python, SQL. R, SAS, VBA, C++, Spark, Cassandra Methodologies: Agile Development, Continuous Delivery, Git, Jira, Build FactoryQuantitative DeveloperBanque de France - Paris (75)2012 to 2012on High Frequency Trading - Financial Economics Research Build and implemented a new, state of the art, low latency and high frequency trading simulation platform Analyzed the impact of the market making trading and the liquidity of the market Evaluated the consequences on the market functioning and dynamismFinancial Software Engineer - Global Equity DerivativesSocit Gnrale Corporate & Investment Banking - New York, NY2010 to 2011 Designed, implemented, optimized, and maintained several applications including:o Front-end database applicationo Real-time proprietary trade-reporting charts applicationo File-based scheduler for report processing to external regulators (FINRA, SEC)o Real-time and multi-threading application feeding equities database referenceo Seeder process which aims to retrieve data from an immense reference database Wrote technical and business requirements and technical specificationsGRTGaz (via Accenture) - Paris, FranceAlgorithmic Engineer - Customer Information System upgrade (50M project) Designed and developed data processing algorithm Generated test scenarios, test cases and test data. Executed tests, created problem reports Conducted various management activities by analyzing and verifying test results, providing status reports Worked with business analysts and developers to resolve issues", "addtionalinformation": "Programming SkillsProgramming  C#, Java, C++, VBA, Python, Shell ScriptingWeb  HTML, JavaScript, AngularJSDatabases  SQL (Microsoft SQL Server, MySQL), NoSQL (MongoDB, Cassandra)Mathematics  Matlab, R, SASBig Data  Spark, Hadoop", "skill": "Microsoft office, Python, C#, spark, machine learning, Hadoop, SQL, Sql Server, VBA", "education": "MSc. in Applied Mathematics & Quantitative FinanceUniversity of Paris 1 - Paris (75)University of Paris 12011 to 2012MSc. in Computer ScienceENSISA - FranceENSISA2007 to 2010", "id": "2"}, {"publication": "", "certification": "", "basicinformation": "Jason SypniewskiData Scientist - MetisClifton, NJ-Highly motivated Data Scientist with a Bachelors Degree in Computer Science and a Masters Degree in Information Systems. Versatile and reliable professional with a prior background in the government and defense sector. Proven leader with experience managing cross-functional teams in high paced environments. Creative thinker driven by data to solve real-world problems. Polished communicator with ability to effectively convey results to diverse audiences.Authorized to work in the US for any employer", "award": "Commander's Award for Civilian ServiceNovember 2015award description not metioned", "link": "http://jasonsyp.github.iohttps://www.linkedin.com/in/jasonsypniewski", "workexperience": "Data ScientistMetis - New York, NY2016 to PresentMetis is an immersive program focused on teaching end-to-end design, implementation and communication of data science projects. A Metis education covers topics in programming, statistics, data acquisition, machine learning, data visualization, relational and non-relational databases, natural language processing, and iterative design.Analyzed MTA turnstile data to recommend optimal locations and times for a non-profit org to deploy street team members.Built and optimized linear regression models to predict success of sports genre movies in terms of revenues and specific sport featured. Scraped and cleaned data across multiple sources for relevant movie data.Analysis of various supervised classification models for diagnosing heart disease using data from the Cleveland Clinic, Hungarian Institute of Cardiology, Swiss University Hospitals, and Long Beach V.A. Medical Center.Utilized unsupervised machine learning and natural language processing to perform topic modeling on Twitter data regarding sentiment towards the European Union.Supervisory Computer ScientistDepartment of the Army RDECOM CERDEC C4ISR Ground Activity - Lakehurst, NJ2008 to 2015Managed team of 25-30 engineers and scientists.Responsible for all project management tasks, reporting directly to the Deputy Director of the organization.Led requirements engineering and analysis according to specific Army C4ISR research and development (R&D) requirements.Led the design, execution and analysis of complex system of systems experiments including multi-tiered RF and satellite communications, intelligence, surveillance and reconnaissance (ISR), information technology (IT), TCP/IP wireless telecommunications, software integration, and mission command applications.Tracked project milestones according to cost, schedule, and performance metrics.Maintained authority for resolving technical issues, conducting analysis of alternatives and making engineering compromises where necessary.Executed all personnel decisions within the branch, including hiring, onboarding, disciplinary actions, training, and performance appraisals.Corresponded with senior management through written and oral communications.Developed briefing materials and presented to internal and external stakeholders across Department of Defense and industry, including briefing senior Army officials, military and civilian.Performed contract management and technical oversight on multmillion dollar support services contracts.Computer ScientistDepartment of the Army RDECOM CERDEC C2D - Fort Monmouth, NJ2001 to 2008Served as the Lead Systems Engineer responsible for the design, integration and testing of system of systems C4ISR architectures.Designed and executed experiments evaluating performance of Army computer systems, mission command applications, ground and airborne platforms, sensors, and tactical wireless radio systems.Executed tasks across the systems engineering lifecycle including software installation, configuration and maintenance, database development, network configuration and monitoring, training, digital terrain generation and mapping, creating and editing application scripts, and test plan development and execution.Served as organization's subject matter expert on Army mission command applications and information systems, writing documentation and giving presentations to internal and external stakeholders.", "addtionalinformation": "SKILLSPROGRAMMING LANGUAGES: Python, JavascriptMACHINE LEARNING: Supervised Learning, Unsupervised Learning, Linear Regression, Classification, Clustering, Natural Language ProcessingSTATISTICAL PACKAGES: scikitlearn, statsmodelsDATA ACQUISITION, STORAGE AND MANAGEMENT: PostgresSQL, Amazon Web ServicesWEB DESIGN: HTML, CSSDATA VISUALIZATION: D3.js, matplotlib, seaborn, CartoDBPROJECT MANAGEMENT: Requirements Analysis, Financial Analysis and Budgeting, Contract Management, Systems Engineering, Integration, Testing, Technical Writing,Oral Communications, Workforce Development, Scheduling", "skill": "skill not metioned", "education": "M.S. in Information SystemsNew Jersey Institute of TechnologyNew Jersey Institute of Technology2004B.S. in Computer ScienceNew Jersey Institute of TechnologyNew Jersey Institute of Technology2001", "id": "3"}, {"publication": "", "certification": "", "basicinformation": "Cong WuManager - Big Data Scientist - American Express-", "award": "", "link": "", "workexperience": "Manager - Big Data ScientistAmerican Express - New York, NYMarch 2015 to PresentPartner closely with Business Units to develop Big Data Use Cases to drive growth- Identify business owners who own multiple businesses and added in EPIN features to database of prospectbusinesses, help to target potential high-spend customer.- Build up a small business ecosystem. SBE(small business ecosystem) connects different business together usingtheir addresses, business owners and hierarchy information- Partnered with marketing team, build ATUL (acquisition targeting utilization link), an easy to use tool that canfacilitate marketing team to get necessary information from big data database of prospect businesses      Create Frameworks and automation tools to ensure focused approach and disciplined governance on corecapabilities investments around prospects and customers with a Big Data POA- On big data platform, built up connected component, network builder and network visualization, searchingcapability for prospect small businesses.- Partner with analytic team, build geo database capability POA, geo database can enable analytic and all other teams in Amex- Integrate analytic tools (Datameer, RevR ) into enterprise-wise centered big data platformData ScientistSocure - New York, NYAugust 2014 to December 2014      Building machine learning models of fraud predictions.      Performing ad hoc analytics using R, Python, SQL, MongoDB query and Unix utilities      Maintaining proprietary machine learning R library      Developing OFAC fuzzy match algorithm for online identity verification system.", "addtionalinformation": "SkillsProgramming: Python, R, Java, C++, C, Hive, MATLAB, SQL, d3.js, scala", "skill": "skill not metioned", "education": "Master of Arts in StatisticsColumbia University - New York, NYColumbia UniversitySeptember 2013 to December 2014Bachelor of Engineering in Software EngineeringSun Yat-sen University - Guangzhou, CNSun Yat-sen UniversitySeptember 2009 to June 2013", "id": "4"}] 

任何人都可以帮忙吗?谢谢。

+0

你能告诉相应的代码和错误消息? – alpert

+0

对应的代码在上面,错误消息是json解码器不同步 - 数据在脚下变化。 –

+0

检查:http://stackoverflow.com/questions/30380751/importing-json-from-file-into-mongodb-using-mongoimport。你的问题似乎是一样的问题。 – alpert

回答

0

我想知道你为什么使用DictReader。相反,你可以这样做:用线

  1. 读取CSV文件一行
  2. 通过选项卡或逗号分割它。
  3. 创建JSON对象从分裂名单
  4. 写JSON对象添加每个元素JSON对象到您的JSON文件