图样:/^@([^{]+)\{([^,]+),\s*$|^\s*([^\[email protected]=]+) = \{(.*?)}/ms
(Demo)
这种模式有两个替代方案;每个包含两个捕获组。
type
和unique_name
被捕获并存储在元件[1]
和[2]
。
- 所有其他键 - 值对存储在元素
[3]
和[4]
。
此分离的阵列存储允许可靠的加工构建所期望的输出阵列结构时。
输入:
$bibtex='@BOOK{ko,
title = {Wissenschaftlich schreiben leicht gemacht},
publisher = {Haupt},
year = {2011},
author = {Kornmeier, M.},
number = {3154},
series = {UTB},
address = {Bern},
edition = {4},
subtitle = {für Bachelor, Master und Dissertation}
}
@BOOK{nial,
title = {Wissenschaftliche Arbeiten schreiben mit Word 2010},
publisher = {Addison Wesley},
year = {2011},
author = {Nicol, N. and Albrecht, R.},
address = {München},
edition = {7}
}
@ARTICLE{shome,
author = {Scholz, S. and Menzl, S.},
title = {Alle Wege führen nach Rom},
journal = {Medizin Produkte Journal},
year = {2011},
volume = {18},
pages = {243-254},
subtitle = {ein Vergleich der regulatorischen Anforderungen und Medizinprodukte
in Europa und den USA},
issue = {4}
}
@INBOOK{shu,
author = {Schulz, C.},
title = {Corporate Finance für den Mittelstand},
booktitle = {Praxishandbuch Firmenkundengeschäft},
year = {2010},
editor = {Hilse, J. and Netzel, W and Simmert, D.B.},
booksubtitle = {Geschäftsfelder Risikomanagement Marketing},
publisher = {Gabler},
pages = {97-107},
location = {Wiesbaden}
}';
方法:(Demo)
$pattern='/^@([^{]+)\{([^,]+),\s*$|^\s*([^\[email protected]=]+) = \{(.*?)}/ms';
if(preg_match_all($pattern,$bibtex,$out,PREG_SET_ORDER)){
foreach($out as $line){
if(isset($line[1])){
if(!isset($line[3])){ // this is the starting line of a new set
if(isset($temp)){
$result[]=$temp; // send $temp data to permanent storage
}
$temp=['type'=>$line[1],'unique_name'=>$line[2]]; // declare fresh new $temp
}else{
$temp[$line[3]]=$line[4]; // continue to store the $temp data
}
}
}
$result[]=$temp; // store the final $temp data
}
var_export($result);
输出:
array (
0 =>
array (
'type' => 'BOOK',
'unique_name' => 'ko',
'title' => 'Wissenschaftlich schreiben leicht gemacht',
'publisher' => 'Haupt',
'year' => '2011',
'author' => 'Kornmeier, M.',
'number' => '3154',
'series' => 'UTB',
'address' => 'Bern',
'edition' => '4',
'subtitle' => 'für Bachelor, Master und Dissertation',
),
1 =>
array (
'type' => 'BOOK',
'unique_name' => 'nial',
'title' => 'Wissenschaftliche Arbeiten schreiben mit Word 2010',
'publisher' => 'Addison Wesley',
'year' => '2011',
'author' => 'Nicol, N. and Albrecht, R.',
'address' => 'München',
'edition' => '7',
),
2 =>
array (
'type' => 'ARTICLE',
'unique_name' => 'shome',
'author' => 'Scholz, S. and Menzl, S.',
'title' => 'Alle Wege führen nach Rom',
'journal' => 'Medizin Produkte Journal',
'year' => '2011',
'volume' => '18',
'pages' => '243-254',
'subtitle' => 'ein Vergleich der regulatorischen Anforderungen und Medizinprodukte
in Europa und den USA',
'issue' => '4',
),
3 =>
array (
'type' => 'INBOOK',
'unique_name' => 'shu',
'author' => 'Schulz, C.',
'title' => 'Corporate Finance für den Mittelstand',
'booktitle' => 'Praxishandbuch Firmenkundengeschäft',
'year' => '2010',
'editor' => 'Hilse, J. and Netzel, W and Simmert, D.B.',
'booksubtitle' => 'Geschäftsfelder Risikomanagement Marketing',
'publisher' => 'Gabler',
'pages' => '97-107',
'location' => 'Wiesbaden',
),
)
这里是the site我提取新的采样输入的字符串从。
@renanbr [renanbr](https://stackoverflow.com/users/5249251/renanbr)推荐:renanbr/bibtex-parser https://github.com/renanbr/bibtex-parser(我认为是他自己的发明)。 – mickmackusa 2017-12-12 03:15:08
我已经看到的所有BibTex文档都将年份值包裹在大括号中。这是发布时的错字吗? – mickmackusa 2017-12-12 13:54:05