Java的正则表达式尝试了分割字符串

嗨，我试图分裂这个字符串了（它很长）：Java的正则表达式尝试了分割字符串

Library Catalogue Log off | Borrower record | Course Reading | Collections | A-Z E-Journal list | ILL Request | Help   Browse | Search | Results List | Previous Searches | My e-Shelf | Self-Issue | Feedback       Selected records:  View Selected  |  Save/Mail  |  Create Subset  |  Add to My e-Shelf  |        Whole set:  Select All  |  Deselect  |  Rank  |  Refine  |  Filter   Records 1 - 15 of 101005 (maximum display and sort is 2500 records)         1 Drower, E. S. (Ethel Stefana), Lady, b. 1879. Lady E.S. Drower’s scholarly correspondence : an intrepid English autodidact in Iraq / edited by 2012. BK Book University Library(1/ 0) 2 Kowalski, Robin M. Cyberbullying : bullying in the digital age / Robin M. Kowalski, Susan P. Limber, Patricia W. Ag 2012. BK Book University Library(1/ 0) ... 15 Ambrose, Gavin. Approach and language [electronic resource] / Gavin Ambrose, Nigel Aono-Billson. 2011. BK Book

所以，我要么得到回：

1 Drower, E. S. (Ethel Stefana), Lady, b. 1879. Lady E.S. Drower’s scholarly correspondence : an intrepid English autodidact in Iraq/edited by 2012. BK Book University Library(1/ 0) 

// Or 

1 Drower, E. S. (Ethel Stefana), Lady, b. 1879. Lady E.S. Drower’s scholarly correspondence : an intrepid English autodidact in Iraq

这只是一个例子和1 Drower，ES ...不会是静态的。虽然每次输入都会有所不同（1和2之间的细节），但字符串的总体布局总是相同的。

我：

String top = ".*   (.*)"; 
String bottom = "\(\d/ \d\)\W*"; 
Pattern p = Pattern.compile(top); //+bottom 
Matcher matcher = p.matcher(td); //td is the input String 
String items = matcher.group(); 
System.out.println(items);

当我与top运行它，它的目的是去除所有的头，但所有我得到的回复是No match found。 bottom是我尝试拆分字符串的其余部分。

如果需要的话，我可以发布所有输入到15号。我需要的是分割输入字符串，以便我可以处理15个结果中的每个个体。

感谢您的帮助！

来源

2012-03-14 Tbuermann

这将为您提供两种输入。这是你想要的？

String text = "Library Catalogue Log off ..."; \\truncated text 

Pattern p = Pattern.compile("((1 Drower.+Iraq).+0\\)).+2 Kowalski"); 
Matcher m = p.matcher(text); 
if (m.find()) { 
    System.out.println(m.group(1)); 
    System.out.println(m.group(2)); 
}

Compile and run code here.

来源

2012-03-14 20:01:33 JMelnik

以某种方式是的。但事情是，输入不是静态的，意志会根据搜索结果而改变。对不起，我应该提到这一点。但是，输入字符串的布局不会更改。数字1只是第一个搜索结果，它会达到15个结果。如果需要，我可以将所有输入发布到15号。 – Tbuermann 2012-03-14 20:14:34

所以你需要分割所有的搜索结果，据我所知？ – JMelnik 2012-03-14 20:25:18

是的，这是正确的。例如：[1 Drower，E. S. ..]应该是一个String和[2 Kowalski，Robin M. ..]直到[15 Ambrose，Gavin。 ..]应该是下一个字符串。该输入根据搜索结果而变化。但输入字符串的布局将始终相同。所以1，2，3 .. 15.除非有少于15个结果，否则总会在那里 – Tbuermann 2012-03-14 20:28:18

首先，你需要将头从结果数据分开。假设每次会有9个空白块可以使用：.*\s{9}(.*)

接下来，您需要将数据解析为行，由于没有行分隔符，因此更加困难。你可以做的最好的假设是行被分隔：一个空格，一个或多个数字，然后另一个空间。

((?<=(?:^|\s))\d+\s.*?(?=(?:$|\s\d+\s)))

如果你打算尝试解析记录到字段，然后不打扰，除非你可以改变分隔符！

什么每一位做一点解释：

(?<=(?:^|\s))向后看：确保小组前的字符或者是字符串（第1记录）的开始，或者一个空间（所有其他记录）。

\d+\s.*?捕获组：一个或多个数字后跟一个空格，然后是文本。由于在断言中使用了非捕获组?:，这是表达式在输出中显示的唯一部分。

(?=(?:$|\s\d+\s))向前看：请确保以下组的字符字符串标记$的任何一个结束或一个空格，然后通过1个+数字，后面加一个空格（表示下一条记录）。

此方法适用于您提供的字段，但如果您的记录包含自定义分隔符（例如，一本名为“我最喜欢的10件事”的书。还有其他一些解析记录的方法，这些方法有点安全，但如果这就是你想要做的，那么它超出了正则表达式的期望...

来源

2012-06-14 14:31:29 KidTempo

Java的正则表达式尝试了分割字符串

回答

相关问题