twitter4j因为id没有得到更新

我想使用since_id来使用twitter搜索api得到推文。下面是我的代码，在这里我创建了一个查询对象的映射，并自id。我默认自ID为0，我的目标是每次运行查询时都更新自ID。因此，当我下次运行查询时，它不会得到相同的推文，并应从最后一条推文开始。twitter4j因为id没有得到更新

import java.io.{PrintWriter, StringWriter} 
import java.util.Properties 
import com.google.common.io.Resources 
import twitter4j._ 
import scala.collection.JavaConversions._ 
// reference: http://bcomposes.com/2013/02/09/using-twitter4j-with-scala-to-access-streaming-tweets/ 
object Util { 
    val props = Resources.getResource("twitter4j.props").openStream() 
    val properties = new Properties() 
    properties.load(props) 

    val config = new twitter4j.conf.ConfigurationBuilder() 
     .setDebugEnabled(properties.getProperty("debug").toBoolean) 
     .setOAuthConsumerKey(properties.getProperty("consumerKey")) 
     .setOAuthConsumerSecret(properties.getProperty("consumerSecret")) 
     .setOAuthAccessToken(properties.getProperty("accessToken")) 
     .setOAuthAccessTokenSecret(properties.getProperty("accessTokenSecret")) 
    val tempKeys =List("Yahoo","Bloomberg","Messi", "JPM Chase","Facebook") 
    val sinceIDmap : scala.collection.mutable.Map[String, Long] = collection.mutable.Map(tempKeys map { ix => s"$ix" -> 0.toLong } : _*) 
    //val tweetsMap: scala.collection.mutable.Map[String, String] 
    val configBuild = (config.build()) 
    val MAX_TWEET=100 
    getTweets() 

    def getTweets(): Unit ={ 
     sinceIDmap.keys.foreach((TickerId) => getTweets(TickerId)) 
    } 

    def getTweets(TickerId: String): scala.collection.mutable.Map[String, scala.collection.mutable.Buffer[String]] = { 
     println("Search key is:"+TickerId) 
     var tweets = scala.collection.mutable.Map[String, scala.collection.mutable.Buffer[String]]() 
     try { 
      val twitter: Twitter = new TwitterFactory(configBuild).getInstance 
      val query = new Query(TickerId) 
      query.setSinceId(sinceIDmap.get(TickerId).get) 
      query.setLang("en") 
      query.setCount(MAX_TWEET) 
      val result = twitter.search(query) 
      tweets += (TickerId -> result.getTweets().map(_.getText)) 

      //sinceIDmap(TickerId)=result.getSinceId 
      println("-----------Since id is :"+result.getSinceId) 
      //println(tweets) 
     } 
     catch { 
      case te: TwitterException => 
       println("Failed to search tweets: " + te.getMessage) 
     } 
     tweets 
    } 
} 

object StatusStreamer { 
    def main(args: Array[String]) { 
     Util 
    } 
}

输出：

Search key is:Yahoo  
log4j:WARN No appenders could be found for logger (twitter4j.HttpClientImpl). 
log4j:WARN Please initialize the log4j system properly. 
-----------Since id is :0 
Search key is:JPM Chase 
-----------Since id is :0 
Search key is:Facebook 
-----------Since id is :0 
Search key is:Bloomberg 
-----------Since id is :0 
Search key is:Messi 
-----------Since id is :0

问题是，当我试图运行它给了什么，我最初设置的值相同的查询后打印，因为ID。有人能指出我在这里做错了什么吗？或者如果我的方法是错误的，如果他们知道在这里工作，有人可以分享任何其他方法。

感谢

来源

2016-07-21 Explorer

从你的代码匆匆一瞥，似乎你从来没有在sinceIDmap更新值。注释掉以下：

//sinceIDmap(TickerId)=result.getSinceId

所以，每个关键字的since_id不会从0更新。

如果您遇到的问题超出这个范围，可能需要检查GitHub上的Twitter4J SearchTweets示例。

来源

2016-07-28 07:51:24 Jonathan

嗨乔纳森，感谢您的回复，我注释掉了这一行，因为我只是想打印结果，我检查了它没有任何引用since_id和max_id的示例。 – Explorer

IIRC它们位于从结果中获得的[Query'对象]（https://github.com/yusuke/twitter4j/blob/master/twitter4j-examples/src/main/java/twitter4j/examples/search/ SearchTweets.java＃L48）。 – Jonathan

Twitter API返回since_id最初由查询请求的值。这意味着QueryResult.getSinceId与您在Query中输入的内容相同。

最简单的解决方案是将下一个sinceId设置为响应中的最大鸣叫ID。

sinceIDmap(TickerId) = result.getTweets().max(Ordering.by(_.getId)).getId

一般而言，使结果更加流畅，你可以使用的since_id和max_id查询参数组合。 Official twitter guide有很好的解释如何使用它们。

来源

2016-07-29 11:21:06

感谢Nazarii为您的解释，所以如果我使用'val id = QResult.getTweets.max.getId'并将此id设置为从以后的迭代id，它会工作吗？ 'query.setSinceId（id）' – Explorer

你应该尝试一下，如果它的行为像你期望的那样。至少它对'since_id'来说是新的值，所以搜索结果会更准确。无论如何读一个指南，以得到你需要什么https://dev.twitter.com/rest/public/timelines –

首先，从您的方法的初始描述中，我可以告诉您，您使用since_id的方法不正确。我在过去犯过同样的错误，无法实现。此外，您的方法不符合官方Working with Timelines。正式的指导方针对我来说很有效，我建议你跟着他们。长话短说，你不能单独使用since_id来通过推文的时间轴（GET search/tweets返回的时间轴）。你绝对需要max_id来做你所描述的。实际上，我认为since_id具有完全的次级/可选功能（可以在您的代码中实现）。该API docs让我相信我可以使用since_id完全像我可以使用max_id，但我错了。仅指定since_id，我注意到返回的推文非常新鲜，好像since_id已被完全忽略。 Here是另一个证明这种意外行为的问题。正如我所看到的，since_id仅用于修剪，而不用于在时间轴上移动。单独使用since_id将获得最新/最新的推文，但限制返回到ID大于since_id的推文。不是你想要的。最后的证据，从官方的指导方针采取的是一个特定要求的这个图形表示：

不仅since_id不动你通过时间线，但是它正好是在这个完全没用具体请求。然而，在下一个请求中它不会毫无用处，因为它会修剪Tweet 10（以及之前的任何内容）。但事实是，since_id不会让你通过时间表。

一般而言，您需要考虑从最新的推文到最早的推文，而不是相反。并且要从最新的推文到最早的推文，在您的请求中，您需要指定max_id作为要返回的推文的ID包含上限，并在连续请求之间更新此参数。

请求中max_id的存在将设置要返回的tweets的包含ID的上限。从返回的tweets中，您可以获得出现的最小ID，并将其用作后续请求中的max_id的值（您可以将最小ID减1，并将该值用于下一个请求max_id，因为max_id是包含性的，所以您将不会再次收到先前请求中最旧的推文）。第一次请求应该没有指定max_id，以便返回最新/最新的推文。使用这种方法，在第一次请求后的每个请求都会让你更深入到过去的一步。

since_id当您需要限制您的旅行到过去时可以派上用场。想象一下，在某个时间点，t0，你开始搜索推文。让我们假设从您的第一次搜索中，您最大的推特ID是id0。在第一次搜索之后，随后搜索中的所有推特ID将变得越来越小，因为您正在回头。过了一段时间，你会得到大约一周的推文，而搜索你的文章将不会返回任何内容。在那个时候，t1，你知道这次过去的旅程结束了。但是，在t0和t1之间将发送更多推文。所以，另一次去过去的旅程应该从t1开始，直到您已经到达推特ID id0（已在t0之前发推文）。这次旅行可以通过在旅行请求中使用id0对since_id进行限制，依此类推。或者，如果您确定一旦您的推文的ID小于或等于id0（记住可以删除推文），则确保您的旅程结束，则可以避免使用since_id。但我建议你尽量使用since_id以方便和高效。请记住，since_id是排他性的，而max_id是包容性的。请致电Working with Timelines。您会注意到“The max_id parameter”一节首先出现，并且“使用since_id以获得最大效率”一节稍后介绍。后面部分的标题指示since_id是而不是用于在时间线上移动。

一个粗糙的未经测试的例子，使用Twitter4J在Java中，打印从最新开始到过去的鸣叫如下：

// Make sure this is initialized correctly. 
Twitter twitter; 

/** 
* Searches and prints tweets starting from now and going back to the past. 
* 
* @param q 
*   the search query, e.g. "#yolo" 
*/ 
private void searchAndPrintTweets(String q) throws TwitterException { 
    // `max_id` needed by `GET search/tweets`. If it is 0 (first iteration), 
    // it will not be used for the query. 
    long maxId = 0; 
    // Let us assume that it will run forever. 
    while (true) { 
     Query query = new Query(); 
     query.setCount(100); 
     query.setLang("en"); 
     // Set `max_id` as an inclusive upper limit, unless this is the 
     // first iteration. If this is the first iteration (maxId == 0), the 
     // freshest/latest tweets will come. 
     if (maxId != 0) 
      query.setMaxId(maxId); 
     QueryResult qr = twitter.search(query); 
     printTweets(qr.getTweets()); 
     // For next iteration. Decrement smallest ID by 1, so that we will 
     // not get the oldest tweet of this iteration in the next iteration 
     // as well, since `max_id` is inclusive. 
     maxId = calculateSmallestId(qr.getTweets()) - 1; 
    } 
} 

/** 
* Calculates the smallest ID among a list of tweets. 
* 
* @param tweets 
*   the list of tweets 
* @return the smallest ID 
*/ 
private long calculateSmallestId(List<Status> tweets) { 
    long smallestId = Long.MAX_VALUE; 
    for (Status tweet : tweets) { 
     if (tweet.getId() < smallestId) 
      smallestId = tweet.getId(); 
    } 
    return smallestId; 
} 

/** 
* Prints the content of the tweets. 
* 
* @param tweets 
*   the tweets 
*/ 
private void printTweets(List<Status> tweets) { 
    for (Status tweet : tweets) { 
     System.out.println(tweet.getText()); 
    } 
}

没有错误处理，没有什么特别的条件检查（如空查询结果中的推文列表），没有使用since_id，但它应该让你开始。

来源

2016-07-29 11:30:38 xnakos

嗨xnakos感谢您的答复，你有任何工作的例子，你上面提到的概念？ – Explorer

@Novice您是否会喜欢使用twitter4j的Java示例？ – xnakos

@Novice添加了Twitter4J/Java示例，但未经测试，但它应该或多或少地工作。 – xnakos

since_id和max_id都是非常简单的参数，可以用来限制从API获取的内容。从文档：

since_id - 返回ID大于（即，比指定ID更新）的结果。可以通过API访问的Tweets数量有限制。如果自since_id以来发生了Tweets的限制，那么since_id将被强制为最旧的可用ID。 max_id - 返回ID小于（即，早于）或等于指定ID的结果。因此，如果您有给定的推特ID，则可以使用这两个参数搜索较早或较新的推文。

计数就更简单了 - 它指定了要找回鸣叫的最大数量，最多200

不幸的是，API不会给你回你想要什么 - 你不能指定日期/查询user_timeline的时间 - 虽然您可以在使用搜索API时指定一个。无论如何，如果您需要使用user_timeline，那么您将需要轮询API，收集推文，确定它们是否与您所需的参数相匹配，然后相应地计算您的统计数据。

来源

2016-07-30 12:35:41

twitter4j因为id没有得到更新

回答

相关问题