2015-11-06 36 views
1

我在运行下面的代码以在Spark graphX中创建图形时出错。我通过以下命令运行它通过火花壳: ./bin/spark-shell -i ex.scala使用边缘/顶点输入文件在GraphX中创建图形时出错

输入:

My Vertex File looks like this (each line is a vertex of strings): 
word1,word2,word3 
word1,word2,word3 
... 
My Edge File looks like this: (edge from vertex 1 to vertex 2) 
1,2 
1,3 

代码:

// Creating Vertex RDD (Input file has 300+ records with each record having list of strings separated by delimiter (,). 
//zipWithIndex done to get an index number for all the entries - basically numbering rows 
val vRDD: RDD[(VertexId, Array[String])] = (vfile.map(line => line.split(","))).zipWithIndex().map(line => (line._2, line._1)) 

// Creating Edge RDD using input file 
//val eRDD: RDD[Edge[Array[String]]] = (efile.map(line => line.split(","))) 

val eRDD: RDD[(VertexId, VertexId)] = efile.map(line => line.split(",")) 

// Graph creation 
val graph = Graph(vRDD, eRDD) 

错误:

Error: 
<console>:52: error: type mismatch; 
found : Array[String] 
required: org.apache.spark.graphx.Edge[Array[String]] 
      val eRDD: RDD[Edge[Array[String]]] = (efile.map(line => line.split(","))) 

<console>:57: error: type mismatch; 
found : org.apache.spark.rdd.RDD[(org.apache.spark.graphx.VertexId, org.apache.spark.graphx.VertexId)] 
required: org.apache.spark.rdd.RDD[org.apache.spark.graphx.Edge[?]] 
Error occurred in an application involving default arguments. 
     val graph = Graph(vRDD, eRDD) 
+0

你建立你的文件吗?它抱怨从上面的代码中已经注释掉了'val eRDD:RDD [Edge [Array [String]]] =(efile.map(line => line.split(“,”)))'这一行。 .. –

+0

但除此之外,您的边缘RDD需要是'RDD [Edge]'类型,而不是'VertexId'的元组(而BTW是“Long”而不是'String')。您应该阅读文档http://spark.apache.org/docs/latest/graphx-programming-guide.html –

回答

0

根据你给出的例子,我创建了两个顶点和边的文件:

val vfile = sc.textFile("vertices.txt") 
val efile = sc.textFile("edges.txt") 

然后创建您的顶点和边RDDS:

val vRDD: RDD[(VertexId, Array[String])] = vfile.map(line => line.split(",")) 
           .zipWithIndex() 
           .map(_.swap) // you can use swap here instead of what you are actually doing. 

// Creating Edge RDD using input file 
val eRDD: RDD[Edge[(VertexId, VertexId)]] = efile.map(line => { 
    line.split(",", 2) match { 
    case Array(n1, n2) => Edge(n1.toLong, n2.toLong) 
    } 
}) 

一旦你创建你的顶点和边RDDS,您现在可以创建您的图表:

val graph = Graph(vRDD, eRDD) 
0

Edge有一个attr - 你的attr是什么类型?让我们假设这是一个Int,让我们将其初始化为零:

取而代之的是:

val eRDD: RDD[(VertexId, VertexId)] = efile.map(line => line.split(",")) 

试试这个:

val eRDD: RDD[Edge[Int]] = efile.map{ line => 
    val vs = line.split(","); 
    Edge(vs(0).toLong, vs(1).toLong, 0) 
} 
相关问题