如何在GraphX中使用此创建图表

我很努力地理解我将如何在Apache Spark的GraphX中创建以下内容。我给出如下：如何在GraphX中使用此创建图表

node: ConnectingNode1, ConnectingNode2..

例如：

123214: 521345, 235213, 657323

我需要以某种方式保存在这个数据

一个HDFS，其具有附带的表格数据的加载文件EdgeRDD，以便我可以在GraphX中创建我的图形，但我不知道我将如何去做这件事。

来源

2016-12-16 Rhys Copperthwaite

你念了HDFS源并在rdd数据后，你可以尝试类似如下：

import org.apache.spark.rdd.RDD 
import org.apache.spark.graphx.Edge 
// Sample data 
val rdd = sc.parallelize(Seq("1: 1, 2, 3", "2: 2, 3")) 

val edges: RDD[Edge[Int]] = rdd.flatMap { 
    row => 
    // split around ":" 
    val splitted = row.split(":").map(_.trim) 
    // the value to the left of ":" is the source vertex: 
    val srcVertex = splitted(0).toLong 
    // for the values to the right of ":", we split around "," to get the other vertices 
    val otherVertices = splitted(1).split(",").map(_.trim) 
    // for each vertex to the right of ":", we create an Edge object connecting them to the srcVertex: 
    otherVertices.map(v => Edge(srcVertex, v.toLong, 1)) 
}

编辑

此外，如果你的顶点具有恒定的缺省权重，您可以直接从边缘创建图形，因此不需要创建verticesRDD：

import org.apache.spark.graphx.Graph 
val g = Graph.fromEdges(edges, defaultValue = 1)

来源

2016-12-16 19:18:24

th为您提供所有帮助！我遵循你所说的，并能够创建一个val图，只是试图找到一种方法来看看它是否工作！ –

我试着按照你说的方式去做，只有那些没用的东西是RDD [Edge [Int]，所以我只用了RDD。但不断收到以下错误：：43：error：not found：value Edge otherVertices.map（v => Edge（srcVertex，v.toLong，1）） ^ ：43：error：type mismatch; found：Array [Nothing] required：TraversableOnce [？] otherVertices.map（v => Edge（srcVertex，v.toLong，1）） –

您是否导入了Edge类？ 'import org.apache.spark.graphx.Edge'。这可能是问题所在，也是为什么'RDD [边缘[Int]]'不起作用 –

如何在GraphX中使用此创建图表

回答

相关问题