下面是简单的例子使用SQL星火
import spark.implicits._
val data = spark.sparkContext.parallelize(Seq(
("awer.ttp.net","Code", 554),
("abcd.ttp.net","Code", 747),
("asdf.ttp.net","Part", 554),
("xyz.ttp.net","Part", 747)
)).toDF("A","B","C")
data.createOrReplaceTempView("tempTable")
data.sqlContext.sql("SELECT A, B, C, SUBSTRING_INDEX(A, '.', 1) as D from tempTable").show
输出:
+------------+----+---+----+
| A| B| C| D|
+------------+----+---+----+
|awer.ttp.net|Code|554|awer|
|abcd.ttp.net|Code|747|abcd|
|asdf.ttp.net|Part|554|asdf|
| xyz.ttp.net|Part|747| xyz|
+------------+----+---+----+
我希望这有助于!