我有多个json文件。我必须使用apache spark来解析它。它嵌套了关键的init。我必须打印所有栏和嵌套键。如何从json文件中使用java中的apache spark创建嵌套列
这些文件也有嵌套键。 我想要获取所有列名称以及嵌套的列名称。我怎么能得到它。
我想这样的:在文件
String jsonFilePath = "/home/vipin/workspace/Smarten/jsonParsing/Employee/Employee-01.json,/home/vipin/workspace/Smarten/jsonParsing/Employee/Employee-02.json";
String[] jsonFiles = jsonFilePath.split(",");
Dataset<Row> people = sparkSession.read().json(jsonFiles);
JSON结构:
{
"Name":"Vipin Suman",
"Email":"[email protected]",
"Designation":"Programmer",
"Age":22 ,
"location":
{
"City":"Ahmedabad",
"State":"Gujarat"
}
}
我得到的结果:
people.show(50, false);
Age | Designation | Email | Name | Location
------------------------------------------------------------
22 |Programmer |[email protected] | Vipin Suman|[Ahmedabad,Gujarat]
我要像数据:
Age | Designation | Email | Name | City | State
------------------------------------------------------------
22 |Programmer |[email protected] | Vipin Suman| Ahmedabad |Gujarat
或类似: -
Age | Designation | Email | Name | Location
---------------------------------------------------------------
22 |Programmer |[email protected] | Vipin Suman| Ahmedabad,Gujarat
如果scema这个样子
root
|-- Age: long (nullable = true)
|-- Company: struct (nullable = true)
| |-- Company Name: string (nullable = true)
| |-- Domain: string (nullable = true)
|-- Designation: string (nullable = true)
|-- Email: string (nullable = true)
|-- Name: string (nullable = true)
|-- Test: array (nullable = true)
| |-- element: string (containsNull = true)
|-- location: struct (nullable = true)
| |-- City: struct (nullable = true)
| | |-- City Name: string (nullable = true)
| | |-- Pin: long (nullable = true)
| |-- State: string (nullable = true)
和JSON结构
{
"Name":"Vipin Suman",
"Email":"[email protected]",
"Designation":"Trainee Programmer",
"Age":22 ,
"location":
{"City":
{
"Pin":324009,
"City Name":"Ahmedabad"
},
"State":"Gujarat"
},
"Company":
{
"Company Name":"Elegant",
"Domain":"Java"
},
"Test":["Test1","Test2"]
}
那又怎么能找到嵌套的关键。并表示在适当的formet表
请准备好:输入数据样本,你做了什么,有什么问题? –