2
我的例子JSON模式(切断由于尺寸):星火SQL JSON布尔评估
|-- LinearScheduleResult: struct (nullable = true)
| |-- Build: string (nullable = true)
| |-- EndTimestamp: string (nullable = true)
| |-- Errors: array (nullable = true)
| | |-- element: string (containsNull = true)
| |-- RequestId: string (nullable = true)
| |-- Schedule: struct (nullable = true)
| | |-- Airings: array (nullable = true)
| | | |-- element: struct (containsNull = true)
| | | | |-- AiringTime: string (nullable = true)
| | | | |-- AiringType: string (nullable = true)
| | | | |-- CC: boolean (nullable = true)
| | | | |-- CallLetters: string (nullable = true)
| | | | |-- Category: string (nullable = true)
| | | | |-- Channel: string (nullable = true)
| | | | |-- Color: string (nullable = true)
| | | | |-- Copy: string (nullable = true)
| | | | |-- DSS: boolean (nullable = true)
| | | | |-- DVS: boolean (nullable = true)
| | | | |-- Dolby: boolean (nullable = true)
| | | | |-- Duration: long (nullable = true)
| | | | |-- DvbTriplet: string (nullable = true)
| | | | |-- EpisodeTitle: string (nullable = true)
| | | | |-- HD: boolean (nullable = true)
| | | | |-- HDLevel: string (nullable = true)
| | | | |-- IconAvailable: boolean (nullable = true)
| | | | |-- InstanceId: string (nullable = true)
| | | | |-- LetterBox: boolean (nullable = true)
| | | | |-- MovieRating: string (nullable = true)
| | | | |-- ParentNetworkId: long (nullable = true)
| | | | |-- ProgramId: string (nullable = true)
| | | | |-- SAP: boolean (nullable = true)
| | | | |-- SL: string (nullable = true)
| | | | |-- SeriesId: string (nullable = true)
| | | | |-- ServiceId: long (nullable = true)
| | | | |-- ShowingType: string (nullable = true)
| | | | |-- SourceDisplayName: string (nullable = true)
| | | | |-- SourceId: long (nullable = true)
| | | | |-- SourceLongName: string (nullable = true)
| | | | |-- Sports: boolean (nullable = true)
当我做到以下几点:
results = sqlContext.sql("SELECT LinearScheduleResult.Schedule.Airings.Sports from tv")
它返回:
[Row(Sports=[False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False])]
当我做更复杂的事情如:
results = sqlContext.sql("SELECT LinearScheduleResult.Schedule.Airings from tv where LinearScheduleResult.Schedule.Airings.Sports = 'False'")
它永远不会返回任何东西,我试过'假',假,0,假,还有更多的组合。
任何帮助,将不胜感激。
或者您可以下降到常规rdd计算 –