这是一项具有挑战性的任务。
首先,将表格拆分回页面是个好主意,因为每个页面的列结构可能都是唯一的。因此,我形成表格列表,每个表格为一页。然后,我必须处理每个页面:提取列名称,为每行添加摘要信息,过滤不需要的行,并设置列名称。这是通过使用自定义功能ConvertTable
为列表中的每个表完成的。之后,您只需组合生成的表格。
这里:
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
AddRowNum = Table.AddColumn(Table.AddIndexColumn(Source, "Index", 1, 1), "RowNum", each Number.Mod([Index]-1, 52)+1, type number),
CountTables = {1..(Number.RoundUp(Table.RowCount(AddRowNum)/52, 0))},
ListTables = List.Transform(CountTables, (ListItem)=>Table.SelectRows(AddRowNum, each [Index] > 52 * (ListItem - 1) and [Index] <= 52 * ListItem)),
ConvertTable = (tbl as table) as table =>
let
hdr1 = Table.Transpose(Table.FillDown(Table.Transpose(Table.FromRecords({tbl{6}})), {"Column1"})),
hdr2 = Table.FromRecords({tbl{7}}),
ColNames = Table.Transpose(Table.SelectColumns(Table.FirstN(Table.AddColumn(Table.Transpose(Table.Combine({hdr1, hdr2})), "ColumnName", each [Column1] & ": " & [Column2]), 19), {"ColumnName"})),
AddPayDate = Table.AddColumn(tbl, "Pay Date", each if [RowNum] > 8 and Text.Trim(tbl{[RowNum]-2}[Column9]) = "Pay Date" then [Column9] else null, type date),
AddPeriodEndDate = Table.AddColumn(AddPayDate, "Period End Date", each if [RowNum] > 8 and Text.Trim(tbl{[RowNum]-2}[Column12]) = "Period End Date" then [Column12] else null, type date),
AddJobCode = Table.AddColumn(AddPeriodEndDate, "Job Code", each if [RowNum] > 8 and Text.Trim(tbl{[RowNum]-2}[Column14]) = "Job Code" then [Column14] else null, Int64.Type),
AddCheckInfo = Table.AddColumn(AddJobCode, "Check Info", each if [RowNum] > 8 and Text.Trim([Column1]) = "Check Printed:" then Table.Transpose(Table.SelectRows(Table.Transpose(Table.FromRecords({_})), each [Column1] <> null)) else null),
ExpandedCheckInfo = Table.ExpandTableColumn(AddCheckInfo, "Check Info", {"Column4", "Column6", "Column8"}, {"Check Amount", "Direct Deposit", "Net"}),
FillUp = Table.FillUp(ExpandedCheckInfo, {"Column3", "Check Amount", "Direct Deposit", "Net"})//Table.AddColumn(AddJobCode, "tmp2", each if [RowNum] < 9 then "" else (if Text.Trim([Column1]) = "Check Printed:" then (if [Column3] = null then -1 else [Column3]) else null), type text), {"tmp2"}),
FillDown = Table.FillDown(FillUp, {"Column1", "Column5", "Pay Date", "Period End Date", "Job Code"}),
AddCheckEEIDfixed = Table.AddColumn(FillDown, "Check:EEID.fixed", each Text.From([Column5]) & ":" & Text.From([Column3]), type text),
FilteredExtraRows = Table.SelectRows(AddCheckEEIDfixed, each [RowNum] > 8 and Text.Trim([Column1]) <> "Check Printed:" and Text.Trim([Column7]) <> "PerControl" and Text.Trim(tbl{[RowNum]-2}[Column7]) <> "PerControl" and [#"Check:EEID.fixed"] <> null),
DemotedHeaders = Table.DemoteHeaders(FilteredExtraRows),
GetColumnNames1 = Table.Combine({Table.FromRecords({DemotedHeaders{0}}), ColNames}),
GetColumnNames2 = Table.PromoteHeaders(Table.FillDown(GetColumnNames1, Table.ColumnNames(GetColumnNames1))),
SetColumnNames = Table.PromoteHeaders(Table.Combine({GetColumnNames2, FilteredExtraRows}))
in
SetColumnNames,
ConvertedList = List.Transform(ListTables, (t) => ConvertTable(t)),
GetWholeTable = Table.Combine(ConvertedList)
in
GetWholeTable
你需要精确详细解释你的数据如何能像它应该如何拆分成多列。在你的例子中,我没有看到任何变化,因为所有的行都有7个位置。此外,对于数字来说,如果存在任何。或货币符号并且数据的文化也可能相关(例如,2017年2月1日是英国2月1日和美国1月2日) 。 – MarcelBeug
我希望我的编辑能让这些数据更具代表性。所有数字通常都是十进制数字,因为它们可以根据上下文表示$ currency或Decimal hours记录。 – CRSPLK
这还不够。如果没有明确的规范说明如何确定哪一部分输入应放入哪一列,并且有所有可能的备选班次,我们将无法为您提供帮助。 – MarcelBeug