从描述列中,您想根据部分{time} {time_label}
(如1 Year
或1 Month
)来推断哪里要在12个月的时间内填充一个或零。
下面就做你想做的一种方式:
# create two temporary columns
# time: holds the numeric value associated with time_label (month or year)
df['time'], df['time_label'] = df.Description.str.split().apply(lambda x: pd.Series(x[-2:])).values.T
# define the numeric equivalent of Month and Year
mapping = {"Month":1, "Year":12}
for month in range(12):
# if is only here to pretty print M, M+1, M+2, ...
# you can remove it if you accept M+0, M+1, ...
if month == 0:
df["M"] = np.where(df.time.astype(int)*df.time_label.map(mapping) >= month+1, 1, 0)
else:
df["M"+"+"+str(month)] = np.where(df.time.astype(int)*df.time_label.map(mapping) >= month+1, 1, 0)
一个完全重复的例子:
import pandas as pd
import numpy as np
from StringIO import StringIO
data = """
No Description
1 "Extention Slack 1 Month"
2 "Extention Slack 1 Year"
3 "Slack 6 Month"
4 "Slack 3 Month"
"""
# StringIO(data) : to simulate reading the data
# change df with your dataframe
df = pd.read_table(StringIO(data), sep="\s+")
# create two temporary columns
# time: holds the numeric value associated with time_label (month or year)
df['time'], df['time_label'] = df.Description.str.split().apply(lambda x: pd.Series(x[-2:])).values.T
# define the numeric equivalent of Month and Year
mapping = {"Month":1, "Year":12}
for month in range(12):
# if is only here to pretty print M, M+1, M+2, ...
if month == 0:
df["M"] = np.where(df.time.astype(int)*df.time_label.map(mapping) >= month+1, 1, 0)
else:
df["M"+"+"+str(month)] = np.where(df.time.astype(int)*df.time_label.map(mapping) >= month+1, 1, 0)
# remove temporary columns
df.drop(['time','time_label'], axis=1, inplace=True)
print(df)
输出:
No Description M M+1 M+2 M+3 M+4 M+5 M+6 M+7 M+8 \
0 1 Extention Slack 1 Month 1 0 0 0 0 0 0 0 0
1 2 Extention Slack 1 Year 1 1 1 1 1 1 1 1 1
2 3 Slack 6 Month 1 1 1 1 1 1 0 0 0
3 4 Slack 3 Month 1 1 1 0 0 0 0 0 0
M+9 M+10 M+11
0 0 0 0
1 1 1 1
2 0 0 0
3 0 0 0
什么是tahun? –
Tahun是印度尼西亚语言年,对不起,我的坏 –