2017-09-09 38 views
0

数据集计数工作日VS周末使用在熊猫

    starttime User Type 
0   7/1/2015 00:00:03 Subscriber 
1   7/1/2015 00:00:06 Subscriber 
2   7/1/2015 00:00:17 Subscriber 
3   7/1/2015 00:00:23 Subscriber 
4   7/1/2015 00:00:44 Subscriber 
5   7/1/2015 00:01:00 Subscriber 
6   7/1/2015 00:01:03 Subscriber 
7   7/1/2015 00:01:06 Subscriber 
8   7/1/2015 00:01:25 Customer 
9   7/1/2015 00:01:41 Subscriber 
10  7/1/2015 00:01:50 Customer 
11  7/1/2015 00:01:58 Subscriber 
12  7/1/2015 00:02:06 Subscriber 
13  7/1/2015 00:02:07 Subscriber 
14  7/1/2015 00:02:26 Subscriber 
15  7/1/2015 00:02:26 Subscriber 
16  7/1/2015 00:02:35 Subscriber 
17  7/1/2015 00:02:43 Customer 
18  7/1/2015 00:02:47 Customer 
19  7/1/2015 00:02:47 Subscriber 
20  7/1/2015 00:03:05 Subscriber 
21  7/1/2015 00:03:16 Customer 
22  7/1/2015 00:03:27 Subscriber 
23  7/1/2015 00:03:34 Subscriber 
24  7/1/2015 00:03:48 Subscriber 
25  7/1/2015 00:03:56 Subscriber 
26  7/1/2015 00:03:57 Subscriber 
27  7/1/2015 00:03:58 Customer 
28  7/1/2015 00:04:03 Subscriber 
29  7/1/2015 00:04:17 Subscriber 
...      ...   ... 
1085646 7/31/2015 23:57:25 Subscriber 
1085647 7/31/2015 23:57:29 Customer 
1085648 7/31/2015 23:57:32 Subscriber 
1085649 7/31/2015 23:57:33 Subscriber 
1085650 7/31/2015 23:57:44 Subscriber 
1085651 7/31/2015 23:57:54 Subscriber 
1085652 7/31/2015 23:58:03 Subscriber 
1085653 7/31/2015 23:58:08 Subscriber 
1085654 7/31/2015 23:58:12 Customer 
1085655 7/31/2015 23:58:15 Subscriber 
1085656 7/31/2015 23:58:18 Customer 
1085657 7/31/2015 23:58:24 Subscriber 
1085658 7/31/2015 23:58:27 Subscriber 
1085659 7/31/2015 23:58:42 Subscriber 
1085660 7/31/2015 23:58:43 Subscriber 
1085661 7/31/2015 23:58:51 Customer 
1085662 7/31/2015 23:58:53 Subscriber 
1085663 7/31/2015 23:58:58 Subscriber 
1085664 7/31/2015 23:59:04 Subscriber 
1085665 7/31/2015 23:59:10 Subscriber 
1085666 7/31/2015 23:59:24 Subscriber 
1085667 7/31/2015 23:59:23 Customer 
1085668 7/31/2015 23:59:24 Subscriber 
1085669 7/31/2015 23:59:24 Subscriber 
1085670 7/31/2015 23:59:38 Subscriber 
1085671 7/31/2015 23:59:40 Subscriber 
1085672 7/31/2015 23:59:41 Subscriber 
1085673 7/31/2015 23:59:42 Customer 
1085674 7/31/2015 23:59:56 Subscriber 
1085675 7/31/2015 23:59:59 Subscriber 

问题

创建的游乐设施由小时和工作日和周末的数量用户键入大熊猫数据帧。使用starttime来确定每个小时的乘车时间。

输出应该像

User 
    Type 
    Hour Customer Subscriber 

Weekday  0 124  2194 
      1 120  1238 
      2 53  716 
      3 30  520 
.... .... .... .... 
Weekend  0 152  1879 
      1 82  1222 
      2 45  718 
      3 34  431 
      4 29  288 
.... .... .... .... 

图表应该是这样的

enter image description here enter image description here

我的代码

def a11(rides): 
    rides['starttime'] = pd.to_datetime(rides['starttime'], infer_datetime_format=True) 

    hours_cats = ['12 AM', '01 AM', '02 AM', '03 AM', '04 AM', '05 AM', '06 AM', '07 AM', '08 AM', '09 AM', '10 AM', '11 AM', '12 PM', '01 PM', '02 PM', '03 PM', '04 PM', '05 PM', '06 PM', '07 PM', '08 PM', '09 PM', '10 PM', '11 PM'] 
    dates = pd.Categorical(rides.starttime.dt.strftime('%I %p'), categories=hours_cats, ordered=True) 
    df = pd.crosstab(dates, rides['User Type']) 

我不知道如何在平日和周末分开数据框架,如问题中提到的那样。

+0

开始:根据骑行的开始时间,计算出每次骑行如何获得一周中的某一天。这可能有助于让你指出正确的方向:https://stackoverflow.com/questions/30222533/create-a-day-of-week-column-in-a-pandas-dataframe-using-python – payne

+0

使用[pandas .DatetimeIndex.weekday](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DatetimeIndex.weekday.html),它返回从0 =星期一到6 =星期日每天的数字。 (https://pandas.pydata.org/pandas-docs/stable/merging.html)使用[pandas.pivot_table](https:// pandas)将这些数据添加到您的df中.pydata.org/pandas-docs/stable/generated/pandas.pivot_table.html)以期望的形式对数据进行sqeeze – 2Obe

回答

0

对于期望DataFrame你需要创建新的列:

def a11(rides): 
    rides['starttime'] = pd.to_datetime(rides['starttime'], infer_datetime_format=True) 
    rides['Type'] = np.where(rides['starttime'].dt.dayofweek < 5, 'Weekday', 'weekend') 
    return pd.crosstab([rides['Type'], rides['starttime'].dt.dayofweek], rides['User Type']) 

print (a11(rides)) 

User Type   Customer Subscriber 
Type starttime      
Weekday 2     6   24 
     4     6   24 

对于图形的输出是一个有点不同 - 改变dates并选择通过多指标的第一级使用DataFrame.xs

print (rides) 
      starttime User Type 
0 7/01/2015 10:00:03 Customer 
1 7/02/2015 02:00:06 Subscriber 
2 7/02/2015 02:00:17 Customer 
3 7/02/2015 04:00:23 Subscriber 
4 7/03/2015 05:00:44 Customer 
5 7/11/2015 07:01:00 Subscriber 
6 7/12/2015 07:01:03 Customer 
7 7/12/2015 08:01:06 Subscriber 
8 7/12/2015 00:01:25 Customer 
9 7/12/2015 09:01:41 Subscriber 
10 7/12/2015 10:01:50 Customer 

def a11(rides): 
    rides['starttime'] = pd.to_datetime(rides['starttime'], infer_datetime_format=True) 
    types = np.where(rides['starttime'].dt.dayofweek < 5, 'Weekday', 'Weekend') 
    hours_cats = ['12 AM', '01 AM', '02 AM', '03 AM', '04 AM', '05 AM', '06 AM', '07 AM', '08 AM', '09 AM', '10 AM', '11 AM', '12 PM', '01 PM', '02 PM', '03 PM', '04 PM', '05 PM', '06 PM', '07 PM', '08 PM', '09 PM', '10 PM', '11 PM'] 
    dates = pd.Categorical(rides.starttime.dt.strftime('%I %p'), categories=hours_cats, ordered=True) 
    return pd.crosstab([types, dates], rides['User Type']).assign(All=lambda x: x.sum(axis=1)) 

a11(rides).xs('Weekday').plot() 
a11(rides).xs('Weekend').plot() 

g1

g2