2017-02-08 35 views
2

我对DataFactory非常陌生,并且在理解如何正确创建将执行存储过程之前执行复制功能的管道时遇到问题。Azure DataFactory链活动

存储的proc只是目标表的一个TRUNCATE,用作第二个活动的输出数据集。

从DataFactory文档中,它告诉我要首先执行存储的proc,请指定proc的“输出”作为第二个活动的“输入”。

但是,存储过程没有真正的“输出”。为了让它“工作”,我克隆了第二个活动的输出,改变了它的名称,并使其成为external=false以使其超过配置错误,但这显然是一个总的混乱。

对于我来说,至少在这个存储过程执行的动作TRUNCATE的情况下,为什么甚至需要定义一个输出是没有意义的。

但是,当我尝试使用存储过程的输出作为附加输入时,我收到一个有关重复表名的错误。

如何获得TRUNCATE存储的proc活动以在运行复制活动之前成功执行(并完成)?

这里的流水线代码:

{ 
    "name": "Traffic CRM - System User Stage", 
    "properties": { 
     "description": "Move System User to Stage", 
     "activities": [ 
      { 
       "type": "SqlServerStoredProcedure", 
       "typeProperties": { 
        "storedProcedureName": "dbo.usp_Truncate_Traffic_Crm_SystemUser", 
        "storedProcedureParameters": {} 
       }, 
       "outputs": [ 
        { 
         "name": "Smart App - usp Truncate System User" 
        } 
       ], 
       "policy": { 
        "timeout": "01:00:00", 
        "concurrency": 1, 
        "retry": 3 
       }, 
       "scheduler": { 
        "frequency": "Day", 
        "interval": 1 
       }, 
       "name": "Smart App - SystemUser Truncate" 
      }, 
      { 
       "type": "Copy", 
       "typeProperties": { 
        "source": { 
         "type": "SqlSource", 
         "sqlReaderQuery": "select * from [dbo].[Traffic_Crm_SystemUser]" 
        }, 
        "sink": { 
         "type": "SqlSink", 
         "writeBatchSize": 0, 
         "writeBatchTimeout": "00:00:00" 
        }, 
        "translator": { 
         "type": "TabularTranslator", 
         "columnMappings": "All columns mapped here" 
        } 
       }, 
       "inputs": [ 
        { 
         "name": "Traffic CRM - SytemUser Stage" 
        } 
       ], 
       "outputs": [ 
        { 
         "name": "Smart App - System User Stage Production" 
        } 
       ], 
       "policy": { 
        "timeout": "1.00:00:00", 
        "concurrency": 1, 
        "executionPriorityOrder": "NewestFirst", 
        "style": "StartOfInterval", 
        "retry": 3, 
        "longRetry": 0, 
        "longRetryInterval": "00:00:00" 
       }, 
       "scheduler": { 
        "frequency": "Day", 
        "interval": 1 
       }, 
       "name": "Activity-0-[dbo]_[Traffic_Crm_SystemUser]->[dbo]_[Traffic_Crm_SystemUser]" 
      } 
     ], 
     "start": "2017-01-19T14:30:57.309Z", 
     "end": "2099-12-31T05:00:00Z", 
     "isPaused": false, 
     "hubName": "stagingdatafactory1_hub", 
     "pipelineMode": "Scheduled" 
    } 
} 

回答

2

你的SP活动的输出数据集,即“名”:“智能应用 - USP截断系统用户”应该是下一个活动的输入。如果您有要放什么东西在数据集中的混乱,只需要创建一个虚拟数据集像下面

{ 
    "name": "DummySPDS", 
    "properties": { 
     "published": false, 
     "type": "SqlServerTable", 
     "linkedServiceName": "SQLServerLS", 
     "typeProperties": { 
      "tableName": "dummyTable" 
     }, 
     "availability": { 
      "frequency": "Hour", 
      "interval": 1 
     }, 
     "IsExternal":"True" 
    } 
} 

下面是完整的流水线代码

{ 
    "name": "Traffic CRM - System User Stage", 
    "properties": { 
     "description": "Move System User to Stage", 
     "activities": [ 
      { 
       "type": "SqlServerStoredProcedure", 
       "typeProperties": { 
        "storedProcedureName": "dbo.usp_Truncate_Traffic_Crm_SystemUser", 
        "storedProcedureParameters": {} 
       }, 
       "inputs": [ 
        { 
         "name": "DummySPDS" 
        } 
       ], 
       "outputs": [ 
        { 
         "name": "Smart App - usp Truncate System User" 
        } 
       ], 
       "policy": { 
        "timeout": "01:00:00", 
        "concurrency": 1, 
        "retry": 3 
       }, 
       "scheduler": { 
        "frequency": "Day", 
        "interval": 1 
       }, 
       "name": "Smart App - SystemUser Truncate" 
      }, 
      { 
       "type": "Copy", 
       "typeProperties": { 
        "source": { 
         "type": "SqlSource", 
         "sqlReaderQuery": "select * from [dbo].[Traffic_Crm_SystemUser]" 
        }, 
        "sink": { 
         "type": "SqlSink", 
         "writeBatchSize": 0, 
         "writeBatchTimeout": "00:00:00" 
        }, 
        "translator": { 
         "type": "TabularTranslator", 
         "columnMappings": "All columns mapped here" 
        } 
       }, 
       "inputs": [ 
        { 
         "name": "Smart App - usp Truncate System User" 
        } 
       ], 
       "outputs": [ 
        { 
         "name": "Smart App - System User Stage Production" 
        } 
       ], 
       "policy": { 
        "timeout": "1.00:00:00", 
        "concurrency": 1, 
        "executionPriorityOrder": "NewestFirst", 
        "style": "StartOfInterval", 
        "retry": 3, 
        "longRetry": 0, 
        "longRetryInterval": "00:00:00" 
       }, 
       "scheduler": { 
        "frequency": "Day", 
        "interval": 1 
       }, 
       "name": "Activity-0-[dbo]_[Traffic_Crm_SystemUser]->[dbo]_[Traffic_Crm_SystemUser]" 
      } 
     ], 
     "start": "2017-01-19T14:30:57.309Z", 
     "end": "2099-12-31T05:00:00Z", 
     "isPaused": false, 
     "hubName": "stagingdatafactory1_hub", 
     "pipelineMode": "Scheduled" 
+0

我添加描述虚拟数据集,但是,那么第二个活动失去了复制活动所需的映射。然后,我尝试向'inputs'中添加第二个项目,但收到'duplicate object key referenced table name'错误,即使我的哑元数据集不包含相同的表名称。这是我用来建议添加到输入的第二个'name'对象的文章:http://stackoverflow.com/questions/35970079/azure-data-factory-multiple-activities-in-pipeline-execution-order – rcastagna

+0

我提供了完整的管道代码,但未在Azure上进行测试,但应该可以工作。 – Manish