Pipelines are a key resource in capturing and ingesting data from various sources in Azure Data Factory V2. A Pipeline is a logical grouping of activities

Example Pipeline activities

  • Copy Data activity
  • Data Transformation activities, using different compute environments, eg Data Flow, Azure functions, SQL Stored procedures, Azure data bricks
  • Control Flow activities, eg filters, lookups, if conditions
  • Pipeline Activity JSON
  • Execution activities (activity Policies)
  • Control Activity

Scheduling pipelines

Pipelines are scheduled by triggers. There are different types of triggers (Scheduler trigger, which allows pipelines to be triggered on a wall-clock schedule, as well as the manual trigger, which triggers pipelines on-demand). For more information about triggers, see pipeline execution and triggers article.

To have your trigger kick off a pipeline run, you must include a pipeline reference of the particular pipeline in the trigger definition. Pipelines & triggers have an n-m relationship. Multiple triggers can kick off a single pipeline, and the same trigger can kick off multiple pipelines. Once the trigger is defined, you must start the trigger to have it start triggering the pipeline. For more information about triggers, see pipeline execution and triggers article.

For example, say you have a Scheduler trigger, “Trigger A,” that I wish to kick off my pipeline, “MyCopyPipeline.” You define the trigger, as shown in the following example:

{
“name”: “TriggerA”,
“properties”: {
“type”: “ScheduleTrigger”,
“typeProperties”: {

}
},
“pipeline”: {
“pipelineReference”: {
“type”: “PipelineReference”,
“referenceName”: “MyCopyPipeline”
},
“parameters”: {
“copySourceName”: “FileSource”
}
}
}
}

Data Integration Runtime

The Integration Runtime is a customer managed data integration infrastructure used by Azure Data Factory to provide data integration capabilities across different network environments. It was formerly called as Data Management Gateway.

The Integration Runtime (IR) is the compute infrastructure used by Azure Data Factory to provide the following data integration capabilities across different network environments:

  • Data Flow: Execute a Data Flow in managed Azure compute environment.
  • Data movement: Copy data across data stores in public network and data stores in private network (on-premises or virtual private network). It provides support for built-in connectors, format conversion, column mapping, and performant and scalable data transfer.
  • Activity dispatch: Dispatch and monitor transformation activities running on a variety of compute services such as Azure Databricks, Azure HDInsight, Azure Machine Learning, Azure SQL Database, SQL Server, and more.
  • SSIS package execution: Natively execute SQL Server Integration Services (SSIS) packages in a managed Azure compute environment.

In Data Factory, an activity defines the action to be performed. A linked service defines a target data store or a compute service. An integration runtime provides the bridge between the activity and linked Services. It’s referenced by the linked service or activity, and provides the compute environment where the activity either runs on or gets dispatched from. This way, the activity can be performed in the region closest possible to the target data store or compute service in the most performant way while meeting security and compliance needs.

Integration runtimes can be created in the Azure Data Factory UX via the management hub and any activities, datasets, or data flows that reference them.

Integration runtime types

Data Factory offers three types of Integration Runtime (IR), and you should choose the type that best serve the data integration capabilities and network environment needs you’re looking for. These three types are:

  • Azure
  • Self-hosted
  • Azure-SSIS

The following table describes the capabilities and network support for each of the integration runtime types:

IR typePublic networkPrivate network
AzureData Flow
Data movement
Activity dispatch
 
Self-hostedData movement
Activity dispatch
Data movement
Activity dispatch
Azure-SSISSSIS package executionSSIS package execution

The following diagram shows how the different integration runtimes can be used in combination to offer rich data integration capabilities and network support:

Different types of integration runtimes

Azure integration runtime

An Azure integration runtime can:

  • Running Data Flows in Azure
  • Running copy activity between cloud data stores
  • Dispatching the following transform activities in public network: Databricks Notebook/ Jar/ Python activity, HDInsight Hive activity, HDInsight Pig activity, HDInsight MapReduce activity, HDInsight Spark activity, HDInsight Streaming activity, Machine Learning Batch Execution activity, Machine Learning Update Resource activities, Stored Procedure activity, Data Lake Analytics U-SQL activity, .NET custom activity, Web activity, Lookup activity, and Get Metadata activity.

Azure IR network environment

Azure Integration Runtime supports connecting to data stores and computes services with public accessible endpoints. Use a self-hosted integration runtime for Azure Virtual Network environment.

Azure IR compute resource and scaling

Azure integration runtime provides a fully managed, serverless compute in Azure. You don’t have to worry about infrastructure provision, software installation, patching, or capacity scaling. In addition, you only pay for the duration of the actual utilization.

Azure integration runtime provides the native compute to move data between cloud data stores in a secure, reliable, and high-performance manner. You can set how many data integration units to use on the copy activity, and the compute size of the Azure IR is elastically scaled up accordingly without you having to explicitly adjusting size of the Azure Integration Runtime.

Activity dispatch is a lightweight operation to route the activity to the target compute service, so there isn’t need to scale up the compute size for this scenario.

For information about creating and configuring an Azure IR, reference how to create and configure Azure IR under how to guides.

 Note

Azure Integration runtime has properties related to Data Flow runtime, which defines the underlying compute infrastructure that would be used to run the data flows on.

Self-hosted integration runtime

A self-hosted IR is capable of:

  • Running copy activity between a cloud data stores and a data store in private network.
  • Dispatching the following transform activities against compute resources in on-premises or Azure Virtual Network: HDInsight Hive activity (BYOC-Bring Your Own Cluster), HDInsight Pig activity (BYOC), HDInsight MapReduce activity (BYOC), HDInsight Spark activity (BYOC), HDInsight Streaming activity (BYOC), Machine Learning Batch Execution activity, Machine Learning Update Resource activities, Stored Procedure activity, Data Lake Analytics U-SQL activity, Custom activity (runs on Azure Batch), Lookup activity, and Get Metadata activity.

 Note

Use self-hosted integration runtime to support data stores that requires bring-your-own driver such as SAP Hana, MySQL, etc. For more information, see supported data stores.

 Note

Java Runtime Environment (JRE) is a dependency of Self Hosted IR. Please make sure you have JRE installed on the same host.

Self-hosted IR network environment

If you want to perform data integration securely in a private network environment, which doesn’t have a direct line-of-sight from the public cloud environment, you can install a self-hosted IR on premises environment behind your corporate firewall, or inside a virtual private network. The self-hosted integration runtime only makes outbound HTTP-based connections to open internet.

Self-hosted IR compute resource and scaling

Install Self-hosted IR on an on-premises machine or a virtual machine inside a private network. Currently, we only support running the self-hosted IR on a Windows operating system.

For high availability and scalability, you can scale out the self-hosted IR by associating the logical instance with multiple on-premises machines in active-active mode. For more information, see how to create and configure self-hosted IR article under how to guides for details.

Azure-SSIS Integration Runtime

To lift and shift existing SSIS workload, you can create an Azure-SSIS IR to natively execute SSIS packages.

Azure-SSIS IR network environment

Azure-SSIS IR can be provisioned in either public network or private network. On-premises data access is supported by joining Azure-SSIS IR to a Virtual Network that is connected to your on-premises network.

Azure-SSIS IR compute resource and scaling

Azure-SSIS IR is a fully managed cluster of Azure VMs dedicated to run your SSIS packages. You can bring your own Azure SQL Database or SQL Managed Instance for the catalog of SSIS projects/packages (SSISDB). You can scale up the power of the compute by specifying node size and scale it out by specifying the number of nodes in the cluster. You can manage the cost of running your Azure-SSIS Integration Runtime by stopping and starting it as you see fit.

For more information, see how to create and configure Azure-SSIS IR article under how to guides. Once created, you can deploy and manage your existing SSIS packages with little to no change using familiar tools such as SQL Server Data Tools (SSDT) and SQL Server Management Studio (SSMS), just like using SSIS on premises.

For more information about Azure-SSIS runtime, see the following articles:

  • Tutorial: deploy SSIS packages to Azure. This article provides step-by-step instructions to create an Azure-SSIS IR and uses an Azure SQL Database to host the SSIS catalog.
  • How to: Create an Azure-SSIS integration runtime. This article expands on the tutorial and provides instructions on using SQL Managed Instance and joining the IR to a virtual network.
  • Monitor an Azure-SSIS IR. This article shows you how to retrieve information about an Azure-SSIS IR and descriptions of statuses in the returned information.
  • Manage an Azure-SSIS IR. This article shows you how to stop, start, or remove an Azure-SSIS IR. It also shows you how to scale out your Azure-SSIS IR by adding more nodes to the IR.
  • Join an Azure-SSIS IR to a virtual network. This article provides conceptual information about joining an Azure-SSIS IR to an Azure virtual network. It also provides steps to use Azure portal to configure virtual network so that Azure-SSIS IR can join the virtual network.

Integration runtime location

The Data Factory location is where the metadata of the data factory is stored and where the triggering of the pipeline is initiated from. Meanwhile, a data factory can access data stores and compute services in other Azure regions to move data between data stores or process data using compute services. This behavior is realized through the globally available IR to ensure data compliance, efficiency, and reduced network egress costs.

The IR Location defines the location of its back-end compute, and essentially the location where the data movement, activity dispatching, and SSIS package execution are performed. The IR location can be different from the location of the data factory it belongs to.

Azure IR location

You can set a certain location of an Azure IR, in which case the activity execution or dispatch will happen in that specific region.

If you choose to use the auto-resolve Azure IR, which is the default,

  • For copy activity, ADF will make a best effort to automatically detect your sink data store’s location, then use the IR in either the same region if available or the closest one in the same geography; if the sink data store’s region is not detectable, IR in the data factory region as alternative is used.For example, you have your factory created in East US,
    • When copy data to Azure Blob in West US, if ADF successfully detected that the Blob is in West US, copy activity is executed on IR in West US; if the region detection fails, copy activity is executed on IR in East US.
    • When copy data to Salesforce of which the region is not detectable, copy activity is executed on IR in East US.
     TipIf you have strict data compliance requirements and need ensure that data do not leave a certain geography, you can explicitly create an Azure IR in a certain region and point the Linked Service to this IR using ConnectVia property. For example, if you want to copy data from Blob in UK South to SQL DW in UK South and want to ensure data do not leave UK, create an Azure IR in UK South and link both Linked Services to this IR.
  • For Lookup/GetMetadata/Delete activity execution (also known as Pipeline activities), transformation activity dispatching (also known as External activities), and authoring operations (test connection, browse folder list and table list, preview data), ADF uses the IR in the data factory region.
  • For Data Flow, ADF uses the IR in the data factory region. TipA good practice would be to ensure Data flow runs in the same region as your corresponding data stores (if possible). You can either achieve this by auto-resolve Azure IR (if data store location is same as Data Factory location), or by creating a new Azure IR instance in the same region as your data stores and then execute the data flow on it.

You can monitor which IR location takes effect during activity execution in pipeline activity monitoring view on UI or activity monitoring payload.

Self-hosted IR location

The self-hosted IR is logically registered to the Data Factory and the compute used to support its functionalities is provided by you. Therefore there is no explicit location property for self-hosted IR.

When used to perform data movement, the self-hosted IR extracts data from the source and writes into the destination.

Azure-SSIS IR location

Selecting the right location for your Azure-SSIS IR is essential to achieve high performance in your extract-transform-load (ETL) workflows.

  • The location of your Azure-SSIS IR does not need to be the same as the location of your data factory, but it should be the same as the location of your own Azure SQL Database or SQL Managed Instance where SSISDB. This way, your Azure-SSIS Integration Runtime can easily access SSISDB without incurring excessive traffics between different locations.
  • If you do not have an existing SQL Database or SQL Managed Instance, but you have on-premises data sources/destinations, you should create a new Azure SQL Database or SQL Managed Instance in the same location of a virtual network connected to your on-premises network. This way, you can create your Azure-SSIS IR using the new Azure SQL Database or SQL Managed Instance and joining that virtual network, all in the same location, effectively minimizing data movements across different locations.
  • If the location of your existing Azure SQL Database or SQL Managed Instance is not the same as the location of a virtual network connected to your on-premises network, first create your Azure-SSIS IR using an existing Azure SQL Database or SQL Managed Instance and joining another virtual network in the same location, and then configure a virtual network to virtual network connection between different locations.

The following diagram shows location settings of Data Factory and its integration run times:

Integration runtime location

Determining which IR to use

Copy activity

For Copy activity, it requires source and sink linked services to define the direction of data flow. The following logic is used to determine which integration runtime instance is used to perform the copy:

  • Copying between two cloud data sources: when both source and sink linked services are using Azure IR, ADF uses the regional Azure IR if you specified, or auto determine a location of Azure IR if you choose the autoresolve IR (default) as described in Integration runtime location section.
  • Copying between a cloud data source and a data source in private network: if either source or sink linked service points to a self-hosted IR, the copy activity is executed on that self-hosted Integration Runtime.
  • Copying between two data sources in private network: both the source and sink Linked Service must point to the same instance of integration runtime, and that integration runtime is used to execute the copy Activity.

Lookup and GetMetadata activity

The Lookup and GetMetadata activity is executed on the integration runtime associated to the data store linked service.

External transformation activity

Each external transformation activity that utilizes an external compute engine has a target compute Linked Service, which points to an integration runtime. This integration runtime instance determines the location where that external hand-coded transformation activity is dispatched from.

Data Flow activity

Data Flow activities are executed on the Azure integration runtime associated to it. The Spark compute utilized by Data Flows are determined by the data flow properties in your Azure Integration Runtime and are fully managed by ADF.

In Azure Data Factory, you can use the Copy activity to copy data among data stores located on-premises and in the cloud. After you copy the data, you can use other activities to further transform and analyze it. You can also use the Copy activity to publish transformation and analysis results for business intelligence (BI) and application consumption.

The role of the Copy activity

The Copy activity is executed on an integration runtime. You can use different types of integration runtimes for different data copy scenarios:

  • When you’re copying data between two data stores that are publicly accessible through the internet from any IP, you can use the Azure integration runtime for the copy activity. This integration runtime is secure, reliable, scalable, and globally available.
  • When you’re copying data to and from data stores that are located on-premises or in a network with access control (for example, an Azure virtual network), you need to set up a self-hosted integration runtime.

An integration runtime needs to be associated with each source and sink data store. For information about how the Copy activity determines which integration runtime to use, see Determining which IR to use.

To copy data from a source to a sink, the service that runs the Copy activity performs these steps:

  1. Reads data from a source data store.
  2. Performs serialization/deserialization, compression/decompression, column mapping, and so on. It performs these operations based on the configuration of the input dataset, output dataset, and Copy activity.
  3. Writes data to the sink/destination data store.
Copy activity overview

Supported data stores and formats

CategoryData storeSupported as a sourceSupported as a sinkSupported by Azure IRSupported by self-hosted IR
AzureAzure Blob storage
 Azure Cognitive Search index
 Azure Cosmos DB (SQL API)
 Azure Cosmos DB’s API for MongoDB
 Azure Data Explorer
 Azure Data Lake Storage Gen1
 Azure Data Lake Storage Gen2
 Azure Database for MariaDB
 Azure Database for MySQL
 Azure Database for PostgreSQL
 Azure File Storage
 Azure SQL Database
 Azure SQL Managed Instance
 Azure Synapse Analytics (formerly SQL Data Warehouse)
 Azure Table storage
DatabaseAmazon Redshift
 DB2
 Drill
 Google BigQuery
 Greenplum
 HBase
 Hive
 Apache Impala
 Informix
 MariaDB
 Microsoft Access
 MySQL
 Netezza
 Oracle
 Phoenix
 PostgreSQL
 Presto (Preview)
 SAP Business Warehouse via Open Hub
 SAP Business Warehouse via MDX
 SAP HANA
 SAP table
 Snowflake
 Spark
 SQL Server
 Sybase
 Teradata
 Vertica
NoSQLCassandra
 Couchbase (Preview)
 MongoDB
FileAmazon S3
 File system
 FTP
 Google Cloud Storage
 HDFS
 SFTP
Generic protocolGeneric HTTP
 Generic OData
 Generic ODBC
 Generic REST
Services and appsAmazon Marketplace Web Service
 Common Data Service
 Concur (Preview)
 Dynamics 365
 Dynamics AX
 Dynamics CRM
 Google AdWords
 HubSpot (Preview)
 Jira
 Magento (Preview)
 Marketo (Preview)
 Office 365
 Oracle Eloqua (Preview)
 Oracle Responsys (Preview)
 Oracle Service Cloud (Preview)
 PayPal (Preview)
 QuickBooks (Preview)
 Salesforce
 Salesforce Service Cloud
 Salesforce Marketing Cloud
 SAP Cloud for Customer (C4C)
 SAP ECC
 ServiceNow
SharePoint Online List
 Shopify (Preview)
 Square (Preview)
 Web table (HTML table)
 Xero
 Zoho (Preview)

 Note

If a connector is marked Preview, you can try it out and give us feedback. If you want to take a dependency on preview connectors in your solution, contact Azure support.

Supported file formats

Azure Data Factory supports the following file formats. Refer to each article for format-based settings.

You can use the Copy activity to copy files as-is between two file-based data stores, in which case the data is copied efficiently without any serialization or deserialization. In addition, you can also parse or generate files of a given format, for example, you can perform the following:

  • Copy data from a SQL Server database and write to Azure Data Lake Storage Gen2 in Parquet format.
  • Copy files in text (CSV) format from an on-premises file system and write to Azure Blob storage in Avro format.
  • Copy zipped files from an on-premises file system, decompress them on-the-fly, and write extracted files to Azure Data Lake Storage Gen2.
  • Copy data in Gzip compressed-text (CSV) format from Azure Blob storage and write it to Azure SQL Database.
  • Many more activities that require serialization/deserialization or compression/decompression.

Supported regions

The service that enables the Copy activity is available globally in the regions and geographies listed in Azure integration runtime locations. The globally available topology ensures efficient data movement that usually avoids cross-region hops. See Products by region to check the availability of Data Factory and data movement in a specific region.

Configuration

You can use one of the following tools or SDKs to use the Copy activity with a pipeline. Select a link for step-by-step instructions.

In general, to use the Copy activity in Azure Data Factory, you need to:

  1. Create linked services for the source data store and the sink data store. You can find the list of supported connectors in the Supported data stores and formats section of this article. Refer to the connector article’s “Linked service properties” section for configuration information and supported properties.
  2. Create datasets for the source and sink. Refer to the “Dataset properties” sections of the source and sink connector articles for configuration information and supported properties.
  3. Create a pipeline with the Copy activity. The next section provides an example.

Syntax

The following template of a Copy activity contains a complete list of supported properties. Specify the ones that fit your scenario.JSONCopy

"activities":[
    {
        "name": "CopyActivityTemplate",
        "type": "Copy",
        "inputs": [
            {
                "referenceName": "<source dataset name>",
                "type": "DatasetReference"
            }
        ],
        "outputs": [
            {
                "referenceName": "<sink dataset name>",
                "type": "DatasetReference"
            }
        ],
        "typeProperties": {
            "source": {
                "type": "<source type>",
                <properties>
            },
            "sink": {
                "type": "<sink type>"
                <properties>
            },
            "translator":
            {
                "type": "TabularTranslator",
                "columnMappings": "<column mapping>"
            },
            "dataIntegrationUnits": <number>,
            "parallelCopies": <number>,
            "enableStaging": true/false,
            "stagingSettings": {
                <properties>
            },
            "enableSkipIncompatibleRow": true/false,
            "redirectIncompatibleRowSettings": {
                <properties>
            }
        }
    }
]

Syntax details

PropertyDescriptionRequired?
typeFor a Copy activity, set to CopyYes
inputsSpecify the dataset that you created that points to the source data. The Copy activity supports only a single input.Yes
outputsSpecify the dataset that you created that points to the sink data. The Copy activity supports only a single output.Yes
typePropertiesSpecify properties to configure the Copy activity.Yes
sourceSpecify the copy source type and the corresponding properties for retrieving data.
For more information, see the “Copy activity properties” section in the connector article listed in Supported data stores and formats.
Yes
sinkSpecify the copy sink type and the corresponding properties for writing data.
For more information, see the “Copy activity properties” section in the connector article listed in Supported data stores and formats.
Yes
translatorSpecify explicit column mappings from source to sink. This property applies when the default copy behavior doesn’t meet your needs.
For more information, see Schema mapping in copy activity.
No
dataIntegrationUnitsSpecify a measure that represents the amount of power that the Azure integration runtime uses for data copy. These units were formerly known as cloud Data Movement Units (DMU).
For more information, see Data Integration Units.
No
parallelCopiesSpecify the parallelism that you want the Copy activity to use when reading data from the source and writing data to the sink.
For more information, see Parallel copy.
No
preserveSpecify whether to preserve metadata/ACLs during data copy.
For more information, see Preserve metadata.
No
enableStaging
stagingSettings
Specify whether to stage the interim data in Blob storage instead of directly copying data from source to sink.
For information about useful scenarios and configuration details, see Staged copy.
No
enableSkipIncompatibleRow
redirectIncompatibleRowSettings
Choose how to handle incompatible rows when you copy data from source to sink.
For more information, see Fault tolerance.
No

Monitoring

You can monitor the Copy activity run in the Azure Data Factory both visually and programmatically. For details, see Monitor copy activity.

Incremental copy

Data Factory enables you to incrementally copy delta data from a source data store to a sink data store. For details, see Tutorial: Incrementally copy data.

Performance and tuning

The copy activity monitoring experience shows you the copy performance statistics for each of your activity run. The Copy activity performance and scalability guide describes key factors that affect the performance of data movement via the Copy activity in Azure Data Factory. It also lists the performance values observed during testing and discusses how to optimize the performance of the Copy activity.

Resume from last failed run

Copy activity supports resume from last failed run when you copy large size of files as-is with binary format between file-based stores and choose to preserve the folder/file hierarchy from source to sink, e.g. to migrate data from Amazon S3 to Azure Data Lake Storage Gen2. It applies to the following file-based connectors: Amazon S3Azure BlobAzure Data Lake Storage Gen1Azure Data Lake Storage Gen2Azure File StorageFile SystemFTPGoogle Cloud StorageHDFS, and SFTP.

You can leverage the copy activity resume in the following two ways:

  • Activity level retry: You can set retry count on copy activity. During the pipeline execution, if this copy activity run fails, the next automatic retry will start from last trial’s failure point.
  • Rerun from failed activity: After pipeline execution completion, you can also trigger a rerun from the failed activity in the ADF UI monitoring view or programmatically. If the failed activity is a copy activity, the pipeline will not only rerun from this activity, but also resume from the previous run’s failure point.Copy resume

Few points to note:

  • Resume happens at file level. If copy activity fails when copying a file, in next run, this specific file will be re-copied.
  • For resume to work properly, do not change the copy activity settings between the reruns.
  • When you copy data from Amazon S3, Azure Blob, Azure Data Lake Storage Gen2 and Google Cloud Storage, copy activity can resume from arbitrary number of copied files. While for the rest of file-based connectors as source, currently copy activity supports resume from a limited number of files, usually at the range of tens of thousands and varies depending on the length of the file paths; files beyond this number will be re-copied during reruns.

For other scenarios than binary file copy, copy activity rerun starts from the beginning.

Preserve metadata along with data

While copying data from source to sink, in scenarios like data lake migration, you can also choose to preserve the metadata and ACLs along with data using copy activity. See Preserve metadata for details.

Schema and data type mapping

See Schema and data type mapping for information about how the Copy activity maps your source data to your sink.

Add additional columns during copy

In addition to copying data from source data store to sink, you can also configure to add additional data columns to copy along to sink. For example:

  • When copy from file-based source, store the relative file path as an additional column to trace from which file the data comes from.
  • Add a column with ADF expression, to attach ADF system variables like pipeline name/pipeline id, or store other dynamic value from upstream activity’s output.
  • Add a column with static value to meet your downstream consumption need.

You can find the following configuration on copy activity source tab:

Add additional columns in copy activity

 Tip

This feature works with the latest dataset model. If you don’t see this option from the UI, try creating a new dataset.

To configure it programmatically, add the additionalColumns property in your copy activity source:

PropertyDescriptionRequired
additionalColumnsAdd additional data columns to copy to sink.

Each object under the additionalColumns array represents an extra column. The name defines the column name, and the value indicates the data value of that column.

Allowed data values are:
– $$FILEPATH – a reserved variable indicates to store the source files’ relative path to the folder path specified in dataset. Apply to file-based source.
– Expression
– Static value
No

Example:JSONCopy

"activities":[
    {
        "name": "CopyWithAdditionalColumns",
        "type": "Copy",
        "inputs": [...],
        "outputs": [...],
        "typeProperties": {
            "source": {
                "type": "<source type>",
                "additionalColumns": [
                    {
                        "name": "filePath",
                        "value": "$$FILEPATH"
                    },
                    {
                        "name": "pipelineName",
                        "value": {
                            "value": "@pipeline().Pipeline",
                            "type": "Expression"
                        }
                    },
                    {
                        "name": "staticValue",
                        "value": "sampleValue"
                    }
                ],
                ...
            },
            "sink": {
                "type": "<sink type>"
            }
        }
    }
]

Fault tolerance

By default, the Copy activity stops copying data and returns a failure when source data rows are incompatible with sink data rows. To make the copy succeed, you can configure the Copy activity to skip and log the incompatible rows and copy only the compatible data. See Copy activity fault tolerance for details.

Add to Del.cio.us RSS Feed Add to Technorati Favorites Stumble It! Digg It!
    BuziTweet


Gerry Reid

Gerry Reid

“Technology leader with 20+ years experience in Agile IT Development, consulting, operations, delivery, project management in CRM, robotics, automation, cloud in financial services, telecoms & consulting sector",
https://crmanalytics.net/wp-content/uploads/2020/06/Ingesting-1024x550.pnghttps://crmanalytics.net/wp-content/uploads/2020/06/Ingesting-150x150.pngGerry ReidAzure Data FactoryFeatured SliderPipelines are a key resource in capturing and ingesting data from various sources in Azure Data Factory V2. A Pipeline is a logical grouping of activities Example Pipeline activities Copy Data activityData Transformation activities, using different compute environments, eg Data Flow, Azure functions, SQL Stored procedures, Azure data bricksControl Flow...CRM consulting and technology for Ireland and Europe, in the Public and private sector