This article explains and demonstrates the Azure Data Factory pricing model with detailed examples.

 

The prices used in these examples below are hypothetical and are not intended to imply actual pricing.

Copy data from AWS S3 to Azure Blob storage hourly

In this scenario, you want to copy data from AWS S3 to Azure Blob storage on an hourly schedule.

To accomplish the scenario, you need to create a pipeline with the following items:

  1. A copy activity with an input dataset for the data to be copied from AWS S3.
  2. An output dataset for the data on Azure Storage.
  3. A schedule trigger to execute the pipeline every hour.Scenario1
OperationsTypes and Units
Create Linked Service2 Read/Write entity
Create Datasets4 Read/Write entities (2 for dataset creation, 2 for linked service references)
Create Pipeline3 Read/Write entities (1 for pipeline creation, 2 for dataset references)
Get Pipeline1 Read/Write entity
Run Pipeline2 Activity runs (1 for trigger run, 1 for activity runs)
Copy Data Assumption: execution time = 10 min10 * 4 Azure Integration Runtime (default DIU setting = 4) For more information on data integration units and optimizing copy performance, see this article
Monitor Pipeline Assumption: Only 1 run occurred2 Monitoring run records retried (1 for pipeline run, 1 for activity run)

Total Scenario pricing: $0.16811

  • Data Factory Operations = $0.0001
    • Read/Write = 10*00001 = $0.0001 [1 R/W = $0.50/50000 = 0.00001]
    • Monitoring = 2*000005 = $0.00001 [1 Monitoring = $0.25/50000 = 0.000005]
  • Pipeline Orchestration & Execution = $0.168
    • Activity Runs = 001*2 = 0.002 [1 run = $1/1000 = 0.001]
    • Data Movement Activities = $0.166 (Prorated for 10 minutes of execution time. $0.25/hour on Azure Integration Runtime)

Copy data and transform with Azure Databricks hourly

In this scenario, you want to copy data from AWS S3 to Azure Blob storage and transform the data with Azure Databricks on an hourly schedule.

To accomplish the scenario, you need to create a pipeline with the following items:

  1. One copy activity with an input dataset for the data to be copied from AWS S3, and an output dataset for the data on Azure storage.
  2. One Azure Databricks activity for the data transformation.
  3. One schedule trigger to execute the pipeline every hour.
Scenario2
OperationsTypes and Units
Create Linked Service3 Read/Write entity
Create Datasets4 Read/Write entities (2 for dataset creation, 2 for linked service references)
Create Pipeline3 Read/Write entities (1 for pipeline creation, 2 for dataset references)
Get Pipeline1 Read/Write entity
Run Pipeline3 Activity runs (1 for trigger run, 2 for activity runs)
Copy Data Assumption: execution time = 10 min10 * 4 Azure Integration Runtime (default DIU setting = 4) For more information on data integration units and optimizing copy performance, see this article
Monitor Pipeline Assumption: Only 1 run occurred3 Monitoring run records retried (1 for pipeline run, 2 for activity run)
Execute Databricks activity Assumption: execution time = 10 min10 min External Pipeline Activity Execution

Total Scenario pricing: $0.16916

  • Data Factory Operations = $0.00012
    • Read/Write = 11*00001 = $0.00011 [1 R/W = $0.50/50000 = 0.00001]
    • Monitoring = 3*000005 = $0.00001 [1 Monitoring = $0.25/50000 = 0.000005]
  • Pipeline Orchestration & Execution = $0.16904
    • Activity Runs = 001*3 = 0.003 [1 run = $1/1000 = 0.001]
    • Data Movement Activities = $0.166 (Prorated for 10 minutes of execution time. $0.25/hour on Azure Integration Runtime)
    • External Pipeline Activity = $0.000041 (Prorated for 10 minutes of execution time. $0.00025/hour on Azure Integration Runtime)

Copy data and transform with dynamic parameters hourly

In this scenario, you want to copy data from AWS S3 to Azure Blob storage and transform with Azure Databricks (with dynamic parameters in the script) on an hourly schedule.

To accomplish the scenario, you need to create a pipeline with the following items:

  1. One copy activity with an input dataset for the data to be copied from AWS S3, an output dataset for the data on Azure storage.
  2. One Lookup activity for passing parameters dynamically to the transformation script.
  3. One Azure Databricks activity for the data transformation.
  4. One schedule trigger to execute the pipeline every hour.
Scenario3
OperationsTypes and Units
Create Linked Service3 Read/Write entity
Create Datasets4 Read/Write entities (2 for dataset creation, 2 for linked service references)
Create Pipeline3 Read/Write entities (1 for pipeline creation, 2 for dataset references)
Get Pipeline1 Read/Write entity
Run Pipeline4 Activity runs (1 for trigger run, 3 for activity runs)
Copy Data Assumption: execution time = 10 min10 * 4 Azure Integration Runtime (default DIU setting = 4) For more information on data integration units and optimizing copy performance, see this article
Monitor Pipeline Assumption: Only 1 run occurred4 Monitoring run records retried (1 for pipeline run, 3 for activity run)
Execute Lookup activity Assumption: execution time = 1 min1 min Pipeline Activity execution
Execute Databricks activity Assumption: execution time = 10 min10 min External Pipeline Activity execution

Total Scenario pricing: $0.17020

  • Data Factory Operations = $0.00013
    • Read/Write = 11*00001 = $0.00011 [1 R/W = $0.50/50000 = 0.00001]
    • Monitoring = 4*000005 = $0.00002 [1 Monitoring = $0.25/50000 = 0.000005]
  • Pipeline Orchestration & Execution = $0.17007
    • Activity Runs = 001*4 = 0.004 [1 run = $1/1000 = 0.001]
    • Data Movement Activities = $0.166 (Prorated for 10 minutes of execution time. $0.25/hour on Azure Integration Runtime)
    • Pipeline Activity = $0.00003 (Prorated for 1 minute of execution time. $0.002/hour on Azure Integration Runtime)
    • External Pipeline Activity = $0.000041 (Prorated for 10 minutes of execution time. $0.00025/hour on Azure Integration Runtime)

Using mapping data flow debug for a normal workday

As a Data Engineer, you are responsible for designing, building, and testing mapping data flows every day. You log into the ADF UI in the morning and enable the Debug mode for Data Flows. The default TTL for Debug sessions is 60 minutes. You work throughout the day for 8 hours, so your Debug session never expires. Therefore, your charge for the day will be:

8 (hours) x 8 (compute-optimized cores) x $0.193 = $12.35

Transform data in blob store with mapping data flows

In this scenario, you want to transform data in Blob Store visually in ADF mapping data flows on an hourly schedule.

To accomplish the scenario, you need to create a pipeline with the following items:

  1. A Data Flow activity with the transformation logic.
  2. An input dataset for the data on Azure Storage.
  3. An output dataset for the data on Azure Storage.
  4. A schedule trigger to execute the pipeline every hour.
OperationsTypes and Units
Create Linked Service2 Read/Write entity
Create Datasets4 Read/Write entities (2 for dataset creation, 2 for linked service references)
Create Pipeline3 Read/Write entities (1 for pipeline creation, 2 for dataset references)
Get Pipeline1 Read/Write entity
Run Pipeline2 Activity runs (1 for trigger run, 1 for activity runs)
Data Flow Assumptions: execution time = 10 min + 10 min TTL10 * 16 cores of General Compute with TTL of 10
Monitor Pipeline Assumption: Only 1 run occurred2 Monitoring run records retried (1 for pipeline run, 1 for activity run)

Total Scenario pricing: $1.4631

  • Data Factory Operations = $0.0001
    • Read/Write = 10*00001 = $0.0001 [1 R/W = $0.50/50000 = 0.00001]
    • Monitoring = 2*000005 = $0.00001 [1 Monitoring = $0.25/50000 = 0.000005]
  • Pipeline Orchestration & Execution = $1.463
    • Activity Runs = 001*2 = 0.002 [1 run = $1/1000 = 0.001]
    • Data Flow Activities = $1.461 prorated for 20 minutes (10 mins execution time + 10 mins TTL). $0.274/hour on Azure Integration Runtime with 16 cores general compute

Add to Del.cio.us RSS Feed Add to Technorati Favorites Stumble It! Digg It!
    BuziTweet


Gerry Reid

Gerry Reid

“Technology leader with 20+ years experience in Agile IT Development, consulting, operations, delivery, project management in CRM, robotics, automation, cloud in financial services, telecoms & consulting sector",
https://crmanalytics.net/wp-content/uploads/2020/04/Euros-1024x683.jpghttps://crmanalytics.net/wp-content/uploads/2020/04/Euros-150x150.jpgGerry ReidAzure Data FactoryAzure,Data Factory,PricingThis article explains and demonstrates the Azure Data Factory pricing model with detailed examples.   The prices used in these examples below are hypothetical and are not intended to imply actual pricing. Copy data from AWS S3 to Azure Blob storage hourly In this scenario, you want to copy data from AWS S3...CRM consulting and technology for Ireland and Europe, in the Public and private sector