Source: https://blogs.technet.microsoft.com/machinelearning/2017/02/09/load-data-from-azure-data-lake-into-azure-sql-data-warehouse-at-3tbhour/

 

Re-posted from the Azure blog.

AZURE SQL DATA WAREHOUSE (Azure SQL DW, or just SQL DW for short) is a SQL-based fully managed, petabyte-scale data warehousing solution in the cloud. It is highly elastic, enabling you to provision in minutes and scale capacity in seconds. You can scale compute and storage independently, allowing you to burst compute for complex analytical workloads or scale down your warehouse for archival scenarios. What’s more, you can pay by usage, rather than being locked into expensive predefined cluster configurations.

AZURE DATA LAKE (ADL) is a no-limits data lake optimized for massively parallel processing, and it lets you store and analyze petabyte-size files and trillions of objects.

A common use case involving ADL Store (ADLS) and SQL DW is the following: Raw data is ingested into ADLS from a variety of sources. ADL Analytics (ADLA) is used to clean and process the data into a loading-ready format. From there, high value data is imported into Azure SQL DW for interactive analytics.

Until recently, the data in ADLS would be loaded into SQL DW using row-by-row insertion which, obviously, consumed time and meant delays in how quickly data could be explored to gain useful business insights.

However, as we RECENTLY ANNOUNCED, with SQL DW PolyBase support for ADLS, you can now load data directly from ADLS into your SQL DW instance using External Tables at nearly 3TB per hour. Because SQL DW can now ingest data directly from Azure Storage Blob and ADLS, you can load data from any Azure storage service, giving you the flexibility to choose what’s right for your application. The picture below captures the “Before” and “After” situation.

Intrigued? Read THIS POST to learn more, including how to connect ADLS to SQL DW, and best practices for loading data. Learn more about the new PolyBase capability HERE. You can also check out a short VIDEO CLIP on how to use this new feature:

If you already have an Azure Data Lake Store, you can try LOADING YOUR DATA INTO SQL DATA WAREHOUSE. For those of you still exploring Azure Data Lake, check out these nice ADLS TUTORIALS which will get you up and running.

CIML Blog Team

Load Data From Azure Data Lake Into Azure SQL Data Warehouse At 3TB/Hour
Tagged on:             

Leave a Reply

Your email address will not be published. Required fields are marked *