In this step we'll setup the dataflow from the SAP System and Cosmos DB towards the Synapse DB. Sales OrderHeaders will be extracted via the first Synapse pipeline using the SAP Table connector, Sales Order Items will be extracted via a second Synapse Pipeline using the SAP ECC (oData) connector. Payment data will be extracted from CosmosDB using a third pipeline.
The first step is to setup the target DB structures in Synapse.
The next step is to define the pipelines to copy the data. For this we first need to create technical connections, called Linked Services, to the different data sources and sinks. These Linked Servicesdefine the adapter to use and the corresponding connection parameters. For our example we'll need 4 Linked Services.
| Scenario | Source LinkedService | Sink Linked Service |
|---|---|---|
| Sales Order Header | SAP Table Connector | Synapse SQL Pool MicroHack |
| Sales Order Items | SAP ECC Connector | Synapse SQL Pool MicroHack |
| Payments | CosmosDB Collection - paymentData | Synapse SQL Pool MicroHack |
Based upon the Linked Services, we need to define the datasets to extract and where to write these within the target.
This is defined in Integration Datasets.
| Scenario | Source Integration Dataset | Sink Integration Dataset |
|---|---|---|
| Sales Order Header | CDS View - ZBD_ISALESDOC_E |
Table - SalesOrderHeaders |
| Sales Order Items | oData EntitySet - C_Salesorderitemfs |
Table -SalesOrderItems |
| Payments | CosmosDB Collection - paymentData |
Table - Payments |
The table beneath summarizes the Integration Datasets and Linked Services.
| Scenario | Source Integration Dataset | Source Linked Service | Sink Integration Dataset | Sink Linked Service |
|---|---|---|---|---|
| Sales Order Headers | CDS View ZBD_ISALESDOC_E |
SAP Table Connector | Synape Table SalesOrderHeaders |
microHack SQL Pool - Azure Synapse Analytics |
| Sales Order Items | oData Entity Set C_Salesorderitemfs |
SAP ECC Connector | Synape Table SalesOrderItems |
microHack SQL Pool - Azure Synapse Analytics |
| Payments | CosmosDB Collection paymentData |
Cosmos DB - SAPS4D DataBase - CosmosDB SQL API |
Synapse Table Payments |
microHack SQL Pool - Azure Synapse Analytics |
The last step is to define the Synapse Pipelineswhich will execute the copy. Here we link the source and sink/target datasets. This also where you can execute data mappings if necessary.
We'll start with creating the Synapse Tables, which will receive the extracted data. The we'll define the extraction pipelines one by one.
Create the Synapse tables in the SQL Pool
These tables are the receivers of the SAP Sales Order data and the Cosmos Payment Data.
Use the following SQL Scripts to create the tables.
You can do this via Synapse Studio or alternatively use the Azure Data Studio.
We'll describe the usage of Synapse Studio.
- In the Azure Portal, select your Synapse Workspace.
- Select
Open Synapse Studio
- Select 'Develop'
- Create SQL Scripts for each of the tables (
SalesOrderHeaders,SalesOrderItems,Payments)
- SalesOrderHeaders
CREATE TABLE SalesOrderHeaders(
BILLINGCOMPANYCODE nvarchar(4),
BILLINGDOCUMENTDATE date,
COUNTRY nvarchar(3),
CREATIONDATE date,
CREATIONTIME time,
CREDITCONTROLAREA nvarchar(4),
CUSTOMERACCOUNTGROUP nvarchar(4),
CUSTOMERGROUP nvarchar(2),
CUSTOMERNAME nvarchar(80),
DISTRIBUTIONCHANNEL nvarchar(2),
LASTCHANGEDATE date,
LASTCHANGEDATETIME decimal(21,0),
MANDT int,
ORGANIZATIONDIVISION nvarchar(2),
PRICINGDATE date,
PURCHASEORDERBYCUSTOMER nvarchar(35),
SALESDISTRICT nvarchar(6),
SALESDOCUMENT nvarchar(10) NOT NULL,
SALESDOCUMENTPROCESSINGTYPE nvarchar(1),
SALESDOCUMENTTYPE nvarchar(4),
SALESGROUP nvarchar(3),
SALESOFFICE nvarchar(4),
SALESORGANIZATION nvarchar(4),
SDDOCUMENTCATEGORY nvarchar(4),
SOLDTOPARTY nvarchar(10),
TOTALNETAMOUNT decimal(15, 2),
TRANSACTIONCURRENCY nvarchar(5),
CITYNAME nvarchar(35),
POSTALCODE nvarchar(10)
)- SalesOrderItems
CREATE TABLE SalesOrderItems(
SalesOrder nvarchar(10),
SalesOrderItem nvarchar(6),
SalesOrderItemText nvarchar(40),
SoldToParty nvarchar(10),
MaterialByCustomer nvarchar(35),
MaterialName nvarchar(40),
Material nvarchar(40),
ShipToParty nvarchar(10),
FullName nvarchar(80),
SDProcessStatus nvarchar(1),
DeliveryStatus nvarchar(1),
SDDocumentRejectionStatus nvarchar(1),
SalesDocumentRjcnReason nvarchar(2),
RequestedQuantity decimal(15,3),
RequestedQuantityUnit nvarchar(3),
TransactionCurrency nvarchar(5),
NetAmount decimal(16, 3),
MaterialGroup nvarchar(9),
Batch nvarchar(10),
ProductionPlant nvarchar(4),
StorageLocation nvarchar(4),
ShippingPointName nvarchar(30),
ShippingPoint nvarchar(4),
SalesOrderItemCategory nvarchar(4),
BillingBlockCriticality tinyint,
ItemBillingBlockReason nvarchar(2),
OrderRelatedBillingStatus nvarchar(1),
RequestedDeliveryDate date,
HigherLevelItem nvarchar(6),
SalesOrderProcessingType nvarchar(1),
RequirementSegment nvarchar(40)
)- Payments
CREATE TABLE Payments(
PaymentNr nvarchar(10),
SalesOrderNr nvarchar(10),
CustomerNr nvarchar(10),
CustomerName nvarchar(80),
PaymentDate date,
PaymentValue decimal(15,2),
Currency nvarchar(5)
)Note: Ensure the SQL Script is attached to your SQL Pool. By default it will be connected to the 'builtin' SQL pool of Synapse.
Make sure to run all the scripts in order to create the tables.
The sales order headers are extracted from SAP using the SAP Table Adapter which uses an RFC.
The view to extract from is : ZBD_ISALESDOC_E.
Note: You can have a look in the SAP system to check the contents. Use the Data Dictionary, transaction
SE11.
- In Synapse Studio, go to the
ManageView
- Select
Linked Services
- Create a new
Linked Serviceof typeSAP Table
- Enter the connection details for the SAP System, we use the name
S4DCLNT100. Use the username and password for the SAP system provided to you at the start of the Microhack.
Note : use
Test Connectionto verify your settings
Note : SAP Connection Details will be handed out at MicroHack
Create an Integration DataSet based on the previously created Linked Service.
This dataset will act as the source.
- Switch to the
DataView - Create a new
Integration Dataset
- Use type
SAP Table
-
Use your previously created Linked Service for the SAP System (Table connector), as name we used
S4DSalesOrderHeaders -
Use
ZBD_ISALESDOC_Eas table, it can take some time before the list of tables is loaded -
Use
Preview Datato check if the data can be retrieved
- Once the information is entered succesfully and the data can be retrieved, leave the tab as-is. We will publish the changes after the rest of the components of this data flow are done.
Note : the source code of the CDS View can be found here
-
this will represent the target/sink of the pipeline
-
Switch to the
Manageview -
Create a new Linked Service of type
Azure Synapse Analytics
Note: Since this linked service represents the Synapse SQL pool, it will be re-used in the
SalesOrderItemsandPaymentspipeline.
This dataset will act as the sink in our pipeline.
-
Switch to the
DataView -
Create a new
Integration DataSetfor the Synapse Sales Orders
-
As a name we used
SynSalesOrderHeadersand for the linked service we used the one we just createdSynMicroHackPool -
Select the
SalesOrderHeaderstable
- Again leave the information on the tab as-is and move to the next step
- Swith to the
Integrateview
- Create a new
Pipeline, we usedExtractSalesOrderHeadersas a name
- Use the
copy actionby dragging it onto the pipeline canvas
- In the
sourcetab, select your SAP Sales Order Dataset as the source
- In the
sinktab, select the Synapse Sales Order Dataset as the sink
Note : Ensure to select
PolyBase
- In the mapping tab, select
Import schemas. Since source and target fields have the same name, the system can auto-generate the mapping
- For the prediction model we will calculate the offset between the billing document date and the actual payment data. For this we need to have these date fields mapped to SQL Date fields. Therefore, go to the JSON Code for the pipeline and add
convertDateToDateTimeandconvertTimeToTimespanparameters.
Add the parameters convertDateToDatetime and convertTimeToTimespan at the existing typeproperties > source element. The resulting document should looks as follows :
"typeProperties": {
"source": {
"type": "SapTableSource",
"partitionOption": "None",
"convertDateToDatetime": true,
"convertTimeToTimespan": true
},
"sink": {
...- In the
Settingsblade,enable stagingand create create a Linked Service pointing to the Data Lake attached to your Synapse.
- Enter the path to the staging directory of your Azure Data Lake. The staging directory
sap-data-adls/staging, was already created by the Terraform script.
- publish and trigger the pipeline, use
Add trigger->Trigger now
- Swith to the
Monitorview to monitor the pipeline run
- Check the result in Synapse using SQL. You can do this via the
Developview and create a new SQL script.
select count(*) from SalesOrderHeaders
select * from SalesOrderHeadersThe SalesOrderItems are extracted from SAP using the SAP ECC Connector which is based on oData. We will use the oData service at http://<System IP>/sap/opu/odata/sap/sd_f1814_so_fs_srv/
- Create a
Linked Serviceof typeSAP ECC
- Enter the connection details, we used
S4DCLNT100_ODATAas the name
This dataset will act as the source for our pipeline.
-
Create a
Integration DataSetbased onSAP ECCadapter -
Use the previously created linked service and as a name we used
S4DSalesOrderItems -
Use
C_Salesorderitemfsas path
- Leave the information as-is and move to the next step
This dataset will act as the sink for our pipeline.
- Create a
Integration DataSetbased onAzure Synapse Analytics, we use the linked service created earlier
- As a name we use
SynSalesOrderItemsand select theSalesOrderItemstable
- Go to the
Integrateview, and execute the same steps as for the SalesOrderHeaders data - Create a new
Pipeline - Use the
Copyaction, as name we useExtractSalesOrderItems
- As source select the SAP SalesOrderItem oData Dataset, which we named as
S4DSalesOrderItems. - As sink, select the Synapse SalesOrderItem DataSet. We named this as
SynSalesOrderItems. - Under the
Mappingtab useImport schemas - Under the
Settingstab enable and configure theStaging Areaas done in the SalesOrderHeaders step - Publish, Trigger and Monitor the integration pipeline
- Create a new SQL scrtipt to check the result in Synapse
select count(*) from SalesOrderItems
select * from SalesOrderItemsThe Payments are extracted from CosmosDB and will be stored in a Synapse Table.
- Create a Linked Service of type CosmosDB (SQL API)
- Enter the connection parameters, as name we use
CosmosSAPS4D
Azure Cosmos DB account URI : <handed out at micro hack>
Azure Cosmos DB access key : <handed out at micro hack>
Database name : SAPS4D
- Test the connection and create the linked service.
This dataset will act as the source for our pipeline.
- Create a
sourceDataSet for the Payment Data based on the CosmosDBSQL APIAdapter
- As name we use
CosmosPaymentData. Use collection :paymentData.
This dataset will act as the sink for our pipeline
- Create a
Integration DataSetbased onAzure Synapse Analytics
- As name we use
SynPayments. Select thePaymentstable
- Go to the
Integrateview - Add a new
Pipeline - Use the
Copyaction and name itExtractPayments - As source select the Cosmos DB payment Dataset, we named this
CosmosPaymentData. - As sink, select the Synapse Payment DataSet. We names this
SynPayments. - Under the
Settingstab enable and configure theStaging Areaas done in the earlier pipelines - Go to the tab
Mappingand chooseImport schemas. Make sure to remove the mappings which are not shown in the screenshot starting with_, you can remove them my unchecking the checkbox behind them. Do not forget to change theColumn nameforValuetoPaymentValue.
- Create, publish and trigger the integration pipeline
- Check the result in Synapse using SQL
select count(*) from Payments
select * from PaymentsContinue with the next step.









































