Question Mapping very long to execute

Start
Prev
1
Next
End

letancel
Topic Author
Offline

03 Oct 2016 17:24 #1 by letancel

Mapping very long to execute was created by letancel

Hi,

I have a simple mapping which copies an AS400 table to a MSSQL table in Append mode.
I do a bulk to load data ("Action Process LOAD Rdbms to Mssql (with BULK)").
The source table contains 3 000 000 lines and it takes around 20 minutes to load them into the target table...
Is there a way to load them faster?

Thanks

Thomas BLETON
Offline

04 Oct 2016 09:59 #2 by Thomas BLETON

Replied by Thomas BLETON on topic Mapping very long to execute

I would first identify which step is taking so long (unload from DB2/400? Bulk load?).
You can consult the CORE_DURATION variable of each process step.

If the Bulk is concerned, try to execute it out of Stambia => does it make a difference?

letancel
Topic Author
Offline

04 Oct 2016 10:23 #3 by letancel

Replied by letancel on topic Mapping very long to execute

Hi,
The step which takes time is the unload from DB2/400.
It is executed from Analytics.

Thomas BLETON
Offline

04 Oct 2016 11:11 #4 by Thomas BLETON

Replied by Thomas BLETON on topic Mapping very long to execute

Please be more specific: is it the "Unload of data from source into file" action which is concerned? I suppose so.
You can try to tune it with the "Fetch Size" parameter. See stackoverflow.com/questions/30730383/how...for-the-select-query for an interesting discussion on this subject.

If this is not enough, try unloading data from DB2/400 using DB2 utilities: CALL SYSPROC.ADMIN_CMD('EXPORT TO /tmp/result.csv OF DEL SELECT c1,c2 FROM table')
=> If this is faster than the SqlFileExport "Unload of data from source into file" then you may want to design your process differently:
- First export data from DB2/400 using DB2 utilities: CALL SYSPROC.ADMIN_CMD('EXPORT TO /tmp/result.csv OF DEL SELECT c1,c2 FROM table')
- Get this file from the AS/400 to the SqlServer (the method depends on your servers' tools to transfer files)
- Load the file to SqlServer using Bulk

letancel
Topic Author
Offline

05 Oct 2016 11:04 #5 by letancel

Replied by letancel on topic Mapping very long to execute

The Fetch Size parameter doesn't change anything..

I would like to try the second option but I just want to make sure I understood it well :
In a process :
1. Operating System Command using the DB2 utilies CALL
2. Get Files to get the file from the AS400 server to the SQL Server server
3. Load the file using bullk in a mapping
I am correct?

Thanks

Thomas BLETON
Offline

05 Oct 2016 14:59 #6 by Thomas BLETON

Replied by Thomas BLETON on topic Mapping very long to execute

1. The OS Command can be used to execute commands into the Runtime's host system.
I suppose you don't have a Runtime running on the AS/400, right?
Instead, simply use a SQL Operation connecting to DB2/400 and containing CALL SYSPROC.ADMIN_CMD('....')

2. Get Files.... with what? First check your OS/400 admins to know which kind of file transfer protocol is available

3. Yes, correct

letancel
Topic Author
Offline

10 Oct 2016 09:56 - 10 Oct 2016 10:35 #7 by letancel

Replied by letancel on topic Mapping very long to execute

Hi,

I tried the Export File solution and the performances are not better (it's even a bit longer):
- it takes around 10 minutes to export the table from the AS400 database
- it takes 18 minutes to import the file (csv, 2Go) to the MSSQL target table using bulk

Is there a third solution?

By the way, I have a time out connection wich appears randomly on the mapping AS40 table to MSSQL table.
I checked on Internet but couldn't find the origin of the problem. I guess it's because of the amount of data transfered. Is it a java problem (heap space) ?

Last edit: 10 Oct 2016 10:35 by letancel.

Thomas BLETON
Offline

10 Oct 2016 11:56 #8 by Thomas BLETON

Replied by Thomas BLETON on topic Mapping very long to execute

OK, so the DB2/400 & Mssql native tools are not really faster.
Is it their duration with their default configuration? or did you try various parameters (see their respective documentation)?

A third solution can be to parallelize your job.
For example, base the extraction on an indexed or partition key which will let you extract independent batches of data.
Let's say I extract Customer data, I can parallelize 6 jobs:
- Job1 extracts customers where ID between 1 and 500000
- Job2 extracts customers where ID between 500001 and 1000000
etc.
You have to define your key and amount of rows.

Start
Prev
1
Next
End