I have a simple mapping which copies an AS400 table to a MSSQL table in Append mode.
I do a bulk to load data ("Action Process LOAD Rdbms to Mssql (with BULK)").
The source table contains 3 000 000 lines and it takes around 20 minutes to load them into the target table...
Is there a way to load them faster?
If this is not enough, try unloading data from DB2/400 using DB2 utilities: CALL SYSPROC.ADMIN_CMD('EXPORT TO /tmp/result.csv OF DEL SELECT c1,c2 FROM table')
=> If this is faster than the SqlFileExport "Unload of data from source into file" then you may want to design your process differently:
- First export data from DB2/400 using DB2 utilities: CALL SYSPROC.ADMIN_CMD('EXPORT TO /tmp/result.csv OF DEL SELECT c1,c2 FROM table')
- Get this file from the AS/400 to the SqlServer (the method depends on your servers' tools to transfer files)
- Load the file to SqlServer using Bulk
The Fetch Size parameter doesn't change anything..
I would like to try the second option but I just want to make sure I understood it well :
In a process :
1. Operating System Command using the DB2 utilies CALL
2. Get Files to get the file from the AS400 server to the SQL Server server
3. Load the file using bullk in a mapping
I am correct?
1. The OS Command can be used to execute commands into the Runtime's host system.
I suppose you don't have a Runtime running on the AS/400, right?
Instead, simply use a SQL Operation connecting to DB2/400 and containing CALL SYSPROC.ADMIN_CMD('....')
2. Get Files.... with what? First check your OS/400 admins to know which kind of file transfer protocol is available
I tried the Export File solution and the performances are not better (it's even a bit longer):
- it takes around 10 minutes to export the table from the AS400 database
- it takes 18 minutes to import the file (csv, 2Go) to the MSSQL target table using bulk
Is there a third solution?
By the way, I have a time out connection wich appears randomly on the mapping AS40 table to MSSQL table.
I checked on Internet but couldn't find the origin of the problem. I guess it's because of the amount of data transfered. Is it a java problem (heap space) ?
OK, so the DB2/400 & Mssql native tools are not really faster.
Is it their duration with their default configuration? or did you try various parameters (see their respective documentation)?
A third solution can be to parallelize your job.
For example, base the extraction on an indexed or partition key which will let you extract independent batches of data.
Let's say I extract Customer data, I can parallelize 6 jobs:
- Job1 extracts customers where ID between 1 and 500000
- Job2 extracts customers where ID between 500001 and 1000000
You have to define your key and amount of rows.