BOB: Business Objects Board
Not endorsed by or affiliated with SAP

Register | Login 

Decision First 
Decision First (Opens a new window)  

General Notice: Upcoming Events: BI show Toulouse: Oct 18. There are 50 ASUG sponsored events in the next 45 days, see calendar for details.

What volumes of data do you move?


 
Search this topic... | Search Data Integrator... | Search Box
Register or Login to Post    Forum Index -> Data Integrator  Previous TopicPrint TopicNext Topic
Author Message
wdaehn
Forum Groupie
Forum Groupie



Joined: 17 Dec 2004

speaker.gif*6
Posts: 7635
Location: Germany


flag
PostPosted: Mon Sep 15, 2008 9:03 am 
Post subject: What volumes of data do you move?

I would like to know
* how many MB of data you load daily
* what the size your target database is in GB
* how many rows the biggest dataflows load
* what the overall load time is for the daily run


Some values are easy to get, others might not, just answer what you have at hand. This should not be a competition on who has the most, it should help us to understand where we are going. Should we focus more on performance, is usability more important as the volumes are so small for the majority of the customers etc...

_________________
Werner Daehn
Product Manager Data Integrator
SAP Business Objects

PLEASE SUPPORT THIS: Run one ATL job and tell us the benchmark result. Details to be found here.
Back to top
dnewton
Forum Addict
Forum Addict



Joined: 30 Jan 2004
ASUG Icon
Posts: 3194
Location: Massachusetts, USA


flag
PostPosted: Tue Sep 16, 2008 6:56 am 
Post subject: Re: What volumes of data do you move?

1. Not sure of the Mb - looking at the Loader statistics, we load on average 65M rows per night. This number includes re-aggregating certain tables etc., but does not include Data Transfer loading or staging of data.

2. For the core ETL process: Two targets (central DWH and spun-off Marts), totaling ~650Gb. 390 tables across both databases.

3. Largest regularly nightly load is 5M rows. Largest single table is 100M rows.

4. Takes 5 hours to load these 65M rows. Would be faster but we have poor change-data information on several sources.

I'd vote for programmer productivity/usability features, followed closely by 64-bit performance and driver support. icon_smile.gif
Back to top
dbeers
Senior Member
Senior Member



Joined: 19 Aug 2002

Posts: 42
Location: Commerce Township, MI, USA



PostPosted: Tue Sep 23, 2008 12:22 pm 
Post subject: Re: What volumes of data do you move?

We are using DI 11.7.2 to move data near real-time by looping DI batch job flows, so I will try to answer your query from that perspective.

I suspect we move about 5-10GB/day into our 4 datamarts via 4 DI jobs. The largest datamart is currently around 30GB, and the source data warehouse database is about 160GB - all reside on SQL Servers, the source being on a 2000 box and the datamart targets are all on a 2005 box.

Occasionally we need to do a full refresh of the datamarts, and when we do, we are moving about 100M rows into the 4 datamarts. The longest refresh (time-wise) is into the SalesMart - primarily due to transformations. Its largest table has 4.1M rows and the smallest only has 1 row. The total record count for the datamart is in the vicinity of 6M rows. To do this refresh generally takes around 6 hours.

The largest table that gets refreshed is about 24.3M rows (9GB) and takes about 4 hours to refresh.

What I would like to see most is DI support for AS400. When we purchased DI a year ago, we replaced the old ETL tool only for our SQL to SQL needs (building out the datamarts from the warehourse), leaving our existing tool to run between the ERP system on the AS400 and the SQL Server based data warehouse. At that time, we were told that a DI AS400 interface was in the works and was going to be available Q4 2007. Now we are told that there is no AS400 interface in the works, and since we are moving off our old ETL tool completely sometime in the next 6-12 months (due to rising licensing costs with no value add for the existing product), we are now looking at implementing yet another ETL tool for the AS400 to SQL piece.

_________________
Doug Beers
HoMedics, Inc.
Back to top
wdaehn
Forum Groupie
Forum Groupie



Joined: 17 Dec 2004

speaker.gif*6
Posts: 7635
Location: Germany


flag
PostPosted: Tue Sep 23, 2008 1:29 pm 
Post subject: Re: What volumes of data do you move?

Can you clarify the AS/400 piece?

We have customers using DB2 connect to read from AS/400, others generate files and read them in via DI Cobol Copybooks and some use the Attunity Mainframe connector for that.

What do you need, what method are you using today, what's missing?

_________________
Werner Daehn
Product Manager Data Integrator
SAP Business Objects

PLEASE SUPPORT THIS: Run one ATL job and tell us the benchmark result. Details to be found here.
Back to top
dbeers
Senior Member
Senior Member



Joined: 19 Aug 2002

Posts: 42
Location: Commerce Township, MI, USA



PostPosted: Wed Sep 24, 2008 7:03 am 
Post subject: Re: What volumes of data do you move?

Specifically what we are looking for is CDC support to the JDE tables on our AS400 to maintain our real-time data warehouse. We have been told that DI does not support that at this time. We currently use DataMirror (now IBM) Transformation Server to handle that functionality.
_________________
Doug Beers
HoMedics, Inc.
Back to top
1010
Principal Member
Principal Member



Joined: 19 Aug 2006

Posts: 453
Location: Somewhere EveryWhere!


flag
PostPosted: Fri Jan 08, 2010 1:33 pm 
Post subject: Re: What volumes of data do you move?

Hi,

We are planing to move around >= 500Gb of data from Informix Dynamic Server 9.52.FC5W4X3 source to Oracle 10g DB target, which will be our new ODS with 137 tables, and then from ODS to our new Dataware house, consisting of 38 dimensions all of Type-1 and 15 facts tables.

We are using DI 11.7 on Win server 2003.

The largest table we have in our Source has around 300 mill rows.

I will let you know, once we move it, from Informix to Oracle, hopefully pretty soon!


Thanks,

_________________
@Data is Delicious

++++++++++++++++
Where you see risk,
I see opportunity!
Back to top
David Coutts
Forum Member
Forum Member



Joined: 03 Jul 2009

Posts: 21
Location: Scotland


flag
PostPosted: Tue Jan 12, 2010 4:01 am 
Post subject: Re: What volumes of data do you move?

On a nightly basis we will be moving about 250Mb of data and once a week 8.5Gb of data (at the moment)

The target databse is currently around 20Gb

The biggest dataflows process approxiamatley 7000 rows on a nightly basis but are expected to load 4 Million Rows on a weekly basis.

The Nightly Load takes approximatly 1.5-2 hours whereas the weekly refresh takes approximatly 22 hours. It takes so long as we have to do some pretty silly things with the data from flat files to get it into a usable format.

Like dnewton i agree that Usability is the key for us here
Back to top
k7nixen
Senior Member
Senior Member



Joined: 07 Oct 2006

Posts: 88


flag
PostPosted: Wed Jan 13, 2010 3:24 pm 
Post subject: Re: What volumes of data do you move?

Approximately 50 million records are staged from flat files. After data validation, transformation and aggregation the data is moved to the target. Especially transformation part is very complex since the data is coming from different sources. Based on customer's needs we created a common data architecture model. All of the common data is translated to new business terminalogy based on the predefined rules etc. Currently the biggest target table has more than 20 billion records (it is a GSM -telecommunication- company). Data retention is 400 days so on 401st day, the 1st partition is dropped automatically which means the final table won't grow significantly as long as daily volume doesn't increase dramatically.

Daily load takes 50-55 minutes.
HP-UX Itanium (16 CPUs & 48GM RAM)
DI 11.5
Oracle 10g

Z.
Back to top
1010
Principal Member
Principal Member



Joined: 19 Aug 2006

Posts: 453
Location: Somewhere EveryWhere!


flag
PostPosted: Mon Jun 21, 2010 11:03 am 
Post subject: Re: What volumes of data do you move?

Alright Folks...

We just moved around 75 GB (including Indexes) of data from our Legacy Sources to Oracle 10g target in 55 hrs. we loaded around 120 ODS tables.

The Primary sources was IBM Informix Dynamic Server Version 11.50.FC5W4X3, Oracle 10g DB, Flat Files ...

This was not a direct dump, the task was to construct new ODS from these Sources. Some of the ETL code was fairly complex with lots of look_up and complex decode statements.

One of the ETL transfered 23 mill rows in just 1:33 hrs, pretty neat huhhhh..It had 5 look-ups and 3 decode statements, one of them 5 levels deep.

Our Machine Config:

Windows Server 2003 R2 Enterprise x64 edition, 2 Quad CPU @ 3 GHZ, with 32 GB of Ram.

Oracle 10g target DB on UNIX and BODI server is on different machine.

This is just 1/8 of data that we migrated, just for testing purpose!

Any recommendations, we want to minimize our time.

Thanks,

_________________
@Data is Delicious

++++++++++++++++
Where you see risk,
I see opportunity!
Back to top
k7nixen
Senior Member
Senior Member



Joined: 07 Oct 2006

Posts: 88


flag
PostPosted: Mon Jun 21, 2010 8:46 pm 
Post subject: Re: What volumes of data do you move?

SSV wrote:
Alright Folks...

We just moved around 75 GB (including Indexes) of data from our Legacy Sources to Oracle 10g target in 55 hrs. we loaded around 120 ODS tables.

The Primary sources was IBM Informix Dynamic Server Version 11.50.FC5W4X3, Oracle 10g DB, Flat Files ...

This was not a direct dump, the task was to construct new ODS from these Sources. Some of the ETL code was fairly complex with lots of look_up and complex decode statements.

One of the ETL transfered 23 mill rows in just 1:33 hrs, pretty neat huhhhh..It had 5 look-ups and 3 decode statements, one of them 5 levels deep.

Our Machine Config:

Windows Server 2003 R2 Enterprise x64 edition, 2 Quad CPU @ 3 GHZ, with 32 GB of Ram.

Oracle 10g target DB on UNIX and BODI server is on different machine.

This is just 1/8 of data that we migrated, just for testing purpose!

Any recommendations, we want to minimize our time.

Thanks,


Given the server configuration, the numbers look pretty impressive. I wonder if you guys did any performance tuning at the database level though, just curiosity... You might get a much better result as I was loading 50 million records in less than an hour(the whole ETL job not only one dataflow, including staging, transformation and dwh population).
Back to top
1010
Principal Member
Principal Member



Joined: 19 Aug 2006

Posts: 453
Location: Somewhere EveryWhere!


flag
PostPosted: Tue Jun 22, 2010 10:40 pm 
Post subject: Re: What volumes of data do you move?

k7nixen,

Just curious, when you say 50 mill < 1 hour, I am amazed.

Could you shed some light on your Architecture, i.e., BODI server, Source and Target DB etc....

Some finer point of any DF, that loads large data in target table.

I believe, we did optimize our DB for this load process. But, If there is any other setting that I am missing, I will be very glad to know the same.

_________________
@Data is Delicious

++++++++++++++++
Where you see risk,
I see opportunity!
Back to top
wdaehn
Forum Groupie
Forum Groupie



Joined: 17 Dec 2004

speaker.gif*6
Posts: 7635
Location: Germany


flag
PostPosted: Wed Jun 23, 2010 4:16 am 
Post subject: Re: What volumes of data do you move?

One hour for 50 million is 14'000rows/sec. Not bad, but not exceptional. In my tests I came close to 200'000rows/sec API bulkloading for a DWH fact table and comparable hardware.

To make the numbers comparable we need to sync up on
* API bulkload or regular load
* DoP, Number of Loaders, Partitioning
* width of the table loaded
* complexity of the transformations


Both of you should run the customer benchmark so we can compare!

_________________
Werner Daehn
Product Manager Data Integrator
SAP Business Objects

PLEASE SUPPORT THIS: Run one ATL job and tell us the benchmark result. Details to be found here.
Back to top
1010
Principal Member
Principal Member



Joined: 19 Aug 2006

Posts: 453
Location: Somewhere EveryWhere!


flag
PostPosted: Wed Jun 23, 2010 11:36 am 
Post subject: Re: What volumes of data do you move?

I think , I am replying in wrong section here...But it make sense to me, to put it here.

Source DB: Oracle 10g on Unix
Target DB : Oracle 10g on Unix
BODI 11.7.3.2 Server: Windows Server 2003 R2 Enterprise x64 edition, 2 Quad CPU @ 3 GHZ, with 32 GB of Ram.
Network, Target/Source to DI : Gigabit Etehrnet


All of the above are on different Machine.

Here are the numbers from the Test Environment, I ran this Job 5 times, and the number is from last run.

6104 7028 DATAFLOW 6/23/2010 12:17:57 PM Process to execute data flow <DF_Benchmark_read> is completed.
5808 6540 PRINTFN 6/23/2010 12:17:57 PM Time______________________________ 89 secounds

5180 1416 DATAFLOW 6/23/2010 12:22:49 PM Process to execute data flow <DF_Benchmark_API_bulkloader> is completed.
5808 6540 PRINTFN 6/23/2010 12:22:49 PM Time______________________________ 291 secounds

6896 6468 DATAFLOW 6/23/2010 12:29:29 PM Process to execute data flow <DF_Benchmark_regular_load> is completed.
5808 6540 PRINTFN 6/23/2010 12:29:29 PM Time______________________________ 388 secounds

4432 4600 DATAFLOW 6/23/2010 12:29:58 PM Process to execute data flow <DF_Benchmark_single_thread> is completed.
5808 6540 PRINTFN 6/23/2010 12:29:58 PM Time______________________________ 15 secounds

5040 5000 DATAFLOW 6/23/2010 12:30:26 PM Process to execute data flow <DF_Benchmark_lookup_DOP1> is completed.
5808 6540 PRINTFN 6/23/2010 12:30:26 PM Time______________________________ 27 secounds

5008 6900 DATAFLOW 6/23/2010 12:30:42 PM Process to execute data flow <DF_Benchmark_lookup_DOP10> is completed.
5808 6540 PRINTFN 6/23/2010 12:30:42 PM Time______________________________ 15 secounds


Thanks,

Any Suggestion!

_________________
@Data is Delicious

++++++++++++++++
Where you see risk,
I see opportunity!
Back to top
1010
Principal Member
Principal Member



Joined: 19 Aug 2006

Posts: 453
Location: Somewhere EveryWhere!


flag
PostPosted: Mon Jul 12, 2010 11:13 am 
Post subject: Re: What volumes of data do you move?

Werner,

After doing some optimization for one of my DF, I was able to get it under 25 mins to process 20 mill rows. i.e., 13334 rows /sec using API bulk load method, if I use Parallel loader it ran for 56 mins. Initially, it use to took like 2 hrs to load.

Still, I am not happy with this number, when I see ours (200'000 rows/sec ), would you mind posting this DF of ours, so that I can peek into it. I have to load around 250 Mill rows, when we go-live!

In your Test, where the Source/Target + BODI server on same Machine?; How many look_ups that Fact table had?


Thanks,

_________________
@Data is Delicious

++++++++++++++++
Where you see risk,
I see opportunity!
Back to top
Display posts from previous:   
Register or Login to Post    Forum Index -> Data Integrator  Previous TopicPrint TopicNext Topic
Page 1 of 1 All times are GMT - 5 Hours
 
Jump to:  

Index | About | FAQ | RAG | Privacy | Search |  Register |  Login 

Not endorsed by or affiliated with SAP
Powered by phpBB © phpBB Group
Generated in 0.1807 seconds using 14 queries. (SQL 0.0079 Parse 0.1518 Other 0.0210)
Hosted by ForumTopics.com | Terms of Service
phpBB Customizations by the phpBBDoctor.com
Shameless plug for MomentsOfLight.com Moments of Light Logo