BOB: Business Objects Board
Not endorsed by or affiliated with SAP

Register | Login 

Sunset Ride 
Sunset Ride 

General Notice: BOB is going to retire...please see details here. All BOB forums have been converted to "read only" as of September 1st.
General Notice: No events within the next 45 days.

Benchmark YOUR hardware and compare the results V1.1
1 members found this topic helpful
Goto page Previous  1, 2, 3, 4, 5, 6, 7, 8, 9, 10  Next
 
Search this topic... | Search DI: Performance and Tuning... | Search Box
Register or Login to Post    Forum Index -> Data Integrator -> DI: Performance and Tuning  Previous TopicPrint TopicNext Topic
Author Message
data_guy
Forum Associate
Forum Associate



Joined: 19 Aug 2006
ASUG Icon
Posts: 723
Location: Somewhere EveryWhere!


flag
PostPosted: Wed Apr 11, 2007 4:26 pm 
Post subject: Re: Benchmark YOUR hardware and compare the results

DI JOB SERVER:

Microsoft Windows server 2003 Standard edition SP1
AMD Opteron processor 875
2.20 GHz, 3.51 GB of RAM

------------------------------------------------------------------------------

DI DESIGNER:

Microsoft Windows XP PROFESSIONAL SP2
INTEL(R) CPU T4200 @ 1.83 GHZ 1.00 GB of RAM


SOURCE AND TARGET DATABASE: ORCALE 10g


SOURCE/TARGET DATABASE AND DI JOB SERVER ARE ON DIFFERENT MACHINE.

CONNECTED by Gig port SWITCH

<DF_Benchmark_read_MS>: 146 seconds
<DF_Benchmark_API_bulkloader_MS>: 269 seconds, WITHOUT PARTITION OPTION
<DF_Benchmark_API_bulkloader_MS>: 4520 seconds, WITH PARTITION OPTION
<DF_Benchmark_regular_load_MS>: 1606 seconds
<DF_Benchmark_single_thread_MS>: 43 seconds
<DF_Benchmark_lookup_DOP1_MS>: 91 seconds
<DF_Benchmark_lookup_DOP10_MS>: 88 seconds



Any suggestion.........

Thanks,
Back to top
dnewton
Forum Addict
Forum Addict



Joined: 30 Jan 2004
ASUG Icon
Posts: 3446
Location: Massachusetts, USA


flag
PostPosted: Mon Apr 23, 2007 6:42 am 
Post subject: Re: Benchmark YOUR hardware and compare the results

Is there enough data yet to draw any conclusions?

Everyone else: C'mon, give it a try, it only takes maybe 10-15 minutes to do the test. icon_mrgreen.gif
Back to top
Werner Daehn
Forum Devotee
Forum Devotee



Joined: 17 Dec 2004

speaker.gif*6
Posts: 10593
Location: Germany


flag
PostPosted: Mon Apr 23, 2007 9:21 am 
Post subject: Re: Benchmark YOUR hardware and compare the results

@Data_Guy: Sorry, overlooked this one. Partitioned load into Oracle is slower than without the partitions??? I wonder if you really really really used API bulkloader when enabling the partitioned loader.

Reimport the table, check the loader settings and execute again. btw, the bug that you cannot set "enable partitions" when API bulkloader is turned on, I filed already. I set the flag first and then turn on bulkloader so that this field is greyed out but checked - that works for me at least. Please check if you have multiple lines saying "bulkloaderAPI" in the trace log.

_________________
Werner Daehn
ex Product Manager Data Services at SAP SE

www.rtdi.io


Last edited by Werner Daehn on Mon Apr 23, 2007 9:46 am, edited 1 time in total
Back to top
Werner Daehn
Forum Devotee
Forum Devotee



Joined: 17 Dec 2004

speaker.gif*6
Posts: 10593
Location: Germany


flag
PostPosted: Mon Apr 23, 2007 9:45 am 
Post subject: Re: Benchmark YOUR hardware and compare the results

Conclusions....
    * GBit Ethernet is important (150secs vs. 90secs local).
    * 10Gbit Ethernet Cards will not help much as one reader thread will consume one CPU entirely then - so partitioned reader is required.
    * Loading, even with API bulkloader, is times slower than reading.
    * Yes, avoid loading via the network.
    * Loading gets faster with disk arrays - so I/O has to be the bottleneck for the DB.
    * DOP scales well with the number of CPUs - within natural borders obviously.


Too little data to understand
    * difference between 6.5 and 11.5. and 11.7
    * VMWare vs. native
    * Windows vs. Unix
    * 8CPU servers (jobs are not built to get better numbers for > 8CPU)

_________________
Werner Daehn
ex Product Manager Data Services at SAP SE

www.rtdi.io


Last edited by Werner Daehn on Mon Apr 23, 2007 1:24 pm, edited 1 time in total
Back to top
data_guy
Forum Associate
Forum Associate



Joined: 19 Aug 2006
ASUG Icon
Posts: 723
Location: Somewhere EveryWhere!


flag
PostPosted: Mon Apr 23, 2007 1:05 pm 
Post subject: Re: Benchmark YOUR hardware and compare the results

yes Werner...
There was a problem with the partitioning thing.
As suggested by you, below is the new statistics:


--------------------------------------------------------------------------------

DI JOB SERVER:

Microsoft Windows server 2003 Standard edition SP1
AMD Opteron processor 875
2.20 GHz, 3.51 GB of RAM

------------------------------------------------------------------------------

DI DESIGNER:

Microsoft Windows XP PROFESSIONAL SP2
INTEL(R) CPU T4200 @ 1.83 GHZ 1.00 GB of RAM


SOURCE AND TARGET DATABASE: ORCALE 10g


SOURCE/TARGET DATABASE AND DI JOB SERVER ARE ON DIFFERENT MACHINE CONNECTED by Gig port SWITCH



Process to execute data flow <DF_Benchmark_read> is completed
Time______________________________ 80 secounds

Process to execute data flow <DF_Benchmark_API_bulkloader> is completed.
Time______________________________ 752 secounds


Process to execute data flow <DF_Benchmark_regular_load> is completed.
Time______________________________ 1728 secounds

process to execute data flow <DF_Benchmark_single_thread> is completed.
Time______________________________ 40 secounds

Process to execute data flow <DF_Benchmark_lookup_DOP1> is completed.
Time______________________________ 66 secounds

Process to execute data flow <DF_Benchmark_lookup_DOP10> is completed.
Time______________________________ 56 secounds


Thanks,


Last edited by data_guy on Mon Jun 11, 2007 12:35 pm, edited 1 time in total
Back to top
data_guy
Forum Associate
Forum Associate



Joined: 19 Aug 2006
ASUG Icon
Posts: 723
Location: Somewhere EveryWhere!


flag
PostPosted: Fri Apr 27, 2007 9:16 am 
Post subject: Re: Benchmark YOUR hardware and compare the results

We just got our PROD BOX....
Below are the results from it:

DI DESIGNER/JOB SERVER:
Microsoft Windows server 2003 Enterprise edition SP1
Intel (R) XEON(TM) MP CPU 3.3 GHZ QUAD CPU, 12 GB RAM

------------------------------------------------------------------------------

SOURCE AND TARGET DATABASE: ORCALE 10g Same Machine.

SOURCE/TARGET DATABASE SERVER AND DI JOB SERVER ARE ON DIFFERENT MACHINE. CONNECTED by Gig port SWITCH

Process to execute data flow <DF_Benchmark_read> is completed.
Time______________________________ 187 secounds


=====================================================================================

BULK LOADER OPTIONS:=
BULK LOAD: API
MODE: TRUNCATE

DATAFLOW: Process to execute data flow <DF_Benchmark_API_bulkloader> is completed.
Time______________________________ 906 secounds


BULK LOADER OPTIONS:=
BULK LOAD: FILE
MODE: TRUNCATE and
DIRECT PATH: ENABLED

2:27:20 PM Bulk Loader Utility in <Dataflow DF_Benchmark_API_bulkloader> completed successfully.
2:27:20 PM Data flow <DF_Benchmark_API_bulkloader> is completed successfully.
2:27:20 PM Process to execute data flow <DF_Benchmark_API_bulkloader> is completed.
PRINTFN 6/11/2007 2:27:20 PM Time______________________________ 2548 secounds



BULK LOADER OPTIONS:=
BULK LOAD: FILE
MODE: TRUNCATE and
DIRECT PATH: DISABLED

3:13:40 PM Bulk Loader Utility in <Dataflow DF_Benchmark_API_bulkloader> completed successfully.
3:13:40 PM Data flow <DF_Benchmark_API_bulkloader> is completed successfully.
3:13:40 PM Process to execute data flow <DF_Benchmark_API_bulkloader> is completed.
PRINTFN 6/11/2007 3:13:41 PM Time______________________________ 2670 secounds


=====================================================================================

DATAFLOW: Process to execute data flow <DF_Benchmark_regular_load> is completed.
Time______________________________ 1610 secounds
DATAFLOW: Process to execute data flow <DF_Benchmark_single_thread> is completed.
Time______________________________ 36 secounds
DATAFLOW: Process to execute data flow <DF_Benchmark_lookup_DOP1> is completed.
Time______________________________ 92 secounds
DATAFLOW: Process to execute data flow <DF_Benchmark_lookup_DOP10> is completed.
Time______________________________ 33 secounds


Thanks,


Last edited by data_guy on Mon Jun 11, 2007 2:19 pm, edited 2 times in total
Back to top
Werner Daehn
Forum Devotee
Forum Devotee



Joined: 17 Dec 2004

speaker.gif*6
Posts: 10593
Location: Germany


flag
PostPosted: Sat Apr 28, 2007 2:17 pm 
Post subject: Re: Benchmark YOUR hardware and compare the results

Are you sure about the API bulkloader????
_________________
Werner Daehn
ex Product Manager Data Services at SAP SE

www.rtdi.io
Back to top
dlinstedt
Forum Member
Forum Member



Joined: 24 Jun 2007

Posts: 6



PostPosted: Sun Jun 24, 2007 10:16 pm 
Post subject: Re: Benchmark YOUR hardware and compare the results

Hi,

One of the things I think you're missing is the SIZE of the file in bytes, and the row-size/width of the file, along with the datatypes of each column that you are loading.

This is typically what is "forgotten" when discussing performance numbers, I'd like to see what you are using.

Thanks,
Dan Linstedt
danL@DanLinstedt.com
Back to top
Werner Daehn
Forum Devotee
Forum Devotee



Joined: 17 Dec 2004

speaker.gif*6
Posts: 10593
Location: Germany


flag
PostPosted: Mon Jun 25, 2007 2:26 am 
Post subject: Re: Benchmark YOUR hardware and compare the results

I am honored, Dan, welcome.

The row-length is 7 times a varchar(100) plus one int column, all filled to the max. The table size with all the overheads (pctfree, block pointers etc) is about 6GB for the 8 Mio rows.

_________________
Werner Daehn
ex Product Manager Data Services at SAP SE

www.rtdi.io
Back to top
Werner Daehn
Forum Devotee
Forum Devotee



Joined: 17 Dec 2004

speaker.gif*6
Posts: 10593
Location: Germany


flag
PostPosted: Mon Jun 25, 2007 2:34 am 
Post subject: Re: Benchmark YOUR hardware and compare the results

@Data_Guy: Your machine is low on disk throughput. The file bulkloader of Oracle does nothing else than what our regular loader does (array inserts of pre-parsed statements), except reading and parsing the flat file. Same thing if you compare API bulkloader with file bulkloader with direct path=true.

So overall, these numbers you came up with - 906sec for API, 1610sec for regular - are what I was looking for. These numbers make sense. For the file bulkloaders, the impact of the settings is burried under the noise of SQL*Loader dealing with the source flat file...

_________________
Werner Daehn
ex Product Manager Data Services at SAP SE

www.rtdi.io
Back to top
Werner Daehn
Forum Devotee
Forum Devotee



Joined: 17 Dec 2004

speaker.gif*6
Posts: 10593
Location: Germany


flag
PostPosted: Fri Aug 10, 2007 5:55 am 
Post subject: Re: Benchmark YOUR hardware and compare the results

beat this: 1.7mio rows PER SECOND icon_eek.gif or 20mins to load a 380GB table via the network. Couldn't get Oracle to perform better. icon_rolleyes.gif

The only thing I can say in public, for the price of that server you would get a very nice house in Silicon Valley icon_mrgreen.gif


We still need some more numbers from you, so please support this initiative!!!

_________________
Werner Daehn
ex Product Manager Data Services at SAP SE

www.rtdi.io
Back to top
dnewton
Forum Addict
Forum Addict



Joined: 30 Jan 2004
ASUG Icon
Posts: 3446
Location: Massachusetts, USA


flag
PostPosted: Thu Aug 16, 2007 3:52 pm 
Post subject: Re: Benchmark YOUR hardware and compare the results

Got a new database server machine, migrated from SQL 2000 to SQL 2005, and put it all on a SAN vs. local RAID. Tried all the tests again... and not much faster. icon_eek.gif Must be network overhead and the fact that the ETL engine server hasn't changed...



Source and Target on same server, DI engine on different server (reading & writing over the network)

Source and Target Db: MS SQL 2005
Source and Target Db Server: Win2003 r2 SP1 x64, (2) 2.6Ghz AMD CPUs, 16GB RAM, RAID 10 on SAN

DI Server: Win2003 r2 SP1 x64, (2) 2.4Ghz AMD CPUs, 16GB RAM, RAID 10 on SAN

Network, source to DI: Gigabit ethernet with fiber backbone
Network, DI to Target: Gigabit ethernet with fiber backbone

DI Version: 11.7.2.2


Results

<DF_Benchmark_read_MS>: 220 seconds
<DF_Benchmark_API_bulkloader_MS>: 271 seconds
<DF_Benchmark_regular_load_MS>: 297 seconds
<DF_Benchmark_single_thread_MS>: 20 seconds
<DF_Benchmark_lookup_DOP1_MS>: 49 seconds
<DF_Benchmark_lookup_DOP10_MS>: 20 seconds


Just for giggles, I tried some other variants to see if it made a difference:

Change DOP to 2 on first 4 tests; change last 2 tests to use lookup as a separate process...

<DF_Benchmark_read_MS>: no difference
<DF_Benchmark_API_bulkloader_MS>: 206 seconds (23% improvement over 1 DOP)
<DF_Benchmark_regular_load_MS>: no difference
<DF_Benchmark_single_thread_MS>: 12 seconds (40% improvement over 1 DOP)
<DF_Benchmark_lookup_DOP1_MS>: 51 seconds (degradation due to overhead?)
<DF_Benchmark_lookup_DOP10_MS>: 38 seconds (degradation due to overhead?)
Back to top
dlinstedt
Forum Member
Forum Member



Joined: 24 Jun 2007

Posts: 6



PostPosted: Tue Sep 18, 2007 5:46 pm 
Post subject: Re: Benchmark YOUR hardware and compare the results

Thank-you for the numbers...

Here's my take on it, I've recently published more numbers, and guidelines for Performance and Tuning in the largest of systems on my blog: http://www.B-eye-Network.com/blogs/linstedt

A series of entries on VLDW.

But down to brass tacks, at the end of the day, the tools don't matter, what we use to "move the data" doesn't matter... These days it's first about performance, and second about metadata/managability (cost).

So with that said, the performance numbers I shoot for (and have publicly disclosed) are as follows:

1. I test with 1500 byte rows to 2200 byte rows (mostly due to Unicode expansion), and if I'm simulating a COBOL source file, I test with 3200 byte rows to 4500 byte rows.

When using an ETL tool:
2. I shoot for 60,000 to 80,000 rows per second on inserts to target tables with primary key constraints, and not much else.
3. I shoot for 12,500 to 25,000 rows per second on updates - to specified rows by primary key, one row per update (could be done in batch, could be done in parallel - doesn't matter).
4. I shoot for 8,000 to 12,500 rows per second for deletes, again specified by primary key.

When using ELT (in database SQL on an MPP platform, NON-CLUSTERED ENVIRONMENT)
1. I shoot for 120,000 to 160,000 rows per second for inserts.
2. 80,000 to 100,000 rows per second on updates / upserts.
3. 50,000 to 80,000 rows per second on deletes.

Now, the requirements for the machines:
1. 300-400 mbyte per second on Disk I/O throughput (minimum)
2. Raid 0+1 (NOT RAID 5, NOT RAID 10)
3. Minimum 2 x 2GBit fibre optic direct connect to Disk Arrays
4. Disk arrays are DEDICATED STORAGE (dedicated to RDBMS Servers)
5. ETL Engine lives on separate server from the RDBMS (and is usually on gigabit back-bone network, no more than 1 hop away, VPN for server to server configuration).
6. ETL Hardware: minimum 2 dual core 3.0ghz CPU's, 8GB RAM, 256mb L2 Cache, 10,000 RPM internal SCSI drives
7. RDBMS Hardware: usually a LOT larger than this, but for SQLServer, it simply increases to 4 dual core 3.0 GHZ cpu's, 64 bit OS, 64 Bit SQLServer2005, and 12 to 16 GB RAM, and 10,000 RPM internal SCSI drives ALL ON RAID 0+1

Anyhow, these are some of the specs I use. I also teach performance and tuning (not just for Informatica, but for ETL / ELT, and VLDW systems) across the board. See my courses at: http://www.GeneseeAcademy.com

Oh yes, almost forgot - with enough horsepower, and a GRID SMP environment with multiple servers (SMP under MPP control), we can move 1TB from flat files through transformations into target tables (oracle 10g) in just over 50 minutes.

Hope something here helps,
Dan L
DanL@RapidACE.com
Back to top
Werner Daehn
Forum Devotee
Forum Devotee



Joined: 17 Dec 2004

speaker.gif*6
Posts: 10593
Location: Germany


flag
PostPosted: Wed Sep 19, 2007 8:49 am 
Post subject: Re: Benchmark YOUR hardware and compare the results

Couple of concerns/questions?


Why do you have the ratio of 1:2 between ETL:ELT? (60'000 vs 120'000 inserts per sec) In my world, an ETL tool is using CPU only, CPU is times faster than disk (=database) hence both should be equally fast.

I don't get it why a delete is slower than an update. In your example you never update the key column I assume, hence SQL Server can keep the primary key intact and needs to update the payload only whereas a delete statement does make a change to the primary key - it removes the row. Is the overhead of the index maintenance really that much bigger than all the space computation? (A database block is filled up to a certain degree, now you update a row that requires more space, e.g. old value NULL; new value 'full address of customer'. Pointer needs to be maintained or another DB block has to be chained to the current for the additional data.) I have to admit, I never tested that with SQL Server. Interesting.

A 1GBit Ethernet line has a theoretical throughput of 1GBit = 1000MBit = approx. 100 MByte/sec. Due to the way such a CSMA/CD network is used, everything above 70MByte/sec is actually excellent. 70MByte/sec is the throughput of one of your disks! Guess why all recent benchmarks loading 1TB of data use local flat files as source? 70MB/sec = 252GB/hour. And this is really the theoretical upper limit, especially Oracle SQL*Net is a rather expensive protocol performance wise.

Why do you use different architectures for ETL vs. ELT? To be fair, you need to install another database in the middle for the ELT case and add database links from the source to it and to the target. Then this database would have all the CPUs exclusively for transformations as you do in the ETL case. You are saying a database just for transformations does not make sense? I fully agree, hence do install the ETL component on the target database server as well. That has the other advantage that not two network cards are required for the server in the middle.


When I do some math, 160'000rows/sec for inserts in the ELT case with a row width of 1500bytes = 230MByte/sec. Sorry, there is no GBit network involved in that case. Can't be. And loading a local flat file I would neither call a typical scenario for ELT tools nor for any Data Integration task.

But the real fun is not 4 CPU servers, it is when talking about NUMA architectures or MPP/Grids. Then you can see what software aligns well with the underlying server hardware. And then you find funny things like a table copy from source to target takes 75hours for ELT and 20mins for ETL (and 19mins if you write multiple insert...select statements and many uncommon assumptions are given for this test case).

There are certain transformations that are rather common in Data Integration jobs that cannot be executed well by a SQL statement. The most simple example is: Read all rows, apply transformations and then do certain tests. If it is a valid row, load it, else load it into an error table. How do you load two tables at the same time with one insert...select statement? It is possible under certain conditions but either you have two of those insert..selects or you use procedural programming in the database and you will never ever get the same performance as with a insert..select.

_________________
Werner Daehn
ex Product Manager Data Services at SAP SE

www.rtdi.io
Back to top
dlinstedt
Forum Member
Forum Member



Joined: 24 Jun 2007

Posts: 6



PostPosted: Wed Sep 19, 2007 9:46 am 
Post subject: Re: Benchmark YOUR hardware and compare the results

Linear scalability in the database, I was using a 10 node Teradata system with no logging, and no fallback on the tables in which I had a complex SQL statement for my 120,000 rows per second test. Depending on the size of the row, the tests varied. The size of the Teradata system far out-weighed the size of the ETL server engine, so in some regard the test was unfair.

Realizing this, I also stipulate that in dealing with very large data sets, the ETL engine can and should be linear scalable to reach these levels of throughput IF the hardware is made available. I have yet to see _any_ ETL tool on the market come close to this without being put on a GRID of ETL server engines.

The problem is: Business makes these "unfair" comparisions all the time in the real-world. Vendors fight the image, and try their best to level the playing field. The other problem is not many people understand "Big Data" and that there are things that MUST be done in order to accomplish throughput, and most of the time that means a tremendous increase in complexity.

On the technical side, the problems stem from not utilizing the hardware 80% or more during ETL processing. Our loading architectures are "poorly designed" or "designed to be transactional in their decision making capabilities" which puts the kabash (? sp) on high volume, high performance parallelism. the ETL architectures we design NEED to be created as BLOCK STYLE components, handling tens or hundreds or even thousands of rows at a time - too many "coders" that learn ETL continue to think about data movement "one-row" at a time, say for instance in a SQL Cursor example.

Engineering of the ETL tool sets needs to get a handle on: Grid, Near-Linear Scalability, partitioning and Parallelism. Unfortunately to engineer core engines that "overcome" the ETL designers mistakes (like architecting loads that look at one row at a time) is a daunting task, requiring all kinds of math that make my head hurt.

Due to the nature of these "life is unfair" playing fields, I always try my best (when producing performance numbers) to be: a) on an underpowered ETL engine, b) on an over-powered highly tuned DB engine, c) to overwhelm the network & disk I/O, push it to it's limits.

This way my performance numbers are no longer "some numbers produced in a lab somewhere with hardware we'll never get." They are numbers closer to reality, that can be achieved within bounds and specific architectures, and often times, existing hardware layers.

Another part to this (under the covers) has to do with the ETL engine's interaction with the RDBMS. People in general seem to have forgotten (or don't think about it), that Indexes cause Disk I/O, as does TEMP, Logging, Rollbacks, Commit Points, Order By, Joins, Group By, Having clauses, and then theres: RAM Overflow, RAM Limitations on 32 bit architectures, even CPU overflow to RAM... All of these things affect I/O.

Deletes have been slower than updates (In an EDW case only, not transactional) because most of the time we have tables that are highly indexed, updates "usually" replace the lowest level of the index tree entry with a new value, leaving the disk block in tact, and "sometimes" forcing re-balancing of the index tree. Updates against the data set at a disk block level typically do not "extend" the data, letting the DB stick the row back in place in an existing disk block (avoiding fragmentation). Updates seem to hit "one or two" of 10 or 12 indexes, while Deletes cause more I/O - deleting rows in the DB (marking them deleted), inserting a pointer to the deleted row in the "free block chain", inserting the previously deleted row in the log (for transaction consistency), and hitting 10 or 12 indexes, across the board, including the Primary key.... Then there's logging on EACH of the indexes.

So, in most cases on average, deletes "cost more" than updates - particularly in high volume solutions that require large numbers of indexes, otherwise select performance would wane significantly.

Regarding "chaining to another DB block" Over the years, RDBMS vendors have "fixed" this problem (for the most part), they actually now insert the new row (if it extends in byte size beyond the old row), and update the pointers to the new row (as opposed to "deleting" the old row which would cause all the above to happen). They avoid chaining rows until it is absolutely necessary to do so (due to space constraints).

The tests and systems that I run are usually on 1GB ether back-bone, but 2GB Fiber Optic channels to the disk, the disks are usually dedicated storage arrays, and have massive sets of RAM cache built in. I'm not talking a typical EMC Clarion here, but IBM SHARK drives, and the like... In most cases, if the corporation has "large data" they "usually" have decent disks, it's the throughput that they have issues with - because (for instance) they put "client traffic" on the network that talks to the disk, or they "put user directories" and other things on the disk, they use it as NFS or remote mount, or NASD storage - all of which kill the throughput.

230mbit / sec can be achieved on a 2GB fiber direct to SAN, which again, is at a minimum dual channel (when parallelism is provided within the ETL architecture, we can achieve a theoretical limit of 4GB fiber channel, 2GB in parallel).

Also, I test many different databases, everything from Oracle, to Teradata, to SQLServer, to DB2 UDB, to Netezza. (Haven't spent time scaling MySQL yet, sorry). We always tune the protocol with network packet sizes, and other buffering mechanisms. When I say "local file" I mean local to the ETL engine, NOT the RDBMS engine. In Teradata's case, we rely on the proprietary super-high speed backbone to move data.

Database Links are inherantly slow, and in volume solutions absolutely without a doubt cannot be used (I have yet to see a configuration where DB Links truly work efficiently). I don't mind how many databases there are, they simply need to be tuned, and the architecture needs to be addressed.

As always, performance and tuning is a shell game - tune one thing, move the bottleneck somewhere else. It's all about achieving the best balance, performing the right operations in the right place at the right time, and then using parallelism to scale (pushing the resources to their maximum efficient operating capacity).

Regarding where the transformations can/cannot be done, this cannot be stated definatively. It all has to do with the architecture, the volume, the desired output, and the latency at which the data arrives on the input side. Batch is completely different than near-real time processing. I politely disagree with your statement that there are some transformations better suited to Data Integration jobs... I've been on both ends, where it's necessary to do it in the DB, and also the other end where it's impossible to do it in the DB - it all boiled down to volume, performance, maintainability, and cost (to the customer).

What I will say is this: there are many many transformations I can and WILL do within Teradata's walls that cannot be accomplished by external ETL tools without a much higher cost/complexity of setting up a GRID and CPU layers due to Volume. The other piece here, is I loose metadata by doing transformations inside the RDBMS (any RDBMS), that I'm aware of, but if you have 1.5 Billion rows in a single table that you have a transformation (or two, three or four - maybe as many as six) to work on, the last thing you want to do is: pull 1.5 billion rows over the network, out of the database, "transform it" then re-insert it back in to the database.

Can you do it? Yes, you can always do it - it's a question of time, cost, complexity, performance, and metadata / maintenance.

There are times when we have heterogenous sources to work with: 800 million incomming rows, and 1.5 Billion target rows to "bounce against" that we may be better off using an ETL tool for _specific_ operations/parts of the comparison before the 800M rows arrives inside the database. And then there are those times when 800M rows "can be processed in 1 week" and performance isn't the main focus, then we can do everything in the ETL engine.

Regarding the Insert...select - I've had times when the DB wasn't tuned properly, and the ETL tool outran it by miles and miles. But on to the error part of your statement...

In my architectures (dealing with 10,000 transactions a second in burst rates / near-real-time) or dealing with huge batch volumes, there is no "time" to split the data into separate tables, nor is there any "disk space" available to do so. We are forced to "mark" the columns, by adding an error code in-process, and use the same target table + 1 column in order to handle the requirements. This is simply a different architecture that meets specific needs.

Great response, I look forward to more.

Respectfully,
Daniel Linstedt
Back to top
Display posts from previous:   
Register or Login to Post    Forum Index -> Data Integrator -> DI: Performance and Tuning  Previous TopicPrint TopicNext Topic
Page 2 of 10 All times are GMT - 5 Hours
Goto page Previous  1, 2, 3, 4, 5, 6, 7, 8, 9, 10  Next
 
Jump to:  

Index | About | FAQ | RAG | Privacy | Search |  Register |  Login 

Not endorsed by or affiliated with SAP
Powered by phpBB © phpBB Group
Generated in 0.1027 seconds using 17 queries. (SQL 0.0020 Parse 0.0682 Other 0.0325)
CCBot/2.0 (https://commoncrawl.org/faq/)
Hosted by ForumTopics.com | Terms of Service
phpBB Customizations by the phpBBDoctor.com
Shameless plug for MomentsOfLight.com Moments of Light Logo