Extract process performance - GoldenGate

Hi
We are planning to Capture data from an Oracle OLTP table which has 40 Million DML's per day. The OLTP system is already heavily loaded and we want a light weight process to capture the changes.
Please can you share your experience with golden gate on a similar environment
Single extract process is it efficient enough to capture 40 Million changes per day.
If we have to have multiple extract processes what would be the impact(CPU,Memory per process)
Thanks 

Hi,
If you use single extract process for 40 million rows then there is a chance of excessive lag times. So you better go for multiple extract and replicat processes to speed up replication and to minimize the lag times. And this would reduce the CPU and memory overhead too.
Thanks. 

Hi.
40M DMLs is arbitrary because one DML could be 100 bytes or 1GB. A better measure is redo/transaction log volume.
However, 40 000 000 / 86400 = 462 DML operations / second. We usually see throughput in terms of several thousands / second with a single extract.
Never start testing with more than one extract. Use one first to get a baseline. Then if this is not RAC move to one extract with two threads ("threads 1" with zero as the default first thread). Then try with two if you're experiencing lag but evenly distribute the load across the extracts. Do this by using logdump "count detail ./dirdat/tr*" on the single extract trail you created in your first test. The command will show number of insert, update, deletes and bytes per table. Evenly distribute based on bytes per table.
Good luck,
-joe

Related

Gather Schema Statistics Overhead

Hello Gathering statistics using the Automatic method - Does anyone know or have access to a white paper that outlines the overhead of running this. Whilst running stats automatically is seen as a good thing I can find no mention on the internet, forums, MOS etc of the impact of gathering statistics. This would help me in my current role as I am trying to convince none technical people of the benefits of gathering statistics and how low risk they are.CheersAl
Why not highlight  the risk of NOT having updated and proper stats? The CBO may very well decide, based on stats saying there is 1 row in a table, to do a nested loop join, or cartesian join, against that table, that actually contains 100's of 1000's of rows - resulting in a cursor that could take days to execute. Proper stats are mandatory for optimal execution plans.

finding lag in records and not in time

Hello Experts, At all places we see that GoldenGate lag is measured in time. Is there a way to measure the lag in number of records? Thanks
No. The lag is a simple calculation between the wall clock time and the commit time in the record at a particular instance. The replicat and extract has no idea how much more it needs to process. If it is a count of records, it has to retrieve the current point and count backwards till it gets to the record it is looking at which will be a impossible amount of work. But if you want to eye ball this, for example in the replicat, an info <replicat> shows the current trail sequence, then do a ls -ltr on all the trails and you can easily see how many trails to go and if you add the bytes for each trail (they are usually the same size) how many more data to go through. CheersKee Gan
Hi , No, you cannot find how many records are left or lagging to get applied in the target server by the replicat process. It is only measured in time.  Regards,Veera
I too had the same understanding. But we are seeing an environment where something has been deployed for lag monitoring script that shows the lag in records. And it is accurate! So there seems to be some way. Can OEM do that?May be not direct, but is there a roundabout way of knowing it?May be by comparing some integrated replicat related views between SRC and TGT? Thanks
There is no such thing in OGG. What OEM shows for OGG comes from OGG itself. You need to show me the env that shows the lag in records. Forget about OGG it is just technically impossible to do. Nothing can determine something that is yet to happen, ie how many records there are to read without actually counting them yet.CheersKee Gan
Hi , There are no options in Oracle GoldenGate to get the lag in records. It may be your customer scripts. Check with your team. Also, there are no views available from OGG for such results. Regards,Veera
We figured out: the lag monitor script was doing a COUNT on the pending trail files and identifying records. Ofcourse, thats not precise because there could be files that have not yet reached the target side. But if extract and pump has no lag, then the script correctly identifies the lag in # of records. just sharing, in case its helpful to somebody. cheers

heavy duty DML can break golden gate 12c

Greeting Dear Community, We are on 12.1 We have a db migration project for a mission critical, heavy transnational db to another db with a new data model.We use PL/SQL procedure inside with DML to do custom ETL. We are still ironing out the bugs and the perf issue in the code.Few days ago, we see a golden gate lag which is a oracle bug. They patch it and now it works fine.After this incident, the team lead and the management decides to convert all the heavy duty DML ETL code to bulk collect.They say using the bulk collect limit <number> feature made the commit size smaller and there is much less chance to choke golden gate. Does this approach make sense ?? Intuitively, I say its counter productive as bulk collect is   less efficient and there is much more code to write and debug.Better use our time to debug of the data quality and the perf issue....,  Thanks for all the inputs.
It does make sense. Generally speaking this is how extract works. When a transaction begins, it marks this position and starts collecting DMLs from this transaction that it wants. Potentially extract can keep track of 100s of transactions at a time. Extract will use its allocated memory until it runs out and starts swapping them to disk (see cachemgr parameter). If this runs out it will simply take note of where the transactions begins. So when a transaction commits, extract writes out what it has in its memory and swap area to the extract trail. If the transaction is very large it reads it again from the archived logs as there wasn't enough space allocated and writes out to the trail as it reads from the archived/redo logs. So if you keep the transaction smaller, or at least within the allocated space extract works much more efficiently.At the end of the report for each stopped extract you will a lot of cachemgr and memory stats. You can see if there occurences of needing to swap to disk.CheersKee Gan

Oracle NoSQL Database YCSB Results

I have posted some results of a YCSB benchmark that I ran on my blog. I thought it might be of general interest to this forum:
http://blogs.oracle.com/charlesLamb/entry/oracle_nosql_database_performance_tests
Charles Lamb 
Two questions:
1) Can you share the consistency and durability settings used in the benchmark?
2) The "getting started" guide mentions that large number of replication groups will improve performance (which makes sense, it will allow more concurrency). However, even your largest test has just 4 groups. Can you share the reason for this?
-- Gwen 
user12003335 wrote:
Two questions:
1) Can you share the consistency and durability settings used in the benchmark?
2) The "getting started" guide mentions that large number of replication groups will improve performance (which makes sense, it will allow more concurrency). However, even your largest test has just 4 groups. Can you share the reason for this?
-- GwenHi Gwen,
Sure, no problem.
For 1): The Durability was the NoSQL Database recommended (and default) value of no_sync, simple_majority, no_sync. The Consistency that we used for the 50/50 read/update test was Consistency.NONE.
For 2): Cisco was able to let us use 12 C210 nodes and 12 C200 nodes. We wanted to make sure that the Rep Nodes were symmetric so we used the smaller nodes (C200s) for the client load generators and the more capable IO subsystems on the C210s for the Rep Nodes. To be honest, it's hard to find a lab that has lots of homogeneous boxes with capable IO subsystems. It's generally a mix of some of this and some of that. You might think that Amazon EC2 is the answer to our needs, but VM performance is not predictable -- we like to run on raw hardware. That said, we have run tests on 64 Rep Groups (on machines with similar CPUs, but much less-capable IO subsystems) and demonstrated NoSQL Database's scalability. The Cisco numbers that I published are the highest performance numbers we have seen for that number of nodes/rep groups.
We are working with a couple of partners on additional large-scale configuration testing. If we get more results, I'll definitely post results on my blog.
Charles Lamb 
Hi Charles,
Thanks for those results. I would like to know the following:
1) I assumed the number of client is 1 (not 3) for the first line the those results and 90 for the threads (not 30).
2) What is your best benchmark today to show how Oracle NoSQL is scalable? Can you tell us the result of the best benchmark as you did for CISCO ?
3) Can you explain what "95% Latency" and "99% Latency mean" ? The result are too different from avg latency.
Thanks. 
893771 wrote:
1) I assumed the number of client is 1 (not 3) for the first line the those results and 90 for the threads (not 30).Several people have called that to my attention, but I believe the number is correct as it stands. The accompanying text implies that it should be 90, but the test was run with 30.
2) What is your best benchmark today to show how Oracle NoSQL is scalable? Can you tell us the result of the best benchmark as you did for CISCO ?To be clear, the results were not done "for Cisco", they were done using Cisco's lab. Cisco was generous with their hardware, but as I mentioned earlier, it is difficult to put hundreds of homogeneous nodes together in one place at one time. A lab with this much equipment has many demands placed on it. Therefore, we were given access to 12 C210 and 12 C200 nodes for our tests. In the future we may receive access to larger configurations, but this access will have to be scheduled among many users. The bottom line is that the results I published are the largest tests we ran in the Cisco labs because that is the largest configuration we have been given access to (to date).
As I mentioned in a previous post, we have run tests using 72 VMs (20 Rep Groups and 12 YCSB clients), and 192 nodes (64 Rep Groups). These tests validated that our architecture scales.
Tests at this scale take a substantial amount of effort to provision, run, and (especially) analyze. We spent a great deal of time on the Cisco tests as well as the "72 VM" and "192 node" tests. We performed extensive analysis on the results before our access to those labs expired.
3) Can you explain what "95% Latency" and "99% Latency mean" ? The result are too different from avg latency.95% of the results showed latency less than that number. Ditto 99%.
Charles Lamb 
Hi Charles,
The accompanying text implies that it should be 90, but the test was run with 30.What about the number of clients? If it is really 3. That mean for "50/50 Read/Update" you had only 10 threads on each client. And for "Insert Results", only 30 threads on each client. Can you confirm that?
We have run tests using 192 nodes (64 Rep Groups)Are those results available some where to see how latency varies with the number of Rep Groups or the volume of data?
Thanks. 
user962305 wrote:
Hi Charles,
The accompanying text implies that it should be 90, but the test was run with 30.What about the number of clients? If it is really 3. That mean for "50/50 Read/Update" you had only 10 threads on each client. And for "Insert Results", only 30 threads on each client. Can you confirm that?Correct.
>
We have run tests using 192 nodes (64 Rep Groups)Are those results available some where to see how latency varies with the number of Rep Groups or the volume of data?No. 
Excuse-me Charles,
I don't understand how Throughput and Latency are calculated from KVS size, Clients, Total Client Threads and Time.
Thanks for your help. 
user962305 wrote:
Excuse-me Charles,
I don't understand how Throughput and Latency are calculated from KVS size, Clients, Total Client Threads and Time.Calculated? I don't understand what you mean. Throughput and Latency are measured results of the benchmark.
Charles Lamb 
If we have time. Why don't we have Throughput close to number of inserts / time ? and avg Latency close to time / number of inserts ? 
user962305 wrote:
If we have time. Why don't we have Throughput close to number of inserts / time ? and avg Latency close to time / number of inserts ?400m / 26,498 ~= 15k
1.6b / 94,441 ~= 17k

Can Data Visualization Cloud Service (DVCS) handle large data volumes?

Hi, We have case in which DVCS can be a potential a solution, if it can handle large volumes of data. We're talking about 20 columns per record, 6-10 million records per day, with a 45-60 days rolling window.. Can DVCS or, alternatively, BICS handle this? CheersJosé
DVCS on it's own no - think about the user experience with DVCS, you'd expect quick visualisations responding to interactions with the end user almost immediately. Query 6-10million rows even on a high powered server would take a period of time. The only way to achieve this is to use BICS (Data Vis) with DBAAS and build a data model which can handle such volumes. This would involve RPD modelling to facilitate holding data at different levels of aggregation, to support very quick query response to end users, transparently redirecting to the appropriate database object depending on the levels of data granularity (using RPD content levels). Hope this helps.
Yeah, DV - the name itself - says Data Visualization.As DV is built on BI tech I think it won't have issues pushing down the queries to the underlying database, to get an aggregate for instance. But showing 20 columns is not a real DV style solution.BICS is much more suited to report on 20 columns, but then you would still have to think of how to limit those Analyses to not show all those millions of records. So what's the real deal here?Do you really need to see 20 columns and all those rows, or do you need summaries?I agree with MattT that DV on it's own won't fit the job, but you probably have an Oracle database that can handle all those records just fine.And if you really need to visualize all that data, you should have a look at Big Data Discovery. That can handle samples of millions of rows just fine. but that would be require a different systems configuration (Big Data). Are you talking Discovery/Visualization (DVCS), Reporting or Dashboarding (BICS), or just Data Management (DBCS)?Regards,

Categories

Resources