Active-Active - GoldenGate

I have a customer that would like to configure GoldenGate in an active-active configuration. Can you provide a list of the most common questions you expect to receive answers from the customer during our discussions. 

Check out the Administrators guide in chapter 8/page 83 "Using GoldenGate for active-active high availability", which can be found here:
The first question you'll want to ask is if this application has been designed for A-A? Active-Active always sounds like a good idea at first until you start pealing back the layers.
If this is an off the shelf (COTS) application (from an application vendor) it's very doubtful any A-A will work. If the customer has control of the design then you have a chance and if it was not designed for A-A they will probably have to make some application changes.
Those changes will come to light when you start digging into their application, but you'll want to figure out how likely a conflict is to occur and what are you going to do about it once it does.
Perhaps the most useful thing a customer can do to make their application A-A ready is to ensure every table in the A-A configuration has last updated timestamp column that get's changed on all inserts and updates.
There's no magic bullet for A-A, it can get overly complicated very fast, and most application aren't a good fit for it. Nonetheless, if you do have an app that is or can be made a a good fit then the chapter I pointed you to should help. In particular you'll want to learn about exception mapping and error handling (reperror).
Good luck,


Failure as Experience: Lessons Learned When Things Go Wrong

Insanity: doing the same thing over and over again and expecting different results. --Albert Einstein It's great when things work out, when all the pieces come together and your work and planning pay off with a successful project. That's the stuff you list on your resumé and your LinkeIn profile. But where would you be without failure? The real value in the years of experience you detail on your resumé comes in large part from what you've learned by screwing up. These might be major disasters, or they might be the kind of under-the-radar mistakes that you've kept to yourself. Either way, while failures aren't something you flaunt, they very definitely shape your experience and are an enormous factor in the decisions you make on a daily basis. After all, if you have never failed, your experience is very limited. Confess! Can you offer an example of what failure has taught you? I'm not asking you to share anything that will cause veins to throb in your boss's forehead. But surely you can offer a bit of wisdom or insight gained from a moment in your career when a decision you made caused -- or narrowly avoided -- a big problem. Your responses may be quoted in an upcoming issue of Oracle Magazine. (In order to be included in the Oracle Magazine article responses must be posted by December 21, 2015.) --Bob Rhubart-Oracle
Hi Bob! Well, to start with I assume that everything you do today is a result of things you have done before. And if you are a little bit smart than you try to avoid making the same mistake again. And when you "offer" your experience to others you might actaully become a nice fellow. A nice example of this is the day when I insisted that a colleagie who was dead sure that his change would not affect anything in the production environment to just make the extra backup.To cut a long story short - this saved the day (and the database). We have gone different ways since then, but whenever I bump into him he still thanks me for that hint. My own list of experience points (to stay in Dungeons&Dragons terms) is long.  I can categorize them in three groups: Me and my big mouthWrite down everythingAssumption is the mother of all *&^%^%$$ So in category 1 we will find things that hit me a lot. Best examples here are the things when you are sure that you have seen it before, been there, done that, got the T-Shirt. So obviously you start bragging that you can fix this in a moment. Yup - until the moment you are at the customer and find out that the problem you encounter is way different than the one you have seen before. Nice one - and since then I make sure I ask the question - is the blooper where I was aked to fix a WebLogic configuration issue I have seen a number of times and I knew how to deal with it. Until I came to the customer and figured out they were running on Windows and not Linux. Standing there and saying that I have only seen the issue in Linux, not on Windows, and the whole system behaves differently than I expected, did not give me credential points. Learning from failure #1 - be sure that you describe in detail what you know. Category 2 is something that becomes even more valueable when you want to to the same thing twice. Example - I was doing some complex configuration in a SOA Suite environment. Some things did not work the forst time, so you change some stuff and retry. Normal work you would say. However when you change five parameters at the same time a lot of combinations exist. At the end I didn't know the starting values, what was changed when, which tests I did execute and so on. Restore and restart and a lot of extra work was the result! Learning from failure #2 - write down what you are doing. Start environment, time of change, what was changed and how to test stuff. In the beginning it is time consuming, but the result is less time spend plus a good starting point when you need to do this again. And then there is category #3 - thinking that you are doing the right thing because somebody told you so. Assuming things (and especially working based on your own assumption) is a recipie for desaster! Short while ago I was working on a DevOps team. Changed a setting in WebCenter which needed a restart of the Managed Servers. Obviously this was done in the dev and test environment and I was about to commit the change in the QA environment. Production change was scheduled for the night outside business hours. My assumption was that I still had a putty session open to the QA environment, so I executed the change and bounced the environment. A moment later my colleague next to me was called why production was down. I realized what happened - my open session was on production, not QA. 3500 people could not work for about 10 minutes with the prime application. Learning from failure #3 - never assume anything. Make sure you start from scratch, verify, double check and in complex situations ask a colleagie to assist you.Hpe this helps.cuAndreas ChatziantoniouOracle ACE
Hi, Bob. I've been working with an upgrade of a current implementation of SOA Suite 12.1.3 to 12.2.1There are 5 domains to be upgraded:1. Sandbox2. DEV3. TEST4. DR5. PROD So far, we've been able to upgrade the first four and we are just few days away to upgrade PROD. During the first 4 domains we've learned from our mistakes: 1. Re-read the documentation. In the sandbox installation we read the upgrade process and executed. We skipped a lot of steps, because we just read it once and thought we had understood  the whole process2. For DEV environment we had a very weak Database backup. A simple export was perfomed. Something went wrong with the upgrade and we had to rollback the process. Because we had a very simple/weak database backup, the rollback was not that simple. We learned that we had to use Oracle RMAN3. For TEST we had a database RMAN backup, we had the whole upgrade process already studied. But we just asked for a two hours window to perform the whole upgrade. Something went wrong during the re-configure step, a set of datasources were badly configured and we couldn't finalize the upgrade. We ran out of time. The lesson learned: do not hesitate to ask a larger window.4. For the DR environment we had everything in place. We asked for a larger window (4 hours). We had a very solid RMAN database backup. We identified that probable  datasources problem, but even though we knew that could happen, we simply didn´t do anything to avoid it.  While running the domain re-configure script , we had the same issue. We lost around 1 hour to identify the datasource that was causing the issue. The lesson we learned: if you have already identified a possible issue, do everything you can to avoid it5. We are just 5 days away to make our upgrade in the Production environment. We have learned a lot of things, the question is: are we ready to perform a clean upgrade? The answer is; yes we are. Sometimes you don't have as many opportunites as the ones I just described to make an upgrade. But one thing that you need to keep in mind is to write down the lessons learned. It could sound a simple recommendation, but not everybody does it. I strongly recommend to create a log for your installation/upgrade processes. A simple document with screen shots and short descriptions for every step. A text file with the errors and an explanation to solve them; also refrences of Oracle Support notes, or blog post that helped you to solve those issues.
Good stuff, Rolando! May I quote you in the article? 
Really great stuff, Andreas! I would very much like to include some of your post in my article. Any objections? 
Hi, Bob. Sure. You can quote me. Absolutely. Thank you again best regards
We had a design agency build the user interface for an application I was going to implement. For regulatory reasons, each value in this application had to be entered by one person and approved by another. The design agency came up with the brilliant idea to have a checkbox to the right of each data entry field for the approver to check. I thought I could easily implement this and casually nodded approval. Big mistake.  What I did not appreciate was the fact that the user interface would contain input - checkbox - input - checkbox, etc. Cursor focus for the data entry person would have to pass from input to input, passing over the checkbox in between. Because the application was also configurable, we did not know which fields would actually be displayed. Everything is possible in ADF, but some things are much harder than others. This was one of the hard ones, saddling us with a set of JavaScript functions that would break with every new version of Internet Explorer the customer rolled out.  Today, I am older and wiser. And I will push back hard against designs that involve specialized user interfaces of a type never built before. The user experience should not be designed by the programmers. But nor should the designers have free rein. The solution is user experience design patterns, and Oracle has published a whole library of these ( | Oracle Applications User Experience). Don't design applications without looking at these. 
Good stuff, Sten! Thanks! May I quote you in the article? 
Hi, I never got a response from you to my question. May I quote you in the magazine article? 
Yes, you are welcome to quote me.
Thanks, Sten!
I think the most educational mistakes have been when we appear to have done everything right technically, but failed effect the results desired. The take away for me has been, regardless of how referenceable a solution is, how well executed the development, how great the quality metrics are. If people don't understand what it is you're trying to achieve for them (make their lives easier, quicker, productive) if they aren't with you then you're going to fail. The clearest illustration of this I've had to experience is in delivering an operational monitoring solution, that meant the ops team could reduce the production problems, or identify and fix the issues quicker as we'd embodied into the monitoring their insights into common issues. So that the ops team had the facts to be able to challenge the delivery organisation about the issues being seen due to quality controls. Ultimately the ops team switched the monitoring solution off, because they felt it was undermining them (although also showed a few of the outages where of their own making as well). The mistake ultimately was a team had a culture of wanting to be 'heroes' in the eyes of the business and the monitoring was impacting that.

Goldengate conflict detection resolution and database transactions

I'm trying to determine how suitable Goldengate bi-directional replication will be with my application and the main concern right now is to do with conflict detection and resolution.When it comes to specifying RESOLVECONFLICT parameters for individual tables, how do you manage scenarios when there are several DML updates in the same database transaction where one causes a conflict?For example, a conflict occurs in a transaction and then there are additional statements in that transaction that may depend on the outcome of that earlier conflict.Does anyone have any advice or useful links that explain ways on how to approach this?
Oracle GoldenGate (OGG) can handle any conflicts you have. The question here is not what OGG can or cannot do. It is whether it is at all logical or workable. What you need to do is to determine the rules yourself, OGG or any software aside. OGG or any software cannot determine that for you. You first determine the rules then any of these rules can be resolved (sorry for the punt!) using RESOLVECONFLICTS and other parameters available to do these things.I will give you a simpler example that your business needs to determine but is easily handled by OGG once it is defined. Example, if a new employee is added (a new row) to the employees table on both bidirectional servers, your business says to always honour the one with the earlier date that employee is hired (say column date_join). Then something like this will work:map schema.employees, target schema.employees, resolveconflicts (insertrowexist, usemin (date_join)), cols(date_join), colmap (usedefaults);  In your statement "For example, a conflict occurs in a transaction and then there are additional statements in that transaction that may depend on the outcome of that earlier conflict."You need get your apps developers and sit down and trashed out how you want to handle this and write down the rules. OGG cannot determined what your business needs, but it can resolve any rules that you specify how you want the conflicts resolved. So the main thing in any bidirectional replication is to keep it simple. Determine why you REALLY want bidirectional. Most businesses will say, to make use of both servers as production servers and still provide disaster recovery in the event one of the servers is permanently down (like a fire). Say your database supports 20 apps, so one very simple way is to have 10 running on server A and 10 on B. They both back each other and both are production servers. So you need 2 smaller (and cheaper) servers because it is handling 1/2 the load at normal times. The resolution here is zero, by keeping it simple and is the number one thing in bidirectional replication. OGG can resolve anything you throw at it once you have the business rules. But maintaining it can quickly become a nightmare if you have many many rules. CheersKee Gan
Thanks for your response Kee. Keeping it simple definitely sounds like the best way to approach this as it seems to be the only way Goldengate can work. Since transaction awareness is not part of its functionality and if the rules get too complicated then Goldengate may not be a suitable solution after all.I guess some detailed conversations with the development team will help to iron out my concerns. Thanks again.

What are the best practices to version control Recon Configuration?

Hi Gurus,
What are the best practices to version control IDM reconciliation configurations (schedules, reconciliation rules etc.) of different resources? They are usually created as "TaskSchedule" xml objects. Should we check in those objects and migrate them with the rest of customization xml objects? Or should we configure the reconciliation manually for each enivronment? Please advise.
Thanks Much!
Xiaofang Li 
Any resolution or best practice for version control of reconciliation scheduling? Or is the best practice just to not version control scheduling and go set it on a new deployment?
Also, is all of the stuff in the TaskSchedule <SubjectRef> tag necessary for version control? There is a lot of stuff there. 
this is a rather old thread and I might not get an answer on my question from the original poster:
Is there anything of interesst in the TaskSchedule objects for this scenario?
The important things from my point of view are stored in the Resource objects and in the Reconcile Configuration configuration object (correlation rules, proxy user, actions on certain situations, pre recon tasks, per account workflows,...).
I'm sorry if I missunderstood the question,
Scheduling information is not obviously present in the Resource and Reconcile Configuration objects: how often and what time of day do you run reconciliation? Do you do full-only or full-plus-incremental? Stuff like that.
I think I now understand what you want to achieve.
I still have a hard time to come up with a versioning idea because this seems to be operations - "Monitoring the system and deciding if we should take action or not" - rather than code changes that you may want to roll back to an earlier version if something goes wrong.
If it is just about the time, intervall and mode to reconcile a couple of resources the best way to manage that seems to be to monitor the system and tell it to do the right thing if the last run showed there is space for improvements. Now if you are dealing with 3k UNIX servers that need to be reconciled in a certain order or time constraints on the availablility of the resources that would be an other beast - one I would try to solve programmatically but not by operations with version controll
Unless I head otherwise, I am going to assume that no one out there bothers to put this sort of scheduling configuration into version control. (If you do and have a good reason, please reply for the next person who searches the forum!) Therefore, it's not a Best Practice, and I can safely ignore it.
I am going to wait a few days before going an marking this question as having been answered.

SGE intermediate job usage info

Here I have a problem with SGE reporting:
The User Guide document says "For N1GE 5.3 systems, only one record exists per finished job, array task and parallel task. The ju_curr_time column holds the job�s end time (j_end_time in
for N1GE 6.0 systems, the online usage is stored as well; this results in multiple records for one job, array task, and parallel task stored in sge_job. The resource
usage of a job can be monitored over time (ju_curr_time)"
I have installed N1GE 6.0 u4, but I still can not see the multiple records as what the doc says. I only got one record as the job is finished.
Anybody know how to solve this prolem? Is there any configuration need to be done? Any comment will be appreciated. 
Your question is about Sun Grid Engine (SGE).
Most of the engineers who answer questions here don't use SGE,
so you might not get a good answer. You can find more information
about the Grid Community here:
There are mail aliases and web sites listed there. Maybe one
of those resources would be a better place to ask your question.
Oh, I'm sorry. But thank you anyway.

Doubt: Filters in Golden Gate Area

Hi All,
I'm a ODI Developer (begginer) and have doubt abt Golden Gate.
I have the follow scenario:
GG -> Staging Area -> Mastersaf
Client told me to do not make filters on the first step (GG -> Staging), only at the step Staging Area -> Mastersaf.
Is it correct (Is it a best practice)? Why?
I think you need to come with more details of your set up. I didn't get your story and the question. 
Just a wild guess here, based on the lack of details and clarity in your question, but are you asking about file permissions for where GoldenGate is installed? Make life easy, install it as your oracle OS user. 
This is a common requirement / request. Basically if you have GG capturing data from some database, you want to capture everything that you might need, including in the future, rather than just getting what you know you need today. So, yes, capture everything (or rather don't filter gratuitously) and send it all to the staging area. Next week you many have another client (or a change in requirements) that asks for additional data. If you're already capturing the data, you don't have to go back to the source & modify your configuration to get the additional tables, rows or columns. Yes, it takes a little bit of extra (temporary) storage space, but disk and bandwidth are (often) cheaper than development effort, regardless of how much effort it actually is. (In many database shops, it takes much longer to do the "paperwork" to make a change on the DB server than the time it would take to make the actual change. This type of proactive policy ("capture it in case we need it") prevents that.)