Read operation - NoSQL Database

Hi all,
Will read operations go to any node if the user doesn’t mind incomplete consistency guarantees (i.e. reads might not see the most recent data)?
Will read operations be served from the master node if the user requires the most recent value for a data item?
Can we choose if we want read operations to the done from the master or not? How can we do that?
By default (if there is no failure) will the read request give the last updated data? Or is it possible to have old values because data are given from a node that has not yet the last value.
Thanks for your response. 

user962305 wrote:
Will read operations go to any node if the user doesn’t mind incomplete consistency guarantees (i.e. reads might not see the most recent data)?See oracle.kv.Consistency. Each type of read operation accepts a Consistency parameter. The request is sent to any node that can satisfy the level of consistency specified.
Will read operations be served from the master node if the user requires the most recent value for a data item?Consistency.ABSOLUTE.
>
Can we choose if we want read operations to the done from the master or not? How can we do that?Consistency.NONE causes the request to be routed to any node in the rep group. TIME_VALUE routes it to any node that is within the specified delta of the master. Consistency.VERSION routes it to any node that has at least that version.
>
By default (if there is no failure) will the read request give the last updated data? Or is it possible to have old values because data are given from a node that has not yet the last value.If you do not specify a store-wide default, and you do not specify a Consistency in the read-operation's arguments, it will default to Consistency.NONE.
Charles Lamb

Related

The perfomance of Oracle noSQL

1.BackgroundOracle NoSQL version: 3.0.14 .node number: 1disks number: 2 2.PerfomanceI deployed a oracle nosql on a machine  named node1 which was a VM in CloudStack.There has another node named node2 in CloudStack used as client.I read data stored in node2 into the store deployed in node1. The data size is about 20G,But the  time spent is about 30 hours!! From the picture above ,We can see the perfomance of put method is very bad?!
The performance information shows that operations had a latency of 0.4 milliseconds, so that suggests that the performance of individual operations is good.  Your program only makes requests using a single client thread.  My guess is that the problem is that the program is creating much less load than the store can handle.  Maybe you could rewrite your program to use multiple threads where each thread inserts data from a given input file?  You'll need to experiment to figure out how many files can be processed simultaneously, but my guess is that you could increase the data throughput that way. - Tim
Hi Tim, After I inserted data about 20G into a store,I want to get the total number of rows using methods showed above.It returns "timeout".I do not know what it means?
The timeout results means that the system was not able to respond to the commands within the specified amount of time, which was 5000 milliseconds and is the default. I wonder if there is a problem with one or more nodes in the store.  Run the 'verify configuration' command to check the health of the store.  The store might not be able to respond to commands if one or more nodes have failed. - Tim
Since the database engine does not store row counts, to get a row count we must iterate over the entire table. Seems like this could easily take longer than 5s, since 20G of data was inserted. --mark
Sorry, I was wrong above about the iteration taking longer than 5s. It shouldn't timeout because the iteration works in small batches that could easily be completed in less than 5s, unless there is something wrong with the store. So please follow Tim's instructions about verifying the store. --mark
Thank you very much.I will have a try according to  Tim's instructions.

How are asset's IDs created in WCS?

Hi, In WCS 11.1.1.8 how are the IDs assigned to new assets? Is there a way to know what would be the next ID? Or can we know a pattern of the next N number of IDs that are about to be assigned?  In any case, what would be the case if Publishing from A -> B and ID from A exists in B but not for the same asset?  I hope this is clear otherwise please let me know Thanks in advance
Publishing always happens from A to B. So it is very unlikely that the ID exists in B but not in A. What are you trying to do? Why do you want to know the ID's? 
There is a table called SystemIdGenerator. That controls the 'next number' that will be fetched when the system needs to generate additional unique id's. The core id generation will grab a 'pool' of numbers and increment this table with what is next. If you change the prefix on the numbers on the other system it will generate based on your new scheme -  example: System A 1374100115170System B 3374100115170 Then system B will start making numbers larger than 33x I understand what you are concerned about -  the system clone and split process you went thru  definitely adds to the confusion.  Before you do that again you should think thru why you need to do it in the first place - if the existing asset and tempting model does not support feature versions you may want to revisit that.In many cases when the business insists that they MUST run in parallel to build out it makes sense to not only clone 'Editing' but also clone Delivery - and then when they are ready to switch to the new model it's just a couple of DNS updates

Error loading Worker with END_ASG rows - (There is no primary assignment for work relationship.)

Hello, I'm having trouble loading an end assignment (END_ASG) row using HCM Data Loader in the new R12 environment. Please see attached file for an abbreviated Worker-0000001.dat.zip file. I get the following error:“There is no primary assignment for work relationship.”  NOTE: This data loads successfully if I omit he END_ASG rows (Assignment, WorkTerms, and AssignmentWorkMeasure). The same Worker data loaded successfully in the previous version of Oracle (R11) but now it errors out in the new R12 environment. Please advise.Thank you in advance!Peter
Hello Peter, good question. You are ending the non-primary assignment, so no change in primary flags should be necessary. Apart from the fact, that I've  never tried to list first all headers and then all data lines, I do not see the root-cause. Unfortunately, I am still on R11, so I have no experience with any "changed behaviour" of the HDL. What happens, if you load the end-assignment rows (Workterm, Assignment) separately, after you successfully loaded the other lines? Note, that you may have to change EffectiveEndDate of the predecessing line, then. RegardsHolger
I can remove the END_ASG rows and load them in a separate file successfully. I have thought of this idea as well, but wanted to avoid having to add another step to our conversion process. On a side note, do you know when the EffectiveSequence should be incremented? I was told to increment the EffectiveSequence when there are multiple rows for the same PersonNumber(We don't use PersonId) and EffectiveStartDate, but that seems strange to me. It makes more sense to increment the EffectiveSequence only when there are multiple rows for the same AssignmentNumber and EffectiveStartDate. Oracle documentation gives a generic explanation. Basically says if some change occurs on the same day, then increment the EffectiveSequence but doesn't specify the keys. Thanks,Peter
Hello Peter, sorry for my late response, but my work schedule does not always allow daily contribution to community. So, if the END_ASG row works fine in a separate file/load, then there might be a bug in the HDL, or I have overseen something. Either way, at least you made it. EffectiveSequence: yes, it should only be increased if multiple changes per day (Fusion documentation uses the acronym MCPD) for the same combination of the [other] keys (EffectiveSequence is part of the key). Fusion uses "artificial keys" (they call it Fusion keys), e.g. PersonId (number field) for a person (while the "user key" PersonNumber (string) is distinct for persons as well), AssignmentId (number) for an assignment  (while the user key AssignmentNumber (string) is distinct for an assignment as well).As the Assignment entity uses assignmentId as key (not the PersonId, though this is part of the granularity), you should increase EffectiveSequence if an assignmentId has multiple changes per day (as EffevticeStartDate is another part of the key). If in doubt, you may want to check the documentation (here the example for the underlying object PER_ALL_ASSIGNMENTS_M) which tells you that techniquallythe keys are: ASSIGNMENT_ID, EFFECTIVE_START_DATE, EFFECTIVE_END_DATE, EFFECTIVE_LATEST_CHANGE, EFFECTIVE_SEQUENCE  (personalyy, I think, that EFFECTIVE_END_DATE and EFFECTIVE_LATEST_CHANGE are dispensable parts of the key, as they do not offer a degree of freedom), while the HCM Data Loader (separate documentation)  also accepts as substitution (think of it as a search index) the reference of AssignmentNumber, EffectiveStartDate, EffectiveSequnce If a person has 2 assignment rows at the same day, but they are of different assignmenIds (AssignmentNumbers), then do not increase Effseq. Just increase it, if MCPD happends for the same Assignment.Furthermore, a MCPD situation in  Fusion expects the non-latest row to havea) an EffectiveEndDate equal to its EffectiveStartDateb) the EffectiveLatestChange (a Y/N-flag) to be N (the only situation this flag is N, in any other cases it is Y, so not very valuable) Example (taken but reduced to the core) from the HDLdi samples, so thanks to Prasanna Borse:  note, that 2nd, 3rd and 4th of the 5 rows start at the same day. While 1st and 5th row stand alone on their EffectiveStartDate. METADATA|Assignment|ActionCode|AssignmentNumber|AssignmentStatusTypeCode|AssignmentType|EffectiveStartDate|EffectiveEndDate|EffectiveLatestChange|EffectiveSequence|...MERGE|Assignment|HIRE|STUDENT*_ASSIGN_NUM117|ACTIVE_PROCESS|E|2015/08/01|2015/08/10|Y|1|...MERGE|Assignment|LOCATION_CHANGE|STUDENT*_ASSIGN_NUM117|ACTIVE_PROCESS|E|2015/08/11|2015/08/11|N|1|...MERGE|Assignment|JOB_CHANGE|STUDENT*_ASSIGN_NUM117|ACTIVE_PROCESS|E|2015/08/11|2015/08/11|N|2|...MERGE|Assignment|PROMOTION|STUDENT*_ASSIGN_NUM117|ACTIVE_PROCESS|E|2015/08/11|2015/08/31|Y|3|...MERGE|Assignment|LOCATION_CHANGE|STUDENT*_ASSIGN_NUM117|ACTIVE_PROCESS|E|2015/09/01|4712/12/31|Y|1|... Hope it helps. Regards Holger   RegardsHolger
Thank you so much for explaing how the EffectiveSequence on Asssignment object works. That was my understanding as well.
After some more research and an SR ticket, it turns out that this issue is related to an Oracle bug. I was instructed to leave the PRIMARY_FLAG empty at the Assignment level and that would allow the loading of employees with END_ASG data which loaded without the error. Unfortunately, there is an Oracle bug that caused multiple assignments to have the PER_ALL_ASSIGNMENTS_M.PRIMARY_ASSIGNMENT_FLAG set to 'Y':https://support.oracle.com/epmos/faces/BugDisplay?_afrLoop=194103949568890&parent=SrDetailText&sourceId=3-15270264661&id… https://support.oracle.com/epmos/faces/BugDisplay?_afrLoop=193845473852340&parent=SrDetailText&sourceId=3-15270264661&id…  Service Request Response from Oracle:Received the following update in  Bug 26395393  : ========================================= In the DAT file loaded by the customer, there are two assignments in the work relationship. Normally the assignment with the action code of HIRE would get created first, but in this case the assignment with action code of ADD_ASSIGN got created first. There was a check in the code which was automatically changing the primary flags to "Y" for the first assignment which got created for a work relationship (even if the DAT file specified the flag as N). This led to incorrect data for person number ###### in the PrimaryFlag, PrimaryAssignmentFlag columns of the Assignment table. We are implementing a fix to ensure that this automatic changing of flags does not happen when there is more than one assignment in the Work Relationship being loaded via the DAT file. The fix for this will be delivered via  bug 25583761. For specific case of person number ######, please ask the customer to correct the PrimaryFlag and PrimaryAssignmentFlag values via HDL. =========================================  Hope someone else can find this information useful.
Thanks for explanation !

Some question about Oracle NoSQL works

Hi, all!
I have some question about Oracle NoSQL work. I hope you can help me!
Development:
1)     We know that the client balancing load into one replication group (between storage nodes). As he gets the value of the current load each of nodes in Shard(in case when I use more then one application server)?
2)     Could you suggest any way to more efficiently load testing (for internal research) ?
3)     Are there any ways\tools to bulk insert(from flat file, for example)?
4)     What are the ways to securing data(auth, …) . How can I protect important keys?
5)     What fetch method are most effetely\faster (multiGet(), multiGetIterator(), storeIterator() )?
6)     How many bytes keep version(per key-value pairs)?
7)     How can I look distribution key-value pairs between partitions?
8)     What about compression(key and value)?
9)     Can I create one value bytes array for 2 different keys(second key the same as pointer)?
Administration:
1)     What monitoring tools exist(disk, memory, CPU… for each node and cluster as a whole)?
2)     In memory(cache) contains only keys(index of keys)? Or I can put value there?
3)     What function must do FS cache and what java cache?
4)     Are there any formula to calculate number of partitions (the same as number of shards)?
5)     May I assign master node into replication groups?
6)     What negative aspects have method using a lot of shards(virtual, when one hard server includes many virtual)?
7) How client connect to kvstore(step by step)?
Thank you in advance! 
934449 wrote:
1)     We know that the client balancing load into one replication group (between storage nodes). As he gets the value of the current load each of nodes in Shard(in case when I use more then one application server)?I don't understand the question. The client driver does load balancing, when it is able to based on the Consistency requirements of the request, across nodes within a shard.
3)     Are there any ways\tools to bulk insert(from flat file, for example)?Not presently.
4)     What are the ways to securing data(auth, …) . How can I protect important keys?Presently there is no authentication capability.
5)     What fetch method are most effetely\faster (multiGet(), multiGetIterator(), storeIterator() )?It depends on the application. multi...() operations are designed to reduce the number of network turnarounds.
6)     How many bytes keep version(per key-value pairs)?Only one version of the data is maintained.
7)     How can I look distribution key-value pairs between partitions?Within a shard, you could run com.sleepycat.je.util.DbVerify against the environment. Partitions will show up as databases named "pNNN".
8)     What about compression(key and value)?Not presently.
9)     Can I create one value bytes array for 2 different keys(second key the same as pointer)?If you are asking if you can have two keys referring to the same value, then "no".
Administration:
1)     What monitoring tools exist(disk, memory, CPU… for each node and cluster as a whole)?There are many monitoring tools available. The admin console provides access to these.
2)     In memory(cache) contains only keys(index of keys)? Or I can put value there?You can not put anything there, but values and keys are put in the cache.
3)     What function must do FS cache and what java cache?I don't understand the question.
4)     Are there any formula to calculate number of partitions (the same as number of shards)?There is a chapter in the Admin guide about sizing. This is addressed there.
5)     May I assign master node into replication groups?It is dynamic.
6)     What negative aspects have method using a lot of shards(virtual, when one hard server includes many virtual)?Contention for resources (IO, CPU, and memory) primarily.
7) How client connect to kvstore(step by step)?The client creates a KVStoreConfig and creates a KVStore object.
Charles Lamb 
Charlie, I think I understand one of the questions.
2)     In memory(cache) contains only keys(index of keys)? Or I can put value there?3)     What function must do FS cache and what java cache?>
The NoSQL DB cache contains keys only (which we call the Btree internal nodes), and values are stored in the file system cache.
Is that what you wanted to know?
--mark                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           

Activity data semantics: mutability and time ordering

Activity data mutabilityExport of activity data via the Bulk REST API v2 returns json objects which have an 'updatedAt' field. How/when is this data updated? For example, are there APIs that allow updating activities (e.g., keyed by an ActivityId)? To what extent are activities immutable? For example, will createdAt ever change? Time ordering and consistency of activity dataAssuming I have full permissions to all data, If I export activity data now (1AM PST 8/26) with a filter like "('{{Activity.CreatedAt}}' < '2014-07-31T23:59:59' AND '{{Activity.CreatedAt}}' > '2014-07-30T23:59:59' AND '{{Activity.Type}}' = 'EmailOpen')" will this capture all EmailOpens in the window, or is it possible that rerunning the export at a later date (say 9/1) could return additional EmailOpens? If so, why/under what circumstances?  How would the same filter behave when the upper bound on CreatedAt was closer to the current time? Is it safe to filter on '{{Activity.CreatedAt}}' < '$NOW' or would '{{Activity.CreatedAt}}' < '$NOW - 5minutes' be better? I wasn't able to find any documentation covering these questions -- if it exists and I missed it, apologies in advance. Thanks! Christopher Campbell-Oracle
Hi pod, The updatedAt field is one that is included on all (or nearly all) objects within Eloqua.  Consider it a field that is globally included, but not always used.  In the case of activities, updatedAt is not utilized at this time. The createdAt field value is set upon activity creation and should never change.  So exporting activities from the first filter should always provide the same activities.  The second filter specifies a time window which is continually updated based on the current date/time, as a result the activities in the export will vary based on activities being created in Eloqua.
Thanks Christopher! Understand re updatedAt, that makes sense. For createdAt, I understand what you're saying, but my question is slightly different -- sorry, I think I wasn't clear. To rephrase: What duration D do I have to wait after time T to be guaranteed that no new activities will be created with a createdAt < T? I'm confident D is less than a month -- I doubt there will be new activities created today that get stamped with a createdAt in July. But, depending on Eloqua's backend, I could imagine that D might be, say, one minute. Thanks again!
The createdAt time is accurate to the second.  Filtering on createdAt > date/time, where date/time is the end time of the previous sync, there should be no overlap. Activities are not created with past dates, so there should be no need to consider overlap in when exporting.
Christopher Campbell wrote:
 
Filtering on createdAt > date/time, where date/time is the end time of the previous sync, there should be no overlap.
 
 Great, thanks!

Categories

Resources