com.taleo.integration.client.customstep.paging
Class PagingPreStep

java.lang.Object
  extended by com.taleo.integration.client.step.BaseStep
      extended by com.taleo.integration.client.step.BaseCustomStep
          extended by com.taleo.integration.client.customstep.BaseCustomStep
              extended by com.taleo.integration.client.customstep.paging.PagingPreStep
All Implemented Interfaces:
com.taleo.integration.client.step.CustomStep, com.taleo.integration.client.step.Step

public class PagingPreStep
extends BaseCustomStep

Pre-processing step to manage the beginning of the paging process for large extracts.

This pre-processing step works in pair with PagingPostStep. It sets the current page index and the paging size into the query. The PagingPostStep manages the end of the loop.

This step takes all parameters from its base class BaseCustomStep and the following additional parameters:

Paging continues until the record count from the last run is less than the paging size. When that happens, the process exits abruptly with code 0. All subsequent steps are then ignored. The exit code can be set differently using the system property named com.taleo.client.customstep.PagingPreStep.CompleteExitCode.

Note: Exiting with a specific code is currently the only option to end TCC on demand. The down side is that the monitoring file remains incomplete.

This implementation of paging does not maintain state between pages at the database level. Consequently, this process is sensible to data moving across pages while paging occurs, possibly causing missing records. One way to prevent this is to ensure consistent ordering of the data from one page to the other. If default sorting cannot guarantee to be consistent, a sort order must be forced.

If records are added while paging, this may cause duplicate records in the result. In fact, a record inserted in a previously extracted page will push the other records down and the last one that was extracted will appear again in the following page. If required, duplicates can be eliminated by applying a merge that removes them at the end of paging.

Also, removed or deleted records may cause shifting of pages and missing records. In fact, records that are removed from already extracted pages during paging will cause other records that should have been extracted in the current page to logically shift to the last extracted page and thus be missed by the current page. The strategy we propose for queries that can have deletions or removals (by filtering on a moving date, e.g. last modified date) is to apply a decrease factor on the paging size on each successive run. This factor represents a safe maximum percentage of deletion/removal we can expect while paging occurs. Doing this, we are likely to extract again the last records of the previous page, but some of them may also be the ones we would have missed otherwise. Duplicates that this strategy will create are a lesser problem than missing records, especially when we apply a merge that removes them at the end. A typical value for this factor is 1%, which for paging size of 100000 represents 1000 removals during the couple minutes that a single page is extracted.

In practice, the following rules should apply:

Here are examples of how to define this custom step in the TCC configuration file:

Author:
Romain Guay, Taleo Corporation

Field Summary
protected  int decreaseFactor
          The decrease factor.
protected  int pagingSize
          The paging size.
 
Fields inherited from class com.taleo.integration.client.customstep.BaseCustomStep
parameterNames
 
Fields inherited from class com.taleo.integration.client.step.BaseCustomStep
parameters
 
Fields inherited from class com.taleo.integration.client.step.BaseStep
commType, ERROR_NULL_GLOBAL_CONFIG, ERROR_NULL_PIPELINE, ERROR_NULL_STEP_CONFIG, productCode, tempFolder, type, version
 
Constructor Summary
PagingPreStep()
          Constructor without arguments.
 
Method Summary
 void execute(com.taleo.ws.integration.client.Pipeline pipeline)
           
 int getDecreaseFactor()
          Get the decrease factor.
 java.lang.String getDescription()
           
 java.lang.String getIdentifier()
           
 java.lang.String getName()
           
 java.lang.String getPagingFilename()
          Get the paging filename.
 int getPagingSize()
          Get the paging size.
 void init(com.taleo.ws.integration.client.GlobalConfig config)
           
 void setDecreaseFactor(int decreaseFactor)
          Set the decrease factor.
 void setPagingFilename(java.lang.String pagingFilename)
          Set the paging filename.
 void setPagingSize(int pagingSize)
          Set the paging size.
protected  void setPagingSizeAndPageIndex(com.taleo.ws.integration.query.QueryDocument.Query query, java.io.File outFile, int pagingSize, int pageIndex)
          Set the specified paging size and page index in the query and write the modified query in the given output file.
 
Methods inherited from class com.taleo.integration.client.customstep.BaseCustomStep
createTempFile, createTempFile, getEncoding, getTempFolder, isActive, registerParameterName, setActive, setEncoding, validateParameterNames
 
Methods inherited from class com.taleo.integration.client.step.BaseCustomStep
getSupportedPipeline, getType, init
 
Methods inherited from class com.taleo.integration.client.step.BaseStep
getAllProcessSupportPipeline, getCurrentFile, getCurrentFile, getCurrentMessage, getPostProcessSupportPipeline, getPreProcessSupportPipeline, isOriginalFile, validateLastStepType, validateMessageType, validatePipeline
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface com.taleo.integration.client.step.Step
validatePipeline
 

Field Detail

pagingSize

protected int pagingSize
The paging size.


decreaseFactor

protected int decreaseFactor
The decrease factor.

Constructor Detail

PagingPreStep

public PagingPreStep()
Constructor without arguments.

Method Detail

getIdentifier

public java.lang.String getIdentifier()
Specified by:
getIdentifier in interface com.taleo.integration.client.step.CustomStep
Specified by:
getIdentifier in interface com.taleo.integration.client.step.Step
Specified by:
getIdentifier in class com.taleo.integration.client.step.BaseCustomStep

getDescription

public java.lang.String getDescription()
Specified by:
getDescription in interface com.taleo.integration.client.step.CustomStep
Specified by:
getDescription in interface com.taleo.integration.client.step.Step
Specified by:
getDescription in class com.taleo.integration.client.step.BaseCustomStep

getName

public java.lang.String getName()
Specified by:
getName in interface com.taleo.integration.client.step.CustomStep
Specified by:
getName in interface com.taleo.integration.client.step.Step
Specified by:
getName in class com.taleo.integration.client.step.BaseCustomStep

getPagingSize

public int getPagingSize()
Get the paging size.

Returns:

setPagingSize

public void setPagingSize(int pagingSize)
Set the paging size.


getDecreaseFactor

public int getDecreaseFactor()
Get the decrease factor.

Returns:

setDecreaseFactor

public void setDecreaseFactor(int decreaseFactor)
Set the decrease factor.


getPagingFilename

public java.lang.String getPagingFilename()
Get the paging filename.

Returns:

setPagingFilename

public void setPagingFilename(java.lang.String pagingFilename)
Set the paging filename.

Parameters:
pagingFilename -

init

public void init(com.taleo.ws.integration.client.GlobalConfig config)
Overrides:
init in class BaseCustomStep

execute

public void execute(com.taleo.ws.integration.client.Pipeline pipeline)
             throws com.taleo.integration.client.step.StepException
Throws:
com.taleo.integration.client.step.StepException

setPagingSizeAndPageIndex

protected void setPagingSizeAndPageIndex(com.taleo.ws.integration.query.QueryDocument.Query query,
                                         java.io.File outFile,
                                         int pagingSize,
                                         int pageIndex)
                                  throws com.taleo.integration.client.step.StepException,
                                         java.io.IOException
Set the specified paging size and page index in the query and write the modified query in the given output file.

Parameters:
query - The SQ-XML query.
outFile - The output file.
pagingSize - The paging size.
pageIndex - The page index.
Throws:
com.taleo.integration.client.step.StepException
java.io.IOException