FTP Binding Component Specification
Clustering Support & Transactional Messaging
Update History:
12/07/2006 created
08/14/2008 converted into wiki form from PDF
09/04/2008 added content for NFS file lock based concurrency control
01/16/2009 update to incorporate changes made for message exchange
area default sub-directory layout and persisted sequence numbering
1. FTP BC and Message Transport Via FTP
FTP Binding Component (FTP BC herein) is an implementation in compliance with JSR 208,
it provides a message transportation via FTP protocol so that abstract services (which
comprise operation(s)) defined in WSDL can be bound to FTP as its underlying message
transportation, other components in a JBI environment, e.g., Service Engines (SE) can
further orchestrate the services consumption & provision.
The FTP BC implements all required BC interfaces in JBI specification so that it can
be deployed and run in any JBI compliant target environment.
With FTP BC, Service Engines can exchange messages with each other through services
defined via WSDLs that have FTP BC extensibility elements as operation definition
and binding mechanism, by “messages exchange over FTP” or “message transportation over FTP”,
we mean the following:
Figure 1. message transportation via FTP
As illustrated, FTP BC uses directories on FTP server as message persistence
where service consumers and service providers exchange messages, hence, in the
implementation, the following assumptions are made:
(1) For a specific service operation with FTP binding, the endpoint information,
i.e., host name, port, account login, password, plus a base directory, specifies
a message exchange area on the remote FTP server where consumer and provider
communicate, take a request response type operation for example, the followings
are performed in the area when carrying out the message transportation from consumer
to provider (request – IN-Route) and from provider to consumer (response – OUT-Route):
Consumer: Stage (Put) request to staging area for PUT operation
Consumer: Expose (Rename to target location) request to provider
when staging completes
Provider: Poll (Get periodically) request - perform
a directory listing and check file that match the request file name
Provider: Stage the selected file - move it to a directory
(staging area for POLL operation) where the content of the file is read
Provider: Read the content of the selected file from POLL staging area
Provider: Archive/Remove request after received (message content read
completes)
... ... ...
Provider: perform business logic on the request message, and produce
a response - assuming it is a request/response operation
... ... ...
Provider: Stage (Put) response to PUT staging area
Provider: Expose (Rename to target location) response to consumer
when staging completes
Consumer: Poll (Get periodically) response -
perform a directory listing and check file that match the response file name
Consumer: Stage the selected file - move it to a directory
(POLL staging area) where the content of the file is read
Consumer: Read the content of the selected file from poll staging area
Consumer: Archive/Remove response received
Note:
Staging in PUT operation prevents a partially written file from being read.
Staging in POLL operation removes a selected file from input directory
as early as possible
The message exchange area – a dedicated directory on remote FTP server
and all its sub-directories (serving as archiving area, staging area,
exchange area, etc) can be created by either FTP BC runtime or
administratively, just as queues and topics in JMS paradigm.
(2) For a given service, both the service consumer and service
provider either login using the same FTP account (so they share
the same FTP context – e.g., login home where all relative paths
refer to), or if different accounts are used, administrative
configuration need to be made to ensure the login home is mapped
to the same location on FTP server and appropriate access rights
granted accordingly
(3) Different service operations use dedicated message exchange
areas on the remote FTP server, no overlapping between these base
directories
(4) FTP file names are leveraged so that the consumers and providers
can push and poll messages between them, e.g., consumers post request
messages into FTP file with names that match the following pattern:
msg.<UUID>.<TIMESTAMP>, while the corresponding providers
retrieve these request messages with the same pattern from the agreed
location, similar process occurs for the response routing from providers
to consumers, numerous patterns are available for configuration of
a particular message transfer (implemented as FTP BC extensibility
elements: ftp:message, ftp:transfer) so that messages posted got
polled according to their name pattern; note, however, it does not
provide request-response messages correlate across the FTP
(5) Request-response correlation is supported by leveraging names
of the FTP files that serve as intermediary message persistence
on FTP server, the contract is described as follows:
On the consumer side:
the message routing starts with consumer who invokes a service
(INVOKE in BPEL script), on the other side of the NMR, FTP BC
OutboundProcessor accepts the request message and de-normalized
it and put the message body – the payload to a FTP file with name
as req.<uuid_instance_1> (see Figure 1 for examples), then
the consumer thread (FTP BC OutboundProcessor) will spawn off
a ResponsePoller thread which starts polling a response with name
resp.<uuid_instance_1>, note that the <uuid_instance_1>
is the same UUID for request and response, when resp.<uuid_instance_1>
becomes available at the agreed on location between the consumer
and provider, it is fetched, and the request-response correlation completes,
the fetched message is wrapped up in a normalized message and send back into
NMR using the same Message Exchange ID so that the response will be available
as the OUT variable of the INVOKE activity in the BPEL script.
 |
Figure 2. UUID tagged request files
On the provider side:
the provider polls (RECEIVE activity in BPEL) the message receiving area
that is agreed on between the consumer and provider for a request with
FTP file name pattern like req.%u, where suffix %u denotes any string match
the pattern of a typical UUID in string form (refer to JDK java.util.UUID – since 1.5),
see Figure 1 for examples.
Upon receive of the request message (polled from remote FTP directory), the UUID
tag is extracted from the file name, and attached as meta data for the message
exchange, the payload is wrapped in a normalized message and send into the NMR.
The request message participates in any business logic in the service unit and
a response is figured out and send back to NMR with the same message exchange
(ME – attached with meta information including the UUID tagged with the request),
the OutboundProcessor will fulfill the contract by putting the message to a FTP
file named as resp.<uuid_instance_1>, see Figure 2 for exampes of UUID
tagged response messages.
 |
Figure 3. UUID tagged response files
This completes the request-response correlation.
Note that message correlation scheme can be applied to multi-hop
service invoking, i.e., during the processing of the first service
invoking, e.g., invoke service1.operation1, the provider might in
turn invoke other services, such as service2.operation2 as shown
below:
Figure 4. Two hop synchronous services invoking with message correlation
Note also, though the UUID tagging mechanism described above is
provided by FTP BC implementation, it is not the only means to
achieve message correlation, the service unit doing the message
transportation can use the information embedded inside the message
payload to do the correlation at application level, but that is
beyond the scope of this document, see scenarios on FTP BC wiki
page for samples.
2. Clustering Support of FTP BC
A generalized layout of clustered JBI containers can be described
as follows:
M Host => each host has N (AppServer + JVM) => each AppServer has
K clustered container => each container has T FTPBC worker threads
accessing a specific FTP resource associated with an endpoint
where:
M >= 1, N >= 1, K >= 1, T >= 1
Or as shown in figure 4.
Figure 5. Clustering of Service Units (Composite Application)
Due to the nature of the FTP protocol (each message transportation interact
with the persistence and the service implementation is not behind the actual
FTP server), it is unlikely that there is a FTP load balancer running within
an application server context as HTTP does, instead, the FTP BC clustering
means deploying identical JBI composite applications onto JBI containers
residing on one host or across multiple hosts, these composite application
instances have service operations bound to FTP protocol via FTP BC.
However, FTP BC clustering still serves the purposes of HALB (High Availability
& Load Balancing) in that: all the service provider instances simultaneously
poll requests, business logic are applied to the requests simultaneously and
responses are posted simultaneously to the message exchange area, hence
increasing the through-put (if properly tuned) – high performance.
Similarly, all the service consumer instances simultaneously post requests,
and responses are polled simultaneously from the message exchange area, hence
increasing the through-put (if properly tuned) – high performance.
As illustrated in Figure 4., Crash of any consumer or provider instances
(but not all), crash of any JBI containers (but not all) on any of the
physical hosts, crash of any of the hosts (but not all) will not stop
the message transportation, hence high availability.
By the competing nature of the way FTP BC handles inbound message
transportation, messages are processed evenly among the pollers
statistically, hence load balancing achieved.
3. Implementation Considerations
One concern for running FTPBC in a clustered context is the synchronization
of access and/or manipulation of resources on a FTP server, i.e., files and
directories by concurrently running threads, mechanism must be introduced to
prevent racing condition and make sure the integrity of the messages, and
robustness of message routing, a component with such mechanism is called
"Cluster Aware".
Here are two approaches to achieve cluster aware for FTPBC:
- Concurrency Control By Timestamp Based Leasing
- Concurrency Control Using File Lock (NFS or local file system)
- Concurrency Control Using Database Table Based Blocking mechanism
Note, prototyping has shown that "Timestamp Based Leasing" algorithm
failed to provide a robust mechanism to synchronize among concurrent
threads, but listed here for the sake of history record.
3.1. Concurrency Control Using File Lock (NFS or local file system)
At runtime, FTPBC comprises of in bound threads and out bound threads;
Inbound threads consume messages (files on FTP server) and send
them to destination through NMR.
Outbound threads provision messages from NMR and send them to
the destination, e.g. target file under a specified directory
at a specific ftp server.
When these inbound & outbound threads are running concurrently,
e.g., in a clustered context, it is important make sure files are
read (for inbound) or written (for outbound) correctly, specifically,
(1) output files should not get overwritten by simultaneous outbound threads
(2) input files should be picked up only once by one inbound thread
To achieve (1), the end user can configure the output file name as
a pattern including a UUID (%d) or a sequence number (%{<seq_name>})
in the name, such that each output file when written out to the target
folder has a unique file name, for more details on persisted sequence
number creation and reference, see "persisted sequence numbering" later
in this writing.
To achieve (2), a locking mechanism is needed to sync the threads polling
the same endpoint (input file), this locking mechanism is described as
follows:
For each inbound endpoint registered, there is a thread lock - T_LOCK of
type java.util.concurrent.lock.ReentrantLock and a file lock - F_LOCK of
type java.nio.Channel.FileLock associated with it, the pair of locks are
used to protect the endpoint from concurrent polling.
T_LOCK is used to enforce a critical region among inbound threads polling
for the same input file in one process (e.g. one JVM), note that the
current implementation create only one inbound thread for each unique
inbound endpoint but it does not have to.
The logic in the critical region perform the followings:
(1) acquire file lock F_LOCK (a non-blocking try lock)
(2) if lock acquired do (3) (4) (5), otherwise, release T_LOCK
(3) poll the target:
target is decided as follows:
if the poller is retrieving request,
target = ftp:message/@messageRepository/inbox/Inbound message file pattern or
target = ftp:transfer/@receiveFrom
if the poller is retrieving response,
target = ftp:message/@messageRepository/outbox/Outbound message file pattern or
target = ftp:transfer/@sendTo
(4) if the message (ftp file) is retrieved successfully,
archive the file (e.g. move to a dedicated directory on the ftp server
so that the same file won't be polled again)
(5) release F_LOCK
Message Dispatching - Inbound Processor Threads
The T_LOCK + F_LOCK combination described above make sure only one thread
in one process can access the target directory and do the above steps.
Note, after an input file is dispatched, it is uniquely identified in
a working directory and no longer visible in input directory, this
prevents the same file being picked by more than one thread.
Message Consuming - Inbound Worker Threads
In this implementation, each inbound endpoint has one inbound processor
thread created for it, and the role of the inbound processor thread is
to dispatch the input files, after an input file is dispatched, it is
the inbound worker threads' responsibility to further transport
the message (content of the input file) to the destination through
NMR.
Currently, there is a fixed number of inbound worker threads (5) for
each inbound processor.
The following illustrates the mechanism in a multi-threads,
multi-processes context:
+-----------------------------+
| PROCESS A | T_LOCK_1
| +-------------+ | |
| | EP1_IB_A_T1 |-------------------->|
| +-------------+ | |
| | |
| +-------------+ | | F_LOCK ------------> [NFS/Local File System]
| | EP1_IB_A_T2 |-------------------->| | |
| +-------------+ | | EP1_IB_A_WINNER | |
| | |------------------>| +----- lock file assoc'ed /w EP1
| +-------------+ | | |
| | EP1_IB_A_T3 |-------------------->| |
| +-------------+ | | |
| | | |
| +-------------+ | | |
| | EP1_IB_A_T4 |-------------------->| |
| +-------------+ | | |
| | | [FTP directory - assoc'ed /w EP1]
+-----------------------------+ | EP1_IB_A/B_WINNER |
|-----------------> |
+-----------------------------+ | |
| PROCESS B | T_LOCK_2 | |
| +-------------+ | | | |
| | EP1_IB_B_T1 |-------------------->| | |
| +-------------+ | | | |
| | | | |
| +-------------+ | | | |
| | EP1_IB_B_T2 |-------------------->| | |----- messages (files) polled
| +-------------+ | | EP1_IB_B_WINNER |
| | |------------------>|
| +-------------+ | | |
| | EP1_IB_B_T3 |-------------------->| |
| +-------------+ | |
| | |
| +-------------+ | |
| | EP1_IB_B_T4 |-------------------->|
| +-------------+ | |
| |
+-----------------------------+
Attributes "persistenceBaseLocation" and "lockName" are added for
FTPBC WSDL extensibility element ftp:address.
"persistenceBaseLocation" and "lockName" are used to map to a lock
file required by the mechanism.
Specifically, "persistenceBaseLocation" is a path pointing to a file
system directory where the lock file indicated by "lockName" is
created.
The lock file must be a local file system file or NFS file mount that
all the threads for the endpoint have appropriate access rights.
3.2. Concurrency Control By Timestamp Based Leasing
Outbound Processing:
When clustered, component instances doing outbound message transportation
(PUT to a target file at the FTP server) need to make sure that messages
do not overwrite each other when they end up in same target directory.
This can be achieved by UUID tagging the target file (also UUID tagging
intermediary files such as staging file, archived file, etc.).
Other schemes includes (but not limited to) tagging the file name with
unique sequence numbering, or by leveraging other name patterns, e.g.,
timestamp (%y%y%y%y%M%M%d%d%h%h%m%m%s%s%n%n%n%n).
Inbound Processing:
When clustered, component instances poll the target directory for files
matching a specified pattern, racing condition exists among the competing
readers, the “timestamp based leasing” algorithm hopefully will provide
the needed synchronization, it is listed here in pseudo code (again
the file name tagging and FTP rename are employed here to achieve
the desired robustness):
Figure 6. FTP BC target polling algorithm (TPA)
the algorithm will be executed by FTP BC inbound processor thread
as shown below (main thread loop in run()):
Figure 7. FTP BC inbound main loop
Message Staging:
Staging for PUT:
When some components (e.g. consumer sending request,
or provider sending response) put messages into a target
directory and other components (e.g. consumer polling
response, or provider polling request) poll messages
from the target directory, staging is necessary to make
sure the message transferred only becomes visible to
the polling components after its uploading is complete.
Staging for POLL:
When components (e.g. consumer polling response, or provider
polling request) poll/get messages from a target directory,
POLL staging is necessary to make sure the message be moved
away from target directory as soon as it is selected by the
poller, in order that the poller thread will leave the directory
listing operation as soon as possible - since directory listing
is a critical region synchronizing the concurrent pollers, the
relative lengthy file read will be performed from the staging
area by the poller that selected the target file.
Message Archiving:
In a FTP directory where messages are put and get concurrently
by threads within a components, it is vital to clean up
the messages that have been processed already. You can:
(a) archiving them to a dedicated directory such as <message
exchange area>/inarchive for requests,
and <message exchange area>/outarchive for responses,
or:
(b) just delete them if there is no need for keeping
track of what have been processed.
Message archiving and POLL staging helps to reduce current
number of entries under a message exchange directory
(such as “inbox” or “outbox” in the working directories
layout described in section 1), which helps scale down
the FTP list directory operation significantly.
4. Persisted Sequence Numbers
At runtime, FTPBC comprises multiple processing threads
which do both message consuming (polling FTP directory for
a specified file name at an interval) and message provisioning
(writting message to FTP directory by a specified file name),
the threads consuming are called inbound processors, and the
threads provisioning are called outbound processors, the
input/output file name can be literal or a 'pattern'.
File name is literal:
When the file name is literal, it is used as is, i.e.,
if input file name is a literal, the inbound processor
will poll the ftp directory for a file by that name and
'normalize' the content of the file into a normalized
message, if output file name is a literal, the outbound
processor will write the de-normalized message to a file
by that name.
File name is pattern:
Most of the time, file name specified for inbound or
outbound processing are patterns, pattern is a propriatary
mechanism of FTPBC.
Used by inbound processor, the pattern serves as a filter,
e.g., if a file name matches the pattern, it is selected by
the inbound processor, and its content will be read,
normalized and sent into NMR, etc.
Used by outbound processor, the pattern serves as concrete
name generator, i.e., the special pattern symbols such as %u
indicating an UUID, %{<seq_name>} indicating a persisted
sequence number, that appears in the file name pattern specified
as output file will be expanded - i.e. substitute the symbols
with their current value - and derive a concrete file name,
which the de-normalized message will be written to.
For a full list of patterns supported by FTPBC, refer to :
FTPBC WSDL Extensibility Elements Reference
messageName is actually the file name where a message is put into,
usually in the form of a name pattern,
where 'pattern' is a string containing special
characters escaped by percentage sign, the following
are all the symbols supported:
1. time stamps (%[GyMdhHmsSEDFwWakKz]). Java simple date/time format is used:
------------------------------------------------------------
Letter Date or Time Component Presentation Examples
------------------------------------------------------------
G Era designator Text AD
y Year Year 1996; 96
M Month in year Month July; Jul; 07
w Week in year Number 27
W Week in month Number 2
D Day in year Number 189
d Day in month Number 10
F Day of week in month Number 2
E Day in week Text Tuesday; Tue
a Am/pm marker Text PM
H Hour in day (0-23) Number 0
k Hour in day (1-24) Number 24
K Hour in am/pm (0-11) Number 0
h Hour in am/pm (1-12) Number 12
m Minute in hour Number 30
s Second in minute Number 55
S Millisecond Number 978
z Time zone General time zone Pacific Standard Time; PST; GMT-08:00
Z Time zone RFC 822 time zone -0800
-----------------------------------------------------------
2. UUID %u, will be substituted by a UUID value compliant with Java 1.5 UUID.
3. sequence number reference %0, %1, %2, %3, %4, %5, %6, %7, %8, %9,
this symbol will be replaced by the current
value of sequence number which is an integer count that increament
after each reference.
As described in 3), a WSDL author can specify a sequence number in
ftp:message->messageName or ftp:transfer->sendTo so that there will
be a sequence number embedded in the file name, the following types
of sequence number should be supported:
- transient sequence - pattern character %d
** per endpoint
** increment by each reference
** transient (non-persistent) meaning will reset to 0 after
** application shutdown or server shutdown (crash)
** concurrency controlled - meaning access to the in memory
sequence count is synchronized among multiple threads
- local sequence type 1 - pattern character %0, %1, %2, ..., %9
** per endpoint
** increment by each reference
** persisted under composite application deployment location
(hence, the sequence will not survive application un-deployment,
but will survive application shutdown or server shutdown (crash))
** concurrency controlled - meaning access to the persisted sequence
count is synchronized among multiple threads
- local sequence type 2 - pattern character %{0}, %{1}, %{2}, ..., %{9}
** per endpoint
** increment by each reference
** not persisted under composite application deployment location
(hence, the sequence will survive application un-deployment
and application shutdown or server shutdown (crash))
** concurrency controlled - meaning access to the persisted
sequence count is synchronized among multiple threads
- global sequence - pattern character %{<seq_name>}
** global - meaning the same named sequence can be referenced
by endpoints within one application and/or across different
applications
** persisted in a persistence store that is accessible to all
the threads/processes refer to the sequence the sequence
will survive application un-deployment and application
shutdown or server shutdown (crash))
** concurrency controlled - meaning access to the in memory
sequence count is synchronized among multiple threads/processes
Reference Persisted Sequence:
%{<seq_name>} indicates a reference to a persisted sequence
number by name <seq_name>, where <seq_name>
is a sequence name whose current value will be used to
substitute the reference and incremented by one, the
lexical definition of <seq_name> is:
<seq_name> =: (
0-9a-zA-Z.-_)+
Sequence is persisted so that it survives application
shutdown and/or undeploy, also the jbi container shutdown
(application server shutdown), etc.
Here are some examples of name pattern expression with
sequences:
Example 1:
req.%u.%{myseq} – where %{myseq} is a reference to a sequence
by name “myseq”.
Example 2:
purchase_order_%{order_number} – where %{order_number} is a
reference to a sequence by name “order_number”.
At runtime, every time a name pattern is de-referenced,
the sequence references are substituted by the current
value of the sequences and the sequences increment, e.g.,
the above expressions can generate name instances as
follows:
A sample run of example 1:
req.%u.%{myseq}
...
myseq = 1 ==> req.4d4f6177-f085-456b-94d1-1ed22baae814.1
myseq = 2 ==> req.d97c1a77-af8a-4dd2-8856-80e7f8b0be75.2
...
myseq = 256 ==> req.d97c1a77-af8a-4dd2-8856-87e7f8b0be65.256
...
A sample run of example 2:
purchase_order_%{order_number}
...
order_number = 1056 ==> purchase_order.1056
order_number = 1057 ==> purchase_order.1057
...
order_number = 2390 ==> purchase_order.2390
...
Mapping of Sequence to Persistence Store:
For the current implementation, an ASCII file is used
as the persistence storage of sequences, the mapping of
sequence to file system file are as follows:
Local Sequence Type 1:
the persistence store for this type of sequence
(e.g. %0, %1, %2, ... %9) is a flat ASCII file
under a per endpoint unique directory the service
assembly deployment location as shown below:
${com.sun.aas.instanceRoot}/jbi/service-assemblies/HelloDukeCompApplication/HelloDukeCompApplication-sun-ftp-binding/sun-ftp-binding/{per-endpoint-UUID}/sequence0
${com.sun.aas.instanceRoot}/jbi/service-assemblies/HelloDukeCompApplication/HelloDukeCompApplication-sun-ftp-binding/sun-ftp-binding/{per-endpoint-UUID}/sequence1
${com.sun.aas.instanceRoot}/jbi/service-assemblies/HelloDukeCompApplication/HelloDukeCompApplication-sun-ftp-binding/sun-ftp-binding/{per-endpoint-UUID}/sequence2
${com.sun.aas.instanceRoot}/jbi/service-assemblies/HelloDukeCompApplication/HelloDukeCompApplication-sun-ftp-binding/sun-ftp-binding/{per-endpoint-UUID}/sequence3
${com.sun.aas.instanceRoot}/jbi/service-assemblies/HelloDukeCompApplication/HelloDukeCompApplication-sun-ftp-binding/sun-ftp-binding/{per-endpoint-UUID}/sequence4
... ... ...
${com.sun.aas.instanceRoot}/jbi/service-assemblies/HelloDukeCompApplication/HelloDukeCompApplication-sun-ftp-binding/sun-ftp-binding/{per-endpoint-UUID}/sequence9
A sample deployment of a composite application "HelloDuke"
with service that has local sequence type 1 reference:
the per endpoint persist store for %0 in HelloDuke
composite application:
C:\GlassFishESB\glassfish\domains\domain1\jbi\service-assemblies\HelloDukeCompApplication\HelloDukeCompApplication-sun-ftp-binding\sun-ftp-binding\449b888d-46aa-338b-8397-ada5cd19f683
Note: the UUID 449b888d-46aa-338b-8397-ada5cd19f683 is calculated from the endpoint unique name, and it will stay the same if the endpoint unique name does not change.
Local Sequence Type 2:
the persistence store for this type of sequence
(e.g. %{0}, %{1}, %{2}, ... %{9}) is a flat ASCII
file under a per endpoint unique directory the
service assembly deployment location as shown
below:
${com.sun.aas.instanceRoot}/applications/jbi_apps_persistence/HelloDukeCompApplication/HelloDukeCompApplication-sun-ftp-binding/sun-ftp-binding/{per-endpoint-UUID}/sequence0
${com.sun.aas.instanceRoot}/applications/jbi_apps_persistence/HelloDukeCompApplication/HelloDukeCompApplication-sun-ftp-binding/sun-ftp-binding/{per-endpoint-UUID}/sequence1
${com.sun.aas.instanceRoot}/applications/jbi_apps_persistence/HelloDukeCompApplication/HelloDukeCompApplication-sun-ftp-binding/sun-ftp-binding/{per-endpoint-UUID}/sequence2
${com.sun.aas.instanceRoot}/applications/jbi_apps_persistence/HelloDukeCompApplication/HelloDukeCompApplication-sun-ftp-binding/sun-ftp-binding/{per-endpoint-UUID}/sequence3
${com.sun.aas.instanceRoot}/applications/jbi_apps_persistence/HelloDukeCompApplication/HelloDukeCompApplication-sun-ftp-binding/sun-ftp-binding/{per-endpoint-UUID}/sequence4
... ... ...
${com.sun.aas.instanceRoot}/applications/jbi_apps_persistence/HelloDukeCompApplication/HelloDukeCompApplication-sun-ftp-binding/sun-ftp-binding/{per-endpoint-UUID}/sequence9
C:\GlassFishESB\glassfish\domains\domain1\applications\jbi_apps_persistence\HelloDukeCompApplication\HelloDukeCompApplication-sun-ftp-binding\sun-ftp-binding\449b888d-46aa-338b-8397-ada5cd19f683
 |
Global Sequence:
global sequence with name <seq_name>
is persisted in a file with name : <seq_name>
The parent directory is indicated by the
corresponding ftp:address/@persistenceBaseLocation
in the binding, preferably, ftp:address/@persistenceBaseLocation
will be a path pointing to a file system location outside of the
application server installation location.
The format of the number:
the current value of the sequence number is stored as string.
Initial value of a persisted sequence:
the inital value of a persisted sequence is 0, and it increments
by 1 every time it is referenced, to make a sequence starts with
a number > 0, the user can edit the persistence storage file and put
a start value there - do this when the file is not in use;
Concurrency control of access to sequence storage file:
the access to the specific sequence's persistence file must be
synchronized for concurrent read/write, meaning at any time, the
file is read and updated by a single thread across clusters and/or
JVMs (can be on one host or multiple hosts)
5. Transactional Messaging
It is desirable to have ACID properties (Atomic, Consistency,
Isolation, Durability) for FTPBC messaging, specifically, each
message consuming and message provisioning has the
characteristics of ACID.
Note, in the context of FTPBC:
- Message consuming involves the following operations:
- list directory
- if found a match, move it to staging area (FTP RENAME)
- other reader could not see it
- retrieve message from the staging area
- move the file in staging to an archive or delete it
- Message provisioning involves the following operations:
- store message (target file) in a staging area
- move the message to destination (FTP RENAME)
As can be seen, message consuming and provisioning both may
involve FTP write operation (STORE, RENAME, DELETE).
Assumption:
the FTP RENAME is atomic in that if it succeeded,
the file is moved from src to dest, otherwise, the file stays
at src.
The mechanism:
the approach to be used for transactional messaging is FTP operation journaling.
for each registered endpoint, a journal is created that keeps track of the sequence
of FTP operations performed by FTPBC to do either message consuming / provisioning.
When a message consuming/provisioning is completed, the corresponding journal is cleaned up,
ready for next message, otherwise, the journal is used to: either rollback to the previous
consistent status or recover to the correct status.
Persistence Design:
file system is used as the persistence storage, the persistence storage is per application server.
for example, for glassfish, ${com.sun.aas.instanceRoot} can be used as the base directory for the
persistence storage.
the base directory of the journals for FTPBC is:
${com.sun.aas.instanceRoot}/persistenceStore/FTPBC_JOURNALS
The journal consists of one endpoint_journal_registry, which is a map of
endpoint unique name <==> journal entry name, each journal entry is an
ASCII file and its name is a UUID derived from the associated endpoint
unique name.
each journal contains the FTP write operations log for one message
receiving / sending, and can be used for recovery in case of system
crash, also can be used to do a rollback of the concerned FTP
operations.
The registry grows or shrink accordingly when endpoints are deployed
or undeployed, and corresponding journal entries created or removed.
6. References
- JBI Specification 1.0 (JSR 208)
- FTP BC Functional Specification
Table of Contents