Derivation

In the remainder of this section, the items following the bullets are an English description of the model. The intervening comments are the rational for the model.

CMA Purpose.

The purpose of the CMA is to enable CAETI applications to exchange messages.

"This architecture covers a minimalist methodology to give CAETI Applications the ability to receive messages and respond to them.... To preserve flexibility, interpretation of the meaning of the messages, processes invoked on receipt of an message, and contents of returned messages are up to individual collaborating CAETI developers." -- [CAT, 96a, p2]

The preceding is the rationale for making Model 2 describe a TCP-like exchange of KQML-like messages. First, this makes the message transmission part of the CMA more strongly resemble TCP. TCP [Postel, 81a] is the well-established data transmission protocol upon which the Internet is based. Second, this more strongly separates the concern of message transmission from the concern of message content. Message content is based on KQML [Finin, 93a] which is a draft specification of an emerging language for communication among knowledge based agents.

Set of Processes.

The CAETI system is of a set of interacting processes. CAETI applications are processes.

"Grow ears and a nervous system' (~ becomes a callable process)" -- [Bellman, 95]

[CAT, 96a] introduces the term "module" as follows: "The term `module' is used herein to refer interchangeably to whole systems as well as their subcomponents." [CAT, 96a, p3], and the "Message Assumptions" section of [CAT, 96a, pp.3-4] implies that "modules" are the parties that exchange messages.

Model 2 uses the term "processes" to refer to the parties that exchange messages rather than "modules." This terminology is used for several reasons. First, in so far as possible, Model 2 uses transmission concepts and terminology that are consistent with TCP as defined in RFC 793 [Postel, 81a]. (See Attachment C for the relevant excerpts.) The RFC 793 glossary defines both "process" and "module" as follows:

module: "An implementation, usually in software, of a protocol or other procedure."

process: "A program in execution. A source or destination of data from the point of view of the TCP or other host-to-host protocol."

Second, "process" and "module" appear to be interchangeable in [CAT, 96a]. Third, "process" is the term introduced by Bellman.

Process Interaction.

CAETI processes are required to interact only by exchanging CMA messages.

"There is no explicit concept of integration beyond the message exchange." -- [Harbison, 95b].}

TCP-Like Message Transport.

CAETI processes are required to exchange messages only over connections that can be supported by TCP/IP. This does not rule out other means of interprocess communication.

"Communication transport layer - TCP/IP, ..." -- [Bellman, 95].} [Bellman, 95] also identifies "SMTP" and "HTTP (desirable)" as communication transport requirements. All of these can be supported by TCP/IP.

"This is based upon a developer's choice of a TCP/IP, SMTP/POP3, or HTTP transport layer to enable transmission of some basic (abstract) interprocess messages. " -- [CAT, 96a, p2]

"In order to be sent by TCP/IP sockets, SMTP, or HTTP, the message (including all parameters) must be ...." -- [CAT, 96a, p5]

"All three CAETI Transport layers use TCP/IP at some level." -- [CAT, 96a, Footnote p5] What seems to be envisioned in all three source documents is a transmission service for CAETI messages that can be implemented either by HTTP, by SMTP/POP3, or by TCP/IP.

Interprocess Connections.

A connection is a logical communication path that is identified by a pair of CAETI addresses.

"The interprocess messaging is based on socket, SMTP or HTTP connections between clients and servers; one-to-many, or peer to peer." -- [CAT, 96a, p.11]

The Model 2 view of interprocess communication deviates somewhat from [CAT, 96a]. Instead of phrasing communication in terms of "modules" connected by "links", CMA message transmission service in Model 2 is identical to TCP service with the following exceptions:

Exception 1 is the fundamental abstraction of CMA message transmission. CMA transmits whole messages in the same kind of way as TCP transmits octets. Thus, CMA can be implemented directly by a TCP/IP service with the addition of a service that converts between CMA messages and octets. Exceptions 2 and 3 are included to enable CMA transmission also to be done via SMTP/POP3 or HTTP.

This TCP-based message transmission service is described in terms of "processes" exchanging messages over "connections." The service in [CAT, 96a] is described in terms of "modules" exchanging messages over "links." An argument is given below that the two services are comparable except that "connections" are bi-directional, whereas "links" are unidirectional. However, bi-directional communication is required among CAETI processes because they are required to respond to incoming messages.

"Can receive and respond to a set of 6-10 messages (responses may be negative)" -- [Bellman, 96a]

So, instead of using a pair of uni-directional links for bi-directional communication, Model 2 just uses bi-directional connections. This introduces no additional requirement.

To see that the two services are comparable, consider each of the message assumptions stated in [CAT, 96a, pp3-4].

"Choice of transport protocol is up to the application developer and is independent of the messages." The TCP-based service may be implemented by a TCP service (together with a message to octet conversion service), but it may be implemented in other ways.

"Modules are connected by unidirectional communication links that carry discrete messages." The TCP-based connections are bi-directional instead of unidirectional. ("A connection can be used to carry data in both directions, that is, it is ‘full duplex'." -- [Postel 81a, p10])

"These links may have a non-zero message transport delay associated with them." (In general the TCPs decide when to block and forward data at their own convenience." -- [Postel, 81a, p4])

"When a module sends a message it may direct to which outgoing link the message goes." The TCP SEND command that a user process uses to transmit data over a connection contains a "local connection name" parameter which is a local name for the connection. [Postel, 81a, p46].

"When a module receives a message, it knows from which incoming link the message arrived." The TCP RECEIVE command also contains a "local connection name" parameter. [Postel, 81a, p48]

"Messages are sent and arrive asynchronously." (It is also expected that the TCP can asynchronously communicate with application programs." -- [Postel, 81a, p3])

"Messages generally arrive in the order they were sent, but may not." This deviates from TCP, and it the reason for Exception 2 in the TCP-based service.

"Message delivery is generally reliable, but programs should be designed to be robust enough to work stand-alone if the transport layer is unavailable. It is understood that this may cause degraded performance or force applications to run with older data." This deviates from TCP, and it is the reason for Exception 3 in the TCP-based service.

The CMA message assumptions in [CAT, 96a] appear to be derived from the KQML transport assumptions which also are stated in terms of "links."

These are an abstraction which can be implemented a variety of ways. "... the links could be TCP/IP connections over the Internet,.... The links could be email paths... The links could be UNIX IPC connections.... Or, the links could be high-speed switches in a multiprocessor...." [Finin, 93a, p6]

If the KQML links were bi-directional instead of unidirectional, the TCP-based service without Exceptions 2 and 3 would be comparable to the KQML assumptions.

CAETI Addresses.

There is a CAETI address space that identifies the end points of connections. Every element of the address space is unique.

The CAETI address space is not addressed in any of the source documents. Because the CAETI address space is not resolved in the source documents, Model 2 also leaves it unresolved. Model 2 simply assumes that a CAETI address space exists and that every address in the space is unique.

RFC 793 very carefully defines the end points of TCP connections to be "sockets", and a socket is "An address which specifically includes a port identifier, that is, the concatenation of an Internet Address with a TCP port." [Postel, 81a, p84] Thus, sockets are the address space for TCP connections, and each socket is unique.

None of the three source documents, [Bellman, 96a], [Harbison, 96a] or [CAT, 96a], raise the issue of the CAETI address space. However, identifying this address space is critical because it identifies the end points of CMA message exchange. If CMA message exchange were required only over TCP connections, TCP sockets would be the natural choice for the CAETI address space. However, because exchange via SMTP/POP3 and HTTP also is to be allowed, an address space of TCP sockets is not adequate. For example, email for many different users many be transmitted over the same TCP socket.

Message Elements.

Every CMA message is composed of a sequence of ASCII characters. Messages are of variable length and potentially large.

"In order to be sent by TCP/IP sockets, SMTP, or HTTP, the message (including all parameters) must be represented as a sequence of ASCII characters. Messages can be of variable length, and are potentially large. The messages may be larger than local TCP/IP buffers." -- [CAT, 96a, p5]

KQML Message Syntax.

Every CMA message has the following syntax:

<performative> ::= (<word> {<whitespace> :<word> <whitespace> <expression>}*)
<expression> ::= <word> | <quotation> | <string> | (<word> {whitespace <expression>}*)
<word> ::= <character><character>*
<character> ::= <alphabetic> | <numeric> | <special>
<special> ::= < | > | = | + | - | * | / | & | ^ | ~ | - | @ | $ | % | : | . | ! | ?
<quotation> ::= ‘<expression> | `<comma-expression>
<comma-expression> ::= <word> | <quotation> | <string> |  ,<comma-expression> 
				| (<word> {<whitespace> <comma-expression>}*)
<string> ::= "<stringchar>*" | #<digit><digit>*"<ascii>*
<stringchar> ::= \<ascii> | <ascii>-\-"

Note that the second form of <string> has an additional restriction: The number named by the <digit>s is how many <ascii> characters must follow. e.g. #5"abcde

This syntax is taken directly from [CAT, 96a, p5], and it is taken from [Finin, 93a, p8].

Message Types.

CAETI processes are required only to receive and respond to the following types of messages: tell, ask, do, subscribe, sorry. The type of the message is identified by the first <word> in the <performative>. "Can receive and respond to 6-10 messages (responses may be negative)" -- [Bellman, 96a].

"A Minimum Set of Message Performatives {tell, ask, do, subscribe, sorry}

Components need to tell things to one another, and to ask for information. Tell can be used to reply to ask queries. Modules can request other modules to do actions, to achieve goals, or invoke a remote method or command. Modules may subscribe to changes in other modules' states, saying, in effect, "ask for notification of when your state changes in such-and-such a way". Finally, modules need to be able to report errors in processing or understanding - "sorry I didn't understand your message"; "sorry I could not complete your request due to an error [of type x]". Besides defining the performatives, the CMA specifies a KQML compatible message syntax in terms of message parameters." -- [CAT, 96a, p4]

That the first <word> of the <performative> identifies the type of CMA is implicit in the examples of [CAT, 96a].

Message Parameters.

Message parameters are identified by the :<word> parts of a <performative>. Every CMA message must have both a sender and a receiver parameter, and it "may contain some or all of the following parameters:

ParameterDescriptionDefaultExamples
senderwhere the message is fromnonepa.caeti.org:2010
receiverwhere the message is tononesa@dodea.edu, http://foo.net/agent
languagehow the content is encodedtexttext, MIME, KIF, TOE
ontologythe name of the ontology used in the contentsomething appropriate for the languageStudent Academic Record
contentthe encoded representation of data to be passednone"math score = 98%"
reply-withwhether a reply is expected, and if so, a label for the replynil"bob's math score"
in-reply-tothe expected labelnone"bob's math score"

Messages may contain any additional parameters, the meaning of which should be agreed upon by both sender and receiver. Parameters may appear in any order. Refer to the KQML specification for more detailed definitions of the syntax and semantics of these parameters." -- [CAT, 96a, p4]

For instance, [CAT, 96a, p9] contains the following example of an ask message:

(ask 	:sender pa@parent.net
	:receiver server.school.edu:4030
	:language text
	:content "Give me Amy's record")

[CAT, 96a] does not explicitly specify that a CMA message must have both a sender and a receiver parameter, but this seems to be what is intended.

Message Addressing.

There is a mapping from the sender and receiver parameters of messages into CAETI addresses (which are the end points of connections).

This issue is not raised in the source documents. The name service that is discussed in the extension of [CAT, 96a] is a different issue. However, defining this mapping is essential for getting CMA messages delivered to the parties for which they are intended.

Message Delivery.

The sender parameter of every CMA message maps into the CAETI address from which the message was sent. If a CMA message is delivered, it is delivered to the CAETI address that is mapped from the receiver parameter in the message. This does not require that every CMA message be delivered nor that CMA messages be delivered in the order in which they were sent.

Correctly identifying the sender of a CMA is not mentioned in [CAT, 96a]. However, this is necessary for messages to be exchanged only between the intended parties. For example, correct identification of the sender is necessary for security. Otherwise, for example, it would be a simple matter for process A to masquerade as process B and obtain information for which B had access but A did not.

This page is URL http://www.computationallogic.com/software/caeti/architecture/model/derivation.html