



                           GRID ASCII HELPER PROTOCOL
                                for OpenStack
                                 VERSION 0.1


                                 Jaime Frey
                             [jfrey@cs.wisc.edu]

                                 October 2014

                               HTCondor Project
                    [http://research.cs.wisc.edu/htcondor]

                       Department of Computer Sciences
                       University of Wisconsin-Madison
                            1210 W. Dayton Street
                             Madison, WI  53706
                          [http://www.cs.wisc.edu]





1.    INTRODUCTION

    The object of the Grid ASCII Helper Protocol (GAHP) is to allow the
    use of the client library or package of a grid or cloud service via
    a simple ASCII-based protocol. A process which implements GAHP is
    referred to as a GAHP server. GAHP is designed to handle both
    synchronous (blocking) and asynchronous (non-blocking) calls.

    The first GAHP specification focused on the GRAM and GASS grid
    services of the Globus Toolkit. This GAHP specification focuses on
    the OpenStack Nova cloud service.

1.1   WHY A GAHP?

    Although most grid and cloud services provide client libraries or
    modules in various languages that can be incorporated directly into
    applications that wish to use those services, there are several
    distinct advantages to using an independent GAHP server process
    instead. For instance, parts of the native API may provide only
    synchronous/blocking calls. Users who require a
    non-blocking/asynchronous interface must make use of multiple
    threads. Even if the native module is thread-safe, thread safety
    requires that _all_ modules linked into the process be both
    re-entrant and in agreement upon a threading model.
    This may be a significant barrier when trying to integrate the
    service client module into legacy non-thread-safe code, or when
    attempting to link with commercial libraries which either have no
    support for threads or define their own threading model. But because
    the GAHP server runs as a separate process, it can be easily
    implemented as a multi-threaded server, and still present an
    asynchronous non-blocking protocol. Worse yet, there may not be a
    client module available in the language of the application wishing
    to use the service. With a GAHP server, langauage choice is
    immaterial.

    GAHP facilitates the construction of multi-tier systems.  A first
    tier client can easily send ASCII commands via a socket (perhaps
    securely via an SSH or SSL tunnel) to a second tier running a GAHP
    server, allowing grid or cloud services to be consolidated at the
    second or third tier with minimal effort.

    Furthermore, GAHP, like many other simple ASCII-based protocols,
    supports the concept of component-based development independent of
    the software language used with minimal complexity. When a grid
    service has client modules available in multiple languages, those
    interfaces can look very different from each other. By using GAHP, a
    body of software could be written once with one interface and then
    subsequently utilize a GAHP server written in C, Java, or Perl --
    and said GAHP server could be running locally or as a daemon on a
    remote host.

1.2   AVAILABLE OPENSTACK GAHP SERVERS

    The HTCondor Project at the University of Wisconsin-Madison has
    developed an OpenStack GAHP server, written in [FIX] C++, which uses
    pthreads and libcurl to support portions of the OpenStack protocol.  Many
    Unix platforms are supported.

    This GAHP is available as part of the HTCondor software distribution,
    version FIXME.FIXME or later.

2.0   GAHP SERVER IMPLEMENTATION

    GAHP itself, as a protocol, is independent of the underlying transport
    protocol and requires only a reliable ordered bi-directional data
    stream channel.

    A GAHP server, however, is assumed to read and respond to GAHP commands
    solely via stdin and stdout.  Should stdin to a GAHP server be closed,
    the GAHP server should immediately shutdown in a manner similar to the
    receipt of a QUIT command.  Therefore, a GAHP server can be easily
    invoked and managed via SSHD, inted, or rshd.  Software can spawn a
    local GAHP server via an interface such as POSIX.2 popen().

    Under no circumstances should a GAHP server block when issued any
    command.  All commands require a response nearly instantaneously.
    Therefore, most GAHP servers will be implemented as a multi-threaded
    process. Use of child processes or avoidance of blocking calls
    within the GAHP server are also options.

3.0   GAHP COMMANDS

    The following commands must be implemented by all GAHP servers:

        COMMANDS
        QUIT
        RESULTS
        VERSION

    The following commands may be implemented by any GAHP server:

        ASYNC_MODE_ON
        ASYNC_MODE_OFF
        RESPONSE_PREFIX

    The following commands are specific to OpenStack and must be implemented
    by an OpenStack GAHP server:

        KEYSTONE_SERVICE
        NOVA_PING
        NOVA_SERVER_CREATE
        NOVA_SERVER_DELETE
        NOVA_SERVER_LIST
        NOVA_SERVER_LIST_DETAILS

3.1   CONVENTIONS AND TERMS USED IN SECTION 3.2

    The following definitions are common to all GAHP protocol documents.

    <CRLF>

        The characters carriage return and line feed (in that
        order), _or_ solely the line feed character.

    <SP>

        The space character.

    line

        A sequence of ASCII characters ending with a <CRLF>.

    Request Line

        A request for action on the part of the GAHP server.

    Return Line

        A line immediately returned by the GAHP server upon
        receiving a Request Line.

    Result Line

        A line sent by the GAHP server in response to a RESULTS
        request, which communicates the results of a previous
        asynchronous command Request.

    S: and R:

        In the Example sections for the commands below, the prefix
        "S: " is used to signify what the client sends to the GAHP
        server.   The prefix "R: " is used to signify what the
        client receives from the GAHP server.  Note that the "S: "
        or "R: " should not actually be sent or received.

3.2   GAHP COMMAND STRUCTURE

    GAHP commands consist of three parts:

        * Request Line
        * Return Line
        * Result Line

    Each of these "Lines" consists of a variable length character
    string ending with the character sequence <CRLF>.

    A Request Line is a request from the client for action on the part of
    the GAHP server.  Each Request Line consists of a command code
    followed by argument field(s).  Command codes are a string of
    alphanumeric and underscore characters.  Upper and lower case
    alphabetic characters are to be treated identically with respect to
    command codes.  Thus, any of the following may represent the
    ASYNC_MODE_ON command:
        async_mode_on
        Async_Mode_On
        asYNc_MOde_oN
        ASYNC_MODE_ON
    In contrast, the argument fields of a Request Line are _case
    sensitive_.

    The Return Line is always generated by the server as an immediate
    response to a Request Line.  The first character of a Return Line will
    contain one of the following characters:
        S - for Success
        F - for Failure
        E - for a syntax or parse Error
    Any Request Line which contains an unrecognized or unsupported command,
    or a command with an insufficient number of arguments, will generate an
    "E" response.

    The Result Line is used to support commands that would otherwise
    block.  Any GAHP command which may require the implementation to block
    on network communication require a "request id" as part of the Request
    Line.  For such commands, the Result Line just communicates if the
    request has been successfully parsed and queued for service by the
    GAHP server.  At this point, the GAHP server would typically dispatch
    a new thread to actually service the request.  Once the request has
    completed, the dispatched thread should create a Result Line that
    includes the same "request id" and enqueue it until the client
    issues a RESULT command.

3.3   TRANSPARENCY

    Arguments on a particular Line (be it Request, Return, or Result) are
    typically separated by a <SP>.  In the event that a string argument
    needs to contain a <SP> within the string itself, it must be escaped
    by placing a backslash ("\") in front of the <SP> character.  Thus,
    the character sequence "\ " (no quotes) must not be treated as a
    separator between arguments, but instead as a space character within
    a string argument. A carriage return or line feed that is part of a
    string argument can be similarly escaped with a backslash, so that
    it is not considered the end of the Line. If a string argument
    contains a backslash character, it must be escaped by preceding it
    with another backslash character.

3.4   SEQUENCE OF EVENTS

    Upon startup, the GAHP server should output to stdout a banner string
    which is identical to the output from the VERSION command without the
    beginning "S " sequence (see example below).  Next, the GAHP server
    should wait for a complete Request Line from the client (from stdin).
    The server is to take no action until a Request Line sequence is
    received.

    There are no sequencing restrictions on the ordering of Requests in
    the GAHP API, even though some sequences may be semantically
    invalid for the underlying service. For example, attempting to
    delete a new server before creating it. The server shall not
    produce an "E" or "F" Return Line in the event of such sequences,
    but may produce a Result Line reflecting an error from the
    underlying service.

3.5   COMMAND SYNTAX

    This section contains the syntax for the Request, Return, and Result
    line for each command. The commands common to all GAHPs are defined
    first, followed by commands specific to the OpenStack GAHP procotol.

3.5.1 COMMON GAHP COMMANDS

    These commands are common to all GAHP types.

    -----------------------------------------------

    COMMANDS

    List all the commands from this protocol specification which are
    implemented by this GAHP server.

    + Request Line:

        COMMANDS <CRLF>

    + Return Line:

        S <SP> <command 1> <SP> <command 2> <SP> ... <command X> <CRLF>

    + Result Line:

        None.

    -----------------------------------------------

    VERSION

    Return the version string for this GAHP.  The version string follows
    a specified format (see below).  Ideally, the version entire version
    string, including the starting and ending dollar sign ($)
    delimiters, should be a literal string in the text of the GAHP
    server executable.  This way, the Unix/RCS "ident" command can
    produce the version string.

    The version returned should correspond to the version of the
    protocol supported.

    + Request Line:

        VERSION <CRLF>

    + Return Line:

        S <SP> $GahpVesion: <SP> <major>.<minor>.<subminor> <SP>
            <build-month> <SP> <build-day-of-month> <SP>
            <build-year> <SP> <general-descrip> <SP>$ <CRLF>

        * major.minor.subminor = for this version of the
            protocol, use version x.y.z.

        * build-month = string with the month abbreviation when
            this GAHP server was built or released.  Permitted
            values are: "Jan", "Feb", "Mar", "Apr", "May", "Jun",
            "Jul", "Aug", "Sep", "Oct", "Nov", and "Dec".

        * build-day-of-month = day of the month when GAHP server
            was built or released; an integer between 1 and 31
            inclusive.

        * build-year = four digit integer specifying the year in
            which the GAHP server was built or released.

        * general-descrip = a string identifying a particular
            GAHP server implementation.

    + Result Line:

        None.

    + Example:

        S: VERSION
        R: S $GahpVersion: 1.0.0 Nov 26 2001 NCSA\ CoG\ Gahpd $

    -----------------------------------------------

    QUIT

    Free any/all system resources (close all sockets, etc) and terminate
    as quickly as possible.

    + Request Line:

        QUIT <CRLF>

    + Return Line:

        S <CRLF>

        Immediately afterwards, the command pipe should be closed
        and the GAHP server should terminate.

    + Result Line:

        None.

    -----------------------------------------------

    RESULTS

    Display all of the Result Lines which have been queued since the
    last RESULTS command was issued.  Upon success, the first return
    line specifies the number of subsequent Result Lines which will be
    displayed.  Then each result line appears (one per line) -- each
    starts with the request ID which corresponds to the request ID
    supplied when the corresponding command was submitted.  The exact
    format of the Result Line varies based upon which corresponding
    Request command was issued.

    IMPORTANT: Result Lines must be displayed in the _exact order_ in
    which they were queued!!!  In other words, the Result Lines
    displayed must be sorted in the order by which they were placed into
    the GAHP's result line queue (i.e. the order in which the commands 
    completed), from earliest to most recent.

    + Request Line:

        RESULTS <CRLF>

    + Return Line(s):

        S <SP> <num-of-subsequent-result-lines> <CRLF>
        <reqid> <SP> ... <CRLF>
        <reqid> <SP> ... <CRLF>
        ...

        * reqid = Request ID (typically an integer), set to the value 
            specified in the corresponding Request Line.

    + Result Line:

        None.

    + Example:

        S: RESULTS
        R: S 1
        R: 100 0

    -----------------------------------------------

    ASYNC_MODE_ON

    Enable Asynchronous notification when the GAHP server has results
    pending for a client. This is most useful for clients that do not
    want to periodically poll the GAHP server with a RESULTS command.
    When asynchronous notification mode is active, the GAHP server will
    print out an "R" (no quotes) on column one when the "RESULTS"
    command would return one or more lines. The "R" is printed only once 
    between successive "RESULTS" commands.    The "R" is also guaranteed 
    to only appear in between atomic return lines; the "R" will not 
    interrupt another command's output.

    If there are already pending results when the asynchronous results
    available mode is activated, no indication of the presence of those
    results will be given. A GAHP server is permitted to only consider
    changes to it's result queue for additions after the ASYNC_MODE_ON
    command has successfully completed. GAHP clients should issue a
    "RESULTS" command immediately after enabling asynchronous
    notification, to ensure that any results that may have been added to
    the queue during the processing of the ASYNC_MODE_ON command are
    accounted for.

    + Request Line:

        ASYNC_MODE_ON <CRLF>

    + Return Line:

        S <CRLF>

        Immediately afterwards, the client should be prepared to
        handle an R <CRLF> appearing in the output of the GAHP
        server.

    + Result Line:

        None.

    + Example:

        S: ASYNC_MODE_ON
        R: S
        S: NOVA_PING 00001 regionOne
        R: S
        S: NOVA_PING 00002 regionTwo
        R: S
        R: R
        S: RESULTS
        R: S 2
        R: 00001 NULL
        R: 00002 NULL

    Note that you are NOT guaranteed that the "R" will not appear
    between the dispatching of a command and the return line(s) of that
    command; the GAHP server only guarantees that the "R" will not
    interrupt an in-progress return. The following is also a legal
    example:
        S: ASYNC_MODE_ON
        R: S
        S: NOVA_PING 00001 regionOne
        R: S
        S: NOVA_PING 00002 regionTwo
        R: R
        R: S
        S: RESULTS
        R: S 2
        R: 00001 NULL
        R: 00002 NULL

        (Note the reversal of the R and the S after NOVA_PING 00002)

    -----------------------------------------------

    ASYNC_MODE_OFF

    Disable asynchronous results-available notification. In this mode,
    the only way to discover available results is to poll with the
    RESULTS command.  This mode is the default. Asynchronous mode can be
    enable with the ASYNC_MODE_ON command.

    + Request Line:

        ASYNC_MODE_OFF <CRLF>

    + Return Line:

        S <CRLF>

    + Result Line:

        None

    + Example:

        S: ASYNC_MODE_OFF
        R: S

    -----------------------------------------------

    RESPONSE_PREFIX

    Specify a prefix that the GAHP server should use to prepend every
    subsequent line of output. This may simplify parsing the output of
    the GAHP server by the client program, especially in cases where the
    responses of more one GAHP server are "collated" together.

    This affects the output of both return lines and result lines for all
    subsequent commands (NOT including this one).

    + Request Line:

        RESPONSE_PREFIX <SP> <prefix> <CRLF>

        <prefix> = an arbitrary string of characters which you want to
        prefix every subsequent line printed by the GAHP server.

    + Return Line:

        S <CRLF>

    + Result Line:

        None.

    + Example:

        S: RESPONSE_PREFIX GAHP:
        R: S
        S: RESULTS
        R: GAHP:S 0
        S: RESPONSE_PREFIX NEW_PREFIX_
        R: GAHP:S
        S: RESULTS
        R: NEW_PREFIX_S 0

    -----------------------------------------------

3.5.2 GAHP COMMANDS SPECIFIC TO OPENSTACK

    The following commands are specific to the OpenStack GAHP. Here are
    some common arguments used in commands:

    * request-id    = a Request ID, typically an integer
    * region        = the name of the OpenStack region to be used, or
                      "NULL" (no quotes) to use the first region in the
                      service catalog

    Most commands will require a request to the Keystone service to
    authenticate, receive a token, and obtain the services catalog,
    followed by a request to the Nova or other service to perform the
    given command. A failure encountered during either request should be
    reported in the Result Line of the command. The request to the
    Keystone service can be performed once and the results cached in
    memory if successful. Failures may be cached for a short period of
    time. For example, the same authentication token can be used for
    multiple requests to the Nova service.

    The following is the Return Line for all OpenStack GAHP commands
    unless otherwise specified.

    + Return Line:

        <result> <CRLF>

        * result = the character "S" (no quotes) for the successful
            submission of the request (meaning the request is now
            pending), or an "E" (no quotes) for an error parsing the
            request or its arguments (e.g. an unrecognized or
            unsupported command, or missing or malformed arguments)

    The following is the Result Line for all OpenStack GAHP commands
    unless otherwise specified.

    + Result Line:

        <request-id> <SP> <error-string> <CRLF>

        If the command was successful, <error-string> shall be the
        special value "NULL" (no quotes). Otherwise, it should be a
        string describing the nature of the failure.

    -----------------------------------------------

    KEYSTONE_SERVICE

    Set identity attributes to be used for subsequent commands that
    involve contacting an OpenStack service. This is an immediate
    command that only updates internal state of the GAHP. All commands
    received after this one should use the given information.

    Any other OpenStack command that is issued before this command is
    issued at least once should return an appropriate error message in
    its Result line.

    + Request Line:

        KEYSTONE_SERVICE <SP> <url> <SP> <username> <SP> <password>
            <SP> <tenant> <CRLF>

        * url = URL of the Keystone service. Only a hostname or IP
            address is required; other portions of the full URL should
            be filled with reasonable defaults.
        * username = username to be used when authenticating to the
            keystone service
        * password = password to be used when authenticating to the
            keystone sevice
        * tenant = tenant name to use in subsequent requests, or "NULL"
            (no quotes) to use the first tenant defined for the
            authenticated user

    + Result Line:

        None.

    -----------------------------------------------

    NOVA_PING

    Send a simple request to the Nova compute server for the given
    region to determine if it is alive, reachable, responds to quries,
    and accepts the provided credentials.

    + Request Line:

        NOVA_PING <SP> <request-id> <SP> <region> <CRLF>

    -----------------------------------------------

    NOVA_SERVER_CREATE

    Create a server with the given properties

    + Request Line:

        NOVA_SERVER_CREATE <SP> <request-id> <SP> <region>
            <SP> <desc> <CRLF>

        * desc = JSON description of a new server to be created in the
            given region. Where possible, the GAHP should provide
            reasonable defaults for any missing attributes.

    + Result Line:

        One of the following. The first should be used if the command
        failed. The second should be used if the command succeeded.

        <request-id> <SP> <error-string> <CRLF>

        <request-id> <SP> NULL <SP> <server-id> <CRLF>

        * error-string = string describing the nature of the failure
        * server-id = the id of the created server, without quotes

    -----------------------------------------------

    NOVA_SERVER_DELETE

    Delete the given server (halting exeucting of the VM)

    + Request Line:

        NOVA_SERVER_DELETE <SP> <request-id> <SP> <region>
            <SP> <server-id> <CRLF>

        The following attribute is required.

        * server-id = the id of the server to delete

    -----------------------------------------------

    NOVA_SERVER_LIST

    Query the nova service of the given region about all servers and 
    return the id and name of each server.
    This may include servers not started by the GAHP; the client must
    maintain its own list of servers if it cares.

    + Request Line:

        NOVA_SERVER_LIST <SP> <request-id> <SP> <region>
            <SP> <filter> <CRLF>

        * filter = a query string to be appended to the URL that 
            contains properties to filter the set of servers to be 
            returned, or "NULL" (no quotes) to indicate no filter

    + Result Line:

        One of the following. The first should be used if the command
        failed. The second should be used if the command succeeded.

        <request-id> <SP> <error-string> <CRLF>
        <request-id> <SP> NULL <SP> <count>
            [<SP> <server-id> <SP> <server-name>]* <CRLF>

        * error-string = string describing the nature of the failure
        * count = the number of servers whose information will follow
        * server-id = the id of a server, without quotes
        * server-name = the "name" property of a server, without quotes

        On success, <count> is an integer indicating the number of
        server returned by the query. It is followed by the given
        two properties of each server. If the command was successful
        but returned no servers, then <count> is 0 and no arguments
        follow it.

    -----------------------------------------------

    NOVA_SERVER_LIST_DETAIL

    Query the nova service of the given region about all servers and
    return several properties of each server.
    This may include servers not started by the GAHP; the client must
    maintain its own list of servers if it cares.

    + Request Line:

        NOVA_SERVER_LIST <SP> <request-id> <SP> <region>
            <SP> <filter> <CRLF>

        * filter = a query string to be appended to the URL that 
            contains properties to filter the set of servers to be 
            returned, or "NULL" (no quotes) to indicate no filter

    + Result Line:

        One of the following. The first should be used if the command
        failed. The second should be used if the command succeeded.

        <request-id> <SP> <error-string> <CRLF>

        <request-id> <SP> NULL <SP> <count>
            [<SP> <server-id> <SP> <server-name> <SP>
             <SP> <status> <SP> <address>]* <CRLF>

        * error-string = string describing the nature of the failure
        * count = the number of servers whose information will follow
        * server-id = the id of a server, without quotes
        * server-name = the "name" property of a server, without quotes
        * status = the "status" property of a server, without quotes
        * address = an IP address or hostname which can be used to
            connect to the server, without quotes

        On success, <count> is an integer indicating the number of
        servers returned by the query. It is followed by the given
        four properties of each server. If the command was successful
        but returned no servers, then <count> is 0 and no arguments
        follow it.
