login  |  register  |  blog

Home > Documentation > NuGram Hosted Server HTTP API

NuGram Hosted Server HTTP API

This document describes the interface to NuGram Hosted Server.

  Overview
  Grammar Publishing
  Grammar Tagging
  Setting Up a Session
  Grammar Instantiation
  Grammar Source Generation
  Sentence Interpretation
  Freeing a Session

  A Complete Example

Overview

NuGram Server (referred to as the server below) is an HTTP-based server that offers many grammar-related services, including (but not limited to):

  • dynamic grammar instantiation
  • source grammar generation to various formats (ABNF, GrXML, GSL)
  • sentence interpretation

The next sections describe the details of the interface, which is fairly stable by now. Yet, a few details may change.

Server URL

A hosted, publicly available NuGram Server is located at http://www.grammarserver.com:8082. All requests to the server must be made at this specific address.

Authentication

All requests made to the server must specify a user name and password in the Authorization header (as a BASE64 encoding of userName:password). They are the same as the ones used to log on NuGram Server's website. Users may read or delete their own sessions and grammars. They may also read others' public grammars.

Stunt

Since not all HTTP agents are able to include HTTP headers (like typical VoiceXML browsers or ASR servers), the server provides a stunt. All requests to this stunt are POST requests of the form:

  POST /voiceXML HTTP/1.0
  Content-Length: (document size in bytes)

  operation=Op&resource=Res&account=userName&password=password&...

where Op is the HTTP method we want to perform, Res is the resource we'd like to operate on, userName and password are the user name and password for the account, and ... stands for all the other resource-specific parameters.

Grammar Publishing

Before it can be used, a grammar must be published on the server. This is done using an HTTP PUT operation (usually by means of NuGram IDE's publishing feature).

Request

The format of the HTTP request is:

  PUT /grammar[:userName]/grammarPath HTTP/1.0
  Content-Length: (grammar size in bytes)
  Authorization: Basic key

  (grammar contents...)

where

  • key is the BASE64 encoding of the string userName:password;
  • grammarPath is a symbolic name for the grammar. If a grammar already exists with that name, it will be overridden. If the path starts with /grammar:userName (the same userName as the one given for authentication), the grammar is made public, meaning that it can be used by other users of the system.

Response

Upon successful completion of the request, the HTTP response document is an XML document giving the (unique) ID of the grammar:

  <grammar uri="grammarId"/>

Grammar Tagging

In order to facilitate sharing grammars among users, a grammar may be tagged witth informative keywords. This is done using an HTTP POST operation (usually by means of NuGram IDE's publishing feature).

Request

The format of the grammar tagging HTTP request is:

  POST /tags[:grammarOwner]/grammarPath HTTP/1.0
  Authorization: Basic key

  tags=tags separated by blank spaces

where

  • key is the BASE64 encoding of the string userName:password;
  • grammarPath is a symbolic name for the grammar. If the resource path starts with /tags:grammarOwner, it is possible to apply tags to a public grammar owned by some account grammarOwner even though it might differ from the account userName used for authentication purposes.

Response

The HTTP response is the following XML document:

  <status code="success" />

Those tags may also be recovered afterwards. Tags may be recovered using a per-grammar or a per-user request.

Request

The format of the HTTP request to recover the tags applied on a given grammar is:

  GET /tags[:grammarOwner]/grammarPath HTTP/1.0
  Authorization: Basic key

The format of the HTTP request to recover all the tags owned by a given account is:

  GET /tags[:tagsOwner] HTTP/1.0
  Authorization: Basic key

where

  • key is the BASE64 encoding of the string userName:password;
  • tagsOwner is an account username. This account might differ from the account used for authentication purposes.
  • grammarPath is a symbolic name for the grammar. If the resource path starts with /tags:grammarOwner, then grammarPath refers to a public grammar published grammarOwner.

Response

Upon successful completion of the request, the HTTP response document is an XML document listing the tags owned by userName associated to the specified grammar:

  <tags>
    <tag name="tag name" count="tag count"/>
    ...
  </tags>

Setting Up a Session

For identification purposes, and also for controlling resource usage, many requests to the server are to be nested within sessions. The user may either choose and provide such a session identifier, or ask the system to generate one. In either case, the session identifier is to be recycled in further requests pertaining to the same session. The session identifier selected by the user may not contain slashes, nor special characters meaningful within URLs, and it should not already be in use by another user. If the user does not want to select such a session identifier, a special request creates a session and returns a randomly generated session identifier.

Request

The format of the HTTP request is:

  POST /session HTTP/1.0
  Authorization: Basic key

Response

Upon successful completion of the request, the HTTP response is an XML document giving the ID of the created session:

  <session id="sessionId"/>

Grammar Instantiation

A particular grammar is created by instantiating a grammar containing dynamic directives using an instantiation context. The server currently only supports instantiation contexts encoded as a JSON object. The JSON object must have one property for each global variable used in the source grammar.

Request

The format of the HTTP request is:

  POST /grammar/sessionId/grammarPath HTTP/1.0
  Content-Length: (document size in bytes)
  Authorization: Basic key

  context=JSONcontext

The request document must be encoded in the application/x-www-url-encoded format.

Response

Upon successful completion of the request, the HTTP response is an XML document giving the generated grammar ID and two URLs: one for fetching the source of the generated grammar, and one for interpreting sentences using the generated grammar:

  <grammar id="grammarId" grammarUrl="srcURL" interpreterUrl="interpURL" />

Example

If we'd like to instantiate a grammar en/names.abnf with a context containing the following variables:

  • names: ["dominique", "john", "shriram"]
  • callerLanguage: "en-US"

we would send a request like the following to the server (assuming session 1234 has already been created):

  POST /grammar/1234/en/names.abnf HTTP/1.0
  Authorization: Basic key
  Content-Length: 73

  context={"callerLanguage":"en-US","names":["dominique","john","shriram"]}

The body of the HTTP response would contain an XML document resembling the following:

  <grammar id="6EBF5"
           grammarUrl="http://www.grammarserver.com:8082/grammar:userName/1234/6EBF5"
           interpreterUrl="http://www.grammarserver.com:8082/interpretation:userName/1234/6EBF5" />
(In practice, grammar IDs are strings of 40 alphanumeric characters.)

Grammar Source Generation

The content of a grammar can be retrieved using a simple GET operation in a variety of formats. The currently available formats are:

  • application/srgs - W3C ABNF format
  • application/srgs+xml - W3C XML format
  • application/gsl - Nuance GSL

Request

The format of the HTTP request is:

  GET /grammar[:userName]/sessionId/grammarPath[Extension] HTTP/1.0
  Authorization: Basic key

where

  • grammarPath is either a symbolic name for the grammar or a grammar ID returned from a previous request. If the path starts with /grammar:userName (the name of another user), then grammarPath is the symbolic name of a public grammar from that other user;
  • Extension is one of .abnf, .grxml or .gsl, corresponding to the above formats, respectively. Upper-cased extensions are recognized as well. If omitted, the grammar format defaults to application/srgs.

Here are a few more details. If grammarPath ends with a recognized extension, a grammar fetch is first attempted from the grammar database using grammarPath written in full. If this fails, a second attempt is made with the extension removed. In both cases, the extension drives the generated format. If grammarPath does not end with a recognized extension, it is taken verbatim, and the default format applies.

To cope with ASR inner workings, there is a special provision for the GET request: in case of missing authentication, the server indirectly retrieves user information from sessionId.

Response

Upon successful completion of the request, the HTTP response document is the requested grammar document in the requested format.

Sentence Interpretation

Textual sentences can be interpreted directly on the server.

Request

The format of the HTTP request is:

  POST /interpretation[:userName]/sessionId/grammarPath HTTP/1.0
  Content-Length: (document size in bytes)

  sentence=sentence

For convenience, grammarPath may repeat the grammar ID returned by a previous grammar instantiation, or else, the request document may define context=JSONcontext.

Response

Upon successful completion of the request, the body of the HTTP response is an XML document giving the JSON encoding of the semantic interpretation:

  <?xml version='1.0' encoding='UTF-8'?>
  <interpretation><![CDATA[ [{"slotName": slotValue,...}, ...] ]]></interpretation>

The returned document is always encoded in UTF-8.

To obtain the semantic interpretation directly in JSON format (i.e. not wrapped in an XML document), add the responseFormat HTTP parameter set to json:

  POST /interpretation[:userName]/sessionId/grammarPath HTTP/1.0
  Content-Length: (document size in bytes)

  responseFormat=json&sentence=sentence

The response, again in UTF-8, will have to following form:

  {"interpretation":[{"slotName": slotValue,...}, ...]}

Freeing a Session

A special request deletes the session and frees related resources. Each account has a maximum number of simultaneously active sessions. If the user forgets to explicitly free a session, that session gets automatically freed if it has been inactive for a few minutes.

Request

The format of the HTTP request is:

  DELETE /session/sessionId HTTP/1.0
  Authorization: Basic key

Response

The HTTP response is the following XML document:

  <status code="success" />

A Complete Example

This section shows how the API requests are typically used and in which order.

We will assume that the account userA exists on NuGram Server. The password for the account is test.

Also, we will use the curl program to send HTTP requests to the server from a Unix shell (Cygwin can be used as well on Windows). The most relevant command-line options are:

  • -X methodname
    to send an HTTP methodname request;
  • -u username:password
    to send basic authentication credentials;
  • --data content
    to send content in the body of the request.

Step 1 - publishing a grammar

In order to be used by NuGram Server, a grammar must first be published under a logical name, called the grammar path. The name can be the same as the grammar file name, but it does not need to.

Suppose the file /tmp/testGrammar.abnf contains the following dynamic ABNF grammar:

  #ABNF 1.0 ISO-8859-1;

  mode voice;
  tag-format <semantics/1.0>;
  language @string callerLanguage;

  root $mainRule;

  public $mainRule =
    i'd like to speak with @word contactName
  ;

To publish this dynamic grammar under the grammar path myFirstGrammar, we issue the following command:

  shell> curl -X PUT -u userA:test --data "`cat /tmp/testGrammar.abnf`" \
              http://grammarserver.com:8082/grammar/myFirstGrammar
  <grammar id="283FEBC6ED627241909145F128A56F1E9FC479DF" />
  shell>

The result is an XML document giving the unique ID of the grammar. We can use this ID in subsequent requests, or the supplied grammar path (myFirstGrammar). They are strictly equivalent, but the grammar path is of course preferred for readability and code maintenance reasons.

Now that the grammar has been published on NuGram Server, we can start using it in our application. Note that this step is usually done at application deployment time, not at application run time.

Step 2 - Establishing a session

The next thing to do is establishing a session:

  shell> curl -X POST -u userA:test http://grammarserver.com:8082/session
  <session id="A194FA3CA3DEB1BAB884" />
  shell>

The id attribute in the response (A194FA3CA3DEB1BAB884) will be used for instantiating the grammar, so it must be extracted by the client application and stored for later use.

Step 3 - Instantiating the grammar template

The grammar published in step 1 cannot be used directly by an ASR engine as it contains some extensions to the ABNF format. It must be instantiated with some dynamic data provided by the client application.

In our example, we need to instantiate the dynamic grammar with two variables: callerLanguage and contactName. To do this, we construct a JSON object with two properties, one for each variable. We then issue a POST request to the grammar with the context HTTP parameter bound to this object:

  shell> curl -X POST -u userA:test --data 'context={"callerLanguage":"en_US", "contactName":"john"}' \
              http://grammarserver.com:8082/grammar/A194FA3CA3DEB1BAB884/myFirstGrammar
  <grammar id="DC501A190E085F8EAF3B699EDD7082542ABB2237"
   grammarUrl="http://grammarserver.com:8082/grammar:userA/A194FA3CA3DEB1BAB884/DC501A190E085F8EAF3B699EDD7082542ABB2237"
   interpreterUrl="http://grammarserver.com:8082/interpretation:userA/A194FA3CA3DEB1BAB884/DC501A190E085F8EAF3B699EDD7082542ABB2237" />
  shell>

The result is an XML document whose root element contains three attributes:

  • id - the grammar ID of the newly generated grammar;
  • grammarUrl - the URL to use to fetch the generated grammar;
  • interpreterUrl - the URL to use to interpret textual sentences using the generated grammar.

The value of the grammarUrl attribute is typically passed to the ASR engine via a VoiceXML <grammar> element. The ASR engine then send a HTTP GET request equivalent to the following command:

  shell> curl -X GET -u userA:test \
              http://grammarserver.com:8082/grammar:userA/A194FA3CA3DEB1BAB884/DC501A190E085F8EAF3B699EDD7082542ABB2237
  #ABNF 1.0 ISO-8859-1;

  language en_US;
  mode voice;
  tag-format <semantics/1.0>;
  base <http://grammarserver.com:8082/grammar:userA/A194FA3CA3DEB1BAB884/testGrammar>;

  root $mainRule;

  public $mainRule =
    i'd like to speak with john
  ;
  shell>

If we prefer to fetch the XML Form of the grammar, we simply add the .grxml extension to the grammar URL:

  shell> curl -X GET -u userA:test \
            http://grammarserver.com:8082/grammar:userA/A194FA3CA3DEB1BAB884/DC501A190E085F8EAF3B699EDD7082542ABB2237.grxml
  <?xml version="1.0" encoding="ISO-8859-1"?>

  <grammar version="1.0"
           xmlns="http://www.w3.org/2001/06/grammar"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.w3.org/2001/06/grammar http://www.w3.org/TR/speech-grammar/grammar.xsd"
           xml:lang="en_US"
           xml:base="http://localhost:8082/grammar:schemeway/A194FA3CA3DEB1BAB884/testGrammar"
           mode="voice"
           tag-format="semantics/1.0"
           root="mainRule">

    <rule id="mainRule" scope="public">
      <item>i'd like to speak with john</item>
    </rule>

  </grammar>
  shell>

Step 4 - Interpreting a textual sentence

To parse a textual sentence using a dynamically-generated grammar, the value of the interpreterUrl attribute (as obtained in the previous step) must be used, together with a sentence HTTP parameter bound to the sentence to interpret:

  shell> curl -X POST -u userA:test --data "sentence=I'd like to speak with John" \
            http://grammarserver.com:8082/interpretation/A194FA3CA3DEB1BAB884/DC501A190E085F8EAF3B699EDD7082542ABB2237
  <?xml version='1.0' encoding='UTF-8'?>
  <interpretation><![CDATA[ ["i'd like to speak with john"] ]]></interpretation>
  shell>

The result is an XML element interpretation whose content is a JSON list, with one element for each possible interpretation of the sentence. This list usually contains a single element, except when the sentence can be parsed in many different ways by the grammar. (In that case, the grammar is said to be ambiguous.)

Step 5 - Freeing the session

To release a session, the HTTP DELETE method is used in the following way:

  shell> curl -X DELETE -u userA:test http://grammarserver.com:8082/session/A194FA3CA3DEB1BAB884 
  <status code="success" />
  shell>