The Common Gateway Interface (CGI), originally developed as part
of the NCSA HTTP server, and is an old standard for interfacing
external applications with HTTP servers that still enjoys
considerable use. It was created to allow dynamic data to be
generated in response to HTTP requests and return the results to
the user's browser. Plain HTML documents are typically static,
while a CGI program allows the response data to be dynamically
created. However, since CGI was first developed, several better
means of creating dynamic web pages have been created that are
faster and more efficient. Read more about such replacements in
Creating Dynamic Web Pages,
Embedded Server Pages and
Using PHP.
Mbedthis Appweb supports CGI so that existing applications that
are written to the CGI interface can be fully supported. Appweb
has a fully featured CGI handler that alleviates many of the
pains with configuring CGI setup.
Configuring CGI
Programs
CGI programs may be configured and invoked in two
primary ways:
- By URL prefix
- By URL extension
When invoked by URL prefix, the CGI programs and scripts are
stored in special CGI directories (for example cgi-bin). When invoked by URL extension, the CGI
programs may be stored anywhere in the web directory. For
security, it is usually best to store all CGI programs and
scripts outside the directory containing the web content.
Consequently invoking CGI programs by extension should only be
used in combination with a URL prefix that allows the CGI
directory to be specified.
Appweb nominates a directory as a CGI directory via
the ScriptAlias
configuration file directive. For example:
ScriptAlias /cgi-bin/ $SERVER_ROOT/web/cgi-bin/
When a URL is requested by a browser which includes the
"/cgi-bin/" prefix, the script name immediately after "/cgi-bin/"
will be run. For example, the following URL:
http://www.mbedthis.com/cgi-bin/testCgi
This will cause the testCgi program to
be run. To configure Appweb to specify CGI programs and
scripts by URL extension use the AddHandler
configuration file directive. For example:
AddHandler cgiHandler .myExt
This configures Appweb to process URLs that contain the .myExt
extension via the CGI handler. To determine which program to run,
the Appweb CGI handler looks up the Mime type associated with the
".myExt" extension in the Mime types file "mime.types". In this
file, the extension is mapped to a mime type. For example:
application/x-appweb-perl myExt
This definition will map ".myExt" to the perl mime type. This
mime type must then be mapped to a program via the the Action
directive. For example:
Action application/x-appweb-perl /usr/bin/perl
This will cause /usr/bin/perl to be run to process the request.
Output from perl is captured by the CGI handler and then returned
to the user's browser.
Invoking CGI
Programs
When a CGI program is run, the Appweb CGI handler
communicates request information to the CGI program via
Environment variables and in some cases, via the command line.
The command line is set to the name of the CGI program, CGI
script if different to the program name, and the CGI Query
String. The query string is set to the portion of the URL after
any "?" character after de-escaping special characters.
CGI Command LineThe command line will be set differently
depending on how the CGI program is being invoked. There are four
possible scenarios:
- Program invoked directly via the request URL.
- Program invoked indirectly if the CGI script contains a
Bang path directive.
- The program is specified via an ActionProgram directive in
the Appweb configuration file.
- On windows if the program is a Windows Batch file
The command line arguments for the CGI program will be set
differently in each case. See the tables below for the
specifications as to how the command line arguments are
defined:
Programs Invoked Directly via the Request URL
argv[0]
|
Program name immediately after the CGI URL
prefix (E.g. after /cgi-bin/)
|
argv[1..N]
|
Each arg is set to portions of the
QUERY_STRING is split at "+" characters after de-escaping
the query. |
Programs Invoked Indirectly with Bang DirectiveIf the
CGI program/script specified in the URL contains a
"#!/pathToProgram" directive on the first line, it is interpreted
to be the path to the real CGI program to run. The script name is
then passed in the command line.
argv[0]
|
Program name defined in the first line of
the CGI script after the "#!" characters.
|
argv[1]
|
The name of the CGI script originally
specified in the URL.
|
argv[2..N]
|
Each arg is set to portions of the
QUERY_STRING is split at "+" characters after de-escaping
the query.
|
Programs Specified via an ActionProgram Directive
argv[0]
|
Program name specified in the ActionProgram
directive in the Appweb configuration file.
|
argv[1]
|
The name of the CGI script originally
specified in the URL.
|
argv[2..N]
|
Each arg is set to portions of the
QUERY_STRING is split at "+" characters after de-escaping
the query. |
Windows Batch Commands
argv[0]
|
Set to "cmd.exe"
|
argv[1]
|
/Q
|
argv[2]
|
/C
|
argv[3]
|
Command
|
The "Command" is a quoted string set to the name of the CGI
script originally specified in the URL followed by the Query
String split at "+" characters. The entire Command string is
escaped so that dangerous characters are preceded by "^" to
prevent security attacks.
CGI Environment
Variables
CGI uses environment variables to send your program
its additional parameters. The following environment variables
are defined :
Variable
|
Description
|
AUTH_TYPE
|
Set to the value of the HTTP AUTHORIZATION
header. Usually "basic" or "digest".
|
CONTENT_LENGTH
|
Set to the length of any associated posted
content.
|
DOCUMENT_ROOT
|
Set to the path location of the web
documents. Defined by the DocumentRoot directive in the
Appweb configuration file.
|
GATEWAY_INTERFACE
|
Set to "CGI/1.1"
|
HTTP_ACCEPT
|
Set to the value of the HTTP ACCEPT header.
This specifies what formats are acceptable and/or
preferable for the client.
|
HTTP_CONNECTION
|
Set to the value of the HTTP CONNECTION
header. This specifies how the connection should be
persisted when the request is complete.
(Keep-alive)
|
HTTP_HOST
|
Set to the value of the HTTP HOST header.
This specifies the name of the server to process the
request. When using named virtual hosting, requests to
different servers (hosts) may be processed by a single HTTP
server on a single IP address. The HTTP_HOST field permits
the server to determine which virtual host should process
the request.
|
HTTP_USER_AGENT
|
Set to the value of the HTTP USER_AGENT
header. |
PATH_INFO
|
The PATH_INFO variable is set to the URL
portion (if any) after the SCRIPT_NAME.
|
PATH_TRANSLATED
|
The physical on-disk path name corresponding
to PATH_INFO.
|
QUERY_STRING
|
The QUERY_STRING variable is set to the URL
string portion that follows the first "?" in the URL. The
QUERY_STRING is URL encoded in the standard URL format by
changing spaces to "+", and encoding all URL special
characters with %xx hexadecimal encoding. Most
major scripting languages provide routines to assist in
decoding QUERY_STRINGs. |
REMOTE_ADDR
|
Set to the IP address of the requesting
client.
|
REMOTE_HOST
|
Set to the IP address of the requesting
client (same as REMOTE_ADDR).
|
REMOTE_USER
|
Set to the name of the authenticated
user.
|
REMOTE_METHOD
|
Set to the HTTP method used by the request.
Valid values are: "GET", "HEAD", "OPTIONS", "POST", or
"TRACE". "PUT" and "DELETE" are not supported.
|
REQUEST_URI
|
The complete request URL after the host name
portion. It always begins with a leading "/".
|
SCRIPT_NAME
|
The name of the CGI script or program to
run. If an ActionProgram is specifying the name of a CGI
interpreter, then SCRIPT_NAME is set to the name of the
script to interpret.
|
SERVER_ADDR
|
The IP address of the server or virtual host
responding to the request.
|
SERVER_HOST
|
The name of the default server or virtual
host serving the request.
|
SERVER_NAME
|
Same as SERVER_HOST.
|
SERVER_PORT
|
The HTTP port of the server or virtual host
serving the request.
|
SERVER_PROTOCOL
|
Set to "HTTP/1.0" or "HTTP/1.1" depending on
the protocol used by the client.
|
SERVER_SOFTWARE
|
Set to "Mbedthis Appweb/VERSION"
|
SERVER_URL
|
Same as SERVER_NAME.
|
Example
Consider the following URL which will run
the Perl interpreter to execute the pricelists.pl script.
http://hostname/cgi-bin/myScript/products/pricelists.pl?id=23&payment=creditCard
This URL will cause the following environment settings:
Variable
|
Value
|
PATH_INFO
|
/products/pricelists
|
PATH_TRANSLATED
|
/var/appweb/web/products/pricelists # where
/var/appweb/web is the DocumentRoot
|
QUERY_STRING
|
id=23&payment=credit+Card
|
REQUEST_URI
|
/cgi-bin/myScript/products/pricelists?id=23&payment=credit+Card |
SCRIPT_NAME
|
myScript
|
ARGV[0]
|
/usr/bin/perl
|
ARGV[1]
|
pricelists.pl
|
ARGV[2]
|
id=23&payment=creditCard
|
This URL below demonstrates some rather cryptic encoding of URLs.
The important thing to remember is that command line arguments
are delimited by "+". The hex encoding %20, is the encoding for
the space character. Once passed to the CGI program, the
convention is for CGI variables to be delimited by
"&".
http://hostname/cgi-bin/cgiProgram/extra/Path?var1=a+a&var2=b%20b&var3=c
This URL will cause the following environment settings:
Variable
|
Value
|
PATH_INFO
|
/extra/Path
|
PATH_TRANSLATED
|
/var/appweb/web/extra/Path
|
QUERY_STRING
|
var1=a+a&var2=b%20b&var3=c |
REQUEST_URI
|
/cgi-bin/cgiProgram/extra/Path?var1=a+a&var2=b%20b&var3=c |
SCRIPT_NAME
|
cgiProgram
|
ARGV[0]
|
cgiProgram
|
ARGV[1]
|
var1=a
|
ARGV[2]
|
a&var2=b b&var3=c
|
URL EncodingWhen a URL is sent via HTTP certain special
characters must be escaped so the URL can be processed
unambiguously by the HTTP server. To escape the special
characters, a HTTP client should convert them to their %hex
equivalent. Form and query variables are separated by "&".
For example: a=1&b=2 defines two form variables "a" and "b"
with their values equal to "1" and "2" respectively.
CGI Programming
A
CGI program can return almost any possible content type back to
the client's browser: plain HTML, audio, video or any other
format. CGI programs can also control the user's browser and
redirect it to another URL. To do this, CGI programs return
pseudo-HTTP headers that are interpreted by Appweb before passing
the data on to the client.
Appweb understands the following CGI headers that can be output
by the CGI program. They are case-insensitive.
Header
|
Description
|
Content-type
|
Nominate the content Mime Type. Typically
"text/html". See the mime.types for a list of possible mime
types.
|
Status
|
Set to a HTTP response code. Success is 200.
Server error is 500.
|
Location
|
Set to the URI of a new document to which to
redirect the client's browser.
|
ANY
|
Pass any other header back to the
client.
|
For example:
Content-type: text/html
<HTML><HEAD><TITLE>Sample CGI Output</TITLE></HEAD>
<BODY>
<H1>Hello World</H1>
</BODY></HTML>
To redirect the browser to a new location:
Location: /newUrl.html
To signify an error in the server:
Status: 500
Hints and Tips
If you
have special data or environment variables that must be passed to
your CGI program, you can wrap it with a script that defines that
environment before invoking your script.
Other Resources
The
following URLs may be helpful in further reading about CGI:
|