by Matthew Flood
January 14, 2007
This page defines CGI, and shows a complete example of a web-server invoking a CGI application.
Please send any comments or questions that would improve this page to matt @ rudeserver.com
The acronym CGI stands for Common Gateway Interface. Unfortunately, that revelation is pretty useless. Originally, CGI was thought of as a way to invoke a command-line program through a web-server instead of invoking it directly- thus the term Gateway. The web-server acts as the go-between for a web-client and a server side application. Years later, the use of CGI applications has become transparent in that web-browsers do not need to know that they are invoking a command-line program. The results web-clients get from a CGI application are the same as they would be for any static page. So, the term Gateway is not helpful unless you use the old mindset. Nowadays, CGI is a fuzzy term that is basically synonymous with the term Dynamic.
|
You can get closer to the core meaning of CGI if you interpret it like this: Common = An industry standard
|
CGI is the
industry's preferred way for a WebServer to make WebClient/Browser data
available to an executable program |
Or, in layman's terms:
CGI is when
a web-server runs a local program to generate content, |
When a web server receives a request, it can send back 1 of three different
things:
1. The contents of a file on the system, byte for byte
2. Content that it generates on the fly, such as a File Not Found response
3. The output of an executable program that the webserver invokes (This is always CGI)
First, a CGI application is supplied with information not normally available to normal CLI (Command Line Interface) applications. This information includes the IP address and port of the connecting client, cookies and other information supplied by the client to the web server, information about the web server itself, and information necessary for the CGI application to access any form data that was sent by the web-client. This information is supplied to the cgi application from the web server via Environmental Variables.
Second, a CGI application is responsible for specifying the kind of content it is outputing. Kinds of content include HTML, plain text, JPEG image, PDF, etc.. The CGI application specifies the content-type by outputing a Content-Type response header before it sends any actual content. A valid Content-Type header would look like the this: Content-Type: text/html
Everything else is the same. There are no special compiler flags that need to be set in order to turn an application into a CGI application. As such, CGI is simply an agreement or contract that says, "Hey developer, here is how you get the form data and information about the web client, and here is what I need you to include in your output."
Basically, as long as an application spits
out a Content-Type header followed by a blank line, |
The following table represents the invocation of a CGI application. The main
players involved are a web browser and a web server.
Also involved is the executable program hello_world.exe (the source code of
which is displayed later).
WEB BROWSER/CLIENT
|
WEB SERVER
|
hello_world.exe
|
|
The webserver at rudeserver.com is running, and is waiting for a connection on port 80 | |||
The user types in the following url: http://rudeserver.com/cgi-bin/hello_world.exe?color=red |
|||
The browser parses the url and decides that it needs to connect to rudeserver.com on port 80 | |||
The webserver accepts the connection from the web browser, and waits for the web browser to issue an HTTP request | |||
Having connected succesfully, the web browser generates and sends the following HTTP request to the server:
|
|||
The webserver receives the request and parses it based on the HTTP specification. It examines the path that was supplied: /cgi-bin/hello_world.exe?color=red The webserver needs convert the path into an actual meaningful local
path. It ignores the question mark and everything after it. It knows that /var/www/cgi-bin So, the resulting path to the resource is /var/www/cgi-bin/hello_world.exe The webserver also knows that anything located in /var/www/cgi-bin/ is supposed to be a CGI program that needs to be executed. Before invoking hello_world.exe, the webserver sets the following environmental variables: REQUEST_METHOD =
GET The webserver then invokes hello_world.exe
|
|||
hello_world.exe is invoked.
|
|||
The webserver captures the output of the program. It makes sure that the program exits succesfully, and ensures that the output it received includes a Content-Type HTTP response header. It then completes the HTTP response header, and sends it and the rest of the program's output to the web browser:
|
|
||
The web browser receives the HTTP response and closes its connection. Based on the Content-Type header, the browser recognizes that the content of the response is a plain-text file. It displays the plain text content on the screen: hello world! |
|||
The webserver logs the request, and continues processing other requests |
In C and C++, the source code for hello_world.exe would look something like
this:
#include <stdio.h> int main(void) { printf("Content-Type: text/plain\n"); printf("\n"); printf("hello world!"); return 0; } |