Introduction
NOTEX stands for "Network Oriented Transforms in ECMAScript and XML" and is implemented in the CGI script
notex.cgi
This CGI script enables you to:
- Read and run server-side any JavaScript file located anywhere on the web
- Use this JavaScript file to process XML files located anywhere on the web
- Cache your results to give faster responses to subsequent requests
The project is hosted by Google at
http://code.google.com/p/notex/ and there's a
Facebook NOTEX group too - please join us.
An example
This is best explained using a simple JavaScript file
example.js:
var file_contents = GET('mydata.com/data.xml'); // read a file of XML data
var xml = new XML(file_contents); // turn the string into an E4X data structure
xml.head.title = 'My new title'; // change a part of the XML data structure
write(xml.toXMLString()); // write out the changed version as XML
If this file is located at the URL
myscripts.com/example.js then it may be run by hosting
notex.cgi at your domain "mydomain.com" and running a query like this:
mydomain.com/notex.cgi?app=demo&token=abcd&script=myscripts.com/example.js
Notice that the
notex.cgi CGI script, the
example.js JavaScript file and
data.xml XML file are all located at different domains. The
notex.cgi CGI script acts as a network hub running a JavaScript program to read and process XML files from around the web, then output new XML results. In effect, the NOTEX system transforms XML into new XML via JavaScript programs.
Caching
As you would expect, it can take a few seconds to read the JavaScript file, run the JavaScript file, read the XML file(s) and output the results, so here's a faster version of the same query:
mydomain.com/notex.cgi?app=demo&token=abcd&script=myscripts.com/example.js&cache=3600
The
cache=3600 part of the query tells the CGI script to cache the output for 1 hour (3600 seconds). This works well when the data processed by the script does not regularly change, but if you only want to cache the JavaScript file you can use a query like this:
mydomain.com/notex.cgi?app=demo&token=abcd&script=myscripts.com/example.js&jcache=3600
Here the
jcache=3600 part of the query tells the CGI script to cache the JavaScript file
myscripts.com/example.js for 1 hour. Only use this option when you're running your tested JavaScript files in production environments.
Conclusions
In this short introduction, we've covered:
- What NOTEX is and how it is implemented in the notex.cgi CGI script
- How the various network resources can be located anywhere on the web
- The way that notex.cgi reads and runs JavaScript files on the server
- Ways to cache the notex.cgi script to make it perform faster
I invite you to try
notex.cgi (or the FastCGI version
notex.fcgi) for yourself by clicking the "Download" tab at the top of this page, and following the (hopefully) detailed instructions.
Enjoy!
Feb 24, 2009: Added relative file paths and URLs in version 2.1.1
Now you can
read("../file.jsx") and
GET("../file.jsx") in your JavaScript code. This makes it much easier to write portable code.
Feb 22, 2009: Added better error reporting in version 2.0.3
Some of the error reporting was a bit cryptic in version 2, so it's now improved a bit in version 2.0.3 with some cleaner code. The "app", "token", "script" and "remote_host" are now available to your scripts via the
config() function.
Feb 21, 2009: Added NOTEX_STRICT mode in version 2.0.2
Now you can choose to run NOTEX CGI scripts using only the Apache "action" to directly run ".jsx" files in the URL path. This option is called NOTEX_STRICT and closes a security hole where remote hostile scripts could be run if someone knew a valid "app" and "token".
Feb 21, 2009: Major upgrade in version 2.0.0
This new version makes a number of significant changes:
- It's now "app-centric": Replaced "user" with "app" and "users" with "apps"
- The "app" name is taken from a URL of the form ".../apps/name/..."
- Added a
read(file) function to include JavaScript files more easily
- Added a default token name of "token" for when tokens are not required
The new directory structure places "services" inside "apps", for example "apps/demo/services/finance/quote.js".
Feb 19, 2009: Added the "env()" function in version 1.6.5
Now you can get the value of environment variables inside your JavaScript programs using the "env()" function. This enables you to set secret values in your ".htaccess" file using the
Setenv NAME value format. For example, you can use this technique to read database usernames and passwords in your scripts, or set a home directory path.
Feb 18, 2009: Added extra logging info in version 1.6.4
New improved logging makes it easier to discover which app's script is not terminating. The log event attributes "pid" and "request" enable you to match the log event "Processing" with "Processed" for all successful script runs, and hence detect which scripts are not terminating. Next, I'll make some admin services to do this matching for you.
Feb 16, 2009: Added PATH_TRANSLATED in version 1.6.3
Now you can configure your Apache server to run the NOTEX script automatically for files of your chosen extension (but we recommend ".jsx" for executable JavaScript). To do this, just make a
.htaccess file like this:
Options ExecCGI
AddHandler cgi-script .cgi
AddHandler js-notex .jsx
Action js-notex /notex.cgi
You can replace "/notex.cgi" with "/notex.fcgi" if you're using the FastCGI version. It just works!
Feb 15, 2009: Cleaned up the code in version 1.6.2
After the past two weeks of rapid development and regular releases, I thought it would be a good opportunity to clean up the code and make the CGI and FastCGI versions as similar to each other as possible. So now you have cleaner code in version 1.6.2 with extra comments.
Feb 14, 2009: Fixed some global variable bugs in NOTEX FastCGI 1.6.1
Because the FastCGI version was a simple port of the regular CGI version, I had missed some global variable issues. These are now fixed in version 1.6.1 so you should not see any strange behavior caused by remembered state from one request to another. Next I'll write a test suite to catch any regression bugs in future.
Feb 10, 2009: Added "header()" and "md5()" functions in version 1.6
Now you can set HTTP output headers with the "header()" function, giving you more flexibility with your script output. For example, you can redirect with a "-location" header and return a custom status (e.g. 301) with a "-status" header. Call
header('-expires', '+1h') to tell browsers to cache the content for 1 hour.
An "md5()" function enables your scripts to use the very useful MD5 hashing function to hide passwords and similar. For example, call
md5('mypassword') to get a unique 32 character hex string instead.
Feb 9, 2009: Added "load()" and "save()" functions in version 1.5
The private functions "_read_cache()" and "_write_cache()" are now public as "load()" and "save()" making cache control much more flexible for the NOTEX programmer. At 16 public functions, I hope the namespace pollution isn't too high now.
Feb 8, 2009: Added FastCGI in version 1.4 and added HTTP cookie support
I've tested a nice FastCGI version of the NOTEX script available here:
notex.fcgi (you'll need the extra Perl module CGI::Fast before it will work for you). The only downside is that it won't respond to HTTP "DELETE" methods until CGI.pm query string parsing is fixed.
I also added a new method
cookie(name, value) so you can get and set cookies using the NOTEX script. This makes NOTEX a good candidate for serving AJAX calls from web pages for logged-in users with session cookies.
Feb 6, 2009: Released version 1.3 accepting GET, HEAD, POST, PUT and DELETE
Now you can write NOTEX scripts that respond to all 5 HTTP verbs, as well as using them inside your scripts to request resources. I want to make NOTEX as RESTful as possible. I also changed the "config()" settings, added "http()" and "https()" to query HTTP headers, and added "status()" and "method()" functions to you can easily query the HTTP status of your last call, and the HTTP method being used to call your script.
Feb 4, 2009: Released version 1.2 with GET, HEAD, POST, PUT and DELETE
I've replaced the "read()" function with various HTTP methods such as "GET()", "HEAD()", "POST()", "PUT()" and "DELETE()" to that the script can be used in RESTful applications.
Feb 2, 2009: Released version 1.1 with some fixes
The HTTP headers weren't being cached so I've recoded the caching to include the HTTP headers. I also improved the user agent string and beautified the code just a tiny bit. I'm working on a test suite to avoid regression bugs in future versions.
Feb 1, 2009: Created the notex.com web site
I spent about 5 hours putting together a basic notex.com web site using jQuery's tabs and Xara Extreme 4 to design a cute green logo. The site is also mirrored at notex.info which might become the main domain in the future.
Please note that NOTEX is only tested on Unix and Linux systems (in particular Ubuntu Linux and Mac OS X).
Get the files
To install NOTEX you'll need the
notex.cgi CGI script or
notex.fcgi FastCGI script (the latest version is recommended), the Perl module
JavaScript::SpiderMonkey and the
SpiderMonkey source code. Here are the latest versions that I have:
How to install
First, be sure that the perl module
Log::Log4perl is installed. For example run
cpan install Log::Log4perl as the "root" user. Again, assuming you have "root" access to your machine, you should make sure the files
JavaScript-SpiderMonkey-0.19.tar.gz and
js-1.7.0.tar.gz are in the
same directory then follow these steps on the command line:
tar xfz JavaScript-SpiderMonkey-0.19.tar.gz
tar xfz js-1.7.0.tar.gz
cd js/src
make -f Makefile.ref
cp *.OBJ/libjs.* /usr/lib
cd ../../JavaScript-SpiderMonkey-0.19
perl Makefile.PL -E4X
make test
make install
You will now have a new
libjs library on your flavor of Unix, and a newly installed Perl module called
JavaScript::SpiderMonkey with E4X support.
Running notex.cgi
The
notex.cgi script relies on a number of Perl modules you can install from CPAN, for example by running the command
cpan as user "root" or trying
perl -MCPAN -eshell instead. Here's a list:
- Time::HiRes - to measure the precise duration of script runs
- Digest::MD5 - to generate filenames for cached data files
- LWP::UserAgent - to request remote files over the network
- CGI - to read query string parameters and remote user details
- CGI::Fast - to run the FastCGI version of the NOTEX script
- JavaScript::SpiderMonkey - the module you have just installed
Finally, you need to configure your Apache web server (or similar) to run CGI scripts. This can be achieved with Apache configuration like this:
AddHandler cgi-script .cgi
<Directory "/path/to/your/web/site/">
Options ExecCGI
</Directory>
...and make sure that your
notex.cgi file is executable with a command
like:
chmod a+x notex.cgi
Running notex.fcgi
When you need better CGI script performance, FastCGI is a great option. If you're running the FastCGI version of NOTEX, then you'll need to configure your web server. For Apache, use a configuration file like this:
AddType application/x-httpd-fcgi .fcgi
FastCgiServer /home/username/web/notex.fcgi -processes 2
<VirtualHost 1.2.3.4:80>
DocumentRoot /home/username/web
ServerName www.mydomain.com
ErrorLog logs/mydomain.com-error_log
CustomLog logs/mydomain.com-access_log common
</VirtualHost>
As "root", I installed FastCGI on Apache2 with Ubuntu Linux using the commands:
apt-get install libapache2-mod-fastcgi
apache2ctl restart
The examples below are hosted on our
notex.com domain, but you can download and install the NOTEX script and run it on your own domains instead.
Getting stock quotes
eval(read('/apps/notex.jsx'));
notex.usage('Get stock quotes', {symbol: 'The stock symbol to quote'});
// Get the stock symbol parameter
var symbol = param('symbol'); if (!symbol) notex.error('No symbol');
var url = 'http://finance.yahoo.com/q?s=' + symbol;
// Read the stock quote web page as XHTML (and cache it for 60 seconds)
var xhtml = GET('/apps/demo/services/translate/xhtml.php?token=' + notex.token + '&url=' + escape(url), 60);
// Strip the namespace attribute and remove all XHTML entities (e.g. " ")
var xml = xhtml.replace(/ xmlns="[^"]+"/, '').replace(/&\w+?;/g, '').toXML();
// Now return the stock quote by extracting the "Last Trade:" from the XHTML
var price = xml..tr.(th=='Last Trade:').td.big.b;
var out =
<quote>
<symbol>{ symbol }</symbol>
<price>{ price.toString() }</price>
<source>http://finance.yahoo.com/</source>
</quote>
write(out.toXMLString());
This example script
apps.notex.com/apps/demo/services/finance/quote.js does the following:
- It gets the symbol name for a US quoted stock (e.g. "AAPL" for Apple)
- It requests a web page from Yahoo Finance with the latest stock price
- It extracts the stock price from the page
- It returns the price in a small piece of XML
Here's a URL to run this example right now on our web server, and send the results as XML to your browser:
apps.notex.com/notex.fcgi?token=abcd&script=http://apps.notex.com/apps/demo/services/finance/quote.js&symbol=AAPL
Here's the same example, using an Apache action to handle ".jsx" files:
apps.notex.com/apps/demo/services/finance/quote.jsx?token=abcd&symbol=AAPL
The CGI scripts
notex.cgi and
notex.fcgi run your JavaScript files on a web server. These JavaScript files may contain calls to the NOTEX functions below, which are exposed to your JavaScript programs by the NOTEX CGI scripts. This is different from JavaScript that runs in the web browser - you have no "document" or "window" objects in a NOTEX JavaScript program, only the functions below.
http(header) & https(header)
Get an HTTP(S) header (e.g. 'user-agent') that was sent in the HTTP request to this script
write(output)
Write some text output from this script, for example an XML response
read(filename)
Read a filename, provided that the file is JavaScript ending in ".js" or ".jsx". This enables you to include JavaScript files inside other JavaScript files.
param(name, [default])
Read a parameter from the HTTP request, for example "app" from the query string "?app=demo&token=abcd" and return it, or return any default value.
config(name, [value])
Get (or set) a configuration variable, including:
- app: the application name running this CGI script
- token: the app token (a filename in the app’s directory)
- script: the script being run (either a URL or a file path)
- remote_host: the remote host name or address running the script
- user_agent: by default the user agent is called "NOTEX/1.6"
- http_timeout: by default the HTTP request timeout is 10 seconds
- content_type: defaults to "text/xml" but can be any other type
- xml_encoding: defaults to "utf-8" for Unicode but can be other
- clean_up_xml: defaults to 1 to remove <?xml... and <!DOCTYPE...
cookie(name, [value], [expires], [path], [domain], [secure])
Get or set an HTTP cookie, with various parameters
header(name, [value])
Set an HTTP header to be sent in the HTTP response sent by this script
method()
Return the HTTP method used in the HTTP request received by this script
status()
Return the HTTP status of the most recent HTTP response that was received by this script, after a call to "GET", "HEAD", "DELETE", "POST" or "PUT" (see below)
GET(url) & HEAD(url) & DELETE(url)
Use HTTP method "GET", "HEAD" or "DELETE" to request a URL
PUT(url, content) & POST(url, content)
Use HTTP method "PUT" or "POST" to send some content to a URL
log(text)
Log some text to the app's log file for today's activity (to be found in "/apps/name/logs/20090213.log" or similar)
md5(text)
Return the MD5 checksum in hex format for some text data
env(variable)
Return the value of an environment variable. These can be set using your .htaccess file like this:
Setenv DB_USER username
load(name, [seconds])
Load a filename from the app's cache (if it's not older than "seconds")
save(name, content)
Save some content to a filename in the app's cache
E4X is a powerful way to process XML data with JavaScript.
You can read more about it here:
Finding XML content
Given this JavaScript E4X fragment...
var pets =
<pets>
<domestic>
<pet animal="dog"><name>Jasper</name><age>13</age></pet>
<pet animal="cat"><name>Rupert</name><age>12</age></pet>
<pet animal="cat"><name>Monty</name><age>15</age></pet>
<pet animal="cat"><name>Tiger</name><age>20</age></pet>
</domestic>
</pets>
...you can find the pet cats with this code:
header('-type', 'text/plain');
var cats = pets.domestic.pet.(@animal=='cat');
write('There are ' + cats.length() + ' cats');
write('\n'); // newline
var monty = pets.domestic.pet.(name.toLowerCase()=="monty");
write('Monty is ' + monty.age + ' years old');
write('\n'); // newline
var tiger = pets.domestic.pet[3];
write('Tiger is ' + tiger.age + ' years old');
...which will output:
There are 3 cats
Monty is 15 years old
Tiger is 20 years old
Soon you'll be able to sign up here to use our NOTEX service and pay for the resources you use.