<html>
<head>
<title>6.916 Problem Set 1</title>
</head>

<style type="text/css">
           body { margin-left: 10%; margin-right: 10%; }
         </style>

<body bgcolor=#ffffff text=#000000>

<center>
<table center cellspacing=1 cellpadding=0 width=90%>
<tr><td bgcolor=red>&nbsp;</td><td bgcolor=green>&nbsp;</td><td bgcolor=blue>&nbsp;</td>
</table>
</center>



<h4 align=center>Massachusetts Institute of Technology<br>
Department Electrical Engineering and Computer Science</h4>

<h3 align=center>6.916: Software Engineering of Innovative Web Services</br>
Problem set 1</h3>

<p>

Reading for this week: 
<ul>
<li><cite>Philip and Alex's Guide to
Web Publishing</cite>at <a href="http://photo.net/wtr/thebook/">http://photo.net/wtr/thebook/</a>, Chapters 1, 4, 5, 10, 11
<li><cite>SQL for Web nerds</cite> at <a href="http://photo.net/sql/">http://photo.net/sql/</a>, Chapters 1-9

<li><cite>Tcl for Web Nerds</cite> at <a href="http://photo.net/tcl/">http://photo.net/tcl/</a>
and/or

<a
href="http://www.amazon.com/exec/obidos/ISBN=0136168302/photonetA/">Practical
Programming in Tcl and Tk</a> (Brent Welch 1997; Prentice-Hall), all the
chapters up until the Tk stuff and/or
<a href="http://www.scriptics.com/man/tcl8.2/contents.htm">the Tcl 8.2 man pages</a>

<li>Introduction to AOLserver:  
<a href="http://photo.net/wtr/aolserver/introduction-1.html">Part 1</a> 
and 
<a href="http://photo.net/wtr/aolserver/introduction-2.html">Part 2</a> 


<li><a href="http://www.aolserver.com/server/docs/3.0/html/tcldev.htm">AOLserver 3.0 Tcl Developer's Guide</a>

<li><a href="http://photo.net/doc/common-errors.html">Common errors</a> by database-backed Web application programmers.

<li>Reference: 
<ul>
<li><a href="http://photo.net/teaching/manuals/usermanual/">Using the LCS Web/db computing facility</a>
<li>Complete Oracle documentation at
<a href="http://oradoc.photo.net">http://oradoc.photo.net</a>

</ul>
</ul>

Helper and example files (if you're not doing this at MIT):
<a href="6916.ps1.tar">6916.ps1.tar</a>

<p>

Online assistance:  <a href="http://photo.net/bboard/q-and-a.tcl?topic=Problem%20Set%201">Problem Set 1 Q&A forum</a>

<p>

Objectives:  we're trying to make sure that everyone knows

<ul>
<li>How to log into his or her development server
<li>Rudiments of Tcl
<li>How to run Tcl via tclsh
<li>How to create, execute, test, and debug a .tcl page
<li>How to write a .tcl page that queries a foreign server
<li>Rudiments of SQL
<li>How to query Oracle from the shell using SQL*Plus
<li>How to query Oracle via the AOLserver administration interface
<li>How to write a .tcl page that queries Oracle
<li>How to personalize Web services by issuing and reading cookies
<li>How to read and write data in XML

</ul>

<p>

This first problem set requires you to learn a lot of new software, so
make sure you get started early: plan to spend at least two or three
sessions on it.  There is nothing difficult here, but we do want to
lead you through the mechanics of using Tcl, SQL, and running the Web
server.

<p>



Please feel free to use this forum to ask your questions about Problem
Set 1 or any other class-related problem.  You can also view other
people's questions, and provide or view answers.  It is archived, so
maybe it'll save you some time!

<H3>Getting Started with Tcl</h3>

Start by reading <I><A
HREF="http://6916.lcs.mit.edu/manuals/usermanual"> Using the LCS
Web/db Computing Facility</A></I> and following its instructions to
log into your server machine.

<h4>Exercise 1: Running Tcl from the shell</h4>

Run Emacs.  Type "m-x shell" to get a Unix shell.  Type "tclsh" to
start the Tcl shell program.  Define a <em>recursive</em> Fibonacci
procedure in Tcl.  Execute and test.

<p>

Hint: If you're writing Tcl programs of more than two or three lines,
you may find it convenient to type the code into a separate Emacs
buffer (set to tcl mode) and cut and paste from there into the Tcl
shell buffer.

<p>

Type <code>info tclversion</code> at the tclsh prompt to make sure
that you're running Tcl 8.2, the same version that is compiled into
AOLserver.

<h4>Exercise 2: Running Tcl from an (almost) HTML page</h4>

Look at <a href="two-plus-two.adp">two-plus-two.adp</a> (<a
href="two-plus-two.adp.txt">source</a>).  This is an example of the
ADP templating facility in AOLserver.

<p>

Augment the page so that (1) you add a $4000 South American Cichlid
aquarium as an option, (2) you build a constructor procedure for each
aquarium type/cost spec (instead of simply calling <code>list</code>,
(3) you add an element to the sublist for how many of each type of
aquarium will be installed, (4) you print out quantity-dependent
subtotals and the grand total at the bottom.

<h4>Exercise 3: Simple Tcl pages</h4>

Using the Web browser running on your local machine, visit the URL
http://yourvirtualserver/psets/ps1/simple-tcl-page.tcl.  Using Emacs
running on the server machine, examine the source code for this page
in /web/yourvirtualserver/www/psets/ps1/simple-tcl-page.tcl.  Also look at
the source code for the target of the form in
/web/yourvirtualserver/psets/ps1/simple-tcl-page-2.tcl. (If
these files are missing, download them from <a href="6916.ps1.tar">6916.ps1.tar</a> and put them in /web/yourvirtualserver/www)  Notice how we
use Tcl to read the form variables.  Try out the form a couple of
times, using your browser.  Now debug the regular expression in
simple-tcl-page-2.tcl so that it properly handles the names "Tammy
Faye Baker" and "William H. Gates III".

<p>

Hint 1: it is easier if you don't try to do this in one regexp.  Use
if then elseif then elseif ...

<p>

Hint 2: <code>regexp</code> has a side-effect.  If you use a
multi-clause <code>if</code> statement, make sure that you wrap your
calls to <code>regexp</code> in braces so that they don't all get
evaluated immediately.

<h4>Exercise 4: Tcl pages that query foreign servers</h3>

Using the Web browser running on your local machine, visit the URL
<a href="http://www.webho.com/WealthClock">http://www.webho.com/WealthClock</a>.  Read the discussion of this program
in <a
href="http://photo.net/wtr/thebook/server-programming.html">Chapter
10</a> of <cite>Philip and Alex's Guide to Web Publishing</cite>.
Drawing upon that program as a model, build a new web service that
takes the ISBN of a book from a form and then uses
<code>ns_httpget</code> to query several online bookstores to find
price and stock information and displays the results in an HTML table.
Save your program in files called
/web/yourvirtualserver/www/psets/ps1/books.tcl and books-2.tcl so people
can access your service over the web.


<p>

We suggest querying wordsworth.com, barnesandnoble.com, and
www.1bookstreet.com (amazon.com tends to respond with a 302 redirect
you if the client doesn't give them a session ID in the query).  Your
program should be robust to timeouts, errors at the foreign sites, and
network problems.  You can ensure this by wrapping a Tcl
<code>catch</code> statement around your call to
<code>ns_httpget</code>.  Test your program with the following ISBNs:
0385494238, 0062514792, 0140260404, 0679762906.

<p>

Extra credit:  From which of the preceding books is the
following quote taken?
<blockquote>
"The obvious mathematical breakthrough would be development of an easy
way to factor large prime numbers."
</blockquote>

<h3 align=center>This would be a good time to take break.</h3>

<h3>Getting started with SQL*Plus</h3>

Start up again with Emacs (you took a break, right?) and start a Tcl
shell as before ("M-x shell" then "tclsh").  Type "M-x rename-buffer"
to rename the shell to "tcl-shell".  Type "M-x shell" to then get a
new Unix shell.  Rename this buffer "sql-shell".  In the SQL shell,
type "sqlplus" to start SQL*Plus, the Oracle shell client.  It's
convenient to work like this using two shells, one for Tcl and one for
SQL.

<h4>Exercise 5:  Talking to Oracle from the shell</h4>

Type the following at SQL*Plus to a table for keeping track of the
classes you're taking this semester:

<codeexample>
create table my_courses (
	course_number	varchar(20)
);
</codeexample>

<i>Note that you have to end your SQL commands with a semicolon in
SQL*Plus.  These are not part of the SQL language and you shouldn't
use these when writing SQL in your Tcl progams for AOLserver.</i>

<p>

Insert a few rows, e.g., 

<codeexample>
insert into my_courses (course_number) values ('6.916');
</codeexample>

Commit your changes:

<codeexample>
commit;
</codeexample>


See what you've got:

<codeexample>
select * from my_courses;
</codeexample>

One of the main benefits of using an RDBMS is <i>persistence</i>.
Everything that you create stays around even after you log out.
Normally, that's a good thing, but in this case you probably want to
clean up after your experiment:

<codeexample>
drop table my_courses;
</codeexample>

Quit SQL*Plus by typing "c-c c-d".

<h4>Exercise 6:  Tcl pages that talk to Oracle</h4>

Look at the file /web/yourvirtualserver/www/psets/ps1/quotations.tcl, which is
the source code for a page that displays quotations that have been
stored in the Oracle database.  Visit this page with your Web browser
and you should get an error.  The reason for the error is that the
program is calling a procedure that doesn't exist:
<code>ad_header</code> ("ArsDigita Header").  
You can confirm this suspicion by using
Emacs to read /home/nsadmin/log/yourvirtualserver-error.log, which is
where AOLserver logs any notices or problems.

<p>

To get AOLserver to load procedure definitions at server startup, you
have to put .tcl files in your server's private Tcl library:
/web/yourvirtualserver/tcl/.  Create a file called "ps1-defs.tcl" in
this directory and define the following Tcl procedures:

<ul>

<li><code>ad_header <i>page_title</i></code> -- returns HTML, HEAD,
TITLE, and BODY tags, with argument enclosed within the TITLE tags

<li><code>ad_footer</code> -- returns a string that will close the
BODY and HTML tags

</ul>


<p>

Reload the quotations.tcl page and you get ... the same error!
AOLserver doesn't know that you've added a file to the private
library; this is only checked at server startup.  Go to a Unix shell
and "restart-aolserver yourservername" (this is the big hammer; it
kills your server's Unix process so that Unix will restart AOLserver
automatically).  If <code>restart-aolserver</code> does not come back
with "Killing 10234" or some other process ID, you'll know that you
did not succeed (perhaps you made a typo when specifying your server
name).

<P>

Reload the quotations.tcl page and you get ... a slightly different
error!  The program is trying to query a table that doesn't exist:
<code>quotations</code>.

<p>

Go back to your sql shell and restart SQL*Plus.  Copy the table
definition from the comments at the top of the file quotations.tcl and
feed this definition to Oracle.  Go back to your Web browser and
reload the page that previously gave you an error.  Things should now
work, although the <code>quotations</code> table is empty.

<p>

Use the form on the web page to manually add the following quotation,
under an appropriate category of your choice: "640K ought to be enough
for anybody" (Bill Gates).  Note that it would be funnier if our table
had a column for recording the date of the quotation (1981) but we
purposely kept our data model as simple as possible.

<p>

Return to SQL*Plus and SELECT * from the table to see that your
quotation has been recorded.  The horrible formatting is an artifact
of your having declared the <code>quote</code> column to be 4000
characters long.

<p>

In SQL*Plus, insert a quotation with some hand-coded SQL (if you are
feeling lazy, you can cut and paste some SQL from the AOLserver error
log; all SQL statements that AOLserver sends to Oracle are logged
here).  Now reload the quotations.tcl URL from your Web browser.  If
you don't see your new quote here, that's because you didn't type
COMMIT; at SQL*Plus.  This is one of the big features of a relational
database management system:  simultaneously connected users are
protected from seeing each other's unfinished transactions.

<h4>Loading tables from .csv files</h4>

Now it is time to preload your quotations database with some
interesting material.  Load /web/yourvirtualserver/www/psets/ps1/quotations.csv
into Emacs and look at the format of the file (this is a standard kind
of output that you can get from any desktop spreadsheet program).
Using SQL*Loader (see Oraexercise 1 below), load these data into
your
<code>quotations</code> table.


<h3>Working with AOLserver and Oracle</h3>

Let's look at how to access the database from Tcl programs.  The basic
idea is that AOLServer includes a data abstraction called a
<em>set</em>, defined by the operations listed under the
<code>ns_set</code> API.  A set is a collection of {key,value} pairs,
which should be a familar idea from 6.001.  Selecting from a table
with <code>ns_db select</code> returns an identifier for a set, whose
keys are the names of the selected columns.  Subsequent successive
calls with <code>ns_db getrow</code> will fill in the values in this
set with the values from successive selected rows.  For example,
suppose you obtain a set identifier by selecting the following table
with <code>ns_select</code>:
<p>
<center>
<table cellpadding=5 border=1>
<tr align=center><th>writer</th>         <th>book</th></tr>
<tr align=left><td>Tolstoy</td>        <td>Anna Karenina</td>
<tr align=left><td>Steinbeck</td>      <td>Grapes of Wrath</td>
<tr align=left><td>Greenspun</td>      <td>Guide to Web Publishing</td>
</table>
</center>
<p>

Then, after the first call to <code>ns_db getrow</code> the set will
be

<codeexample>{{writer Tolstoy} {book "Anna Karenina"}}</codeexample>
After the second call to <code>ns_getrow</code> the set will will be
<codeexample>{{writer Steinbeck} {book "Grapes of Wrath"}}</codeexample>
and after the third call the set will be
<codeexample>{{writer Greenspun} {book "Guide to Web Publishing"}}</codeexample>

<p>

The programs in the files quotations.tcl and quotation-add.tcl
illustrate these ideas.  It will be well worth your while to study
these programs until you understand how they work, because you'll be
doing a lot of this kind of programming throughout the semester.

<h4>Exercise 6a: Eliminating the <code>lock table</code> via a sequence</h4>

Read about Oracle's <em>sequence</em> database object in <a
href="/sql/ref/createsequence">http://photo.net/sql/ref/createsequence</a>
and 
<a
href="/sql/ref/usingsequences">http://photo.net/sql/ref/usingsequences</a>
and <a href="http://photo.net/sql/data-modeling.html">http://photo.net/sql/data-modeling.html</a>.
By creating a sequence, you should be able to edit 
quotation-add.tcl to 

<ul>
<li>eliminate the <code>lock table</code>

<li>eliminate the begin and end transaction (since you're no longer
tying multiple SQL statements together)

<li>generate a key for the new quotation within the INSERT statement

</ul>



<h4>Exercise 7: Improving the User Interface for data entry</h4>

Go back to the main quotations.tcl page and modify it so that
categories entry is done via a select box of existing categories (you
will want to use the "SELECT DISTINCT" SQL command).  For new
categories, provide an alternative text entry box labeled "new
category".  Make sure to modify quotation-add.tcl so that it recognizes
when a new category is being defined.

<h4>Exercise 8: Searching</h4>

Add a small form at the top of quotations.tcl that takes a single
query word from the user.  Build a target for this form that returns
all quotes containing the specified word.  Your search should be
case-insensitive and also look through the authors column.  Hint:
<code>like '%foo%'</code>.

<h3>Personalizing Web services with cookies</h3>

We'd like you to build a system that implements per-user personalization of
the quotation database.  The overall goal should be

<ul>

<li>A user can "kill" a quotation and have it never show up again
either from the top-level page or the search page.

<li>Killing a quotation is persistent and survives the quitting and
restarting of a browser.

<li>Quotations killed by one user have no effect on what is seen by
other users.

<li>Users can erase their personalizations and see the complete
quotation database again by clicking on an "erase my personalization"
link on the main page.  This link should appear only if the user has
personalized the quotation database.

</ul>

<p>You can personalize Web services with the aid of magic cookies.  A
cookie issued by the server directs the browser to store data in
browser's computer.  To issue a cookie, the server includes a line
like

<codeexample>
Set-Cookie:  cookie_name=value; path=/ ; expires=Fri, 01-Jan-2010 01:00:00 GMT
</codeexample>

in the HTTP header sent to the browser.  Here <code>cookie_name</code> is the
name for this cookie, and <code>value</code> is the associated value,
which can contain any character or format except for semicolon, which
terminates a cookie.  The <code>path</code> specifies which URLs on
the server the cookie applies to.  Designating a path of
slash (<code>/</code>) includes all URLs on the server.

<p>

After the browser has accepted a server's cookie, it will include the
cookie name and value as part of its HTTP requests whenever it asks that
server for an applicable URL.  Your Tcl programs can read this information
using the AOLServer API

<codeexample>
[ns_set get [ns_conn headers] Cookie]
</codeexample>

After the expiration date, the browser no longer sends the cookie
information.  The server can also issue cookies with no specified
expiration date, in which case, the cookie is not persistent -- the
browser uses it only for that one session.

<p>You can see an example of how cookies are issued and read, by
visiting the URL
http://yourvirtualserver/psets/ps1/set-cookies.tcl
and examining the Tcl for file and the associated URLs
check-cookies.tcl and expire-cookies.tcl.
Observe how expire-cookies gets rid of cookies by reissuing them with
an expiration date that has already past.

<p>

Reference:  The magic cookie spec is available from 

<a href="http://home.netscape.com/newsref/std/cookie_spec.html">http://home.netscape.com/newsref/std/cookie_spec.html</a>.


<h4>Exercise 9</h4>

Implement the personalized quotation system described above.

<p>

Hint 1: it is possible to build this system using an ID cookie for the
browser and keeping the set of killed quotations in Oracle.  However,
if you're not going to allow users to log in and claim their profile,
there really isn't much point in keeping data on the server.  In fact,
by keeping killed quotation IDs in your users' browser cookies, you've
achieved the holy grail of academic database management system
researchers: a distributed database!  

<p>

Hint 2: it isn't strictly copacetic with the cookie spec, but you can
have a cookie value containing spaces.  Tcl stores a list of integers
internally as those numbers separated by spaces.  So the easiest and
simplest way to store the killed quotations is as a space-separated
list.

<p>

Hint 3: don't filter the quotations in Tcl; it is generally a sign of
incompetent programming when you query more data from Oracle than
you're going to display to the end-user.  SQL is a very powerful query
language.  You can use the NOT IN feature to exclude a list of
quotations.


<h3 align=center>How about taking another break?</h3>

<h3>Sharing data with XML</h3>

As you learned above from querying bookstores, data on the Web have
not traditionally been formatted for convenient use by computer
programs.  In theory, people who wish to exchange data over the Web
can cooperate using XML, a 1997 standard from the Web Consortium.  In
practice, hardly anybody uses XML right now (1999).  Fortunately for
your sake in completing this problem set, you can cooperate with your
fellow students: the overall goal is to make quotations in your
database exportable in a structured format so that other students'
applications can read them.

<p>

Here's what we need in order to cooperate:

<ul>

<li>an agreed-upon URL at everyone's server where the quotations
database may be obtained:  "/quotations-xml.tcl"

<li>an agreed-upon format for the quotations.
</ul>

<p>

We'll format the quotations using XML, which is simply a conventional
notation for describing structured data.  XML structures consist of
data strings enclosed in HTML-like tags of the form 
<code>&lt;foo&gt;</code> and <code>&lt;/foo&gt;</code>, describing
what kind of thing the data is supposed to be.

<p>

Here's an informal example, showing the structure we'll use for our
quotations: 

<codeexample>
&lt;quotations&gt;
  &lt;onequote&gt;
    &lt;quotation_id&gt;1&lt;/quotation_id&gt;
    &lt;insertion_date&gt;1999-02-04&lt;/insertion_date&gt;
    &lt;author_name&gt;Bill Gates&lt;/author_name&gt;
    &lt;category&gt;Computer Industry Punditry&lt;/category&gt;
    &lt;quote&gt;640K ought to be enough for anybody.&lt;/quote&gt;
  &lt;/onequote&gt;
  &lt;onequote&gt;
  .. another row from the quotations table ...
  &lt;/onequote&gt;
  ... some more rows
&lt;/quotations&gt;
</codeexample>

Notice that there's a separate tag for each column in our SQL data model:

<codeexample>
&lt;quotation_id&gt;
&lt;insertion_date&gt;
&lt;author_name&gt;
&lt;category&gt;
&lt;quote&gt;
</codeexample>

There's also a "wrapper" tag that identifies each row as a
<code>&lt;onequote&gt;</code> structure, and an outer wrapper that
identifies a sequence of <code>&lt;onequote&gt;</code> stuctures as a
<code>&lt;quotations&gt;</code> document.

<h4>Building a DTD</h4>

We can give a formal decription of our XML structure, rather than an
informal example, by means of an XML Document Type Definition (DTD).

<p>

Our DTD will start with a definition of the <code>quotations</code>
tag:

<codeexample>
&lt;!ELEMENT quotations (onequote)+&gt;
</codeexample>

This says that the <code>quotations</code> element must contain at
least one occurrence of <code>onequote</code> but may contain more
than one.  Now we have to say what constitutes a legal
<code>onequote</code> element:

<codeexample>
&lt;!ELEMENT onequote (quotation_id,insertion_date,author_name,category,quote)&gt;
</codeexample>

This says that the sub-elements, such as <code>quotation_id</code> must
each appear exactly once and in the specified order.  Now we have to
define an XML element that actually contains something other than
other XML elements:

<codeexample>
&lt;!ELEMENT quotation_id (#PCDATA)&gt;
</codeexample>

This says that whatever falls between <code>&lt;quotation_id&gt;</code>
and <code>&lt;/quotation_id&gt;</code> is to be interpreted as raw
characters rather than as containing further tags (PCDATA stands for
"parsed character data").

<p>

Here's our complete DTD:

<codeexample>
&lt;!-- quotations.dtd --&gt;
&lt;!ELEMENT quotations (onequote)+&gt;

&lt;!ELEMENT onequote (quotation_id,insertion_date,author_name,category,quote)&gt;

&lt;!ELEMENT quotation_id (#PCDATA)&gt;
&lt;!ELEMENT insertion_date (#PCDATA)&gt;
&lt;!ELEMENT author_name (#PCDATA)&gt;
&lt;!ELEMENT category (#PCDATA)&gt;
&lt;!ELEMENT quote (#PCDATA)&gt;
</codeexample>

You will find this extremely useful...  Hey, actually you won't find
this DTD useful at all for completing this part of the problem set.
The only reasons that DTDs are ever useful is for feeding to XML
parsers because they can then automatically tokenize an XML document.
For implementing your quotations-xml.tcl page, you will only need to look
at informal example.


<h4>Exercise 10: Generating XML</h4>

Create a
Tcl program that queries the
<code>quotations</code> table, produces an XML document in the
preceding form, and returns it to the client with a MIME type of
"application/xml".  Place this in a file quotations-xml.tcl, so that
other users can retrieve the data by visiting that agreed upon URL.

<p>

To get you started, we've provided /psets/ps1/example-xml.tcl.
Requesting this URL with a Web browser should offer to let you to save
the document to a local file, and you can then examine it with a text
editor on your local machine.  (This assumes that you haven't defined
some special behavior for your browser for MIME type application/xml.)
The differences between our example and your program is that you'll
need to produce a document containing the entire table and you'll need
to generate it on the fly.

<h4>Exercise 11: Importing XML</h4>

Write a program to import a quotations database from another student's
XML output page (if you have completed Exercise 9 and your peers
have not, this might be a good time to exhort them to greater
efforts).  Your program must 

<ul>
<li>grab /psets/ps1/quotations-xml.tcl from another student's database using
<code>ns_httpget</code>
<li>parse the file into records and the records into fields
<li>if a quote from the foreign server has identical author and
content as a quote in your own database, ignore it; otherwise, insert
it into your database with a new <code>quotation_id</code> (you don't
want keys from the foreign server conflicting with what is already in
your database)

</ul>

<p>

Hints:  You might want to set up a temporary table using
<code>create table quotations_temp as select * from quotations</code>
and then drop it after you're done debugging.  You should use
<code>DoubleApos</code> when presenting data to Oracle for
comparisons.

<P>

Rather than having you link in a 100,000-line C program (or a
5,000-line Lisp program) that parses XML documents based on a DTD,
we've gone for simplicity here by predefining for you a parser in Tcl
that understands only this particular DTD for quotations.  The
procedure is <a href="parse-all.txt"><code>parse_all</code></a> (you
have to install this file in your server's private Tcl library,
/web/yourvirtualserver/tcl/, for this function to be callable by .tcl
and .adp pages) . The <code>parse_all</code> proc takes an XML
quotation structure as argument and returns a Tcl list, showing the
parts and subparts of the structure.  To see an example of the format,
use your browser to visit the page
http://yourvirtualserver/psets/ps1/xml-parse-test.tcl.

<p>

Note: these exercises are designed to familiarize you with XML.  In
most cases, sophisticated XML processing should be done inside Oracle
using Java libraries.  See
<a href="http://photo.net/doc/xml.html">http://photo.net/doc/xml.html</a>.

<h4>Exercise 12: Tracking a book's popularity</h4>

Neurotic authors will constantly check amazon.com to see where their
book is ranked in terms of sales.  That these figures are updated
hourly only makes the habit more destructive.  Write a program to
track a very neurotic author's work (ISBN 1558605347).  You will need
to

<ol> 

<li>define an Oracle table to hold ISBN, date-time, sales rank

<li>write a procedure that will grab the Amazon page, REGEXP out the
sales rank, and stuff it into the Oracle table

<li>use the AOLserver API call <code>ns_schedule_proc</code> to
schedule your procedure to run once every hour

<li>build a .tcl page to look at the popularity over time

</ol>

One of the interesting things about Amazon is that they often lose
control of their server farm and database (they write a lot of C code
and one programmer's sloppiness can generate a catastrophic failure of
the entire service).  You might want to build your system so that you
can record (a) times when amazon.com is unreachable, and (b) for which
of those times the page served contains the string "Our store is
closed temporarily for scheduled maintenance" (you'll sometimes get
this during the middle of weekdays when they would definitely not have
intentionally scheduled any maintenance).

<h4>Exercise 13: Becoming a chartoonist</h4>

Why print a table of a book's popularity when you can print a chart?
You're going to learn about the wonders of single-pixel GIFs and WIDTH
and HEIGHT tags now.  Grab the software in 
<a href="http://software.arsdigita.com/tcl/ad-graphing.tcl">http://software.arsdigita.com/tcl/ad-graphing.tcl</a> and put it in your server's private Tcl directory
(/web/yourservername/tcl/).
Read the docs at 
<a href="http://software.arsdigita.com/www/doc/graphing.html">http://software.arsdigita.com/www/doc/graphing.html</a> and then write code to generate a pretty chart of 
the data from Exercise 12.

<p>

Note that you're dipping into the ArsDigita Community System toolkit
here, the software with which you'll be occupied in Problem Set 2.

<h3>The Wide World of Oracle</h3>

We're going to shift gears now into a portion of the problem set
designed to teach you more about Oracle and SQL.

<h4>Oraexercise 1:  SQL*Loader</h4>

<ul>

<li>create a tab-separated file in Emacs containing five lines, each
line to contain your favorite stock symbol, an integer number of shares
owned, and a date acquired (in the form MM/DD/YYYY)

<li>create an Oracle table to hold these data:

<blockquote><pre>
create table my_stocks (
       symbol	       varchar(20) not null,
       n_shares	       integer not null,
       date_acquired   date not null       
);
</pre></blockquote>

<li>use the <code>sqlldr</code> shell command to invoke SQL*Loader to
slurp up your tab-separated file into the <code>my_stocks</code> table
(see page 1183 of <cite>Oracle8: The Complete Reference</cite> and
the official Oracle docs at <a href="/sql/ref/utilities">http://photo.net/sql/ref/utilities</a>)

</ul>


<h4>Oraexercise 2:  copying data from one table to another</h4>

This exercise exists because we found that, when faced with the task
of moving data from one table to another, programmers were dragging
the data across SQL*Net from Oracle into AOLserver, manipulating it in
Tcl, then pushing it back into Oracle over SQL*Net.  This is not the
way!  SQL is a very powerful language and there is no need to bring in
any other tools if what you want to do is move data around within
Oracle. 


<ul>

<li>using only one SQL statement, create a table called
<code>stock_prices</code> with three columns: <code>symbol,
quote_date, price</code>.  After this one statement, you should have
created the table and filled it with one row per symbol in
<code>my_stocks</code>.  The date and price columns should be filled
with the current date and a nominal price. Hint: 

<code>
select symbol, sysdate as quote_date, 31.415 as price from my_stocks;
</code>.

<li>create a new table:
<blockquote><pre>
create table newly_acquired_stocks (
       symbol	       varchar(20) not null,
       n_shares	       integer not null,
       date_acquired   date not null       
);
</pre></blockquote>

<li>using a single <code>insert into .. select ... </code> statement
(with a WHERE clause appropriate to your sample data), copy about half
the rows from <code>my_stocks</code> into <code>newly_acquired_stocks</code>


</ul>


<h4>Oraexercise 3:  JOIN</h4>

With a single SQL statement JOINing <code>my_stocks</code> and
<code>stock_prices</code>, produce a report showing symbol, number of
shares, price per share, and current value.


<h4>Oraexercise 4:  OUTER JOIN</h4>

Insert a row into <code>my_stocks</code>.  Run your query from
Oraexercise 3.  Notice that your new stock does not appear in the
report.  This is because you've JOINed them with the constraint that
the symbol appear in both tables.  

<p>

Modify your statement to use an OUTER JOIN instead so that you'll get
a complete report of all your stocks, but won't get price information
if none is available.


<h4>Oraexercise 5:  PL/SQL</h4>

Inspired by Wall Street's methods for valuing Internet companies,
we've developed our own valuation method for this problem set: a stock
is valued at the sum of the ascii characters making up its symbol.
(Note that students who've used lowercase letters to represent symbols
will have higher-valued portfolios than those will all-uppercase
symbols; "IBM" is worth only $216 whereas "ibm" is worth $312!)

<ul>

<li>define a PL/SQL <em>function</em> that takes a trading symbol as
its argument and returns the stock value (hint: Oracle's built-in
<code>ASCII</code> function will be helpful)

<li>with a single UPDATE statement, update <code>stock_prices</code>
to set each stock's value to whatever is returned by this PL/SQL
procedure

<li>define a PL/SQL function that takes no arguments and returns the
aggregate value of the portfolio (<code>n_shares * price</code> for
each stock).  You'll want to define your JOIN from Oraexercise 3
(above) as a cursor and then use the PL/SQL Cursor FOR LOOP facility.
Hint: when you're all done, you can run this procedure from SQL*Plus
with <code>select portfolio_value() from dual;</code>.

</ul>


<h4>Oraexercise 6:  Buy more of the winners</h4>

Rather than taking your profits on the winners, buy more of them!

<ul>

<li>use SELECT AVG() to figure out the average price of your holdings

<li>Using a single INSERT with SELECT statement, double your holdings
in all the stocks whose price is higher than average (with
<code>date_acquired</code> set to <code>sysdate</code>)

</ul>

Rerun your query from Oraexercise 4.  Note that in some cases you will
have two rows for the same symbol.  If what you're really interested
in is your current position, you want a report with at most one row
per symbol.

<ul> 

<li>use a SELECT ... GROUP BY query from <code>my_stocks</code> to
produce a report of symbols and total shares held

<li>use a SELECT .. GROUP BY query JOINing with
<code>stock_prices</code> to produce a report of symbols and total
value held per symbol

<li>use a SELECT .. GROUP BY .. HAVING query to produce a report of
symbols, total shares held, and total value held per symbol
<em>restricted to symbols in which you have at least two blocks of
shares</em> (i.e., the "winners")

</ul>


<h4>Oraexercise 7:  encapsulate your queries with a view</h4>

Using the final query above, create a view called
<code>stocks_i_like</code> that encapsulates the final query.



<h3>Information Architecture and User Interface</h3>

You've got a database table filled with stock data.  There are a
couple of ways to provide a Web interface to these data.  The loser
Web developer presents a page with options for retrieving stock data:

<ul>
<li>show recent acquisitions
<li>show best performers
<li>show highest value stocks
<li>show entire portfolio
</ul>

The bottom line is that the user doesn't see anything on the first
page.  From an information presentation point of view, the first page
is therefore a waste.  

<p>

An alternative is to show the user a table of holdings right on the
top-level page, with a sensible subset of the data by default, and
provide controls to adjust what is included in the display.  What kind
of controls?  You could have the ones above, plus whatever other views
the publisher of the site and the users eventually decide are
necessary.  Suppose, though, that you can organize the controls along
orthogonal dimensions.  If you can do that, with just a handful of
"dimensional sliders", the user will have many options.  

<p>

For example, the ArsDigita ticket tracking system is used to store bug
reports and feature requests.  The following dimensions are employed

<ul>
<li>involvement of connected user (values:  mine/everyone's)
<li>ticket status (values: open/+deferred/+closed)
<li>ticket age (values:  last day/last week/last month/all)
</ul>

Note that even though these are modeled as continuous dimensions, the
user is not presented with continuous sliders.  The user picks one of
several discrete points on each dimension.  This interface is
compatible with the Netscape 1.1 browser and provides the user with 24 
choices in total.  Yet instead of seeing 24 options in a big list, the
user sees one line across the top of the browser, with nine buttons
arranged logically into three dimensions.

<p>

If those 24 options aren't enough, the ticket tracking system lets the
user re-sort the table by any of the columns by clicking on the column
heading.

<p>

Try building the same sort of thing for your stock portfolio.  You
want a .tcl page that shows the contents of <code>my_stocks</code>
with <code>stock_prices</code>.  Provide controls across the top
(hint:  TABLE WIDTH=100% and TD ALIGN=RIGHT will be useful) for the
following dimensions:

<ul>
<li>recency of acquistions (within last week/last month/last year/all)
<li>value of holding (more than 10% of total portfolio/more than 2%/all)
</ul>

Provide the ability for users to sort by any of the columns presented
(hint:  <code>export_ns_set_vars</code> in
$SERVER_HOME/tcl/00-ad-utilities.tcl will be useful for this,
notably because of the <code>exclusion_list</code> argument), e.g.,
symbol, number of shares, price per share, value of holding.

<p>

You can build this from scratch or use the ArsDigita Community System
toolkit API calls in 
<a href="http://software.arsdigita.com/tcl/ad-table-display.tcl">http://software.arsdigita.com/tcl/ad-table-display.tcl</a> as a building block.


<h3>Turning in your work</h3>

We expect you to have your code working and debugged before the
beginning of the class when it is due.  In class, we will use a web
browser to connect to several student servers and all look together at
the dynamic pages, to see how well they work.  Also, starting
immediately after class, we will examine the contents of your Web
server to look at your answers.  On Thursday night, you will be
required to do a code review with a TA in lab.

<h3>Who Wrote This and When</h3>

This problem set was written by 
<a href="http://photo.net/philg/">Philip Greenspun</a> 
and <a href= "http://www-swiss.ai.mit.edu/~hal/hal.html">Hal Abelson</a>
in January 1999 for MIT Course 6.916.  It is copyright 1999 by them but 
may be reused provided credit is given to the original authors
with a hyperlink to this document.

<p>

It was revised in January 2000 for AOLserver 3.0, which incorporates
Tcl 8.2.  The old version is available from 
<a href="http://photo.net/teaching/psets/ps1/6916.ps1-for-aolserver-2.3.tar">http://photo.net/teaching/psets/ps1/6916.ps1-for-aolserver-2.3.tar</a>.


<p>

It is permanently housed at http://photo.net/teaching/psets/ps1/ps1.adp.

<%=[teaching_footer]%>

