ECT 360
Introduction to the
Class
Prof. Robin Burke
ECT 360
Fall 2004
Outline
Introductions
Course and Syllabus
XML
XHTML
Homework #1
Introductions
Student information sheet
Administrativa
Contacting me
Questions
CS&T 453
x 25910
rburke@cs.depaul.edu
Try the course discussion forum
Automatically mailed to all students
I will also post announcements here
Course web site
http://josquin.cs.depaul.edu/~rburke/courses/
f04/ect360/
About Me
3rd year at CTI
PhD in AI, 1993
Research
AI applications in E-Commerce
"smart catalogs"
Taught web development since 1996
What I hope to get out of teaching this
class
Course
Introduction to XML
concentrate on XML's uses for the web
many other uses!
Three parts
XML standard
XML validation
XML transformations
Some DOM programming
Course cont'd
Seven homework assignments
Midterm project
Final project
Allocation
Homework – 40%
Midterm project – 30%
Final project – 30%
Midterm project
Instead of a midterm
10/6
Two-person teams
Choose an XML language and report on it
Possibilities
SVG, VoiceML, XSL-FO, MathML, SMIL,
SOAP, WSDL, UDDI, BPEL4WS, XBRL
anything else you think interesting
Midterm project
Proposal
Due next week 9/15
Email message with the following
Name / email address for each partner
1st/2nd/3rd choice for XML application
Grading
Three Components
Knowledge
Does the work display correct technical knowledge?
Reasoning
Does the work indicate good problem-solving skills?
Communication
Written work: Is the answer well-written English?
Code: Does the answer display good coding /
documentation style
Grading, cont'd
A = Excellent work
B = Very good work
Complete knowledge of the subject matter
No major errors of reasoning in problem solutions
Competent written answers
Readable coding style
C = Average work
Thorough knowledge of the subject matter
Well-considered and creative solutions
Well-written answers
Employment of impeccable coding style
Some gaps in knowledge of subject matter
Some errors or omissions in problem solving
Written answers may contain grammatical and other errors
Coding may be stylistically awkward
D = Below average work
Substantial gaps in knowledge of subject matter.
Problem solving incomplete or incorrect
Poor English in written answers
Ineffective coding style
Resources
Text
Carey, P. New Perspectives on XML
(Comprehensive). Thomson Learning
Tools
XML Spy
• 4.3 included with book
XML Spy Enterprise 2004
• available in 7th floor lab
Discussion enhancement
Card distribution
XML
eXtensible Markup Language
Misnomer
Not a language
Technology for creating languages
XML
Looks a little bit like HTML
But with a wide variety of tag names
Reason
HTML and XML have a common ancestor
• SGML
Developed for entry and management of
very large documents
Why do we need XML?
Web publishing with HTML
Develop content
Determine how content should be displayed
on pages
Encode content in HTML
Content available to users
Problem
what happens when content changes
• design decisions must be rethought
what happens when design changes
• HTML must be rewritten
designer and author must work closely
Web publishing with XML
Develop XML application for content
Develop content
Content encoded in XML
Design pages
Write stylesheet to render pages in
HTML
Content available to users
Benefits
If design changes, only stylesheet is
affected
Different pages / displays can be
generated from the same content
Designer and author need not interact
Big picture
Modularity is a good thing
decoupling of data's structure from its use in
a particular application
lowers effort of repurposing data
Modularity requires standards
non-application specific data representation
• not in the interest of any application vendor
XML is the language in which such
standards can be expressed
XML applications
Purpose-specific languages that
conform to the XML standard
Many are standardized
In-house languages easy to develop
XML is becoming the default choice
for data storage format
MS Office 2003
Example: Syllabus
<syllabus
xmlns="http://josquin.cs.depaul.edu/~rburke/namespaces/sylla
bus">
<course>
<course-number>ECT 360</course-number>
<course-title>Introduction to XML</course-title>
<prereqs>
<note>One quarter of programming</note>
<and>
<or>
<course-number>CSC 211</course-number>
<course-number>CSC 261</course-number>
<equivalent/>
</or>
<course-number>IT 130</course-number>
</and>
</prereqs>
</course>
... see full example ...
Note
Structure determined by needs of
application
Other design choices could be made
separate components of course
number
text for prerequisites
Note
Mixed content
Use of external namespaces
Entities
Internal referencing
The rules of XML
Documents consist of elements, attributes and content
(and a few other things)
Elements are set off by tags in angle brackets
start tag for element foo <syllabus>
end tag for element foo </syllabus>
Anything in between the start tag and end tag is
element content
Attributes are additional data associated with an
element
indicated by name/value pairs inside the start tag
• <hwk ref="hwk2">
More rules
Comments
enclosed by special character sequence
<!-- -->
Document prolog
before the first element
contains declarations
typically
• declare that it is xml
• declare the relevant document type
Processing instructions
information that the XML parser doesn't use
passed along to the application
Special tag <?
Entities
Special characters
Certain characters part of the language
Need a way to indicate these
• &lt; <
Entities can be defined as part of a
document type
useful for inserting standard text
&copyright; might insert a standard copyright
notice
Document tree
Document is just one form of XML
More useful for computation
Tree representation
XML Tree
syllabus
offering,
etc.
course
coursenumber
coursetitle
prereqs
and
TEXT
"ECT 360"
TEXT
"Introduction
to XML"
note
description
coursenumber
TEXT
"IT 130"
or
TEXT
“One quarter of programming”
coursenumber
coursenumber
TEXT
"CSC 211"
TEXT
"CSC 261"
equivalent
TEXT
"This course
is..."
Tree
Nodes
elements
text nodes
Attribute lists
Paths
A path traverses the tree
XPath provide syntax for tree traversal
Example
/section[2]/meeting[1]/day/
Transformation
XML transformations change the XML
tree
adding
deleting
changing contents
Well-formed vs valid
A well-formed document is one that obeys
the syntactic rules
it can be parsed
<foo bar="2"><baz>thud</baz>&zap;</foo>
well-formed document
A valid document has been validated
against some standard
what is the entity zap?
is baz a legal subelement for foo?
unknown without a definition for foo
XML Validation
Validation is the process of checking
an XML document against a standard
Different languages for defining such
standards
DTD – document type definition
XML Schema
RELAX NG
others
Document type
The document type specifies the legal
structure of the XML document
order, contents of elements
legal attributes and default values
etc.
Designing a document type means
deciding what data will be stored and
how
HTML
HTML is not XML-compliant
XML is case-sensitive
XML requires quotes around attributes
HTML optional
XML requires end tags
HTML is not
Optional in some cases for HTML
XML requires /> syntax for empty elements
HTML does not
XHTML
Latest HTML standard
Makes HTML XML-conformant
Different flavors
Transitional
• allows style information as part of the document
(align attribute)
Frameset
• allows frames
Strict
• no frames, no style attributes
• assumes use of a stylesheet for rendering
Benefits of XHTML
XML
decouples data from application
XHTML
decouples content from style
for web documents
Modularity = More pieces
HTML
document
Web browser
XML
data
XHTML
document
XSLT
stylesheet
CSS
stylesheet
Web browser
More pieces =
More flexibility
XML
data
XSL-FO
stylesheet
PDF
document
XML
data
XSLT
stylesheet
SVG
graphics
Converting to XHTML
No sloppiness
tags must nest
end tags everywhere
quotes on attribute values
remove deprecated elements / attributes
• replace with style
XML-specific
/> for empty elements
• like img
declarations
• xml
• doctype
• namespace
lowercase
Example
HTML XHTML conversion
Validation
on-line validator
Assignment #1
Convert a file to XHTML
Use the online-validator or XML Spy
Due
before class time next week
submit to COL