CSE5335: Web Data Management and XML

Class:TuTh 2:00-3:20pm (NH 111)
Instructor:Leonidas Fegaras
Office:GACB 115 (General Academic Classroom Bldg)
Office hours:Tuesday and Thursday 12:30-2:00pm

XML has become an important standardization for data representation and information exchange among Internet co-operative applications. This course provides an in depth study of the area of web data management with an emphasis on XML standards and technologies. The course primarily covers the state of the art in designing and building web applications and services, primarily focusing on issues and challenges that revolve around the management and processing of XML data.

Prerequisites: CSE 3330/CSE 5330 (Database Systems I) or equivalent. Students are expected to have a working knowledge of Java, SQL, and basic HTML. Students without adequate preparation are at substantial risk of failing this course.

Reading material:
There is no required textbook but you are expected to read many online tutorials and references (links will be given out in class). One good online tutorial is w3schools.

Additional Reading:
Although not required, you may find the following books useful for additional background and explanation (listed in order of relevance to this course):

The final grade will be based on Final grades will be assigned according to the following scale:
     A: score >= 90, B: 80 <= score < 90, C: score < 80
Sometimes, I use lower cutoff points, depending on the overall performance of the class.

Both exams are closed-book and closed-notes. The final exam will cover the material from the first lecture up to and including the last lecture. Once the exam grades are posted, you will have 10 business days to dispute your grade and get your exam re-evaluated. No re-evaluation will be entertained after the 10 day period. No makeup exams will be given unless there is a justifiable reason (such as illness, sickness or death in the family). If you miss an exam and you can prove that your reason is justifiable, you should arrange with the instructor to take the makeup exam within a week from the regular exam time. For any other case, you will get a zero grade for the missed exam.

Programming Assignments:
There will be ten small weekly programming assignments. Each project will be done individually. Details will be given out in class. Late project will be marked 20%-off per day. No further extensions will be allowed. No excuses, no exceptions.

All projects will be done in Java (using JDK 6). Students are expected to have a working knowledge of Java, SQL, and basic HTML. The software used for the projects is open-source, free, platform-independent, and well-suited for Java: You can do the project on your own PC/laptop under any platform (Linux, MAC OS X, MS Windows, etc). Directions of how to download the required software will be given out in class.

Note that, although we will briefly talk about it, we will not use Microsoft ASP.NET (Visual Studio, C#, etc), since this framework is platform-dependent (for IIS only).

All work in this class must be done individually. No copying is permitted. Cheating involves giving assistance to or receiving assistance from other students or from other individuals, copying material from the web, etc. I strictly adhere to the University of Texas at Arlington rules and guidelines for handling violations of academic dishonesty. Please refer to the pamphlet "CHEATING: Definitions and Consequences" for additional information. You are required to sign and return the statement about academic dishonesty. If any one is caught for cheating, or indulge in plagiarism or collusion on a programming assignment or on a exam, the grade for the entire course will be an automatic Fail grade (F).

How to do Well in this Course:
Students who get the most out of this course will be the ones who put in the most effort. If you want to do well, attend all the lectures, read the assigned reading material, and start early on your programming assignments. If you are having difficulty, the instructor and the GTA will be more than happy to help you. In addition to regular office hours, the best way of communication with the instructor or the GTA is through email. If you can't make it to the scheduled office hours but really need help, contact one of us for an appointment.

Distance Education Students:
The requirements for distance education (off-campus) students will be the same as for regular students with the possible exception of the two exams. If you are a distance education student and work within one hour driving distance from UTA (based on Google driving directions), then you need to come and take the exams in person. Otherwise, you will have to find an exam proctor on site to supervise each exam. The proctor cannot be anyone equal or below your pay grade at your office, unless it is someone in HR that specializes in proctoring exams. The proctor could be someone from a local school, testing center, etc. The proctor must be approved by the instructor and a proctor agreement must be signed at least one week before the first exam. Note: Do not use the UTA proctor agreement form; there is a special form that will be available shortly. Each exam will be delivered to a proctor in the morning of the exam day and the student must take the exam in the same day.

Special Accommodations:
If you require an accommodation based on disability, I would like to meet with you in the privacy of my office, during the first week of the semester, to make sure you are appropriately accommodated.

Web Page:
Please visit this web page often; it will contain the reading assignments, project description, class notes, etc.
Other related web pages:

Tentative Schedule:
  1. Introduction and motivation
  2. Web application development
    1. Dynamic web pages, HTTP GET/POST requests
    2. HTML forms
    3. Client-side programming (JavaScript)
    4. XHTML and CSS stylesheets
    5. The document object model (DOM) and dynamic HTML
    6. Asynchronous server requests (AJAX)
    7. Server-side programming: PHP scripts, cookies and sessions
    8. Servlets (Tomcat), Java Server Pages (JSP)
    9. Database connectivity, JDBC
  3. XML standards
    1. DTD
    2. XML Schema
    3. XPath
    4. XML programming (DOM, SAX, StAX)
    5. XSLT
    6. XQuery
    7. Java/XML data binding (JAXB)
  4. XML data modeling
  5. Native XML storage management
    1. Indexing techniques
    2. Xindice and Berkeley DB XML
  6. Relational databases and XML
    1. XML shredding
    2. XML publishing
    3. XML on commercial databases (Oracle XML DB, SQL Server SQLXML)
  7. XML data management
    1. Query processing
    2. Query optimization
    3. Updates
    4. View maintenance
    5. Integrity constraints
    6. Compression
  8. XML search engines
    1. Information retrieval
    2. Web search engines
    3. XML ranking
  9. Web services
    1. Standards: SOAP, WSDL, UDDI
    2. Axis and JAX-WS
  10. Special topics
    1. Metadata management with RDF
    2. Data integration
    3. Web Mashups
    4. Semantic Web

Last modified: 01/05/09 by Leonidas Fegaras