Experiment in learning: completing Stanford online course: Introduction to Databases

In October 2011, Stanford University broke new ground by offering three free online computer science courses:

Curious, I signed up for all three of the courses, since I had never taken a course in any of these three subjects.

Because I was not enjoying the AI course and did not expect to find it so useful, I dropped it after completing all the assignments in the first week or two.

Here I report on the databases course. I will follow up with a report on the machine learning course.

Why did I take this online course?

For the record, the subject of databases is not inherently all that interesting to me.

I took the databases course with the aim of efficiently filling in a gap in my formal education: I was not a computer science major and never took a single computer science course in college, and did not take a databases course in graduate school. I have always had the option, of course, of taking a databases course here at Carnegie Mellon University, or just reading and studying material on my own, but there have been two obvious reasons not to.

I did not take a databases course at Carnegie Mellon because I have taken courses here before and they are very time-consuming for someone who is working. Not only are there lectures and recitations at fixed times, but also a good amount of homework. I have never liked this format of learning, even when I was a full-time student. It always seemed inefficient, and cramming in a lot of detailed material into each course always left me feeling that I was not mastering everything anyway and was not going to retain it, if the subject was not something of primary interest or use to me. Therefore, I am pretty much done with taking traditional courses for learning. They are not an efficient use of my time or energy. What is efficient is intense learning and mastery of fundamentals and retaining them.

I did try some self-study years ago, but that was problematic, because I couldn’t easily determine what fundamentals to focus on and how to assess myself with relatively little effort. I bought a standard textbook a long time ago, but was unable to force myself to read it because it was so dry and thick. I just got confused and bored out of my mind.

Taking an online course with no grade hanging over my head seemed a good way to experiment and learn whatever I felt I was up to learning, based on such factors as difficulty and time commitment.

Completion of the course

In December 2011, I did complete the databases course, having done all the in-lecture quizzes, the review exercises, the midterm, and the final exam. I did not get a perfect score, but missed a few points.

My strategy

Given how little time I had to devote to the Stanford classes (sadly, much of my weekend time was spent trying to get stuff done by the end of Sunday), I tried to be efficient. In the end, I spent about four hours a week on the databases course, most of which was devoted simply to working through the lecture and demo videos. I can easily imagine an in-depth traditional course taking up to ten hours a week (lectures and recitations, homework, context switching in the middle of the work day).

Textbooks

I had no time to read any of the suggested textbooks. And any time I tried to browse the old edition of the textbook I have, I just got impatient and confused anyway.

Lectures and demos

What I did do was faithfully watch the lectures, and make notes on the slides which I printed out beforehand, master all the in-lecture quizzes (with some exceptions, to be discussed below), and then complete the review exercises, using the “testing workbench” until I got every one of my answers “correct”.

For the demos of relational algebra and XSLT and XQuery and SQL engines, we were urged to experiment ourselves with creating queries and the like, but because of time constraints, I chose not to do that. I did carefully follow along with the commented code transcripts of the demos, however, and pause the demos (which often went at breakneck speed!) to go over the transcript and write some notes to myself before moving on.

The instructor, Professor Jennifer Widom, was very enthusiastic and clear in her lectures and demos. Her “screenside chats” were useful in “humanizing” the course experience.

Online discussion forum

I did not really use the online discussion forum.

Other study

I did not much use the optional exercises. I looked at them but did not find them very useful.

I did not “study” for the midterm or final exams; I just went and took them with no review of anything.

Difficulty of the course

Overall, the course was not very difficult. And that was fine. When I’m trying to learn the fundamentals of a subject that is not my primary interest or usefulness, I’m not interested in being forced to run the gauntlet as though I were training to become an instant expert by completion of a course.

That said, there were some topics in the course that we were rightly warned were difficult. The material on multivalued dependencies, relational design theory were quite abstract and dry. I will confess that I never fully mastered this material. But I did not feel it worth the effort to master it all.

Benefits of the course format

Time flexibility and efficiency

Very nice in comparison with traditional courses was that I could watch the lectures at any time. Typically I worked through lectures before bedtime on some weekdays, on Friday (which I was taking off every week from work from October till December to ease my burden, using up vacation time that had accumulated and needed to be used anyway), and on Saturday. Sometimes I was too occupied during the work week to watch any lectures until Thursday or Friday. That was not optimal, because catching up was unpleasant (as with real life lectures, I really do not like watching one for more than twenty minutes at a time), but at least it was actually possible to load up on lectures and catch up.

In-lecture quizzes

The in-lecture quizzes were very helpful, and one of the highlights of this learning format. They enabled me to get quick assessment of whether I totally understood the material I just watched or needed to review a bit more.

Sometimes I was tired or distracted when watching a lecture segment and found myself at a loss at a quiz. It was very useful to catch myself at frequent quiz checkpoints. In a traditional lecture, I have certainly experienced getting lost and basically wasting a whole half hour or more as a result of not having demonstrated to myself mastery before moving on. With the ungraded quizzes that one can take again and again, I found it mostly easy to go back to review a lecture and then master the topic being quizzed. The process was quite efficient, for the most part.

Testing workbench

The testing workbench was also useful, in that it gave immediate feedback on whether an answer (typically a query I had to write) was correct on a test case. Most of the time I was correct at first try, but sometimes I made a mistake and had to correct it.

That you could keep on making submissions was, I think, a good feature, because the real goal is to master something, after all. I never liked it when I was in school and sometimes I would take a class and got penalized for making mistakes in homework, even though I quickly mastered the material after the feedback of the homework errors. Should students be graded on what they know at the end of a course, or on what they know at some transient snapshot in time when they were in the process of learning?

Software tools

I assume some people just typed their queries into the testing workbench and messed with them until they passed. I did not do that. I downloaded the data and tools (such as SQLite), and developed my answers totally outside the testing workbench, then copied and pasted to the testing workbench. I found this a much more useful way to learn and write code.

Flaws in the online course format

Less depth

Without more detailed homework and projects that would require supplemental reading, there was less depth than there would be in a traditional course.

The midterm and final exams were somewhat simplistic, being just multiple-choice questions. Perhaps they should be extended with questions asking for SQL and XML queries to be submitted.

Obviously, the format of the course did not allow for interesting independent projects to be done and assessed.

Assignment due dates

Unfortunately, assignment due dates were chosen far beyond the actual schedule of the lectures, which meant a lot of people, including me, “procrastinated”. This made for a very stressful December as I was spending all my time outside of work struggling to catch up before the final exam. Psychologically, this lax assignment due date policy was a mistake.

SQL itself

This may not be a drawback of the course, as such, but of the state of the SQL world: practically every lecture talked about standards and how the implementations of SQL don’t implement everything in the SQL standard and differ from each other. We learned constructs that were not supported by any of the SQL implementations recommended to us! I presume that in a traditional course, we would have had access to some proprietary database that did support the full SQL standard?

Lack of worked-out examples

As long as the material wasn’t very difficult, I found the process of verifying mastery to be very efficient. The trouble came when I didn’t understand something. In particular, multivalued dependencies and relational design theory turned out to cause me problems. I chose not to work too hard to resolve them, but was surprised by the lack of suitable material provided to help out.

What I would have found very useful would have been more supplementary material, in the form of worked-out examples. And I was surprised that the optional exercises only came with an answer key, not explanations. It’s not sufficiently useful to know whether one’s answer is correct, or to be told which answer (of a multiple-choice question) is correct.

Theoretically, I could have asked for help on the discussion forum, and I would have if the material happened to be more interesting and important. But I happened to make the decision to not fully master every corner of multivalued dependencies and normalization, because I judged that if I ever needed to get this stuff straight, I could run an algorithm rather than compute things by hand.

Conclusion

I felt it was worthwhile completing the Stanford online databases course. For me, it was as much a personal experiment with online learning as it was a matter of learning more about databases in particular.

I am grateful to Professor Jennifer Widom and her team for making this course possible.

Postscript

I will also write a report on the machine learning course, which was very different in various ways from the databases course.

comments powered by Disqus