Doug Engelbart's Colloquium at Stanford | Session 5: Eric Armstrong

Doug Engelbart's
Colloquium at Stanford

An In-Depth Look at "The Unfinished Revolution"

Session 5
Further OHS-planning suggestions
Eric Armstrong. 1.*
- unedited transcript -

First I would like to say thanks for the opportunity to talk here because this is a really high power group and it's quite a thrill to be part of it. I really can't think of anything more compelling to work on, anything more important to spend time on and my employer if I might have something to say about that in the near future. I have been blessed with the opportunity to spend a lot of time on these projects. When I got into computers to start with it was to augment human intelligence to solve complex problems, so the fact that Doug has been spending so much time and got so much talent on the problem for so long is just thrilling.

Fig. 1

This was the electrifying slide that Adam put up last week. What he did was to summarize and constrain the problem down to dimensions that are tack able, feasible. This line here where he just says, the documents we want to be able to manager are e-mail, html, augment and source code. Nicely constrains a problem that we can attack with an open hypertech system and it's elegant and it's simplicity and I can assure you that nothing you hear in the next twenty minutes is going to approach that level of conciseness. We need to be able to publish with version control, we need to be able to look at it in different ways the same way we count an augment system, and we need a simple editor. His main concept was that we absolutely need to be able to look at it in HTML so that it's open but that we may need to come up with a proprietary editor in order to be able to do manipulations of that data at least at the outset. He looked at options for publishing; he looked at different ways to views. The main concept being lets use open hypertech system as a

ENGELBART: Would you use open hyperdocument please? They're a whole knowledge container, not just the text.

ARMSTONG: If you have your open hypertech document system then that's the first step in creating a larger knowledge repository. There is a lot of things that are not being addressed here; we're not looking at knowledge management, we're not looking at model building systems, or abstract knowledge systems or expert systems and that's important because the tools we need to build those systems are at very minimum this system, which makes it perfect for bootstrapping. So, I guess I need to move onto my slide now.

ENGELBART: I apologize for this.

ARMSTRONG: Now I'm suggesting that our initial target for a bootstrapping system should be aimed at improving open source efforts.

Fig. 2

It dovetails nicely with what Adam is suggesting as a starting point. We need to integrate e-mail, web, source code and the augment system. It is true bootstrapping because the system we design to do this will provide a positive feedback loop for bootstrap. So what we start for version one will help us to evolve version two, will help us to evolve the knowledge management and other systems that make sense. Now the two areas where we see a lot of output that could be generated back in are management, organizational things, everything about open hypertech system and the information management system itself.

AUDIENCE: Hyperdocument system.

ARMSTRONG: Hyperdocument system. I'll do that just keep correcting me, I do learn. That feedback loop will get us to where we can start building the knowledge management systems, the expert systems, take account with people we learned and build it up. It gives us a concrete problem, we can focus our thinking on and it gives us a familiar problem domain. Now there is a danger there that the first system might be to overly limited.

We might be only looking at our own kinds of problems and not the other problems we need to address but it gives us a tool that we can use to build the next system.

Fig. 3

What it needs to do, it needs to track the discussions and the documents we do for requirements, functional specs. I think most the people here are conversant with software, so I'm not going to go over each of these documents but these are the traditional documents you see in the life cycle of a software system. One thing that's really nice about open-source is that your development plans can be event driven rather than time driven. Instead of having to march to a time clock to reach some production goal, so your marketing people know what they are doing, you can say when it's ready we ship it. That's an advantage, and it's an advantage that isn't shared in the commercial market place. Now design decisions, I really want to emphasize the importance of being able to track design decisions.

Fig. 4

At one point I kept a design document, as I was going along, the alternatives I considered why I rejected them, which ones I selected and why? I found myself in a meeting at one point and they were saying that we've got to do something differently. To be able to take out of that document and say, no we shouldn't and here are the reasons. In another meeting when I was attacked on a point, I took out my document and said here is the assumptions I had when I made that decisions, and it turned out the assumptions were wrong and I can make the change. The important thing was that I was sitting there in this meeting, three months after making that decision and I couldn't remember the reason why I made it but I knew there was some reason. I was very afraid to make that change because I had no idea what kind of thoughts have gone through my mind, what I was going to run into if I made it. So it told me that it was okay to change and I had a list of alternatives sitting right there.

Fig. 5

Now if we but this kind of system together we'll have interactive collaboration. We'll be able to do versioning and attributions. It's nice to look at a piece of source code, like Doug mentioned, and say who did that? As we're looking on e-mail lists we see a lot of rambling discussions, which are very important, they introduce a lot of knowledge but it's important to reduce that at some point into the next version of the document. So the system needs to be able to do that. You also get, and I'm assuming XML here because I frankly don't see anything else on the horizon that will come close to being possible to do it but I'll leave the design space open and just use XML as an example. There may be a better alternative sometime in the next twenty years and when we see it we should do it. We have also some benefits of storing source code in XML that you can't get any other way and I want to touch on those.

Fig. 6

Main thing with linking is being able to connect your documents and connect the reasons for making decision with the results of that decision. At each point you wind up at the bottom with code and moving backwards through it. The hardest question to answer when your looking at a piece of source code is, why was that done? The hardest thing document in any short distinct way is why that was done. If you're looking at starting at the bottom of it, even a simple bug, it can take a long explanation to explain the changes. Starting with your design decisions, your development checklist and the various codes you put out. The first version of a piece of code is typically very simple, you put out a bunch of things you want to do, a bunch of short explanations and it reads it flows, it's really nice. Then you start testing and your users start giving you feedback and you start running into problems. Then pretty soon, if your documenting, the code becomes completely unfollowable, the thread of it is completely lost and if your not documenting you wind up with mystery code that no one understands why it's there. It becomes completely unmaintainable and the kind of system were envisioning here can solve most of those problems. Linking is one real good reason.

Fig. 7

You minimize the number of code intrusions. You can put the code in put a link to the explanation. You put the code in put a link to the document design decision. Now you have very small amounts of documentation inline and it's possible to make sense reading it. You reuse the explanation multiple places, if it takes a page to explain a one line insertion, which I've done, and it needs to explain multiple different places, why is this variable here? Why did you store information there? Why did you act on it? That explanation can be reused in multiple places and linking gives you that capability. We're used to using plain text systems. That's practically Stone Age. In some ways given everything we've seen about hypertext we might as well be doing binary code. There are a lot of advantages to hypertext we're not taking advantage of.

Fig. 8

When you put source into hierarchical systems like XML, one of the things you get is the virtual reduction in size. Your code, instead of looking like one monolith of directories, if you remember the days of linear directories, just one big long list of file after file after file. They were very large. I've got maybe twenty, thirty directories on my system. I go into them when I need to get something done and the other day we had to put it into Outlook and Outlook kept telling me how may folders it was searching. There are fifteen hundred directories in my system, I had no idea it was that big. To me it's a twenty-directory system and when I change my context there's another ten or twenty things to look at and at each step it's very simple.

Another benefit is rapidly being able to move things. Hierarchical systems are unique because they are constrained; there is information in the document about where a structure starts and where it ends. That means, every entry is a handle and you can move a method or routine by grabbing the handle and dragging it, the same way you would in the directory system. Literate programming style, your code could look like a series of comments. You are not actually doing literate programming, the way Knuth describes it, but what you're looking at can be a series of comments were tucked under each comment is a piece of code. There is also the option of eliminating structural syntax on some of the problems that's attendant on those.

Fig. 9

Now here's an example of literate programming style. Image that this is a program viewed at the top level. These are the steps that I would conceive as something that could operate as a servlet that would be a front end to augment and let you interact with it as a hypertech system. Notice there's going to be a lot of code inside of each one of those sections, but when you collapse the document and look at it you can see what the thread of that code is. So you've taken out some of the problem with getting the flow of the code by adding links to major sections, major comments. You've also are gaining it by being able to collapse in a hierarchy.

Fig. 10

Now again it's possible to eliminate structural syntax. We've started a, I'll give you an URL here I forgot to put it into the slide, it's extende.sourceforge.com and we're having major trouble getting anything. We've finally got the mailing list set up, and we don't have a web page set up yet. I will send out an e-mail with the addresses but we're starting doing some thinking about what it is like if you start putting source code in the structural system. Because the end of the structure is well defined, a compiler or translator can look at this and it knows where to put the braces. You don't have to put them in yourself.

Fig. 11

Comments, if I got a // at the beginning of the end that known that's well defined, I don't have to put a // on the beginning of every line. The compiler is perfectly capable of figuring out where that lines ends. If I have a block comment again everything under that is constrained so if I do that /* then the */ that ends the comment can be automatically supplied and that means I can comment out an entire block of code. On a plain text system this is impossible because that second comment down there, when it sees the ending */ terminates the first comment and everything that follows underneath that tries to be compiled. In a hierarchical system it's not a problem and I can put that code in or out by changing a single character. Now one of the things you see if you program at anytime that there is a brace missing somewhere in your program, be a good guy and go find it will you?

Fig. 12

If you made a lot of changes, it can be really hard finding them, that goes away. Another thing is the impedance mismatch between what the compiler sees and what you see. The indentation suggests to you containment, but what the compiler sees is something completely different and you can spend hours trying to find those kinds of bugs. Because we think like people fundamental and learning to program is fun, I mean learning to think like a computer so you can see the mistakes it sees. What do we need to be able to do source and XML?

Fig. 13

One thing we need is a filter that will take existing plain text programs and put them into XML structures. One of the nice things about the Sun's architecture that David Brownell put together is that if you take an existing parser, any parser in fact that recognizes a data set and have it generate sax events, which is one of the XML protocols, you can plug that into a parser that will generate XML out the other end. So that one is fairly easy to do given someone has a parser and knows how to use it.

The second part is a XML to plain text. Now the interesting thing there is that we have to store the line numbers when we do that because when the compiler gives line number errors, actually it did this by the way. When I started that company to develop an outliner I wrote a little plain text filter and used it to do source code programming, so the examples I'm giving you come from experience. The one problem with using the system was that when the compiler gave me a line number I had a hard time finding it in my outline. But in XML we have attributes and we can adjust those attributes as we produce plain text. Obviously you need a XML editor, but my claim is that XML is approaching a level of ubiquity that will make XML editors the standard much as plain text editors is today. I think there is a long way to go before that happens but there's a lot of editor projects in the works. It's so much easier to program with XML because it's well formed that it took us five years to see even one HTML differencing utility. There is five or six already in the works for XML; it's just that good of standard. We'll need a go to line number function to be able to translate the exceptions we see and the compiler errors, but eventually there is no reason not to have XML aware compilers.

Fig. 14

They can process the XML directly, we get rid of the filter and then the error listings can have links that directly there. So you click on the link and just go straight to the line, one less step. You still have runtime exceptions but there's no reason that the class structure can't store an URL instead of a line number and do the same thing. Now those, I think, are basic functionalities and I just start thinking about version two. I think integrated them, one of the issues that I still have is, how do you integrate an IBIS style design discussion that Dick Karpinski was nice enough to point us to?

Fig. 15

This is where you have a question, alternative design possibilities, pros, cons, an endorsement, I like this idea and then finally decision. This is the one we decided to do; those are the QAPCED and one of the questions we have to ask is what does it mean to integrate that kind of an ongoing design discussion? Because it's fundamental of this kind of structure that records those design decisions. Here's the alternatives we've considered, here's the reasons we liked it, here's the reason we didn't, here's the decision we made. We can also look towards possibly doing a patterns repository.

Gamma and the Gang of Four did this wonderful book on patterns but it's all, here's a pattern, here's how it works, here's a pattern, here's how it works and those are all great but it's hard to use them, what pattern do I need right now? I don't know. Unless I know all the patterns I can't tell which one I want to use. So to invert that structure in a way that says, maybe it's a fact, maybe it's a listing of here's problems that you can solve. As I go through that series of problems I look for ones that ring a bell that relate to the issues I'm facing and I can go to the repository and see that pattern and find some good way to learn how it works, something interactive. That moves us towards the kinds of knowledge systems we need to do management structures that we need to do the other organizational thinking.

ENGELBART: We have to kind of close. We go off the air and I need a few minutes to close.

ARMSTRONG: Two different approaches, I'm just going to put them up here and I'll put them on the e-mail list, we can use augment, front end. The main constraint there that it's not open to everyone.

Fig. 16

Fig. 17

We can use a new open system that makes it usable by everyone, all the open source efforts. It's going to be a lot easier with the tools we have today then it was initially.

Fig. 18

Summary, taken an initial focus of email, web, source, and augment. It's a true bootstrapping project. It's a familiar project domain. It's a concrete problem to attack and it will lead us to the next series of things we need to do. Thank you Doug.

[ principal lecture]

Top

---

Above space serves to put hyperlinked targets at the top of the window