Thin AirCode and culture from the centre of the world2010-03-28T01:25:29-07:00Colin Putneycputney@wiresong.cahttp://wiresong.ca/Scripting languages and IDEstag:wiresong.ca,2006-10-20:11613954922006-10-20T18:51:32-07:00<p>On the Squeak development list there's been a lot of talk lately about
creating a scripting language based on Squeak. On the surface it seems like a
great idea. Scripting languages are popular, dynamism is in vogue, and it
would be nice to be able to use Smalltalk for all the day-to-day utilities and
admin tools that tends to get done in Perl or Ruby. On top of that, the main
drawback of scripting languages is that there aren't any good IDEs for them.
Squeak has a great IDE, and should be able to provide a great script
development environment.</p>
<p>I'm pretty skeptical of the idea, because I think scripting languages and IDEs
are like oil and water. They just don't mix. What follows is a post I made to
the Squeak list defending this position. First, I'd like to define some terms.</p>
<p><em>IDE</em> - This is a program that allows one to view and manipulate another
program in terms of it's semantic elements, such as classes and methods,
rather than in terms of the sequence of characters that will be fed to a
parser. IDEs might happen to display text, but they also provide tools like
class browsers, refactoring and other transformations, auto-completion of
identifiers etc, things that require a higher level model of the program than
text. Examples include various Smalltalk implementations, Eclipse, Visual
Studio, IDEA.</p>
<p><em>Scripting language</em> - a programming language and execution model where the
program is stored as text until it is executed. Immediately prior to
execution, the runtime environment is created, the program's source code is
parsed and executed, and then the runtime environment is destroyed. This is an
important point - the state of the runtime environment is not preserved when
execution terminates, and one invocation of a program cannot influence future
invocations.</p>
<p>Now, one might quibble over my definition of "scripting language." Fine, I
agree that it's not a good general definition of everyday use of the term. But
it's an important feature of languages like Ruby, Python, Perl, Javascript,
and PHP and one that makes IDEs for those languages particularly hard to
write.</p>
<p>Damien Pollet <a href="http://aspn.activestate.com/ASPN/Mail/Message/squeak-list/3245367">brought
up</a> the key
issue in designing a Smalltalk-bases scripting language - should the syntax be
declarative or imperative?</p>
<p>Imperative syntax gives us a lot of flexibility and power in the language. A
lot of the current fascination with Ruby stems from Java programmers
discovering what can be done with imperative class definitions. The Ruby
pickaxe book explains this well:</p>
<pre><code>In languages such as C++ and Java, class definitions are processed
at compile time: the compiler loads up symbol tables, works out how much
storage to allocate, constructs dispatch tables, and does all those other
obscure things we'd rather not think too hard about. Ruby is different. In
Ruby, class and module definitions are executable code. </code></pre>
<p>Executable definitions is how metaprogramming is done in scripting languages.
Ruby on Rails gets a lot of milage out of this, essentially by adding
class-side methods that can be called from within these executable class
definitions to generate a lot of boring support code. In Java, we can't modify
class definitions at runtime, and that's why Java folks use so much XML
configuration.</p>
<p>Python <a href="http://docs.python.org/ref/class.html">does</a> this too. Perl5 is pretty
weird, but Perl6 is slated to handle class definition this way as well.
Javascript doesn't have class definitions, but we can build up pseudoclasses
by creating objects and assigning functions to their properties.</p>
<p>When writing an executable class definition, we have the full power of the
language available. You can create methods inside of conditionals to tailor
the class to it's environment. You can use eval() to create methods by
manipulating strings. You can send messages to other parts of the system. You
can do anything.</p>
<p>I'm making a big deal out of this, because I think it's a really, really
important feature of modern scripting languages.</p>
<p>Declarative syntax, on the other hand, gives us a lot of flexibility and power
in the tools. Java, C++ and C# have declarative class definitions. This means
that IDEs can read in the source code, create a semantic model of it,
manipulate that model in response to user commands, and write it back out as
source code. The source code has a cannonical represenation as text, so the
code that's produced is similar to the code that was read in, with the textual
changes proportional to the semantic changes that were made in between.</p>
<p>This is really hard to do with scripting languages, because we can't create
the semantic units of the program just by parsing the source code. You
actually have to execute it to fully create the program's structure. This is
problematic to an IDE for many reasons: the program might take a long time to
run, it might have undesirable side effects (like deleting files), and in the
end, there's no way to tell whether the program structure we end up with is
dependent on the input to the program.</p>
<p>Even if we did have a way to glean the program structure from a script, there
would be no way to write it back out again as source code. All of the
metaprogramming in the script would be undone, partially evaluated, as it
were, and we'd be stuck with whatever structures were created on that
particular invocation of the script.</p>
<p>So, it would appear that we can have either a powerful language, or powerful
tools, but not both at the same time. And looking around, it's notable that
there are no good IDEs for scripting languages, but none of the languages that
have good IDEs lend themselve to metaprogramming.</p>
<p>There is, of course, one exception. Smalltalk.</p>
<p>With Smalltalk, we have the best of both worlds. A highly dynamic language
where metaprogramming is incredibily easy, and at the same time, a very
powerful IDE. We can do this because we sidestep the whole issue of
declarative vs. imperative syntax by not having any syntax at all.</p>
<p>In Smalltalk, classes and methods are created by executing Smalltalk code,
just like in scripting languages. That code creates objects which reflect the
semantic elements of the program, just like in the IDEs for compiled
languages. One might say that programs in compiled languages are primarily
state, while programs in scripting languages are primarily behavior. Smalltalk
programs are object-oriented; they have both state and behavior. The secret
ingredient that makes this work is the image - Smalltalk programs don't have
to be represented as text.</p>
<p>And that's why a Smalltalk-like scripting language wouldn't be worthwhile. It
leaves out the very thing that makes Smalltalk work so well - the image. It
would have to have syntax for creating classes - either imperatively or
declaratively. We'd end up limiting either the language or the tools, or if we
tried hard enough, both.</p>
<p>I'd much rather see a Smalltalk that let me create small, headless images,
tens or hundreds of kilobytes in size, with just the little bits of
functionality I need for a particular task. If they had good libraries for
file I/O, processing text on stdin/stdout and executing other commandline
programs, they'd fill the "scripting language" niche very well. If they could
be created and edited by a larger IDE image, they'd have the Smalltalk tools
advantages as well.</p>
<p>I have high hopes for <a href="http://netjam.org/spoon/">Spoon</a> in this regard.
Between shrinking, remote messaging and Flow, it's already got most of the
ingredients. It just needs to be packaged with a stripped down VM, and
integrated into the host operating system.</p>
Announcementstag:wiresong.ca,2006-07-08:11524123032006-07-08T19:31:43-07:00<p>The basic design strategy for OmniBrowser is simple: rather than modelling a
browser with one large and complex object (like Browser does), break it up
into a network of smaller, simpler objects. From there, the design is pretty
straightforward, and it's much easier to build lots of kinds of browsers
from the same code base.</p>
<p>This design does have a downside, though. It makes event handling more
difficult, because the objects that need to communicate to respond to events
are often in distant parts of the network, and can't rely on the the
structure of the network to find each other. Early versions of OmniBrowser
responded to events, such as a click, with a cascade of messages, with each
object letting it's neighbors know about the the event. This had the
advantage that each object only needed to know about it's immediate
neighbors, but it was also fragile and prone to infinite loops as neighbors
repeatedly notified each other of the same event.</p>
<p>My second attempt to address this problem involved the use of a Dispatcher.
This was an central object that all notification messages would flow through.
As the various parts of the browser were created, they would register with the
dispatcher to receive messages. This was an improvement, because objects could
send messages to "everybody" rather than to an explicit receiver.
But it was still awkward, and the event handling code was still convoluted and
difficult to understand.</p>
<p>I've just finished up the implementation of my third attempt, this time
based on Vassili Bykov's notion of <a href="http://www.cincomsmalltalk.com/userblogs/vbykov/blogView?showComments=true&entry=3310034894" title="Vassili's blog post introducing the framework">Announcements</a>. I
talked to the folks at Cincom about porting the code to Squeak, but that
didn't work out. I ended up just doing a mini-implementation that meets
my needs for OmniBrowser. (Actually this was probably what I should have done
in the first place. It was probably less work for me to re-implement
Announcements from scratch than it would have been for someone at Cincom to
get corporate approval to release the code under an open source license.)</p>
<p>Despite all the positive things Vassili had to say about Announcements, I have
to admit I was surprised what an improvement it made in OmniBrowser's
event handling code. My first pass at the conversion was simple. I replaced
messages sent to the dispatcher with announcements sent to the announcer. Then
I installed an <a href="http://www.cincomsmalltalk.com/userblogs/vbykov/blogView?searchCategory=Announcements%20Framework" title="Vassili's blog post describing an announcement spy">announcement spy</a>
and browsed around the image a bit. It turned out that every event resulted
in 3 or 4 redundant announcements, and probably even more unnecessary updates
to the UI.</p>
<p>So I made a second pass, explicitly aimed at removing all the redundant
announcements. In many cases, this meant finding the ultimate source of a
particular announcement. For example, OBSelectionChanged should only be
announced from two places in the code. All the other places where it was being
announced were redundant, and had to be removed. By spying on announcements, I
was able to get a clearer idea of the code flow in response to different
events, and find other ways to simplify.</p>
<p>I suspect there's even more simplification that can be made, but even
without it, moving to Announcements was a big improvement.</p>
Questions on the versioning modeltag:wiresong.ca,2006-06-11:11500897442006-06-11T22:22:24-07:00<p>Bruce Badger posted a
<a href="http://www.wiresong.ca/air/2006/05/25/monticellos-versioning-model#comments">comment</a>
in response to my post on the versioning model used in Monticello 2. He has
some questions about methods:</p>
<ul>
<li>What is the identity (or primary key) of a method?</li>
<li>Within what scope is
the identity unique?</li>
<li>If I wanted to use a particular version of a particular method in two
classes, could I (setting asside the question of whether this is a
good idea or not)?</li>
</ul>
<p>The short answer is that Monticello two uses the same semantics that the
Smallltalk runtime uses. The identity of a MethodElement is class name and
selector; it's only guaranteed to be unique within a given image. You
couldn't put the same method in two classes, it would have to be copied.</p>
<p>Now, Avi and I have kicked around ideas for a deeper model of Smalltalk code.
Rather than identifying elements by name, they'd each have UUIDs. Method
sources would be versioned as an AST. The nodes for variable references would
have the UUIDs of the elements the variables are bound to in the compiled
method.</p>
<p>This would have two advantages:</p>
<p>First, it would help with platform independence. Rather than depending on
names to bind variables during compilation, we'd be relying on UUIDs.
This would make it easier to transform the names when moving code back and
forth between dialects. This would make it easier to handle Namespaces in VW,
for example, or differences in platform libraries.</p>
<p>Second, it would allow us to provide a more accurate reproduction of code
between images. We'd be restoring methods to their compiled states
rather than just their source code. This is one of the things that's so
compelling about <a href="http://www.netjam.org/spoon/" title="Craig Latta's Spoon
site">Spoon</a>, and it would allow Bruce's scenario of the same method version
being used in two different classes.</p>
<p>On the other hand, it's that much more code and complexity. It would
require a custom parser, an AST able to handle all the syntactic quirks of the
various dialects of Smalltalk where Monticello will run, and a compiler back
end for each platform. Monticello 2 is already an ambitious project, and a
significant improvement over Monticello 1. Our goal for now is to get the
current version up to production quality so we can start using it. Maybe some
of these ideas will be part of Monticello 3.</p>
Slicing the imagetag:wiresong.ca,2006-06-01:11492226722006-06-01T21:31:12-07:00<p>In my last <a href="http://www.wiresong.ca/air/2006/05/25/monticellos-versioning-model">post</a>,
I mentioned that version history in Monticello 2 isn't tied to packages.
Instead, it introduces the concept of slices.</p>
<p>A slice is, quite simply, a set of elements - an arbitrary slice of the
code in the image. We can define several different kinds of slices:</p>
<h3>Packages</h3>
<p>In Squeak, we can use PackageInfoSlice to get packages identical to those
used by Monticello 1. In other dialects we'd create slices to interface
with the native packaging code - PackageSlice and BundleSlice in VisualWorks
for example.</p>
<h3>Change Sets</h3>
<p>A ChangeSet also defines an interesting slice of the image, and by
implementing ChangeSetSlice, we can make them versionable and mergeable, just
like packages. I'm really looking forward to this one, actually.
It'll make the lives of package maintainers easier, since contributors
can just send them change sets rather than full packages.</p>
<h3>Modules</h3>
<p>Lately, I've become interested in combining Monticello with Spoon.
One of the keys to that integration would be to create a NaiadSlice. This
would define a slice based on the elements involved in executing a given
Smalltalk expression.</p>
<h3>Explicit</h3>
<p>Probably the simplest kind of slice is defined with a collection of
elements. At some point, I'd like to create a UI for easily creating an
ExplicitSlice. I'm imagining a window which lists the contents of the
slice, and accepts new elements via drag and drop from OmniBrowser. For now,
though, ExplicitSlices can be created pogrammatically, and are really handy
for testing.</p>
<h3>Others</h3>
<p>Although they're probably not useful for everyday development, there
are other kinds of slice one might want. A FileOutSlice would enumerate all
the elements in a particular chunk file. We could do the same thing with the
sources and changes files. We could create a slice that scanned the changes
file and included all elements modified between a pair of snapshot markers.
When demoing Monticello 2 I sometimes joke about creating a slice that
includes all the elements that match a given rewrite rule. I don't know
how useful it would be, but why not?</p>
<p>For the moment, I've only implemented ExplicitSlice and
PackageInfoSlice, since they're needed to acheive feature parity with
Monticello 1.</p>
Monticello's versioning modeltag:wiresong.ca,2006-05-25:11486198112006-05-25T22:03:31-07:00<p>Although Monticello has proven very useful for developing applications that
run in Squeak, it hasn't been very helpful in supporting the development of
Squeak itself. The problem is that the versioning model used in Monticello 1
is based on the assumption of packages with well-defined and relatively stable
boundaries. In Squeak, the well-defined packages have already been removed,
and what remains is a large chunk of tangled and inter-dependent code.</p>
<p>Monticello 2 adopts a new versioning model, one that's not tied to packages as
the fundamental unit of versioning. Instead, Monticello 2 divides the system
into its fundamental elements. In Squeak, Smalltalk code is made up of
following elements:</p>
<ul>
<li>Classes</li>
<li>Methods</li>
<li>Class comments</li>
<li>Instance variables</li>
<li>Class variables</li>
<li>Class instance variables</li>
<li>Pool Imports</li>
</ul>
<p>Rather than maintaining the version history of packages, Monticello 2 keeps
version history for each element.</p>
<p>Right off the bat, this makes it easy to implement a feature that Monticello
has never had before: the ability to view previous versions of a given method.
More importantly, though, it makes it much easier to deal with fluid package
boundaries. Packages can be created, renamed or destroyed, elements can move
back and forth between packages, elements can even belong to more than one
package at a time. Since the version history is attached to the element, it's
not affected.</p>
<p>Another consequence of element-level version history is that merges can be
performed on individual elements. Although Monticello 1 supports
cherry-picking, it does so in an awkward and non-intuitive way. In Monticello
2, cherry picking is the norm, and merging an entire package is just a special
case.</p>
Monticello 2 alpha releasetag:wiresong.ca,2006-05-24:11485335392006-05-24T22:05:39-07:00<p>One of the things that surprised me at Smalltalk Solutions this year was the
continuing interest in Monticello 2 from outside the Squeak world. Now that
I'm not working in VisualWorks day-to-day anymore, I've been more focused on
solving the problems that we have with using Monticello 1 in Squeak.</p>
<p>However, there is a real need for tools to make cross-dialect development
easier, and versioning is an important component of that. After doing a few
demos, I had volunteers to maintain VisualAge and Dolphin ports. The
VisualWorks folks all seem pretty busy, but I'm sure somebody will step up
when MC2 gets to production quality.</p>
<p>With all that momentum coming out of the conference, I cleaned up the code a
bit, wrote an installer and posted the first alpha to SqueakMap. The reaction
has been mostly positive, particularly given that Monticello 2 is still very
raw and there's no documentation at all.</p>
<p>To remedy that I'll post some discussion of the architecture and features of
Monticello 2 over the coming weeks.</p>
Humortag:wiresong.ca,2006-05-23:11484331812006-05-23T18:13:01-07:00<p>Today Patricia asked me to explain about "knock, knock." So I explained the
call-and-answer sequence and gave a couple of examples. Then I had to tell her
that they were meant to be jokes. She just stared at me and said, "but it's
not funny." She's right. It's not.</p>
<p>Newfie jokes seem to be universal, though. In Ecuador, they make jokes about
"Pastusos" - people from Pasto, a Colombian city just across the border.</p>
New Digstag:wiresong.ca,2006-05-21:11482744472006-05-21T22:07:27-07:00<p>After months of neglect, I've finally put some time in to fixing my blog. I'm
ditching SmallBlog and moving to Typo. We'll see if Rails is all it's cracked
up to be. As much as I'm a believer in eating my own dogfood, Smallblog was
just painfully difficult to post to, and I'd rather spend my spare time
writing than tinkering with the infrastructure. I've still got to find a theme
that doesn't clash with the rest of the site, but posting works, finally.</p>
<p>The old posts are still
<a href="http://www.wiresong.ca/seaside/blog/colin">available</a>, but I'll migrate them
over to Typo as soon as I can.</p>
<p>I've decided to stick with "Thin Air," despite the fact that it's wildly
inappropriate these days. Now that I work for
<a href="http://www.dabbledb.com">Smallthought</a>, I expect to be near Vancouver again
soon.</p>
BASTUG Meetingtag:wiresong.ca,2005-09-15:11268089212005-09-15T11:28:41-07:00<p>I'm stoked that we're back on track with BASTUG, the Boston Area
Smalltalk User's Group. It went on a short hiatus while I got married
and Rob's wife had a baby, but now that things have settled down,
we've decided on the third Tuesday of the month for regular meetings.
That makes for short notice of this month's meeting, but here's
where it will be:</p>
<pre><code>Date: Tuesday, September 20, 2005
Time: 6:00 pm
The Joshua Tree Bar & Grille
256 Elm St.
Somerville, MA 02144
617-623-9910
</code></pre>
<p><a href="http://maps.google.com/maps?q=256+Elm+St.,+Somerville,+MA+02144&amp;spn=0.023033,0.033757&amp;hl=en">map</a></p>
Not Messagestag:wiresong.ca,2005-07-12:11212310652005-07-12T22:04:25-07:00<p>In my last post, I speculated that a hypothetical "good modelling
language" wouldn't revolve around message-sending, but rather
focus on the relationships between objects and making explicit the patterns of
cooperating objects that we see in good OO design. I shouldn't have
imagined that I could get away with being that vague; Vincent Foley quickly
asked the pertinent question:</p>
<pre><code>I have a little question: I quite like Smalltalk (though I'm more a Ruby
guy), but I was wondering what you meant by a language that is not
message-oriented? What would that look like?
</code></pre>
<p>One of the first things I do when modelling is to identify some of the
objects that will be in the model: the nouns in the domain language. These
would be <em>objects</em>. I also want to classify objects, so they should
have classes or types attached to them.</p>
<p>I also want to describe <em>relationships</em> between objects. I'm
imagining a way to build up complex relationships from simpler ones, in the
same way that "high level" methods can call "low level" methods.
Perhaps the language or libraries would provide basic relationships such as
is-a and has-a. By combining several of these I could build complexity. For
example, I might define the relationship between an invoice and line items
like this:</p>
<ul>
<li>an invoice has a collection of line items</li>
<li>an invoice has a total</li>
<li>a line item has a value</li>
<li>an invoice's total is the sum of the values of its line items</li>
</ul>
<p>Now, this sort of declarative definition of relationships is great in the
abstract, but I still need a way of causing computation to occur. The state of
the program has to change over time, so I need a way to describe the changes
that might happen in the relationships of objects at runtime. By defining a
<em>transformation</em>, I provide a transition from one state to another. For
example, I might say that given an invoice and a line item, the line item can
be added to the invoice's collection.</p>
<p>Finally, all this needs to be hooked up to input and output. With the
right hooks, input would create new objects, whose existence would trigger a
cascade of transformations, state changes in the model, and ultimately, a
result going to output.</p>
<p>I'm still doing a lot of hand-waving here, but one can at least
imagine that such a language could exist, and perhaps what programming it
might feel like. This notion of input triggering a cascade of transformations
sounds a bit like <a href="http://www.haskell.org/tutorial/monads.html">monads</a>, which makes me
wonder if this is maybe a lazy functional language in disguise. I should
probably take a cue from <a href="http://www.blainebuxton.com/weblog/2004/08/perfect-design-i-am-addicted-to.html">Blaine</a>
and learn Haskell.</p>